When the first Fermi card, the GeForce GTX 480 was released, we were excited but also disappointed. Excited because the card NVIDIA has talked about for so long was finally here in our labs; and disappointed because while the Fermi architecture promised 512 CUDA cores, the GeForce GTX 480 had only 480. There were other undesirable traits like extremely high operating temperatures and power draw to contend to as well.
If you recall from our earlier article of the GeForce GTX 480, the full Fermi chip was supposed to have 16 SMs (streaming multiprocessors) with 512 CUDA cores, but the GTX 480, powered by the GF100 chip, only had 15 SMs enabling 480 CUDA cores. Although NVIDIA never explicitly explained why, it is widely believed that, at that point of time at least, yields were too low due to the complexity of the new chip. This wasn’t helped by the fact that TSMC, the foundry tasked to produce the new chips, had numerous issues with their 40nm process, which affected not only NVIDIA but also ATI.
Recently however, in face of competition from AMD and their new Northern Islands cards, the rumor mill was abuzz with news that NVIDIA is preparing a card that will finally flaunt the full 512 cores that the Fermi architecture is capable of, in a bid to cement their grip on the title of world’s fastest single GPU card. That card is the new GeForce GTX 580 and it has finally arrived and is in our labs.
Architecturally, despite being given the GF110 codename by NVIDIA, the GeForce GTX 580 is very similar to the older GTX 480 with the key exception being an additional SM on the GTX 580. Consequently, this additional SM gives the GeForce GTX 580 an extra 32 CUDA cores and four texture mapping units, bringing the total number of cores and texture mapping units on the GTX 580 to 512 and 64 respectively. And on top of that, NVIDIA has also incorporated two enhancements such as full-speed FP16 texture filtering and support for new tile formats to improve Z-cull efficiency to speed up the GeForce GTX 580.
Performance enhancements aside, NVIDIA was also anxious to improve the power efficiency and thermal characteristics of the new card. This led to significant improvements that were done at the transistor level, where lower leakage transistors were used on less timing sensitive processing paths and higher speed transistors on more critical processing paths. This also allowed NVIDIA to cut back on the overall number of transistors used, and the end result is that the GeForce GTX 580 has a lower rated TDP of 244W compared to the GeForce GTX 480’s 250W.
Impressively ,despite the slightly lower rated TDP, the GeForce GTX 580 runs at significantly higher clock speeds. It’s core is clocked at 772MHz, whereas its shader and memory are clocked at 1544MHz and 4008MHz respectively. Compared to the GeForce GTX 480’s core, shader and memory clock speeds of 701MHz, 1401MHz and 3696MHz DDR, it’s amazing how the GTX 580 can run at higher clock speeds, have more cores, and still have a lower rated TDP, which only goes to show just how hard NVIDIA has worked at optimizing the GF110 chip in the GeForce GTX 580.
Before we continue, here’s how the GeForce GTX 580 stacks up against its closest competitors.
|Model||NVIDIA GeForce GTX 580||NVIDIA GeForce GTX 480||NVIDIA GeForce GTX 470||ATI Radeon HD 5970||
AMD Radeon HD 6870
|ATI Radeon HD 5870|
|Core Code||GF110||GF100||GF100||Hemlock||Barts XT||Cypress XT|
|Transistor Count||3000 million||3200 million||3200 million||4300 million||1700 million||2150 million|
|Stream Processors||512 Stream Processors||480 Stream Processors||448 Stream Processors||3200 Stream processing units||1120 Stream processing units||1600 Stream processing units|
|Stream Processor Clock||1544MHz||1401MHz||1215MHz||725MHz||900MHz||850MHz|
|Texture Mapping Units (TMU) or Texture Filtering (TF) units||64||60||56||160||56||80|
|Raster Operator units (ROP)||48||48||40||64||32||32|
|Memory Clock||4008MHz GDDR5||3696MHz GDDR5||3348MHz GDDR5||4000MHz GDDR5||4200MHz GDDR5||4800MHz GDDR5|
|DDR Memory Bus||384-bit||384-bit||320-bit||256-bit||256-bit||256-bit|
|PCI Express Interface||PCIe ver 2.0 x16||PCIe ver 2.0 x16||PCIe ver 2.0 x16||PCIe ver 2.0 x16||PCIe ver 2.0 x16||PCIe ver 2.0 x16|
|Molex Power Connectors||1 x 6-pin, 1 x 8-pin||1 x 6-pin, 1 x 8-pin||2 x 6-pin||1 x 6-pin, 1 x 8-pin||2 x 6-pin||2 x 6-pin|
|Multi GPU Technology||SLI||SLI||SLI||CrossFireX||CrossFireX||CrossFireX|
|DVI Output Support||2 x Dual-Link||2 x Dual-Link||2 x Dual-Link||2 x Dual-Link||1 x Dual-Link, 1 x Single-Link||2 x Dual-Link|
|HDCP Output Support||Yes||Yes||Yes||Yes||Yes||Yes|
|Street Price||Launch Price: US$499||~US$500||~US$259||~US$500||US$239||~US$360|