Obsessed with technology?
Subscribe to the latest tech news as well as exciting promotions from us and our partners!
By subscribing, you indicate that you have read & understood the SPH's Privacy Policy and PDPA Statement.
Product Listing

Performance Review: NVIDIA GeForce GTX 980 Ti

By Koh Wanzi - 1 Jun 2015

Introduction: DirectX 12, 4K gaming and virtual reality

The future of PC Gaming: DirectX 12, 4K gaming, and virtual reality

Meet the NVIDIA GeForce GTX 980 Ti, a monster card that sits just below the GeForce GTX Titan X.

Following the release of the beast of a graphics card that was the NVIDIA GeForce GTX Titan X, rumors swirled that NVIDIA would be releasing another high-end card to fill the gap between the GeForce GTX 980 and the Titan X. As it turns out, the rumors are true. NVIDIA today took the wraps off the GeForce GTX 980 Ti, a GM200-based card that purports to offer performance similar to the Titan X but with less video memory. And given that AMD is poised to release its Radeon R9 300 series of graphics cards this June, we’re guessing that the timing of NVIDIA’s launch is no coincidence.

At first glance, NVIDIA’s new card appears quite similar to the GeForce GTX Titan X, save for the fact that its cooling shroud is a slate gray instead of black. Under the shroud, it features a GM200 GPU that packs a stunning 8 billion transistors like the Titan X. However, unlike the GeForce GTX 780 Ti which turned out to be even more powerful than 2013’s Titan, the GeForce GTX 980 Ti ships with 22 Streaming Multiprocessors (SMM) and 2816 CUDA cores, down from 24 SMMs and 3072 cores on the Titan X.

The NVIDIA GeForce GTX 980 Ti features a pared-down GM200 GPU with 22 SMMs, down from 24 on the Titan X. (Image Source: VideoCardz)

The GeForce GTX 980 Ti also sports 176 texture mapping units (TMUs), putting it right smack between the GeForce GTX Titan X and the 980. However, it retains six Graphics Processing Clusters (GPCs) like the Titan X, giving it a 384-bit memory bus width to deal with memory-hungry 4K resolutions and VR gaming. In addition, the new card boasts 6GB of GDDR5 memory with an effective memory clock of 7010MHz. This puts its peak memory bandwidth at 336.5GB/s, 50% higher than that of the GeForce GTX 980.

The card runs at a base clock frequency of 1000MHz and a boost clock frequency of 1075MHz. However, the boost clock frequency is actually an average figure that is derived from the speeds the card usually boosts to over a series of real-world applications and does not represent the maximum clock speed the card can boost to. As NVIDIA was careful to point out, the card could very well boost to even higher clock speeds under heavier workloads and conditions.

Here's a quick rundown of the card's specifications.

In terms of video connectivity options, the card has three DisplayPort outputs, one HDMI port and one dual-link DVI port. The card will of course also support G-Sync, NVIDIA’s variable refresh rate technology that aims to eliminate screen tearing and stutter.

The NVIDIA GeForce GTX 980 Ti has three DisplayPort outputs, one HDMI and one dual-link DVI port.

And with a Thermal Design Power (TDP) of 250 watts, it’s powered by two PCIe graphics power connectors - a six-pin and eight-pin connectors.

The card requires a six-pin and eight-pin PCIe power connectors to power it.

Here’s a look at how the GeForce GTX 980 Ti compares against other enthusiast graphics card SKUs in the market:-

NVIDIA GeForce GTX 980 Ti 6GB GDDR5 and competitive SKUs compared
  NVIDIA GeForce GTX 980 Ti NVIDIA GeForce GTX 980 NVIDIA GeForce GTX Titan X AMD Radeon R9 295X2
  NVIDIA GeForce GTX 980 Ti NVIDIA GeForce GTX 980 NVIDIA GeForce GTX Titan X AMD Radeon R9 295X2
Core Code
  • GM200
  • GM204
  • GM200
  • Vesuvius
GPU Transistor Count
  • 8 billion
  • 5.2 billion
  • 8 billion
  • 12.4 billion (2 x 6.2 billion)
Manufacturing Process
  • 28nm
  • 28nm
  • 28nm
  • 28nm
Core Clock
  • 1000MHz (Boost: 1075MHz)
  • 1126MHz (Boost: 1216MHz)
  • 1000MHz (Boost: 1075MHz)
  • up to 1020MHz
Stream Processors
  • 2816
  • 2048
  • 3072
  • 5632 (2 x 2816)
Stream Processor Clock
  • 1000MHz
  • 1126MHz
  • 1000MHz
  • up to 1020MHz
Texture Mapping Units (TMUs)
  • 176
  • 128
  • 192
  • 352
Raster Operator units (ROP)
  • 96
  • 64
  • 96
  • 128
Memory Clock (DDR)
  • 7010MHz
  • 7010MHz
  • 7010MHz
  • 5000MHz
Memory Bus width
  • 384-bit
  • 256-bit
  • 384-bit
  • 2 x 512-bit
Memory Bandwidth
  • 336.5 GB/s
  • 224 GB/s
  • 336.5 GB/s
  • 640GB/s
PCI Express Interface
  • PCI Express 3.0
  • PCI Express 3.0
  • PCI Express 3.0
  • PCIe 3.0 x16
Power Connectors
  • 1 x 6-pin, 1 x 8-pin
  • 2 x 6-pin
  • 1 x 6-pin, 1 x 8-pin
  • 2 x 8-pin
Multi GPU Technology
  • SLI
  • SLI
  • SLI
  • AMD CrossFireX
DVI Outputs
  • 1
  • 1
  • 1
  • 1
HDMI Outputs
  • 1
  • 1
  • 1
DisplayPort Outputs
  • 3
  • 3
  • 3
  • 4
HDCP Output Support
  • Yes
  • Yes
  • Yes
  • Yes

With the impending release of AMD’s next-generation cards, NVIDIA could hardly be content to let the red camp steal all the thunder, and it’s hoping that the new GeForce GTX 980 Ti will lead the charge towards what it sees as the next level of PC gaming. In its view, three new technologies are quickly reshaping the gaming landscape – DirectX 12, ultra-high resolution 4K gaming, and virtual reality (VR). The GeForce GTX 980 Ti is designed to deliver performance in all these key areas.

 

Next-generation API: DirectX feature level 12.1

The NVIDIA GeForce GTX 980 Ti will support DirectX feature level 12.1

The NVIDIA GeForce GTX 980 Ti ushers in DirectX feature level 12.1, which enables developers to take advantage of new features like Volume Tile Resources, Conservative Raster and Raster Ordered Views. All second-generation Maxwell cards will also support these features. While the low-level DirectX 12 API allows developers to work more closely with the hardware and gives them more control over the GPU’s resources in order to reduce CPU overhead and boost performance, DirectX feature level 12.1 builds on this foundation to further improve efficiency and graphics realism.

Volume Tiled Resources improves upon the implementation of Tiled Resources in DirectX 12. The latter method breaks down textures into tiles, so instead of streaming all the textures indiscriminately - which can be quite inefficient and graphically taxing - only the tiles which are required for rendering are stored in the GPU’s memory. This effectively allows game developers to produce higher fidelity graphics while using less memory.

However, one limitation of Tile Resources is that it’s limited to 2D, planar objects only. Volume Tiled Resources changes that with the addition of a third parameter. Now the efficiency of Tiled Resources can be applied to objects which span all three axes, or in other words, 3D textures. Many visual effects used in games are volumetric in nature, such as fluids, clouds, smoke, fire and fog. Volume Tiled Resources allows the GPU to use its memory even more efficiently, which will in turn allow developers to incorporate more intricate details into their games, like smoke that is generated using sparse fluid simulation for greater realism.

Another key feature of DirectX feature level 12.1 is Conservative Raster, which is essentially a more accurate method for determining whether or not a pixel is covered by a texture primitive, the building blocks of a texture in a scene. In traditional rasterization, a pixel is only considered covered if the primitive covers a specific sample point within that pixel, for instance, the center of the pixel.

On the other hand, conservative rasterization rules dictate that a pixel is considered covered if any part of it is covered by a primitive. By providing hardware acceleration for the process, the GPU can perform these calculations more efficiently. This enables game developers to employ new approaches to improve image quality, and conservative raster can prove very useful in certain cases, like generating ray-traced shadows that are free from anti-aliasing or gaps.

 

Ultra-high resolution 4K gaming

NVIDIA has designed its new card to deliver performance in ultra-high resolution 4K gaming.

The GeForce GTX 980 Ti was also designed with 4K gaming in mind. While peak memory bandwidth plays an important role in a GPU’s performance, especially at ultra-high resolutions, various bottlenecks within the memory subsystem can also prevent the GPU from achieving its peak performance. To minimize this issue, NVIDIA’s second-generation Maxwell GPUs – which now includes the GeForce GTX 980 Ti – employ a new memory architecture that is designed to allow the GPU to use its memory bandwidth more effectively.

Each of the GM200 GPU’s SMM units features its own dedicated 96KB of shared memory, while the L1 and texture caches are combined into a 48KB pool of memory per SMM. In comparison, previous-generation Kepler GPUs were designed with shared memory and L1 cache sharing the same on-chip storage. By merging the functionality of the L1 and texture caches in each SMM, NVIDIA has allowed each SMM to have its own dedicated space of shared memory on GM200. In addition, the GM200 GPU is equipped with 3MB of L2 cache.

And when combined with the GeForce GTX 980 Ti’s 384-bit memory bus width – up from 256-bit on the GeForce GTX 980 – NVIDIA says the new card is fully capable of churning out playable frame rates in the latest games at 4K resolutions.

 

Multi-res shading for improved VR performance

The new card supports multi-res shading, which helps improve performance in the demanding VR space.

The GeForce GTX 980 Ti also features support for multi-res shading, which purports to provide a 1.3 to 2x improvement in pixel shader performance over traditional rendering for VR. VR headsets utilize a specially designed lens to create the focus and field of view for an immersive experience. But because the lens introduces its own distortions of an image, the display renders a warped fisheye image to match the optical properties of the lens. The rendered display image features an enlarged center but a compressed periphery, but when viewed through the lens, the viewer sees a correctly proportioned image which allows them to focus on the center.

However, this method is actually rather inefficient because GPUs are designed to render to a 2D screen and cannot natively render the fisheye images needed for VR. To get around this, the GPU first renders the image normally and then warps the image periphery just before it is sent out to the display. This solution works, but because the GPU has to render the entire scene in full detail first before warping the image for a VR display, a lot of pixels are lost. This means that the GPU has essentially wasted resources rendering pixels that are never used.

NVIDIA’s multi-res shading intends to address this issue and improve the GPU’s efficiency by utilizing the multi-projection engine – you may know this as something called Viewport Multicast – that is found on all second-generation Maxwell GPUs. Each image is divided into discrete viewports, which are in turn scaled to a different size based on the maximum sampling resolution needed within that portion of the image. While the center viewport stays the same size, much like in traditional rendering, the outer viewports are scaled down to match the lower resolutions needed in the image periphery.

Image Source: NVIDIA

Viewport Multicast accelerates the geometry broadcast stage that’s used to create the various viewports. Instead of rendering the scene in multiple passes for each viewport, the engine renders the scene’s geometry and sends it out to the relevant viewports all in a single pass. This results in a far more efficient rendering process for the GPU. And given that VR is far more unforgiving when it comes to performance drops than traditional flat screen displays, every boost in performance counts.

8.5
  • Performance 9
  • Features 8.5
  • Value 8.5
The Good
Excellent performance that rivals the Titan X
Delivers playable framerates at 4K resolutions
Highly overclockable
Offers decent value for its performance
The Bad
Reference cooler runs hot
Loading...