Product Listing

NVIDIA GeForce GTX 1080 review: A new king is crowned

By Koh Wanzi - 28 May 2016
Launch SRP: S$1188

GDDR5X memory and Asynchronous Compute

The first GDDR5X card

The other major change from the Maxwell flagship is the GeForce GTX 1080’s use of GDDR5X memory (the GeForce GTX 1070 still uses GDDR5). As we’ve noted before, HBM2 is still fairly expensive to produce, and GDDR5X provides a way to boost memory bandwidth while still keeping costs in check.

The GeForce GTX 1080 features 8GB of GDDR5X memory, clocked at a 10Gbps transfer rate. That seems pretty high at first, until you consider the fact that Micron’s chips are actually rated for up to 13Gbps and maybe even higher. Nevertheless, this might actually be a deliberate choice, as a low starting point gives all parties more breathing room to scale up in the future.

The primary benefit of GDDR5X over GDDR5 is much improved memory bandwidth.

GDDR5X’s main improvement over GDDR5 is its use of quad-data rate (QDR) bus in place of a double-data rate (DDR) bus. This effectively doubles the rate at which data can be transmitted, all without needing to raise clock speeds.

In addition to bandwidth improvements, NVIDIA says that it has designed a new circuit architecture to go with the higher transfer rates enabled by GDDR5X. Great attention has also been given to the channel between the GPU and memory die, in order to minimize any signal degradation that might impede performance.

The final feather in the cap is GDDR5X’s lower voltage of 1.35 volts, which combines with the circuit and channel improvements to bring about a far more power efficient operation.

But further adding onto GDDR5X’s bandwidth improvements is also an enhanced memory compression pipeline. To put things simply, this is brought about by better color compression modes, which reduces the size of the data that have to be fetched from the memory per frame. In a word, this frees up around 20 percent of additional effective bandwidth for use in other areas that could boost performance.

Maxwell Pascal color compression

Nevertheless, the fact remains that GDDR5X doesn't solve the problems of GDDR5 memory, like soon-to-be untenable power consumption, which was what prompted AMD to adopt HBM in the first place. HBM2 is the future, and it's likely that flagship cards will go down that route, with GDDR5X being reserved for mid-range models.


Finally, hardware support for asynchronous compute

Asynchronous compute is one of the rare cases where NVIDIA is playing catch-up with AMD, but catch up it has. While Maxwell cards lagged behind their AMD counterparts in DirectX 12 games like Ashes of the Singularity because of the lack of proper hardware support for asynchronous compute, that is happily no longer the case.

We’ve briefly touched on asynchronous compute before in our review of the AMD Radeon R7 370 and R9 380. But to recap, asynchronous compute is a term that describes the ability to simultaneously process complex – comprising multiple independent or “asynchronous” tasks – workloads

These workloads are separated into graphics and compute tasks, but because they sometimes overlap with each other, and usually utilize different GPU resources, the GPU can boost performance by running these workloads simultaneously instead of serially. This amounts to more efficient resource utilization, and one example could be a PhysX workload (compute) running concurrently with graphics rendering (graphics).

To address this, Pascal has implemented something called dynamic load balancing, which allows either compute or graphics workloads to maximize use of the GPU’s resources if they are freed up. In comparison, Maxwell relied on static partitioning of the GPU into subsets for graphics and compute workloads. This worked well if the balance of work between the two matched the partitioning ratio. But if there was an imbalance and one was taking longer than the other, and both needed to be completed for the pipeline to move forward, the partition handling the quicker task would end up idling, possibly resulting in reduced performance.

NVIDIA dynamic load balancing

NVIDIA also improved Pascal’s pre-emption capabilities with a feature called Pixel Level Preemption.  To cut a long story short, this means Pascal can interrupt a longer task in order to switch to a more time-sensitive workload with very low latency involved, which potentially benefits VR gaming features like asynchronous time warp (ATW). Because ATW works by inserting a repeated frame (corrected for head position) in the event that the GPU cannot keep up with the display's refresh rate, it ideally runs as late as possible to get the most accurate positional information from the headset's sensors. However, before Pascal's pre-emption improvements, ATW requests had to be submitted several milliseconds in advance, leading to greater perceived latency. With Pascal, the finer-grained pre-emption allows ATW requests to be submitted much later.

NVIDIA Pascal preemption

Join HWZ's Telegram channel here and catch all the latest tech news!
9.0
  • Performance 9.5
  • Features 9
  • Value 9
The Good
Stellar performance
More powerful than last-gen flagship, but cheaper
Excellent build quality and attractive design
Low power consumption
Good overclocking headroom
Crammed full of new features for multi-display setups and VR
The Bad
No HBM2 memory
"Reference" design now comes with extra price premium
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.