Feature Articles

A Primer on AMD's Radeon R9 290 Series

By Vijay Anand - 24 Oct 2013

GCN 2.0 and New PowerTune

At the Core of Things - An Updated GCN Architecture

Last generation’s Southern Island GPUs have made significant changes to the graphics processing architecture that debuted as the Graphics Core Next (GCN) architecture. The new Hawaii-core based GPU that will be featured on the Radeon R9 290 series take it up a notch with further refinements and an updated core layout to account for even more firepower. The core layout, while revamped, is not a radical departure from the Radeon HD 7970 and it's first generation GCN architecture. This time round, you can notice that there are 4 main functional processing blocks called shader engine blocks that make up the Radeon R9 290X GPU. All other supporting functional units are encapsulate the shader engine blocks:-

This is the full functional block diagram of the Hawaii GPU found on the Radeon R9 290X GPU.

Depending on future GPU models, these large shader engine blocks can be cut back accordingly to make up a smaller GPU die. Within each, shader engine block, you'll find one geometry engine. With four shader engine blocks, the R9 290X has four geometry processors - that's double what the previous generation Tahiti core can offer. The core processing unit in any of AMD’s modern GPU is a GCN compute units (CU). Within each shader engine block, it can support 1 to 11 compute units. In the case of the fully decked Radeon R9 290X, there are 11 CUs in each shader engine. With four such shader engines, the Radeon R9 290X offers a total number of 44 compute units, which is more than the 32 offered on the previous generation.

A closer look at each shader engine block where you can see it architecture supports up to 11 CUs per block. With the R9 290X, you've four such shader engines for a total of 44 CUs.

Each of these GCN CU units have largely remained identical to that of the GCN CU of the Southern Island GPUs (Radeon HD 7000 series), but has a few updates such as support for a Flat Addressing support that now allows the hardware to determine direct addressing, improved media processing instruction support – especially to the Maskable Quad Sum of Absolute Difference (MQSAD) that was introduced in the previous generation whose function is to allow background pixels to be ignored while helping isolate moving objects. So yes, on the whole each GCN CU still has quad 16-processing element vector units, which gives you 64 stream processors per GCU block (or per GCN CU).

The Radeon R9 and R7’s GPU compute unit (CU) largely remains similar the previous generation, but what differs is the number of compute units available per GPU. This also means the number of other supporting processing engines and blocks that co-work with the basic GCN compute unit also differs in each GPU configuration.

Supporting the graphics processing blocks is other functions like the rasterizer, render back-ends, geometry processors, L2 cache (total of 1MB, up from 768KB) and the memory interface – all of which have been incrementally updated but the biggest change is the allocation of the number of units per shader engine block. We’ll detail this when we’ve obtained clearance, but two aspects that have been publicly acknowledged is the doubling of the render back-ends on the top tier R9 290 series (to cater to 4K resolution gaming) compared to the Radeon HD 7970 and the much higher density 512-bit memory interface used on the new R9 290X that consumes much less die space in a bandwidth per mm2 dies size used.

The new R9 and R7 series of GPUs are still manufactured based on the 28nm processing node, but given all the enhancements and increased number of processing units/blocks and other other aspects, the top-end R9 290 series carries over 6 billion transistors and naturally a larger die size compared to its predecessor. Despite that, AMD assured us that the R9 290 series uses 25% less die size compared to NVIDIA’s Titan and is more efficient per mm2 die size (though there’s no mention of how the performance will stack up). In place, we've some stats from AMD when comparing the Radeon R9 290X against the Radeon HD 7970 GHz Edition:-

 

Power to the People - A New PowerTune for 2013/2014

AMD’s PowerTune technology is the company’s version of the more popular GPU Boost used on NVIDIA’s products – even though AMD debuted this technology earlier. Essentially, AMD PowerTune that is featured on the previous generation Southern Island GPUs (Radeon HD 7000 series) analyzes the ‘active power signature’ of the card to utilize the unused thermal headroom. This is because most use-case scenarios hardly approach the graphics card’s TDP and technologies like AMD PowerTune help utilize balance power budget to push the core clock speeds and provide enthusiasts with increase performance.

The design goals of the new AMD PowerTune on AMD Radeon R9 and R7 graphics cards.

After more than 1.5 years since the Radeon HD 7970 first debuted, we’re glad to know that the Radeon R9 290 parts will also feature a much more comprehensive PowerTune technology, more so because NVIDIA’s GeForce Boost 2.0 in its current generation of offerings has been available since earlier this year. As painted in AMD’s PowerTune manifesto, it aims to be the most advanced controller to-date as it will now not only factor active power consumption, but also factor in other attributes such as real-time temperature monitoring, voltage draw and even fan speed.


As shown in the block diagram above, a Digital Power Management (DPM) arbitrator checks on the card’s temperature, power consumption and voltage draw to determine how best to increase another attribute to maximize the potential of the hardware. Factoring temperature targets (default threshold set at 95 degrees Celsius) and fan noise are the newer aspects of the new PowerTune architecture to provide more control and optimization. For the longest time, we’ve been complaining about the AMD reference coolers being rather noisy when we’re in the thick of gaming. Fortunately, the ability to set fan speed to your preference helps one control the optimal acoustics of the card and also ensures there’s no drastic changes in noise levels as the card enters various stages of usage.

At the end of the day, overall performance of the Radeon R9 and R7 graphics cards are determined by the overall balance power budget available and due to the dynamic nature and the various parameters that are in control by PowerTune (and further user inputs), AMD is officially acknowledging that the R9 and R7 cards will no longer have a single advertised clock speed but they will be advertised as “Up to xxxxMHz”. This is again noticeable by the large specs table we’ve tabulated above that reflects this change of marketing.

The 4 pillars of the new AMD PowerTune that’s featured in the Radeon R9 290X.

Alas, all of this is playing catch-up as NVIDIA’s GPU Boost 2.0 and partner utilities have supported all of these in the GeForce 700 series of graphics cards from earlier this year. Nevertheless, it’s good to know AMD recognizes what needs to be done to appeal to the modern gamer. Performance numbers isn’t everything as the overall user experience is important too. Having said that, we notice that AMD hasn’t talked about any improvements to the cooler/shroud/fan used and we suspect that without the updated PowerTune, perhaps the reference cards will once again rear their ugly side of the previous generation Radeon cards. We’ll find out when and if we get hold of a reference-based Radeon R9 graphics card.

Join HWZ's Telegram channel here and catch all the latest tech news!
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.