Feature Articles

Engineering AMD's Radeon RX 7900 series and the RDNA 3 GPU (Updated)

By Vijay Anand - 15 Nov 2022

Engineering AMD's Radeon RX 7900 series and the RDNA 3 GPU

The Radeon RX 7900 XTX RDNA 3 GPU, as held up by Dr. Lisa Su, AMD CEO. (image source: AMD)

Note: This is a continuation of our earlier story on the Radeon RX 7900 series performance expectations and the RDNA 3 core GPU architecture.

Making breakthroughs and design choices

In an era where advancing processor performance is getting increasingly complex, it's all in the hands of clever engineers to maximize and extract more out of given design parameters and available technology. For example, progressing through various wafer fabrication process technology nodes was a great way to get performance hops regularly just because you could pack more transistors in a given die size, which generally equates to more capabilities on the new chip. However, the rising cost of inflation over the years plus the complexity of engineering the subsequent nodes has only ramped up the yield cost for the silicon wafers.

In today's context, having excessive transistors in your silicon is likely only adding to the cost, and complexity of taming it to run efficiently without overheating and throttling. Plus, silicon process node advancements don't scale as uniformly in capability and performance as they once did at larger gate sizes as different silicon functions now exhibit varying gains as AMD has captured in the below diagram.

(Image source: AMD)

AMD is well aware of these limitations as it debuted one of the more modern chiplet-based chip designs when it engineered the Zen 2 microarchitecture for its second-gen AMD EPYC processors where it was noted that the DRAM and Infinity Fabric tend not to scale well with process shrinks, so separating the CPU chiplets from the I/O die allowed AMD to continue with an advanced process node for the core processor  chiplets that benefited from increased performance while using more mature process nodes for I/O interfaces. This enabled AMD to cram in more CPU cores at the same power and was more cost-effective to manufacture than traditional monolithic chip designs.

AMD's existing Navi 21 GPU (Radeon RX 6900 series) based on RDNA 2 GPU architecture is an extremely large and expensive monolithic chip at over 520mm2 and cramming 26.3 billion transistors on a 7nm TSMC process technology. It's brute force for sure, but it's not really efficient, and AMD needed to tackle this to put its best foot forward. As such, for AMD's RDNA 3 GPU microarchitecture, AMD looked towards their EPYC and Ryzen CPUs that continue to use the chiplet topology for inspiration. Could AMD tear its GPU into multiple dies?

 

The birth of the world's first chiplet gaming GPU | Breakthrough no.1

Dissecting the Navi 21 GPU core into Navi 31, the basis of RDNA 3's chipset topology. (Image source: AMD)

Keeping the wafer fabrication process scaling limitations in mind, AMD's top minds chose to split off components that scaled poorly, while moving ahead to shrink the graphics core (Graphics Compute Die or GCD) with TSMC's advanced N5 (5nm) process. Thus, the memory interfaces and controllers and improved Infinity Cache were reorganised to be packed into their own little chiplet - the MCD (Memory Cache Die). Each MCD has a 64-bit memory controller and is fabricated on a much more mature and cost-efficient 6nm process that's currently used on advanced memory chips. Multiple MCD dies are used to make up the GPU's specced memory bandwidth.

The net effect is that AMD has landed 165% more transistors per mm2 and managed to cram 58 billion transistors across a chiplet architecture with an RDNA 3 GPU that has a total die area equivalent to the single monolithic die used to cram 26 billion transistors on RDNA 2's GPU. That's a big upswing.

Radeon GPU dies compared
  Radeon RX 7900XTX Radeon RX 7900 XT Radeon RX 6950 XT
GPU
Code
Navi 31
(RDNA 3)
Navi 31
(RDNA 3)
Navi 21
(RDNA 2)
Package type Multi-chip Module (MCM): GCD + 6 MCD Multi-chip Module (MCM): GCD + 5 MCD Monolithic die
Process 5nm (GCD) + 6nm (MCD) 5nm (GCD) + 6nm (MCD) 7nm
Full Die Area 300mm2 (GCD)
+ 222mm2 (5 x 37)
300mm2 (GCD)
+ 185mm2 (5 x 37)
520mm2
Transistors 58 billion 58 billion 26 billion
Total Memory Bus size 384-bit 320-bit 256-bit

Thus, the RNDA 3 GPU architecture is the first-ever chiplet topology GPU and the Radeon RX 7900 XTX and Radeon 7900 XT are the first graphics cards to bear them. In fact, the move to the chiplet style architecture is such a big move that AMD specifically decided to mark the occasion to bring back the "XTX" SKU for its top graphics card.

(Image source: AMD)

However, coming up with the right way to dissect the classic monolithic GPU is only part of the equation. How can all these chiplets communicate effectively between them?

 

The Secret Sauce: Infinity Fanout Links | Breakthrough no.2

On AMD's CPU, the Infinity Fabric's high-speed organic package was the enabler that interconnects the chipsets sufficiently fast enough to meet CPU bandwidth requirements with hundreds of signals being processed across the I/O and Core Complex Dies (CCDs).

(Image source: AMD)

However, AMD realized that even when tinkering with the Navi 21 RDNA 2 GPU architecture, inter-GPU shader engine communication requires a massive amount of connectivity that numbers tens of thousands of signals, plus much higher bandwidth needs than the processor cores on the EPYC and Ryzen CPUs. 

The engineers had to come up with a brand-new interface and interconnect, which is the faster ever in the industry. Meet Infinity Links operating at 9.2Gb/s with high-performance fanout to enable an industry-leading 5.3TB/s chiplet interconnect bandwidth.

Yet another industry first for a high-speed chiplet interconnect. (Image source: AMD)

Undoubtedly, the Infinity Links are as crucial to the story as the chiplet architecture is for the GPU. They are engineered for low-voltage operation and support aggressive clock gating to conserve power. The result is that it's highly energy efficient, 80% per bit, relative to the organic package links used for the Infinity Fabric on AMD's CPU and only consumes a net total of 5% of the GPU's overall power consumption.

(Image source: AMD)

 

Engineering the card for practicality

The engineering and design choices don't stop at stuff deep under the hood; even the physical design of the product has been thoroughly thought about. For example, the power envelope of the card was key, and so was their design goal to make it as easily compatible with existing systems as possible.

(Image source: AMD)

So while it might seem like AMD was purposely holding back after noticing the current debacle where some GeForce RTX 4090 owners were facing problems with the new 16-pin 12VHPWR (PCIe Gen 5) power connector, the Radeon RX 7900 series was on the drawing board more than a year before all this began.

Simply put, AMD designed the Radeon RX 7900 series to utilise the standard PCIe 8-pin power connector all along. With a total board power of 355W and 300W respectively for the Radeon RX 7900 XTX and 7900 XT edition, AMD says that you will only likely need a standard 800W or 750W PSU accordingly to get it running smoothly with the rest of your system. This is pretty much what anyone with a high-power graphics card in the last few years would be decked with, so it's unlikely one will need a new PSU, unless they're upgrading from a much lower-tier graphics horsepower.

That said, we'll let the images and slide shots from AMD do the talking, as there are a lot of finer points, but they are quite clearly captured here on the refinements done over the predecessor:-

Note that this represents the Radeon RX 7900 XTX, but the RX 7900 XT is fairly similar -- the former has 20 power phases for power management and delivery while the latter sports 17 power phases. (Image source: AMD)

(Image source: AMD)

Note that the Radeon RX 7900 XTX uses triple 85mm fans while the RX 7900 XT takes it down a notch with triple 80mm fans. (Image source: AMD)

You'll note that the overall design hasn't changed much from the Radeon RX 6900 and 6800 series, but enhanced with a number of small updates. After all, if the total board power of the newcomers isn't much different from the predecessors, there's no real need to shake things up. Plus, they've kept their cards more compact than the competition they are aiming at:-

The new Radeon cards might not look any different or impressive, but they are practically designed to deliver what matters most. (Image source: AMD)

Here are our close-up photos of AMD's flagship Radeon RX 7900 XTX in person:-

Don't let the plain looks bother you. It'll probably fit better than NVIDIA's competitive product in any given system.

A Logitech MX Anywhere 2S is placed next to the card for size, which clearly indicates that the Radeon RX 7900 XTX is quite reasonably sized and not anywhere as large as the NVIDIA GeForce RTX 40 series. However, the Radeon RX 7900 XTX is still a very hefty product. Don't let the plain looks fool you.

A thick aluminium die-cast backplate helps improve the PCB rigidity (and adds to the weight too). Overall, you'll note that the heat from the card will be exhausted back into the chassis, which is also where the triple fans force-feed air from the chassis to cool the card. It remains to be seen how well this air cooler design can cope with high-performance graphics cards in a sealed chassis environment, which we fully intend to find out in due time.

Power requirements are taken care of by two standard 8-pin PCIe auxiliary power connectors. Nothing fancy, and far older systems can easily welcome this new graphics powerhouse as long as they have a PSU rated higher than 750W (preferably 800W for the XTX model).

Controllable RGB light strips are in too.

 

Future-proofing with new display standards | Breakthrough no.3

The new Radeon RX 7900 series will also be the first-ever enthusiast graphics card to support DisplayPort 2.1 standard, which is a big leap over anything else in the market that only still support DisplayPort 1.4. Adding to that, the new graphics cards will have a total of two dedicated DP 2.1 ports, with a USB Type-C port that also has DP 2.1 support via DisplayPort Alternate mode, and an HDMI 2.1 port. Pretty radical, considering this will be the first such card to support such a specification. 

There you have it, something that AMD can boast that NVIDA's new card doesn't yet support. (Image source: AMD)

So, what do you actually gain out of a card that supports DP 2.1? Far higher refresh rates and resolutions for next-gen display experiences. Think 480Hz at 4K resolutions are even 165Hz at 8K resolution. However, this spec is more forward-looking than being immediately useful as there aren't any screens that support such a standard - yet. We hear that CES 2023 will be where display makers such as ASUS, Acer, Dell, LG and Samsung will show off their top-of-the-line screens with crazy new capabilities thanks to DP 2.1 support.

As crazy as those refresh rate figures seem, note that the end goal is to push out really smooth and silky gameplay through really high refresh rates with fantastic quality, such as 4K480 or and 8K165. (Image source: AMD)

Samsung's next-gen Odyssey Neo G9 gaming monitor will be the first 8K ultrawide display with high refresh rates to need DisplayPort 2.1 bandwidth. Watch out for it at CES 2023. (Image Source: AMD)

DP 2.1 is basically a slightly updated standard from the more important DisplayPort 2.0 spec with official support for DP40 and DP80 cables, along with improved USB 4.0 Type-C spec support. The most important aspect is Ultra-High Bit Rate (UHBR) transmission modes that DP 2.0 debuted. There are three transmission modes:-

  • UHBR 10 = 10Gbps transmission rates per lane
  • UHBR 13.5 = 13.5Gbps transmission rates per lane
  • UHBR 20 = 20Gbps transmission rates per lane.

With a total of four lanes within the DisplayPort connection, the total bandwidth supported is 40Gbps, 54Gbps and 80Gbps, respectively. The Radeon RX 7900 series have implemented DisplayPort 2.1 via UHBR 13.5 implementation, thus giving their DP 2.1 ports a maximum display bandwidth of 54Gbps to support the crazy throughput required to support ultra-high refresh rates at 4K and 8K resolutions while gaming. Roughly what kind of performance can you expect since there are so much more pixels to crunch and at faster clocks? The below performance data is compiled by AMD to give you an idea, but it also alludes to the capability of the Radeon RX 7900 XTX with FidelityFX Super Resolution (FSR) to give it the necessary boost.

 (Image Source: AMD)

 (Image Source: AMD)

 

When can you experience the next-gen AMD gaming experience?

So all this sounds promising, and RDNA 3 could be the real deal we've been expecting from AMD. When can we actually see any real-world performance from actual in-house testing? Stay tuned as that's likely going to be closer to the actual availability of the Radeon RX 7900 series that's slated for availability on13th December 2022.

Join HWZ's Telegram channel here and catch all the latest tech news!
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.