AMD Threadripper 2990WX preview: Just how fast is 32 cores?
A peek at the performance of AMD's latest 32-core monster chip.
New generation, twice the number of cores
One year after AMD released its 16-core Threadripper 1950X, the company is back with more. While the second-generation Threadripper processors focus on fixing the weaknesses of their predecessors, the flagship products also cram in more cores than ever.
In a single generation, AMD has doubled the number of cores from 16 to 32, a clear sign that its multi-die design and Infinity Fabric interconnect is starting to pay off handsomely. It’s now miles ahead of Intel in terms of multi-threaded performance, and the latter is still stuck at the stage where it’s showcasing hypothetical 28-core overclocked CPUs on stage.
The point is that Intel isn’t anywhere near a multi-core consumer CPU that can properly challenge AMD’s 32-core monster. In the meantime, AMD is already making a play for creative professionals with a shipping product that has access to a wide ecosystem of motherboards and coolers.
What’s new in the way AMD is organizing its Threadripper line-up?
The second-generation Threadripper processors can be divided into two families – the regular X series and a new WX series.
From a marketing standpoint, AMD says the two WX chips are for “creators and innovators”, while the X series chips are for “enthusiasts and gamers”. Of course, those are merely differences in target audience, and there’s nothing stopping a gamer from going for the flagship Threadripper 2990WX as all the chips drop into the same X399 motherboards.
Still, this differentiation is new for AMD, and by creating a product family expressly geared for workstations, AMD is probably hoping to appeal to even more hardcore content creators.
Image Source: AMD
How does the architecture of the new Threadripper chips differ from their predecessors?
At their core, not much has changed. They are based on an optimized Zen+ architecture, which feature a series of improvements meant to improve performance in latency-sensitive tasks.
These changes can be summarized as follows:
- 15% reduction in L3 cache latency
- 9% reduction in L2 cache latency
- 8% reduction in L1 cache latency
- 2% reduction in DRAM latency
On top of that, AMD has added official support for JEDEC DDR4-2933, up from DDR4-2667 from before.
In addition, they use GlobalFoundries’ newer 12nm process, which offers transistor performance that is around 10 to 15 per cent better than preceding nodes, according to AMD. While 14nm Zen was optimized for transistor density, the 12nm Zen+ was optimized for power and efficiency.
Image Source: AMD
This allows for an extended range of clock speeds and reduces the required current – roughly an 80 to 120mV reduction in core voltage – at all frequencies. In practice, this means that the Threadripper 2950X chip can clock up to 4.4GHz, compared to the 4.0GHz on the 1950X.
Are there any new features?
The second-generation Threadripper processors support Precision Boost 2, which is a new and more opportunistic frequency boosting algorithm governed by built-in temperature, current, and clock speed limits.
While these limits are on the more conservative side by default to accommodate trying conditions such as hotter and more humid climates, users will be able to exceed the recommended thermal specification with a beefier cooler.
This will allow the processor to boost higher, and the new Extended Frequency Range 2 (XFR2) feature will let the chip run at a higher average frequency. While this capability was restricted to just a small number of cores with the first-generation Ryzen products, XFR2 now operates across any number of cores and threads.
How did AMD manage to go from 16 cores to 32 so quickly?
At its heart, Zen was designed around four key focus areas, namely, performance, throughput, efficiency, and scalability.
The latter forms the crux of how AMD has been able to so easily double the number of cores in just a single generation. Zen compromises modular 4-core CPU complexes, or CCXes, that are attached to the Infinity Fabric interconnect. This modular approach allows AMD to efficiently scale core, thread, and cache quantities to cater to a range of client, server, and HPC demands.
Each CCX is a natively quad-core module. It houses 64KiB of L1 instruction cache, 64KiB of L1 data cache, and 512KB of dedicated L2 cache per core. It also has 8MB of L3 cache shared across cores.
More than one CCX can be present in a Zen-based product. In the case of the first-generation Ryzen Threadripper, there are two dies containing two CCXes per die for a total of 16 cores. Individual cores in each CCX can be symmetrically disabled to allow for products with fewer cores. For example, the Threadripper 1920X has a 3+3+3+3 configuration.
On the other hand, the 32-core Ryzen Threadripper 2990WX contains a total of four active dies, each with two CCXes. However, only Dies 0 and 2 (also referred to as the IO Dies) provide PCIe lanes and memory channels, with each die supplying 32 PCIe 3.0 lanes and two memory channels.
The other two dies are Compute Dies without local PCIe or DRAM access. Instead, they receive DRAM and PCIe access from the IO Dies via the Infinity Fabric. All dies are connected via a mesh topology at approximately 25GBps.
Two IO Dies (0,2) provide 32 PCIe 3.0 lanes and two memory channels. (Image Source: AMD)
Other than their core counts, how does the 32-core Ryzen Threadripper 2990WX differ from the 16-core Ryzen Threadripper 2950X?
The Ryzen Threadripper 2950X has a more straightforward topology consisting of just two active dies, each with eight cores, two memory channels, and 32 PCIe 3.0 lanes.
The Threadripper 2950X has just two active dies. (Image Source: AMD)
In addition, the 2950X’s topology may be configured into either one large uniform memory access (UMA) domain or two separate non-uniform memory access (NUMA) domains. In UMA mode, threads and DRAM transactions are distributed evenly across the chip to maximize bandwidth. However, in NUMA mode, memory latency is minimized by trying to pair active cores and local DRAM together.
This flexibility allows the 2950X to be configured to suit either gaming or creative performance.
In comparison, the Threadripper 2990WX is targeted at professional content creators, so it is exclusively configured as a NUMA solution.
Finally, will I be able to use my new Threadripper processor in an older X399 motherboard?
Image Source: AMD
Yes. All AMD X399 motherboards are fully compatible with the entire second-generation Threadripper line-up, including the 2990WX. However, you will probably need to update the motherboard BIOS.
That said, all available X399 boards support USB flashback for easier BIOS updates, regardless of the processor in the socket. This means all existing X399 boards are drop-in ready for the new Threadripper chips.
Test setup
The configurations of the test setups we used for the different processors are listed below.
AMD Ryzen Threadripper 2
- AMD Ryzen Threadripper 2990WX (3.0GHz, 64MB L3 cache)
- Enermax Liqtech TR4 240
- ASUS ROG Zenith Extreme
- 4 x 4GB G.Skill Flare X DDR4-3200 (Auto timings: CAS 14-14-14-34)
- ASUS ROG Strix GeForce GTX 1080 Ti
- Samsung 850 Pro 250GB SSD
- Windows 10 Home (64-bit)
AMD Ryzen Threadripper
- AMD Ryzen Threadripper 1920X (3.5GHz, 32MB L3 cache) / 1950X (3.4GHz, 32MB L3 cache)
- Thermaltake Floe Riing 360 TT Premium Edition
- ASUS ROG Zenith Extreme
- 4 x 4GB Corsair Vengeance LPX DDR4-2666 (Auto timings: CAS 15-17-17-35)
- NVIDIA GeForce GTX 1080 Ti (GeForce Driver Version 384.94)
- Samsung 850 EVO 250GB SSD
- Windows 10 Home (64-bit)
Intel Core X
- Intel Core i9-7900X (3.3GHz, 13.75MB L3 cache) / Intel Core i7-7740X (4.3GHz, 8MB L3 cache)
- Cooler Master MasterLiquid 240
- Gigabyte X299 Aorus Gaming 9
- 4 x 4GB Corsair Vengeance LPX DDR4-2666 (Auto timings: CAS 15-17-17-35)
- NVIDIA GeForce GTX 1080 Ti (GeForce Driver Version 384.94)
- Samsung 850 EVO 250GB SSD
- Windows 10 Home (64-bit)
[hwzcompare]
[products=646603,609577,609579]
[width=200]
[caption=Test CPUs compared]
[showprices=1]
[/hwzcompare]
Cinebench R15
Cinebench R15 utilizes up to 256 threads to evaluate a processor’s performance in a photorealistic 3D rendering.
Unsurprisingly, the Threadripper 2990WX charted a blazing path here and it was around 53 per cent quicker than the 16-core Threadripper 1950X in the multi-threaded benchmark.
However, the single-threaded benchmark is where Intel's Core i9-7900X continues to shine.
SPECviewperf 12.1
SPECviewperf is used to assess the 3D graphics performance of systems in professional applications. Each individual workload, called a viewset, represents graphics and content from an actual real-world application. SPECviewperf actually runs a total of eight different viewsets, but we’ve picked the four which have the greatest performance variation across CPUs display here.
The 3ds-max viewset comes from traces of the graphics workload generated by 3ds Max 2016, while maya-04 is derived from Autodesk’s Maya 2013 application. The catia-04 viewset involves the numerous rendering modes from the CATIA V6 R2012 application, and includes things like anti-aliasing, depth of field, and ambient occlusion. Finally, the sw-03 viewset comes from SolidWorks 2013 SP1, and involves various rendering modes including environment maps.
The Threadripper 2990WX didn't do well here, falling behing the 1950X and Intel Core i9-7900X in a couple of benchmarks. It also seems like select viewsets such as maya-04 heavily favor the higher clock speeds offered by the Intel Core X chip as both Threadripper chips ended up lagging behind.
Handbrake
Handbrake is a video transcoder that converts videos into a format for use on PCs and portable electronic devices, and is a good indicator of a processor’s video encoding capabilities. YouTube content creators, Twitch streamers, and other video creators will be most interested in this performance metric.
The 64 threads on the Threadripper 2900WX helped it edge ahead here while transcoding a 1.5GB .mkv file. It's a good bit faster than the 1950X and Core i9-7900X, even though we've observed that Handbrake tends to evince diminishing returns beyond six cores.
Final thoughts
These benchmarks offer just a peek at a slice of the Threadripper 2990WX's performance, but its Cinebench score leaves little doubt that it is a multi-threading beast for the professional applications that can fully utilize it.
However, seeing as how it ends up trailing in some viewsets in SPECviewperf 12.1, it doesn't always come out ahead, and its relatively low 3.0GHz clock speed might still see it losing out in certain workloads. That said, I have to say that a 3.0GHz base clock is overall a very respectable speed for a 32-core CPU, and the Threadripper 2990WX certainly looks like it has what it takes to deliver the performance its target audience needs while not compromising too much on single-threaded performance.
At S$2,738, this isn't a chip that even a hardcore gaming enthusiast would pay for. Instead, it's professionals whose work revolves around 3D and VR animation and simulation, character modeling, massive virtualization, and professional 3D ray tracing and rendering who will find the greatest utility in this processor.
We'll have a full review coming soon, so stay tuned for a more comprehensive look at the 2990WX's performance.
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.