Obsessed with technology?
Subscribe to the latest tech news as well as exciting promotions from us and our partners!
By subscribing, you indicate that you have read & understood the SPH's Privacy Policy and PDPA Statement.
News
News Categories

NVIDIA's pro-grade GPUs adopt Ampere architecture for massive performance leaps

By Vijay Anand - on 13 Apr 2021, 1:27am

NVIDIA's pro-grade GPUs adopt Ampere architecture for massive performance leaps

Note: This article was first published on 13 Apr 2021.

Simply said, it's a great time to be a creator.

Today's announcement at NVIDIA's Digital GTC 2021 has finally upgraded the Turing architecture-based Quadro RTX solutions (for both desktop and mobile) to newer Ampere based architecture that gamers have been lapping up via the GeForce RTX 3000 series for a while now (or at least as much as retail stocks would allow).

Just like the 2x generational performance leap seen on the GeForce RTX 3000 series, the new NVIDIA RTX A5000 and RTX A4000 GPUs take over the Quadro RTX lineup with the following key features derived from the Ampere architecture:-

  • Second-gen RT cores for twice the throughput of the previous generation with the ability to run concurrent ray tracing, shading and denoising tasks.
     
  • Thrid-gen Tensor cores also double AI inferencing and deep learning tasks thanks to the new TF32  and BFloat16 data formats, and these figures go up to 10x with structural sparsity to improve efficiency. Read more about these data formats and structural sparsity in our Ampere architecture overview.
     
  • It packs much more CUDA cores thanks to Ampere's newer 8nm lithography process by Samsung, as opposed to Turing's older 12nm FinFET process. This, in turn, affords it to process up to 2.5x the FP32 operations that greatly increase the graphics and compute workload capacity.

The NVIDIA RTX A4000 can be had in a single-slot form factor. Considering that it's a close cousin to an NVIDIA RTX 3070, that's impressive.

The desktop-bound RTX A4000 and A5000 GPUs boast support PCIe 4.0  connectivity and double the graphics memory from the precursors with 16GB GDDR6 and 24GB GDDR6 respectively - both with ECC memory support. Of them, the A5000 is bestowed with NVLink to SLI two of these cards for a total GPU memory of 48GB. The RTX A5000 is also powerful enough that it supports NVIDIA RTX vWS software for multiple virtual workstation instances that enable remote users to share resources and drive high-end design and compute workloads.

Speed is everything when we need to evaluate new concepts for the most adventurous vehicles, and the NVIDIA RTX A5000 really delivers what we need. The basic viewport rendering is incredibly fast in Octane Render — 5x faster — and unlocks things we couldn’t have even tried before. -- Erick Green, 3D / CGI Lead, Polaris

The RTX A4000 and A5000 follow-up to better complete the pro-visualization lineup which first launched the RTX A6000 late in 2020. Here's how they stack up and against older GPUs:-

Graphics Card RTX A4000 RTX A5000 RTX A6000 GeForce RTX 3080 Quadro RTX 6000 GeForce RTX 2080 Ti
GPU Ampere
(GA104)
Ampere
(GA103)
Ampere
(GA102)
Ampere
(GA102)
Turing (TU102) Turing (TU102)
Process 8nm
(Samsung)
8nm
(Samsung)
8nm
(Samsung)
8nm
(Samsung)
12nm FinFET 12nm FinFET
Die Size (mm2) 392 628 628 628 754 754
Transistors 17.4 billion 28 billion 28 billion 28 billion 18.6 billion 18.6 billion
CUDA cores 6144 8192 10752 8704 4608 4352
Tensor Cores 192 256 336 336 576 544
Tensor Performance 153.4 TFLOPS 222.2 TFLOPS 238 TFLOPS 238 TFLOPS 130 TFLOPS 89 TFLOPS
RT Cores 48 64 84 84 72 68
RT Performance 37.4 TFLOPS 54.2 TFLOPS 58 TFLOPS 58 TFLOPS ? 34 RT TFLOPS
GPU base / boost clock speeds - - 1455MHz / 1860MHz 1440MHz / 1710MHz 1400MHz /
1770MHz
1350MHz /
1545MHz
 Memory 16GB GDDR6 with ECC 24GB GDDR6 with ECC 48GB GDDR6 with ECC 10GB GDDR6X 24GB GDDR6 11GB GDDR6
 Memory clock speed 1.75Gbps 2.0Gbps 2.0Gbps 2.375Gbps 14,000MHz 14,000MHz
Memory bus width 256-bit 384-bit 384-bit 320-bit 384-bit  352-bit
Memory bandwidth 448GB/s 768GB/s 768GB/s 760GB/s 672GB/s 616GB/s
TDP 140W 230W 300W 320W 295W 250W
Price -- -- US$4,694 US$699 US$6,300 US$999

 

For creators on the-the-move

For professionals on the go needing thin and light form factors, the new NVIDIA RTX A2000, NVIDIA RTX A3000, RTX A4000 and RTX A5000 laptop GPUs deliver accelerated performance without compromising mobility. They include the latest generations of Max-Q 3.0 and RTX technologies and are backed by the NVIDIA Studio ecosystem, which includes exclusive driver technology that enhances creative apps for optimal levels of performance and reliability.

Graphics Card RTX A5000 Laptop RTX A4000 Laptop RTX A3000 Laptop RTX A2000 Laptop T1200 Laptop T600 Laptop
GPU Ampere
(GA104)
Ampere
(GA104)
Ampere
(GA106)
Ampere
(GA107)
Turing (TU117) Turing (TU117)
Process 8nm
(Samsung)
8nm
(Samsung)
8nm
(Samsung)
8nm
(Samsung)
12nm FinFET 12nm FinFET
CUDA cores 6144 5120 4096 2,560 1024 896
Tensor Cores 192 160 128 80 NIL NIL
Tensor Performance 174 TFLOPS 142.5 TFLOPS 102.2 TFLOPS 74.7 TFLOPS NIL NIL
RT Cores 48 40 32 20 NIL NIL
RT Performance 75.6 TFLOPS 34.8 TFLOPS 25 TFLOPS 18.2 TFLOPS NIL NIL
GPU base / boost clock speeds - - - - - -
Memory 16GB GDDR6 8GB GDDR6 6GB GDDR6 4GB GDDR6 4GB GDDR6 4GB GDDR6
 Memory clock speed 1.75Gbps 1.5Gbps 1.375Gbps 1.5Gbps 1.5Gbps 1.25Gbps
Memory bus width 256-bit 256-bit 192-bit 128-bit 128-bit 128-bit
Memory bandwidth 448GB/s 384GB/s 264GB/s 192GB/s 192GB/s 160GB/s
TGP 80 - 165W 80 - 140W 60 -130W 35 - 95W 35 - 95W 25W

NVIDIA also introduced the NVIDIA T1200 and NVIDIA T600 laptop GPUs, based on its previous-generation Turing architecture. These are designed for multi-application professional workflows and are a significant upgrade in performance and capabilities from integrated graphics.

 

Delivering cutting edge graphics, video and AI services through enterprise servers

The NVIDIA A10 GPU for enterprise server deployment.

What if you're in an organisation that's leveraging on enterprise server solutions for shared utilization of CPU, GPU and AI performance? That's when you'll need a vendor who can deploy an accelerator to meet the cutting edge needs of designers, engineers, artists, scientists, and more who may not be equipped with desktops or laptops running the above covered RTX GPU solutions.

Enter the NVIDIA A10 Tensor Core GPU that combines with NVIDIA RTX Virtual Workstations (vWS) to deliver the necessary modern compute needs to clients while seated within an enterprise server. Featuring the same Ampere architecture GPU as some of the above solutions and 24GB GDDR6 memory, the A10 is a single slot, full-height, full-length card that's designed with a 150W TDP. With 72 RT Cores, the NVIDIA A10 seems to be using a variant of the GA102 GPU core that's used on the GeForce RTX 3080 and RTX A6000 GPUs.

NVIDIA says the A10 can deliver up to 2.5x faster virtual workstation performance and inference performance over the NVIDIA T4. The NVIDIA A10 is supported as part of NVIDIA-Certified Systems, in the on-prem data center, in the cloud, and at the edge. It builds on the rich ecosystem of AI frameworks from the NVIDIA NGC catalog, CUDA-X libraries, over 2.3 million developers, and over 1,800 GPU-optimized applications to help enterprises solve the most critical challenges in their business.

For enhanced virtual desktop interface (VDI) deployment for remote workers, NVIDIA also has the A16 GPU solution. The A16 crams four Ampere GPUs on one board, each with 16GB of graphics memory for a total of 64GB onboard. Through proper virtual PC configuration, the A16 can support up to 64 concurrent users per board. NVIDIA also says the A16 delivers an experience indistinguishable from a physical PC, which allows remote workers to seamlessly transition between working at the office and at home.

 

Market Availability

The new NVIDIA RTX desktop GPUs and NVIDIA data centre GPUs will be available from global distribution partners and OEMs starting later this month.

The new NVIDIA RTX laptop GPUs will be available in mobile workstations anticipated in Q2 this year from global OEMs.

Meanwhile, the NVIDIA A16 will be available later this year.


Here are more stories from NVIDIA's Digital GTC 2021 event:-

1) NVIDIA joins the ARMs race with their first data centre CPU called Grace

2) NVIDIA announces Drive Atlan, an SoC for cars that delivers over 1,000 TOPs

Join HWZ's Telegram channel here and catch all the latest tech news!
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.