Obsessed with technology?
Subscribe to the latest tech news as well as exciting promotions from us and our partners!
By subscribing, you indicate that you have read & understood the SPH's Privacy Policy and PDPA Statement.
News Categories

NVIDIA announces Turing-based Tesla T4 GPU for AI workloads and hyperscale data centers

By Wong Chung Wee - on 13 Sep 2018, 5:11pm

NVIDIA announces Turing-based Tesla T4 GPU for AI workloads and hyperscale data centers

The NVIDIA Tesla T4 GPU. (Source: NVIDIA)

At GTC Japan today, NVIDIA announced the new Tesla T4 GPU that is “powered by Turing Tensor cores” and the Tesla T4 is designed for AI inference. It is touted to enhance the user experiences in current AI applications by speeding up inference from trained data models extracted from deep learning systems. According to NVIDIA, as these models are increasingly more accurate and complex, Tesla T4 GPU has the compute capabilities to handle multi-precisions computations for real-time inference application like video analytics and conversational AI applications.

With its relatively-low power requirements of 75W of the Tesla T4 GPU and its PCIe form factor, the Tesla T4 GPU is readily deployable in servers and can be scaled up to deliver up to 1 petaflops inference performance in a single scaled-up server. The Tesla T4 GPU has 320 Turing Tensor cores and 2,560 NVIDIA CUDA cores. It features 16GB of GDDR6 VRAM and has a memory bandwidth of more than 320GB/s.

The TensorRT optimizer and engine. (Source: NVIDIA)

NVIDIA also took the chance to introduce the new TensorRT Hyperscale Platform, which is positioned as the next-generation AI data center platform. The new platform consists of hardware, which is primarily the Tesla T4 GPU, and a “comprehensive set of new inference software.” The NVIDIA TensorRT 5 inference optimizer and runtime engine is able to leverage on Turing Tensor cores. At the same time, NVIDIA TensorRT 5 supports multi-precision workloads with improved performances.

The diagrammatic representation of the TensorRT inference server container. (Source: NVIDIA)

As part of the new software solution from NVIDIA, the TensorRT inference server “encapsulates” data models and frameworks for easy deployment in a cloud computing environment. The latest version of the TensorRT inference server container is also available from NVIDIA GPU Cloud for developers to “experiment with” now.

(Source: NVIDIA)

Join HWZ's Telegram channel here and catch all the latest tech news!
Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.