NVIDIA's Head Honchos Speak - Computing Trends and Platform Updates

High Performance Computing with Tesla

High Performance Computing with Tesla

Here's a Tesla module made for compact blade systems. This particular Tesla model is an X2070 with 6GB GDDR5, 448 CUDA cores and has a peak performance of 515GFLOPS.

  • Targeted for HPC applications, oil and gas exploration, molecular dynamics, and many other computational science aspects that require heavy physics processing.
  • The biggest competitor is multi-core CPUs since the Tesla group is practically starting from ground up. So from that point of view, the Tesla department has tremendous growth potential.
  • According to some of the professors in universities, the biggest problem they face is compute power (only 50TFLOPS on average). They usually have to rely on supercomputing centers, but by the time they get to experiment and try to follow-up for more compute experiments, the researchers would have lost interest and moved on to another idea. So the waiting game to try out one's ideas is counterproductive at the end of the day. A Tesla equipped system can grant researchers parallel computing power for little money and space.
  • A separate purpose built chip isn't required for HPC usage. There's a very strong overlap in the type of computing done in HPC and what is done on the consumer side. For example the physics simulations you run in PhysX are physics. The Tesla team used the same GPU foundation and then added what is necessary for the HPC environment. This is an effective strategy at this point of time for the team. The type of processing needed in the consumer market, Quadro market and HPC is very similar. Like the strategies in the CPU world, Tesla adds double precision, ECC and few others to make it HPC suitable.
  • On why some GeForce products are actually faster than Tesla products, it's because a lot of margin in the GPU has been added to ensure it can run in the data centre, 24/7, doing a very valuable job where someone could be relying on it for big revenue. So its goal is to provide the highest reliability, and not actually the highest performance. If one aims for performance, there could be some possibility where results might not turn out as intended, thus costing money. So it's better off being tuned for reliability.
  • The biggest benefit and value of a GPU is what a small company/start-up can do, rather than the big companies who've very big budgets and can afford large-scale systems to deliver what they need.

To move on to CUDA in the cloud, Tesla in the cloud, the first thing NVIDIA needed to solve was to have solutions for the cloud providers that they can buy. They cannot buy a graphics card, thus buy whole computers. This is why NVIDIA's partnership with HP, IBM, Dell and others are very important. Unlike just one vendor last year, several top system providers have configurations with Tesla. And so, now several other companies can get obtainable solutions without much effort. We saw several configurations and models on show in GTC 2010.

The Tesla module seen above belongs as part of one of the highest performing Tesla blade systems. This is the T-Platforms TB2-TL where each blade node as seen here has dual Intel quad-core Xeon L5630 processors, dual Tesla X2070 GPU modules and up to 24GB DDR3 main memory. The complete TB2-TL blade enclosure fully populated will have 192 quad-core processors, 192 Tesla GPUs, up to 384GB of main memory, 192GB of GPU memory, peak power of 105TFLOPS, weighs 154.6kg and consumes 12KW of power. Mind boggling for a 7U enclosure.

Here's another rack unit, Appro's 1U Tetra GPU Solution. Each rack can house dual Intel Xeon processors and quad Tesla M2050 GPUs. The interesting aspect of this rack is its modular integration of the Tesla GPUs which are neatly installed on both the sides of the rack. This is an example how server racks these days are thoughtfully designing racks to cater to GPU processing as well.