Meta begins the rollout of the world’s fastest supercomputer
Meta begins rollout of the world’s fastest supercomputer
Social media giant Meta has just completed the first phase of its AI Research SuperCluster (RSC) supercomputer. When it is fully deployed in mid-2022, the company says that it will be the world’s fastest of its type in the world.
Designed specifically to train machine learning systems, RSC will help researchers develop better AI models that can learn from trillions of examples, work across hundreds of different languages, seamlessly analyse text, images, and video together, and develop new augmented reality tools.
Under the hood of a supercomputer
RSC currently uses 760 NVIDIA DGX A100 systems as its compute nodes. They pack a total of 6,080 NVIDIA A100 GPUs linked on an NVIDIA Quantum 200Gb/s InfiniBand network to deliver 1,895 petaflops of TF32 performance.
Phase two which will be completed later this year will see RSC expand to 16,000 GPUs that Meta believes will deliver five exaflops of mixed precision AI performance.
Pure Storage’s FlashArray and FlashBlade provide the scalable storage solution for the RSC to help it analyse both structured and unstructured data for faster response times.
RSC’s storage tier has 175 petabytes of Pure Storage FlashArray, 46 petabytes of cache storage in Penguin Computing Altus systems, and 10 petabytes of Pure Storage FlashBlade.
What it all adds up to
According to Meta, this will be fast enough to train large-scale natural language processing models three times faster, so that AI models can determine if something said or posted constitutes hate speech or includes harmful content as it gets typed or said.
To protect user data, Meta said that RSC is isolated from the larger Internet, with no direct inbound or outbound connections, and traffic can flow only from Meta’s production data centres.
In terms of performance, when RSC’s second phase is completed and the full five exaflops of performance achieved, it will be faster than four exaflops of performance from the National Energy Research Scientific Computing Center’s (NERSC) Perlmutter that was brought online in the first half of 2021.