Deploying an NVIDIA Omniverse server: Considerations and complications
Deploying an NVIDIA Omniverse server: considerations and complications
By now, NVIDIA’s Omniverse isn’t anything new.
Omniverse is a real-time 3D design collaboration open platform designed to simplify workflows and allow multiple content collaborators to work on the same project anywhere around the world at the same time.
At a recent ASUS and NVIDIA Omniverse event, we sat down with Morris Tan, Country Product Manager for Server at ASUS, Raghu Ganti, Regional Sales Leader for ProViz and NVIDIA Enterprise Software at NVIDIA and Loh Siak Hong, Technical Director from their SI partner PTC, to find out what businesses needed to consider when deciding to deploy an Omniverse server.
Q. What is the difference between a CPU-based server and GPU based server? Are there workloads that favour a particular type of server?
CPU-based servers don’t include accelerators to offload workloads off of the CPU while GPU-based servers have GPU accelerators where the work can be offloaded onto. CPUs are generalised processors that can do a wide variety of tasks but in a sequential manner while GPUs are specialised processors that have mathematical computation capabilities ideal for graphics and machine learning. Recently, there are quite a number of applications that used to be traditionally 100% CPU-based workload are moving towards hybrid CPU+GPU or 100% GPU compute-based workload because it is easier to scale up.
Q. Are there cost advantages to each when used for something common like a vSphere deployment?
Depending on the type of workload, there may be but for most purposes, having a GPU accelerator will increase the capability of the VM deployed and thus increase the value of what is being developed on the VM.
Q. When looking at an AI server deployment? What should a business be looking for?
Businesses should look for servers that have the power and flexibility to run workloads of varying types and sizes so that they can handle the whole end-to-end AI pipeline. They should also use systems that can be managed using modern cloud-native management frameworks, which are typically based on Kubernetes and can make use of virtualisation. NVIDIA-Certified Systems from ASUS provide scalable, high-performance, secure solutions from the data centre that can handle a diverse range of accelerated workloads, including AI training, inference, and data analytics.
Q. If we are pushing AI processing to the edge, will that make a difference to things?
Do you have sensors or other sources of data that are always working or monitoring the environment? Do you need to process the data as it is collected? Environments, where data is being collected continuously and needs to be analysed, processed, and interpreted (inference) at the point of collection, will likely benefit from edge computing.
Using an Intelligent Edge solution brings with it includes lower latency - avoiding the time it takes moving data to and from the data centre; reduced bandwidth- edge computing resources can be a cost-effective means of scaling compute infrastructure without the need to significantly increase bandwidth to the edge; data sovereignty- data that is collected and processed at the edge can stay there, no need to transfer it back to the data centre if all processing occurs at the point of collection.
Q. What about Omniverse? What sort of server best suits an Omniverse deployment?
There’s no single server configuration that best suits an Omniverse deployment but having an RTX-enabled GPU within the server is a requirement. Because the Omniverse platform encompasses so many different workflows from graphics, to simulation, to synthetic data generation to name a few, a GPU-based server is a requirement. It will depend on the complexity of the project. The more complex the project is, it will require more collaboration with the respective creators or reviewers.
Therefore, if the creators and reviewers are in different locations, it will increase the difficulty to collaborate thus increasing the time spent on the project.
Q. Given that Omniverse can run on a desktop, what is the argument for a server deployment? When working in a team, is there a limitation for collaboration?
Omniverse can run on desktops, servers via virtualised or bare metal deployments, and even laptops without issue. However only in a server deployment does an organisation gain centralised management, critical security, and ease of administrative deployment benefits.
Best practices for enterprise virtualisation environments are all applicable with Omniverse running on servers in the enterprise for a better overview of the project. When designing a deployment for Omniverse and collaboration is a requirement, it’s critical to ensure that each component of Omniverse such as the client systems, GPU, Nucleus Server, are all taken into account for optimal performance and experience.
Q. What are some common Omniverse use cases from customers?
One of the main use cases for Omniverse is Digital Twins. We have several examples of companies using Omniverse to create their own industrial application of a Digital Twin - from Ericsson creating a digital twin of a city for 5G optimisation, BMW for building factories of the future, or Lockheed Martin to build simulation environments to calculate wildfire spread.
Another example of an Omniverse deployment is CG Ark school for animation. The school is using NVIDIA Omniverse solution powered by ASUS ESC4000-E10 server and NVIDIA A40 GPU to build the first XR smart virtual studio for real-time collaboration and shooting with Unreal Engine. It allows CG Ark to connect talents with the industry and also provides training required for future professional needs. We will see enterprises and developers using Omniverse immediately.
Q. What is the typical deployment size? How large a deployment have you seen?
We have seen deployments of Omniverse as small as a single user two users collaborating together to hundreds of users all over the world. For example, some of the following use cases have been deployed on the servers listed below.
|Scene||Interior design||Engineering||3D Animation|
|Dimension||800mm x 440mm x 88.9mm (2U)||800mm x 440mm x 88.9mm (2U)||800 x 440 x 175.6mm (4U)|
|CPU||AMD EPYC 7453 (28C)||Intel 6348*2pcs (56C)||AMD EPYC 7663*2pcs (108C)|
|Memory||DDR4 3200 32G*8||DDR4 3200 32G*16||DDR4 3200 64G*32|
|GPU||NVIDIA A5000 GPU*2pcs||NVIDIA A6000*4pcs||NVIDIA A40 GPU*8pcs|
|SSD||NVMe 3.84T*2pcs||NVMe 7.68T*4pcs||NVMe 7.68T*2pcs|
|VDI QTY||5 ~ 8||8 ~ 12||12 ~ 16|
Q. What are some common deployment issues to look out for?
Some areas of concern when it comes to deployment are:
- GPU selection – what is chosen, may not be powerful enough for the workflows required,
- CPU selection - not enough cores for the required user density per server and not enough clock speed for the applications being run,
- System memory size - not enough to ensure each user has enough onboard resources, and
- Nucleus Server placement - too far from the client systems, local or in a data centre, to ensure the best latency experience.
Q. Can Omniverse deployments be considered a niche solution? Would an HPC cluster better lend itself to an Omniverse deployment?
No, Omniverse is not a niche solution. The user scenario case can be very wide. As with the previous question on the typical deployment size, it can range up to hundreds of users. HPC clusters consist of 2 types of GPUS, the A100s for AI and computing and A40’s for Visualisation and computing such as ASUS ESC8000A server. If the cluster has the A40s they can be used for Omniverse deployments.