Why NVIDIA is betting on powering Deep Learning Neural Networks
Why NVIDIA is betting on powering Deep Learning Neural Networks
Object recognition and the path to powering Deep Learning Neural Networks
NVIDIA is primarily in the business of visualization, be it for gaming, producing movies, or rendering life-like models, among many other uses. With the advent of speedy internet connectivity, they've also ventured to delivering graphics compute horsepower as a service with the NVIDIA Grid ecosystem to enable cloud based gaming services and virtualized environments driven by a virtualized GPU in the cloud.
If you take a step back and look at the big picture, all of the above scenarios still do the same thing and that is for the GPU to deliver a pretty picture or video - by one way or another.
With the advent of NVIDIA powering super cars and using its Tegra compute capabilities to go beyond driving your instrumentation panel and other multimedia displays within the car, the company has intentionally or unintentionally enabled a new growth path - object recognition via deep learning.
With NVIDIA being a leader in graphics visualization technology, the challenge next was what else the growing GPU compute horsepower can achieve beyond just creating a visually powerful cockpit display through the Drive CX digital cockpit computer (powered by the recently launched Tegra X 1 chip).
To answer that, NVIDIA announced the Drive PX system which NVIDIA claims is a supercomputer for the car that's powered by dual Tegra X1 chips with a total compute performance of 2.3 teraflops. The goal was to enable the car to drive itself, but to do so, it needs to understand its surroundings, which requires lots of processing power, hence the two Tegra X1 chips. Using up to an array of 12 high resolution cameras fitted around the car to capture the surroundings in great detail, the Drive PX can process the input to understand its environment and the elements within it (object classification) and be able to react to it appropriately. For example, it can identify an ambulance from afar and begin taking maneuvers to steer the car to give way to the vehicle, or perhaps notice the traffic light signal has changed from green to red, thus ensuring the car stops within a safe distance from other cars ahead of it.
The Drive PX system is able to achieve all this because it uses a deep neural network model to advance the algorithms used in achieving successful object recognition within a given environment. Here's an example quoted from our previous feature on how Tegra X1 is paving the way for smart cars of the future:-
For example, it might not be able to properly identify a pedestrian as a pedestrian if it was partially blocked by an obstacle. On the other hand, a deep neural network model would be able to do so as it uses a more flexible recognition algorithm.
To achieve the desired algorithm to classify objects and scenarios successfully or improve the object recognition algorithm given new scenarios, this has to processed off-site via deep learning (or also known as machine learning techniques) and when ready, update the Drive PX system with the latest algorithm and data sets.
So while NVIDIA launched the Drive PX to tackle front-end processing tasks, the other half of the equation to enable self-driving cars was to empower deep learning neural networks that help evolve the algorithms used by the Drive PX. If all this sounds like Knight Rider coming to life (add to the fact you now have smartwatches that can take calls), it will probably become a reality not too far from now.
Driving the neural networks and further applications
This is why the new GeForce GTX Titan X was launched to power the back-end and drive deep learning neural network performance to greater heights. While NVIDIA's existing Tesla and Grid series of accelerators can tackle these duties, they are easily bested by the GeForce GTX Titan X as it is truly the most advanced single GPU that NVIDIA has in its arsenal currently. With the Titan X boasting 3,072 CUDA cores, 7 teraflops of single-precision performance, 12GB of memory and 336.5GB/s memory throughput, it is the ideal GPU to process and train deep neural networks. To prove its point, NVIDIA has trained AlexNet, an industry-standard model, using its GPUs and 1.2 million images from ImageNet's dataset and the results are impressive:-
What took months, the original Titan brought slashed the training time required to just under a week and the new Titan X combined with NVIDIA's new deep learning neural network framework - cuDNN, have brought that figure down to just over a couple of days. The advancement in performance among the Titans might not be as impressive as it looks from the graph as opposed to a non-accelerated solution, but time is money and sometimes the difference in time is all it takes for either a company, working committee or a researcher to make the decision of embarking on a the proposed solution/project or not at all. The implications are immense and the applications for deep learning are aplenty.
Deep Learning Applications:-
- Autonomous cars
- Advanced intelligent video surveillance
- Automatic image tagging
- Voice recognition
- Revolutionary medical research and much more.
Deep Learning Visualized
We're at an exciting age where the infrastructure and technology is in place to allow deep neural networks (DNN) to progress, thanks to speedy broadband internet connectivity, which has spurred on cloud computing and big data as a byproduct. Lastly, democratization of super computing components such as the Titan X to plow through the data effectively for just US$999 further adds to the enablement and deployment of DNN solutions.
To get a quick idea how deep learning works in the area of image recognition, it uses a DNN type called convolution neural network that's most often used in image recognition systems. It effectively combs through an image in several layers, with each processing ever finer elements to analyze and derive the outcome - to objectively classify and describe the image and its elements. Of course that's just a high-level approximation for the deep systematic analysis that the GPU is well attuned for, so here's a quick dissection from Andrej Karpathy who has done extensive work in the field of machine learning and computer vision:-
As iterated earlier, the outcome of running these images through the neural networks is to detect, classify and characterize these images. The following are some examples of the outcome where the neural networks were supposed to automatically caption the image:-
The above are just early examples of the image recognition and identification capabilities through deep learning neural networks. The potential is there, but there's definitely room for improvement as noted in some of outputs provided by the system.
Given a few years, advances in technology and improved database sets, output accuracy can only go up. In fact, Baidu's CEO Robin Lee told Bloomberg that currently 10% of the company's search queries are done by voice and that voice and image search queries will surpass text queries in five years. As such, they are also aggressively banking on a deep learning system to service this need in the future.
Simplifying usage of Neural Networks
NVIDIA is all too aware that interfacing, using and training respective neural networks is a difficult task. Making sure the right hardware in place is just one part of the equation, but simplifying the process for researchers and developers to be able to deploy their own DNNs is equally important. To tackle this, NVIDIA has also announced a Deep GPU Training System (DIGITS) for data scientists. Working hand-in-hand to make DNN deployment even faster, NVIDIA has also announced the DIGIT Devbox hardware system that will come pre-loaded with all one needs to jump start tinkering with neural networks.
To wrap up, we're excited with NVIDIA's deep learning endeavors and we can't wait to see the fruits of its labor in the long run - including autonomous cars. Whether will they succeed or not, we'll have to evaluate that next year, but with NVIDIA's next generation Pascal GPU architecture awaiting to be 'unpacked' in 2016, things could only get better for the deep learning scene. We'll leave you with this parting slide that sums up the companies that are supporting GPU-accelerated deep learning:-