New Nvidia GPUs promise better AI
By Digital News Asia September 15, 2016
- Single Tesla GPU server can replace 13 CPU-only servers
- Improvements for self-driving cars and image recognition
NVIDIA has unveiled the latest additions to its Pascal architecture-based deep learning platform, with the new Tesla P4 and P40 GPU accelerators and accompanying software.
Modern AI services such as voice-activated assistance, Email spam filters, movie and product recommendation engines are rapidly growing in complexity. According to Nvidia, they require up to 10x more computing power compared to neural networks from a year ago. Current CPU-based technology is not capable of delivering real-time responsiveness required for modern AI services, leading to a poor user experience.
The company says the Tesla P4 and P40 are specifically designed for inferencing, which uses trained deep neural networks to recognise speech, images or text in response to queries from users and devices. These GPUs are based on the Pascal architecture.
According to Nvidia, the Tesla P4 delivers the highest energy efficiency for data centers. It fits in any server with its small form factor and low power design which starts at 50 watts, helping to make it 40x more energy efficient than CPUs for inferencing. A single server with a single Tesla P4 can replace 13 CPU-only servers for video inferencing workloads, delivering over 8x savings in total cost of ownership, including server and power costs.
A server with eight Tesla P40 accelerators can replace the performance of more than 140 CPU servers. At approximately US$5,000 per CPU server, this results in savings of more than US$650,000 in server acquisition cost.
“With the Tesla P100 and now Tesla P4 and P40, Nvidia offers the only end-to-end deep learning platform for the data center, unlocking the enormous power of AI for a broad range of industries,” said Ian Buck, general manager of accelerated computing at Nvidia. “They slash training time from days to hours. They enable insight to be extracted instantly. And they produce real-time responses for consumers from AI-powered services.”
The new single processor configuration of the Drive PX 2 AI computing platform for AutoCruise functions, which include highway automated driving and HD mapping, consumes just 10 watts of power and enables vehicles to use deep neural networks to process data from multiple cameras and sensors. A car using the small form-factor Drive PX 2 for AutoCruise can understand in real-time what is happening around it, precisely locate itself on a HD map and plan a safe path forward.
According to Nvidia, more than 80 automakers, tier 1 suppliers, startups and research institutions are developing autonomous vehicle solutions using the Drive PX architecture which scales from a single mobile processor configuration to a combination of two mobile processors and two discrete GPUs. This enables automakers to move from development into production for a wide range of self-driving solutions.
For an introduction to deep learning, watch this video below.
If you wish to understand the difference between deep learning and inferencing, read this blog post.
The company is innovating not just on hardware. Complementing the Tesla P4 and P40 are two software packages to accelerate AI inferencing: TensorRT and the DeepStream SDK.
TensorRT is a library created for optimising deep learning models for production deployment that delivers instant responsiveness for the most complex networks. It maximises throughput and efficiency of deep learning applications by taking trained neural nets and optimising them for reduced precision operations.
The DeepStream SDK taps into the power of a Pascal server to simultaneously decode and analyse up to 93 HD video streams in real-time compared with seven streams with dual CPUs. This addresses one of the grand challenges of AI: understanding video content at scale for applications such as self-driving cars, interactive robots, filtering and ad placement. Integrating deep learning into video applications allows companies to offer smart, innovative video services that were previously impossible to deliver.
“Delivering simple and responsive experiences to each of our users is very important to us,” said Greg Diamos, senior researcher at Baidu. “We have deployed GPUs in production to provide AI-powered services such as our Deep Speech 2 system and the use of GPUs enables a level of responsiveness that would not be possible on un-accelerated servers."
The Tesla P4 and P40 are planned to be available in November and October respectively, in qualified servers offered by ODM, OEM and channel partners.
Nvidia accuses Intel of lying
Review: Nvidia GeForce GTX 1060, the new mainstream champion
Nvidia names new sales and marketing VP for APAC
For more technology news and the latest updates, follow us on Twitter, LinkedIn or Like us on Facebook.