Volta is, without question, the most advanced processor that Nvidia has ever put into the field, and it is hard to imagine what it will be able to do for an encore.
Nvidia, well regarded in the gaming industry for its market leading graphical processers, has been overshadowed in the AI field by the likes of Google and its Tensor Processing Unit (TPU), which offers specialised chips for powering AI applications. For neural networks, the basic building blocks are matrix multiplication and addition. There were some issues with the "Maxwell" line of GPUs in that we never did see a part with lots of double precision flops, and some features slated for one generation of GPU were pushed out to later ones, but Nvidia has more or less delivered what it said it would, and did so on time. The result is what Nvidia says is a 12x speedup in inferencing learning over Pascal, and a 6x speedup in inferencing.
A high-level look at the Tesla V100's specs.
New GPU architecture with over 21 billion transistors. 150 watts is pretty much the upper limit for an accelerator in a hyperscale environment, but NVIDIA engineers here at GTC suggested that the wattage can be dialed down further for customers with more constrained power requirements.
Nvidia's new competitors argue that they can make hardware faster and more efficient at running AI software by designing chips tuned for the objective from scratch instead of adapting graphics chip technology. The presentation line-up at the event included the highly anticipated NVIDIA Volta. Huang also announced TensorRT, a compiler for Tensorflow and Caffe created to optimize the runtime performance on GPUs. In tasks that can take advantage of them, Nvidia claims that the new tensor cores offer a 4x performance boost versus Pascal, which in theory makes the V100 a better performer than Google's dedicated tensor processing unit (TPU).
Somalia leader wants arms embargo lifted to fight al-Shabab
Somalia has been mired in violent chaos since 1991 when warlords overthrew dictator Siad Barre and then turned against each other. Analysts say the national army is also poorly equipped and underfunded.
The Tensor Cores are complemented with a large 20MB register file, 16GB of HBM2 RAM at 900GB/s, and 300GB/s NVLink for IO.
Architectural enhancements in Volta are abundant and wide-ranging.
With its Tensor cores, Volta is "no longer a general-purpose GPU architecture, so Nvidia can not be accused of using its GPU hammer and seeing every problem as a nail", said Kevin Krewell, principal analyst at Tirias Research.
The company's new DGX Station is a desktop workstation modeled after the larger DGX-1 AI supercomputer, but is conveniently designed for use at a home, lab, or office desk. The V100 will start shipping by the end of the year to data centers owned by Amazon, Microsoft, and other cloud computing providers in several different configurations. Microsoft has invested heavily in using FPGAs to power its machine-learning software and made them a core piece of its cloud platform, Azure. The on-ramp approach to GPU-based cloud computing addresses growing requirements to gather into a single stack the proliferation of deep learning frameworks, drivers, libraries, operating systems and processors used for AI development.





Comments