09 Aug Intel Acquires Nervana Systems Which Could Significantly Enhance Future Machine Learning Capabilities

Intel has announced that it will acquire Nervana Systems, a Deep Learning startup based in San Diego and Silicon Valley, to extend their capabilities in the fast-moving market for training deep neural networks used in artificial intelligence (AI) applications. I recently wrote about Nervana here. Training neural networks is a hot market where companies typically use GPUs to teach a machine how to process text, image, voice and other data types. Nervana is developing an accelerator and software that is tailored to this task instead of using a more general purpose GPU to do the heavy lifting. This acquisition provides Intel with a specific product and IP for Deep Learning, which can be used in standalone accelerators and can be integrated with future Intel technology to deliver more competitive and innovative products.

Why Does Intel Need Yet Another Architecture?

A GPU does a great job with machine learning, because it has thousands of floating point units that can be used in parallel for the matrix (tensor) operations that make up the bulk of the processing in training a deep neural network (DNN). But most GPUs have a lot of other capabilities as well, tailored for processing graphic images and producing graphics output. In addition, GPUs provide higher precision floating point used by High Performance Computing (HPC) applications like financial analysis, simulation and modeling, which is not required for Deep Learning algorithms. All this functionality takes up valuable space and power on the GPU chip. In theory, therefore, the Nervana approach could deliver higher performance and / or lower costs for these computationally intensive workloads, however the company has not yet provided any performance projections for their chip.

Nervana has also not disclosed many details about their processor, focusing for now on their NEON software for accelerating GPUs in the Nervana Cloud as they work to finish their chip for a 2017 debut. But they have previously shared that the Nervana Engine will include an on-die fabric switch that interconnects these devices in a 3D torus topology. This feature will enable the engines to scale to a large number of cooperating accelerators, a capability needed to train more complex DNNs, such as convolutional and recurrent neural networks. Exploiting this functionality will require additional engineering by system vendors or Intel, so it may take some time to materialize. We will know more sometime next year about how well their chip performs, and how well it will support popular AI frameworks such as Caffe, Torch and Tensorflow.

Does Intel Still Need a Big GPU?

When it comes to processors, Intel has one or more of every architecture flavor except big GPUs. They have desktop processors with integrated (little) GPUs, Xeon CPUs for servers, many-core Xeon Phi (“Knights Landing”) for HPC and supercomputers, and Altera FPGAs for specific function accelerators including inference engines for Deep Learning. But I am often asked if Intel still needs a heavy duty GPU. With this acquisition, I think the answer now is “no”; they can cover the much of the GPU acceleration spaces between Xeon Phi, Altera FPGAs, and now the Nervana Engine IP for AI. And Intel’s recent push into autonomous driving systems could benefit from a low-power DNN engine like Nervana appears to be developing.

What Will Intel Do With Nervana’s Technology?

Since the Nervana team is building a standalone accelerator today, Intel will likely continue down that path, at least for the initial release. But Intel excels at integrating technology, be it on chip or on multi-chip packages. Adding the Nervana Engine IP to a Xeon CPU could deliver a low cost approach to onboard acceleration, but then scaling that would not be straightforward, as the CPU-accelerator ratio would be fixed at 1-1. Therefore I think Intel may eventually productize the Nervana IP in several form factors, perhaps standalone products for strong scaling used in training, and one or more integrated solutions for using those trained neural networks for inference workloads.

In any event, it appears Intel has now closed one of the few gaps in their datacenter product line and stands to better participate in the incredible growth the AI market. I must note that there remains a lot of work to do, and NVIDIA certainly isn’t standing still. NVIDIA sets a high bar by which all contenders will be measured, and has nurtured a rich ecosystem of software and research institutions around the world that will take significant time and resources to replicate.