25 Jun Microsoft Cranks AI Efforts Up To 11

Microsoft is justifiably proud of its hardware/software co-design approach to accelerating a wide range of data center workloads using Intel FPGAs. The company recently shared some progress in this area and subsequently announced its acquisition of the AI startup Bonsai to ease the on-ramp for building AI on Microsoft. These advances give more advantages to the Microsoft AI strategy and warrant further analysis. I believe the company is very well-positioned to lead the penetration of AI into the enterprise market, where its productivity software and cloud success give it a springboard for growth.

Brainwave: reconfigurable hardware for AI inference 

The Brainwave project uses large arrays of Intel FPGAs to accelerate deep neural network (DNN) inference processing for search, ad targeting, facial recognition, and more. As Moore’s Law fades, CPUs cannot deliver the performance needed to crunch these AI networks at the low latencies and high scale that Microsoft demands, so the company has been investing in FPGA acceleration for at least five years. In fact, according to Microsoft, its penchant for FPGAs (and its tendency to place at least one in practically every server it installs), led to Intel’s acquisition of Altera in 2016.

Microsoft has also created an innovative, networked infrastructure that allows software to tap into a fabric of FPGAs in parallel, providing very low latency for virtually limitless processing. Now, all this hard work appears to be paying off; Microsoft’s Doug Burger, MSFT Technical Fellow for Azure, claims Brainwave outperforms “the market leader” for inference processing.

Keep in mind that Microsoft BrainWave is all about inference processing, where a DNN trained by GPUs (presumably made by the same market leader—hint, NVIDIA ) is used to infer attributes of an image, text, or voice sample. While nobody I know breaks down NVIDIA GPU sales by training and inference, it is safe to assume that the vast majority of its AI business today is in training (although the growth of NVIDIA DrivePX for autonomous vehicles and NVIDIA Jetson for robotics is driven by the need for high performance and complex edge inference).

Accelerators like GPUs typically batch together a number of traversals of the pre-trained DNN for inference, in order to amortize multiple queries across the relatively high latencies required to transfer the data and code to the GPU for execution. Microsoft has built an ingenious fabric of FPGAs to enable the near real-time AI it claims with Brainwave, by directly attaching each FPGA to the Top of Rack (TOR) Switch, as depicted in Figure 1. This enables the scaling of parallel work to be practically unlimited, all while keeping the latency very low.

Figure 1: Microsoft’s interconnect design enables low latency and high scalability by attaching each FPGA directly to the Top of Rack switch to allow pooling and sharing of FPGA resources across a 40Gb/s network. MICROSOFT

Microsoft also shared some indication of the performance of its FPGA network compared to “NPUs,” or Neural Processing Units—albeit without stating which chip it is comparing it to. There are two takeaways for FPGA inference processing here: 1) the “best performing NPU” (which I assume is NVIDIA Volta GPUs) is roughly comparable to FPGAs for large batch sizes, and more importantly, 2) Brainwave performance does not decrease (latencies do not increase) as batch size is reduced. This is important for enabling “real-time” AI, which Microsoft sees as critical for many applications.

Figure 2: Microsoft Brainwave delivers near real-time latencies regardless of batch size, unlike NPUs such as GPUs. MICROSOFT

Microsoft acquires Bonsai for edge inference software

As I mentioned earlier, Microsoft also announced its acquisition of privately-held, Berkley-based Bonsai to ease the burden of training sophisticated neural networks. Bonsai, in which Microsoft had previously invested, is a pioneer in reinforcement learning for developing edge autonomous systems. These systems reduce the need for the application domain scientist to understand the rigors of traditional DNN training models. Microsoft Research is a leader in reinforcement learning, so this acquisition does not surprise me. It strengthens Microsoft’s portfolio of tools for AI development.

Conclusions

I have previously covered Microsoft’s AI efforts here, here, here and here, and as followers know I am pretty impressed with the depth and breadth of the company’s AI research and offerings. Microsoft now provides some 29 APIs and pre-trained DNNs on Azure to simplify AI development in the enterprise and has augmented its productivity application portfolio with useful AI tools. Looking at what Microsoft has achieved with pervasive FPGAs across its data centers, one can see why the company is betting on these soft (reconfigurable) ASICs to accelerate many different AI workloads in its cloud and internal infrastructure. This not only speaks to Microsoft’s competitive advantages but also to the growth opportunity in front of Intel and its competitor Xilinx for future data center deployments.