08 Nov Arm Adds More AI Firepower

As I covered in a recent article on the status of startups building chips for AI, the AI accelerator market will become quite crowded over the next 12-18 months. Not to be outdone, Arm is now extending its neural network processor family, adding logic designs for mid-range and entry-level devices to the flagship Ethos-N77, which targets higher-end smart phones. For companies needing AI logic for their own Arm-based mobile and embedded chips, the three Ethos designs will offer good performance, small size and very low power options for them to consider.

The question I often field, however, is whether Arm is too late to the party, since large Arm partners such as Qualcomm  have already designed their own AI accelerator engines. The bottom line, in my opinion, is that the AI chip industry is only in the 2nd inning, and there’s lots of game left to play. However, Arm will face partners who are already competitors in the AI design space. Let’s look at Arm’s design, which of course is licensable IP, not a chip manufactured by Arm.

The Arm Ethos design

Since many larger mobile chip designers have already added an AI engine to their application and modem SOCs, it looks like Arm decided to build IP for the rest of the market, providing IP for those partners to add to their own or licensed IP. The Ethos design comes complete with a scalable MAC Engine for the math, a Programmable Layer Engine for activations and vector processing and a Network Control Unit to manage the neural network workflow. The MAC + PLE is a scalable unit, available in 4, 8 and 16 blocks in the Ethos-N37, N57 and N77 IP. The company projects performance for these chips at 1GHz to attain 1, 2 and 4 Trillion Operations per Second respectively. The MAC supports the 8-and 16-bit integer math now commonly used for efficient inference processing. Importantly, the PLE and the NCU do the work often handled by an application CPU, so this design should deliver low latency at low power and low cost.

To help solve the memory bottleneck AI chips face, these designs include on-die SRAM memory, from 512KB to 4MB, reducing traffic to external memory. In addition, Arm equipped the devices with on-the-fly compression and memory management to further reduce memory consumption—nice! Finally, the chip supports scaling out to handle larger jobs, interconnecting up to 16 Ethos chips.

Figure 1: All three Ethos products feature the same rich set of power-efficient and scalable logic.
 ARM

To help partners building chips for a wide variety of applications needing AI, Arm now offers IP across a fairly wide spectrum, all supported by the same software stack based on Arm NN. The range of devices Arm envisions for these AI engines spans a broad range of popular AI tasks.

Figure 2: The Ethos N37, 57, and 77 cover the broad range of applications, devices, and price points
ARM

Frankly, I’ve been wondering when Arm would come out with a competitive series of designs for AI acceleration; there is no doubt it is late to the party and all mobile SOCs need AI. But as I mentioned, this party is just getting started. AI is no longer a “nice to have” option—every smartphone will need to perform computational photography, language translation and voice recognition, even at the low end. The Ethos design is well thought out, with important features that will help ensure high performance and low battery consumption for these computationally intensive tasks.

Arm’s success here will not come without challenges. I would point out that Arm’s apparent delay created an opportunity for companies like Qualcomm, whose Snapdragon 855 delivers 7 TOPS today and powers the Google  Pixel 4, among others. The amazing photography everyone is raving about in the Pixel 4, and the ability to perform voice processing without network connectivity is all enabled by Google software running on the Hexagon AI Engine.