Giuseppe Desoli, PhD, ST Company Fellow
Holds an EE engineering master and PhD degrees from the University of Genoa. From 1995 to 2002 he worked for Hewlett-Packard Laboratories, in the US, developing microprocessors architectures, compilers, and tools. He’s one of the original architects of the ST200 family of VLIW embedded processors later integrated into most of ST’s multimedia products. In 2002 he joined STMicroelectronics as an R&D Director and lead architect continuing to work on microprocessor architectures and pioneering multiprocessor systems for embedded SoCs for set-top boxes and home gateways. Since 2012 he is serving as the Chief Architect for the System Research & Application central R&D group responsible for the development of HW AI architectures and tools for edge applications being integrated into multiple ST products. From 2015 he pioneered the development and deployment of HW-accelerated AI in STMicroelectronics for advanced deep learning based applications. Presently he leads the SRA AI architecture team developing advanced AI HW digital IPs and tools supporting ST’s product groups; he is one of the proponents and coordinator of the advanced R&D corporate project for neuromorphic computing and he’s contributing to multiple initiatives of the Innovation Office such as the ST’s technology council, he is the chairman of ST’s fellows scientific committee reporting to the corporation, and coordinates the ST AI Affinity team. He has co-authored more than 70 scientific publications and holds more than 40 patents in the field of microprocessor architectures, AI HW acceleration, algorithms, compilers, and tools and has been coordinating multiple funded EU research projects.
Various emerging memory technologies, including SRAM, PCM, RRAM, and MRAM, are being explored for efficient implementation of neural network accelerators using both digital and analog computing schemes. However, the analog approach might suffers from inaccuracies and resolution loss due to device variations and low signal-to-noise ratio while digital in-memory computing (IMC) can be preferred in some applications for its deterministic behavior and compatibility with technology scaling rules. In this paper, we present an architecture for a scalable and design-time parametric Neural Processing Unit (NPU) for edge AI applications, utilizing digital SRAM IMC (DIMC) with 8T standard bitcells integrated into IMC tiles supporting 1, 2, and 4b operation. The NPU is instantiated in multiple clusters with digital logic and is driven by a custom tensor slicing optimizing graph compiler aided by advanced data mover HW engines. A prototype System-on-Chip (SoC) has been manufactured in 18nm FD-SOI technology and is capable of working at low Vdd. The end-to-end system-level energy efficiency achieved on representative neural network benchmarks ranges from 40-310TOPS/W. Additionally, we present efficient mapping and performance for several relevant applications of this technology to ultra-low power use cases for battery-operated devices relying on advanced AI algorithms.
Thursday Memory for Edge Computing PM
AI R&D Director & Company Fellow, STMicroelectronics
A digital SRAM In-Memory Computing Multi-Tiled Neural Processing Unit in 18nm FD-SOI with 40-310TOPS/W and its use-cases for ultra low power inference applications... more info