Title: Efficient large-scale deep learning on the SpiNNaker2 Neuromorphic Supercomputer
Duration: 01.07.2022 – 30.06.2025
Research Area: Methods and Hardware for Neuro-Inspired Computing
The SpiNNcloud is a neuro-inspired supercomputer based on 5 million low-power ARM microcontrollers that operate asynchronously and communicate in a packet-based fashion. It is by far the largest brain-inspired computer worldwide and offers a unique potential for energy-efficient real-time AI processing. In this project, we want to develop and implement scalable deep learning models on the SpiNNaker2 supercomputer. In addition to vanilla models such as CNN, ResNet and Transformers, neuro-inspired models will be developed that e.g., make use of structural and temporal sparsity. This makes it much easier to implement these models on the SpiNNaker2 system, thus avoiding communication bottlenecks by a scalable event-based communication. The efficient deep learning models will be benchmarked against standard hardware for machine learning.
Further, a sustainable workflow will be established to ensure reproducible and scalable research on the SpiNNcloud for a wide user community.
The aim is to develop and implement state-of-the-art deep learning models on the SpiNNcloud supercomputer, requiring less energy than on conventional AI hardware like NVIDIA GPUs. This is to be shown for various architectures such as convolutional neural networks, recurrent neural networks and transformers. Technically, the aim is to develop modified DNN models that are suitable for the event-based SpiNNaker2 platform and to implement a scalable software stack for mapping these DNN models.
The brain processes information order of magnitudes more efficiently than AI models on GPUs. Yet, brain-inspired models like spiking neural networks are lacking behind DNNs in terms of performance. The challenge is to apply concepts from neuroscience into DNNs to achieve both high performance and energy efficiency. Another challenge is to quantize the models to 8-bit integers for efficient inference with the SpiNNaker2 DNN accelerators.
Event-based recurrent neural networks (EGRU) were developed in collaboration with D. Kappel (RUB) and A. Subramoney (RHUL) [Subramoney et al 2023]. EGRU models were implemented on SpiNNaker2 for language modelling and event-based vision, requiring 18x less energy than a NVIDIA A100 [Nazeer et al. 2024].
We use the SpiNNcloud, a massively parallel neuro-inspired supercomputer with highly efficient DNN accelerators for int8 operations. To increase efficiency, we leverage neuro-inspired concepts like event-based communication and structural sparsity. Post-training quantization and quantization-aware training will help to prepare DNN models for efficient inference on SpiNNaker2. State-of-the-art approaches from ML compilers are adopted for the efficient mapping of the models to SpiNNaker2.
After implementing standard models such as ResNets and Transformers on single-chip SpiNNaker2 systems, we aim to scale up the network models to run on multiple 48-chip server boards. In addition, more efficient transformer models for natural language processing shall be developed and deployed on the SpiNNcloud supercomputer.