理化学研究所 計算科学研究センター 高性能ビッグデータ研究チーム

JP EN

DL4Fugaku Project

Overview

DL4Fugaku Project [Slides]

Large-scale deep learning has emerged as an essential machine learning approach for many research challenges such as image classification, speech recognition and many others. Fast and large-scale deep learning enables us to train neural networks with more training data in shorter time. Supercomputer Fugaku is expected to enable high performance computing for deep learning since A64FX, which is a general-purpose processor equipped in Fugaku, provides high-speed half-precision floating point (FP16) and 8-bit integer (INT8) operations for matrix multiplications and high bandwidth HBM2 memory (1,048 GB/sec) for convolutions. Also, Fugaku interconnects employ the next-generation ToFu interconnects (ToFuD) for gradient reduction operations. However, to make use of Fugaku/A64FX hardware performance, tuning software stacks from deep learning frameworks to low-level numerical libraries is indispensable.

To achieve fast and scalable deep learning in Fugaku, we launched a new project, DL4Fugaku (Deep learning for Fugaku). The goals of the projects are (1) performance analysis and tuning of deep learning frameworks and low-level numerical libraries used by the frameworks; (2) Reliable deployment of large-scale deep learning environments; (3) Enhancement of the usability for production use in Fugaku. We organized a project team for DL4Fugaku from PIs and researchers in the application development unit, the high-performance AI system research team, the high-performance big data research team and the large-scale parallel numerical computing technology research team under collaboration with industry, academia and government; AIST, ARM, Cybozu, Fujitsu laboratories, Fujitsu limited, Linaro and Toky Tech. To facilitate the logistics and accelerate the software development, RIKEN R-CCS signed MOU with Fujitsu ltd. for further collaboration on the DL4Fugaku project.

News

Repositories