Overview

Large-scale deep learning has emerged as an essential machine learning approach for many research challenges such as image classification, speech recognition and many others. Fast and large-scale deep learning enables us to train neural networks with more training data in shorter time. Supercomputer Fugaku is expected to enable high performance computing for deep learning since A64FX, which is a general-purpose processor equipped in Fugaku, provides high-speed half-precision floating point (FP16) and 8-bit integer (INT8) operations for matrix multiplications and high bandwidth HBM2 memory (1,048 GB/sec) for convolutions. Also, Fugaku interconnects employ the next-generation ToFu interconnects (ToFuD) for gradient reduction operations. However, to make use of Fugaku/A64FX hardware performance, tuning software stacks from deep learning frameworks to low-level numerical libraries is indispensable.
To achieve fast and scalable deep learning in Fugaku, we launched a new project, DL4Fugaku (Deep learning for Fugaku). The goals of the projects are (1) performance analysis and tuning of deep learning frameworks and low-level numerical libraries used by the frameworks; (2) Reliable deployment of large-scale deep learning environments; (3) Enhancement of the usability for production use in Fugaku. We organized a project team for DL4Fugaku from PIs and researchers in the application development unit, the high-performance AI system research team, the high-performance big data research team and the large-scale parallel numerical computing technology research team under collaboration with industry, academia and government; AIST, ARM, Cybozu, Fujitsu laboratories, Fujitsu limited, Linaro and Toky Tech. To facilitate the logistics and accelerate the software development, RIKEN R-CCS signed MOU with Fujitsu ltd. for further collaboration on the DL4Fugaku project.
News
- (May 22, 2023) Tokyo Tech, Tohoku University, Fujitsu and RIKEN start collaboration to develop distributed training of Large Language Models
https://www.titech.ac.jp/english/news/2023/066798 - (May 22, 2023)「富岳」スーパーコンピュータ政策対応枠における大規模言語モデル分散並列学習手法の開発について | 理化学研究所
https://riken.jp/pr/news/2023/20230522_2/index.html - (November 16, 2022) 東京大学、東北大学、神戸大学が推進する、深層学習による超新星爆発シェルの膨張予測を用いた高解像度銀河形成シミュレーションの高速化 プロジェクトに、モルフォの『SoftNeuro®』を提供 ~スーパーコンピュータ「富岳」における深層学習を用いた3Dシミュレーションを支援~
https://www.morphoinc.com/news/20221116-jpr-sn - (March 7, 2022) Fujitsu and Tokyo Medical and Dental University leverage world’s fastest supercomputer and AI technology for scientific discovery to shed light on drug resistance in cancer treatment
https://www.fujitsu.com/global/about/resources/news/press-releases/2022/0307-01.html - (March 3, 2022) Fujitsu leverages world’s fastest supercomputer and AI tech in joint field trial for safe and efficient tsunami evacuations in Kawasaki City
https://www.fujitsu.com/global/about/resources/news/press-releases/2022/0303-01.html - (December 9, 2021) World’s Largest Scale Deep Learning on Supercomputer Fugaku Achieved World’s Highest Performance
https://blog.fltech.dev/entry/2021/12/09/mlperfhpcv1-fugaku-en - (November 26, 2021) 世界最大規模のディープラーニングを「富岳」で実施して世界一になりました
https://blog.fltech.dev/entry/2021/11/26/mlperfhpcv1-fugaku - (November 18, 2021) Fujitsu and RIKEN Claim 1st Place for MLPerf HPC Benchmark with Supercomputer Fugaku — World’s fastest performance for the number of deep learning models trained per time unit for CosmoFlow a key machine learning processing benchmark
https://www.fujitsu.com/global/about/resources/news/press-releases/2021/1118-02.html - (November 18, 2021) スーパーコンピュータ「富岳」が機械学習処理ベンチマークMLPerf HPCで世界第1位を獲得 — 深層学習モデルCosmoFlowの単位時間あたりの学習で世界最高速度を達成
https://pr.fujitsu.com/jp/news/2021/11/18-1.html - (September 30, 2021) “The 16, 384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer” has been accepted at HiPC 2021
https://hipc.org/accepted-papers/ - (September 30, 2021) “MLPerf HPC: Benchmarking Machine Learning Workloads on HPC Systems” has been accepted in MLHPC2021@SC21 Workshop
https://ornl.github.io/MLHPC/index.html - (November 19, 2020) Fujitsu, AIST, and RIKEN Achieve Unparalleled Speed on the MLPerf HPC Machine Learning Processing Benchmark Leveraging Leading Japanese Supercomputer Systems https://www.fujitsu.com/global/about/resources/news/press-releases/2020/1119-02.html
- (November 19, 2020) 機械学習処理ベンチマークMLPerf HPCにて最高レベルの速度を達成 https://pr.fujitsu.com/jp/news/2020/11/19-1.html
- (November 18, 2020) 富岳CPU A64FX用ディープラーニングライブラリの深層 -研究者が語る開発の軌跡-
https://blog.fltech.dev/entry/2020/11/18/fugaku-onednn-deep-dive-ja - (November 18, 2020) 富岳のディープラーニング処理を支えるJITコンパイラ「Xbyak_aarch64」誕生秘話
https://gihyo.jp/news/interview/2020/11/1801 - (November 18, 2020) 富岳版XbyakがIntelの深層学習ライブラリoneDNNにmergeされる
https://blog.cybozu.io/entry/xbyak_for_fugaku - (November 11, 2020)HPC and AI Initiatives for Supercomputer Fugaku and Future Prospects https://www.fujitsu.com/global/about/resources/publications/technicalreview/2020-03/article09.html
- (October 13, 2020)「スーパーコンピュータ「富岳」におけるHPC/AIへの取り組みと将来への展望」 https://www.fujitsu.com/jp/about/resources/publications/technicalreview/2020-03/article09.html
- (November 26, 2019) 「富岳」上のAI(人工知能)フレームワーク構築に向けた覚書を富士通株式会社と締結しました
https://www.r-ccs.riken.jp/outreach/topics/191126-2/
Repositories
- Github repository: DL for Fugaku
https://github.com/dl4fugaku - Github repository: Fujitsu (Xbyak_aarch64, Xbyak_translator_aarch64 etc.)
https://github.com/fujitsu