Headline
Job Openings !!Research Outline
High Performance Big Data Research Team [LINK]
High Performance Big Data Research Team at RIKEN R-CCS researches and develops system software for large-scale HPC (High Performance Computing) systems such as the Supercomputer Fugaku. Especially, we study state-of-the-art techniques for convergence of HPC, AI and Big data technologies as well as fundamental R&D in HPC. To achieve the goal, we develop system software to accelerate deep learning and big data processing on large-scale HPC systems, i.e., (HPC for AI/BD). we also apply AI and Big data processing techniques to resolve several technical challenges in large-scale HPC systems. We also study techniques to design next-generation large-scale HPC systems. The research topics include (but not limited to): (1) Scalability and acceleration of big data processing with massively parallel I/O by making use of next-generation storage and file systems; (2) Development of massively parallel algorithms and programming models for the next-generation non-volatile memory and deeply hierarchical memory/storage architectures; (3) Research and development for big data collection, transfer, accumulation, management, and utilization for data science; (4) Development of system software for integrating applications (Society 5.0 simulation as well as system operation), artificial intelligence training, and training data collection; (5) Scalability and acceleration of large-scale deep learning and inference; (6) Scalability and acceleration of high-reliability technologies such as checkpointing with large-scale I/O; (7) Architecture exploration for the development of next generation large-scale computers; (8) Development of tools to support application development and execution environments for large-scale computers; (9) Any other research and development related to high performance computing.Data Management Platform Development Unit, AI for Science Platform Division [LINK]
In the TRIP-AGIS project, R-CCS has been developing systems for advanced high-performance computing system to promote AI for Science research. In order to develop and utilize generative AI models for scientific research (scientific foundation models) that handle a variety of multimodal data, we have been conducting performance analysis and developing system software to improve the performance of the supercomputer “Fugaku” and AI-dedicated systems equipped with GPUs. Our unit develops a data management infrastructure for efficient development and utilization of scientific foundation models. In addition, in conjunction with the automation technology developed in TRIP-AGIS, our unit will conduct research and development of fundamental research related to data to enable real-time processing of enormous amounts of diverse data. This will enable high-speed training and inference cycles, aiming to accelerate the development and utilization of scientific foundation models. Specifically, we have been conducting the following research and development: (1) Optimization of data placement for training and inference utilizing a hierarchical memory/storage system; (2) Research and development of high-performance and scalable fault-tolerant techniques for large-scale model training and inference; (3) Research and development of data compression techniques to enhance data communication/transfer, management, and model training and inference; (4) Research and development of workflow systems to improve efficiency of model training, fine-tuning, inference, and utilization; (5) Other research topics to promote AI for science research.Advanced HPC Technologies Development Unit, Next-Generation HPC Infrastructure Development Division [LINK]
Advanced computing infrastructure plays a crucial role in realizing Society 5.0 and the Sustainable Development Goals (SDGs). By integrating digital twins technologies with AI-driven simulations, it is essential to accelerate research digital transformation (DX) beyond the capabilities of next-generation computing systems. This integration enables seamless collaboration between deductive simulations, inductive simulations, and large-scale data analysis, thereby driving scientific and technological innovation. With the end of Moore’s Law, achieving higher performance through traditional computing infrastructure extensions has become increasingly challenging. To overcome this, new architectures that significantly enhance performance and power efficiency are indispensable. In particular, moving beyond FLOPS-centric design, the key lies in optimizing data movement and maximizing energy efficiency. The Advanced HPC Technologies Development Unit conducts research and development on essential technologies for building advanced computing infrastructure, including feasibility studies. On the architectural side, we aim to develop massively-parallel and highly scalable computing architectures by integrating cutting-edge technologies such as high-density 3D-stacked memory, optical communication between chips by silicon photonics, and quantum-HPC hybrid computing. On the system software side, our focus is on pioneering system software technologies that leverage AI across a broad range of fields to enhance usability, optimize performance, and enable AI-driven operations (e.g., resource management and data utilization platforms). Additionally, we promote co-design with applications to fully leverage the potential of advanced computing infrastructure. Furthermore, in collaboration with industry and international research institutions, we advance feasibility studies on versatile and sustainable advanced computing infrastructure, laying the foundation for the computing platforms that will support future society.Introduction Video
Projects
- November 26, 2019 DL4Fugaku Project