Preferred Networks (PFN) has developed the Chainer™ open-source deep learning framework and has been working to build large-scale clusters that support its research and development activities with the aim of applying deep learning technology in the real world.
To promote this initiative further, PFN is developing MN-Core™, a processor dedicated to the acceleration of deep learning research.
High-speed computing is one of the big challenges in deep learning, which requires an enormous amount of computation.
The MN-Core is optimized for the training phase in deep learning. Unlike a general-purpose accelerator, it delivers excellent processing performance by having only limited functionalities. As well as focusing on minimal functionalities, PFN’s proprietary MN-Core has a dedicated circuit for performing matrix operations, a required process in deep learning, to make deep learning much faster.
Nowadays, performance per watt is becoming increasingly important when developing a processor mainly because of the cooling capacity reaching its limit. MN-Core is expected to achieve 1TFLOPS/W in a half-precision floating-point format, a top-class performance per watt in the world.
|Estimated power consumption (W)||500|
|Peak performance (TFLOPS)||32.8 (DP) / 131 (SP) / 524 (HP)|
|Estimated performance per watt (TFLOPS / W)||0.066 (DP) / 0.26 (SP) / 1.0 (HP)|
(Notes) DP: double precision, SP: single precision, HP: half precision.
MN-Core has matrix arithmetic units (MAUs) extremely densely mounted in its hardware architecture.
The simple architecture – entirely SIMD with no conditional branch – can process a large amount of data all at once.
An MAU and four processor elements (PEs) form one matrix arithmetic block (MAB) where the PEs provide data to the MAU.
Each PE has an integer arithmetic unit, and frequently used commands in deep learning are also implemented in the hardware.
A total of 2,048 MABs, 512 MABs per die, are integrated into one package, which comprises four dies. These are hierarchically arranged and have multiple modes for interlayer data movement, such as scatter, gather, broadcasting, and reduction. This enables flexible programming.
MN-Core Board is a PCI Express board where MN-Core is mounted. A specifically designed heatsink with a blower fan ensures that the temperature of MN-Core does not become high and brings out the best performance of MN-Core.
(Table) MN-Core Board
|Interface||PCI Express Gen3 x16|
|Memory size||32 GB|
|Power consumption||600 W (Estimated value)|
MN-Core Server is a 7U-size rack-mount server developed for mounting four MN-Core Boards.
In addition to the high-performance CPU and large capacity memory, a specifically designed internal structure, combined with 12 powerful built-in fans, provides an air-cooling system against heat generated by four MN-Core Boards.
With four MN-Core Boards, its computation speed per node is expected to be about 2PFLOPS in half precision.
(Table) MN-Core Server
|Number of mounted MN-Cores||4 MN-Core Boards|
|CPU||Dual socket up to TDP 200W|
|Memory||DDR4 up to 2666MHz / Up to 3TB ECC 3DS LRDIMM, 1TB ECC RDIMM|
|Storage||Up to 24 SAS/SATA drive bays / 8x 2.5" SAS/SATA supported natively, 2x 2.5" NVMe supported natively|
|Power unit||4 2000W (2+2 Redundant) Titanium Level|
|Size||H311mm, W437mm, D737mm (7U Rack-mountable)|
We have constructed MN-3 computing cluster. Please visit here.
A research group led by Kobe University Prof. Junichiro Makino has played a key role in developing specifications for MN-Core. Thanks to their support, PFN was able to design and develop hardware backed by proven technology.
The University of Tokyo Emeritus Prof. Kei Hiraki has also kindly provided guidance on the evaluation of the high-speed transmission board.
The development of MN-Core originated when PFN was entrusted with a public project by New Energy and Industrial Technology Development Organization of Japan or NEDO. In this project, PFN carried out research and development in conjunction with Prof. Makino’s research team members, such as Takayuki Muranushi and Miyuki Tsubouchi, both of Riken, Japan’s national research institute, to create a processor. The knowledge obtained through this project was fully utilized to design and develop MN-Core.
(From left: Prof. Makino and Prof. Hiraki. Photo provided by: Mari Inaba, an associate professor of The University of Tokyo)
* MN-Core™ and Chainer™ are the trademarks or the registered trademarks of Preferred Networks, Inc. in Japan and elsewhere.