Since 2011, the most powerful supercomputers systems ranked in the Top500 list are hybrid systems composed of thousands of nodes that includes CPUs and accelerators, as Xeon Phi and GPUs. Programming and deploying applications on those systems is still a challenging work, due to the complexity of the system and the need to mix several programming interfaces (MPI, CUDA, Intel Xeon Phi) in the same application.
This Workshop is aimed to explore the state of the art of developing applications in accelerated massive HPC architectures, including practical issues of hybrid usage models with MPI, OpenMP, and other accelerators programming models. The idea is to publish novel work on the use of available programming interfaces (MPI, CUDA, Intel Xeon Phi) and tools for code development, application performance optimizations, application deployment on accelerated systems, as well as the advantages and limitations of accelerated HPC systems. Experiences with real-world applications, including scientific computing, numerical simulations, healthcare, energy, data-analysis, etc. are also encouraged.
Manycore challenge in Kyoto: What we learned from HPC programming with KNC
It is expected that manycore processors with wide-SIMD mechanism will play the key role in the post peta-scale and exa-scale computing. At the same time, it is also anticipated that high-performance programming for such processors are not easy, or more pessimistically, extremely tough. This talk presents our early experiment with our XC30 supercomputer having a Xeon Phi KNC processor in each node, focusing of our PIC-based plasma simulation code whose complicated kernel can be considered as an toughest example of 10^2-scale multithreading and 8-way SIMD parallelization. Our preliminary performance results will encourage HPC application programmers, while our effort to have the good performance should reveal the necessity of highly sophisticated programming tools and environment.
Hiroshi Nakashima is a professor of the supercomputing center named Academic Center for Computing and Media Studies in Kyoto University, and its former director. He has worked in parallel computing for more than 30 years to design parallel computers, to implement parallel programming languages and frameworks, and to develop parallel applications and libraries for them.
Towards Next Generation Parallel Accelerated Computing with High Productivity
Accelerating devices such as GPU, MIC or FPGA are one of the most powerful computing resources to provide high performance/energy and high performance/space ratio for wide area of large scale computational science. On the other hand, the complexity of programming combining various frameworks such as CUDA, OpenCL, OpenACC, OpenMP and MPI is growing and seriously degrades the programmability and productivity.
We have been developing XcalableMP (XMP) parallel programming language for distributed memory architecture for PC clusters to MPP, and enhancing its capability to include accelerating devices for heterogeneous parallel processing systems. XMP is a sort of PGAS language, and XMP-dev and XMP-ACC are the extension for accelerating devices. On the other hand, we are also developing a new technology for inter-node GPU direct communication named TCA (Tightly Coupled Accelerators) architecture network from special hardware to the applications covered by this concept. Our on-going project vertically integrate all these components toward the new generation of parallel accelerated computing.
In this talk, I will introduce our on-going project which vertically integrates all these components toward the new generation of parallel accelerated computing.
Prof. Taisuke Boku received Master and PhD degrees from Department of Electrical Engineering at Keio University. After his carrier as assistant professor in Department of Physics at Keio University, he joined to Center for Computational Sciences (former Center for Computational Physics) at University of Tsukuba where he is currently the deputy director, the HPC division leader and the system manager of supercomputing resources. He has been working there more than 20 years for HPC system architecture, system software, and performance evaluation on various scientific applications. In these years, he has been playing the central role of system development on CP-PACS (ranked as number one in TOP500 in 1996), FIRST (hybrid cluster with gravity accelerator), PACS-CS (bandwidth-aware cluster) and HA-PACS (high-density GPU cluster) as the representative supercomputers in Japan. He also contributed to the system design of K Computer as a member of architecture design working group in RIKEN and currently a member of operation advisory board of AICS, RIKEN. He received ACM Gordon Bell Prize in 2011. His recent research interests include accelerated HPC systems and direct communication hardware/software for accelerators in HPC systems based on FPGA technology.
|13:00 - 13:45||Hiroshi Nakashima, Manycore challenge in Kyoto: What we learned from HPC programming with KNC|
|13:45 - 14:30||Taisuke Boku, Next Generation Parallel Accelerated Computing with High Productivity|
|14:30 - 15:00||Yuki Sumiyoshi, Akihiro Fujii, Akira Nukada and Teruo Tanaka. Mixed Precision AMG method for Many Core Accelerators|
|15:00 - 15:30||Coffee Break|
|15:30 - 16:00||Mikiko Sato, Go Fukazawa, Akio Shimada, Atsushi Hori, Yutaka Ishikawa, Mitaro Namiki, Kazumi Yoshinaga and Yuichi Tsujita. Design of Multiple PVAS on InfiniBand Cluster System Consisting of Many-core and Multi-core|
|16:00 - 16:30||Ting-Hsuan Chien, Chia-Jung Chen and Rong-Guey Chang. An Adaptive Zero-Copy Strategy for Ubiquitous High Performance Computing|
|16:30 - 17:00||Jesus Carretero, Javier Garcia Blas, David E. Singh, Florin Isaila, Thomas Fahringer, Radu Prodan, George Bosilca, Alexey Lastovetsky, Christi Symeonidou, Horacio Perez-Sanchez, Jose M. Cecilia Optimizations to enhance sustainability of MPI applications|
TopicsAreas of interest of the workshop include, but are not limited to:
Extended versions of distinguished selected papers accepted and presented in ESAA 2014, after further revisions, will be published in a special issue of the International Journal of Computers & Electrical Engineering.
Paper submission guidelines
Contributors are invited to submit a full paper as a PDF document not exceeding 6 pages in English. The title page should contain an abstract of at most 100 words and five specific, topical keywords. The paper must be formatted according to double-column ACM ICPS proceedings style. The usage of LaTeX for preparation of the contribution as well as the submission in camera ready format is strongly recommended. Style files can be found at www.acm.org/publications/icps-instructions/.
All contributions will be fully peer reviewed by the program committee. Registration for EuroMPI/ASIA 2014 main conference is mandatory to attend workshops.
The paper submission online system is open: https://www.easychair.org/conferences/?conf=esaa2014.