International Workshop on Enhancing Parallel Scientific Applications with Accelerated HPC (ESAA 2014)

to be held as part of EuroMPI/ASIA 2014 Kyoto, Japan, September 9-12, 2014


Since 2011, the most powerful supercomputers systems ranked in the Top500 list are hybrid systems composed of thousands of nodes that includes CPUs and accelerators, as Xeon Phi and GPUs. Programming and deploying applications on those systems is still a challenging work, due to the complexity of the system and the need to mix several programming interfaces (MPI, CUDA, Intel Xeon Phi) in the same application.

This Workshop is aimed to explore the state of the art of developing applications in accelerated massive HPC architectures, including practical issues of hybrid usage models with MPI, OpenMP, and other accelerators programming models. The idea is to publish novel work on the use of available programming interfaces (MPI, CUDA, Intel Xeon Phi) and tools for code development, application performance optimizations, application deployment on accelerated systems, as well as the advantages and limitations of accelerated HPC systems. Experiences with real-world applications, including scientific computing, numerical simulations, healthcare, energy, data-analysis, etc. are also encouraged.

Invited speakers

Manycore challenge in Kyoto: What we learned from HPC programming with KNC

Hiroshi Nakashima

It is expected that manycore processors with wide-SIMD mechanism will play the key role in the post peta-scale and exa-scale computing. At the same time, it is also anticipated that high-performance programming for such processors are not easy, or more pessimistically, extremely tough. This talk presents our early experiment with our XC30 supercomputer having a Xeon Phi KNC processor in each node, focusing of our PIC-based plasma simulation code whose complicated kernel can be considered as an toughest example of 10^2-scale multithreading and 8-way SIMD parallelization. Our preliminary performance results will encourage HPC application programmers, while our effort to have the good performance should reveal the necessity of highly sophisticated programming tools and environment.

Hiroshi Nakashima is a professor of the supercomputing center named Academic Center for Computing and Media Studies in Kyoto University, and its former director. He has worked in parallel computing for more than 30 years to design parallel computers, to implement parallel programming languages and frameworks, and to develop parallel applications and libraries for them.

Towards Next Generation Parallel Accelerated Computing with High Productivity

Taisuke Boku

Accelerating devices such as GPU, MIC or FPGA are one of the most powerful computing resources to provide high performance/energy and high performance/space ratio for wide area of large scale computational science. On the other hand, the complexity of programming combining various frameworks such as CUDA, OpenCL, OpenACC, OpenMP and MPI is growing and seriously degrades the programmability and productivity.

We have been developing XcalableMP (XMP) parallel programming language for distributed memory architecture for PC clusters to MPP, and enhancing its capability to include accelerating devices for heterogeneous parallel processing systems. XMP is a sort of PGAS language, and XMP-dev and XMP-ACC are the extension for accelerating devices. On the other hand, we are also developing a new technology for inter-node GPU direct communication named TCA (Tightly Coupled Accelerators) architecture network from special hardware to the applications covered by this concept. Our on-going project vertically integrate all these components toward the new generation of parallel accelerated computing.

In this talk, I will introduce our on-going project which vertically integrates all these components toward the new generation of parallel accelerated computing.

Prof. Taisuke Boku received Master and PhD degrees from Department of Electrical Engineering at Keio University. After his carrier as assistant professor in Department of Physics at Keio University, he joined to Center for Computational Sciences (former Center for Computational Physics) at University of Tsukuba where he is currently the deputy director, the HPC division leader and the system manager of supercomputing resources. He has been working there more than 20 years for HPC system architecture, system software, and performance evaluation on various scientific applications. In these years, he has been playing the central role of system development on CP-PACS (ranked as number one in TOP500 in 1996), FIRST (hybrid cluster with gravity accelerator), PACS-CS (bandwidth-aware cluster) and HA-PACS (high-density GPU cluster) as the representative supercomputers in Japan. He also contributed to the system design of K Computer as a member of architecture design working group in RIKEN and currently a member of operation advisory board of AICS, RIKEN. He received ACM Gordon Bell Prize in 2011. His recent research interests include accelerated HPC systems and direct communication hardware/software for accelerators in HPC systems based on FPGA technology.


13:00 - 13:45Hiroshi Nakashima, Manycore challenge in Kyoto: What we learned from HPC programming with KNC
13:45 - 14:30Taisuke Boku, Next Generation Parallel Accelerated Computing with High Productivity
14:30 - 15:00Yuki Sumiyoshi, Akihiro Fujii, Akira Nukada and Teruo Tanaka. Mixed Precision AMG method for Many Core Accelerators
15:00 - 15:30Coffee Break
15:30 - 16:00Mikiko Sato, Go Fukazawa, Akio Shimada, Atsushi Hori, Yutaka Ishikawa, Mitaro Namiki, Kazumi Yoshinaga and Yuichi Tsujita. Design of Multiple PVAS on InfiniBand Cluster System Consisting of Many-core and Multi-core
16:00 - 16:30Ting-Hsuan Chien, Chia-Jung Chen and Rong-Guey Chang. An Adaptive Zero-Copy Strategy for Ubiquitous High Performance Computing
16:30 - 17:00Jesus Carretero, Javier Garcia Blas, David E. Singh, Florin Isaila, Thomas Fahringer, Radu Prodan, George Bosilca, Alexey Lastovetsky, Christi Symeonidou, Horacio Perez-Sanchez, Jose M. Cecilia Optimizations to enhance sustainability of MPI applications


Areas of interest of the workshop include, but are not limited to:

  • Tools, libraries, and environments for accelerators.
  • Hybrid and heterogeneous programming with MPI and accelerators.
  • Performance evaluation scientific applications based on accelerators.
  • Automatic performance tuning of scientific applications with accelerators.
  • Integrating accelerators on existing HPC middlewares.
  • Run-times for accelerators.
  • Energy efficient HPC solutions based on accelerators.
  • Storage cache solutions based on SSD accelerators.
  • Parallel data analysis for MPI and SSD.
  • Real-world scientific and engineering applications using accelerated HPC.
  • Future trends and prospects for accelerated HPC.
  • Important dates

  • Submission deadline extended to: May 31st, 2014 (no further extension)
  • Author notification: July 6th, 2014
  • Camera Ready papers due: July 18th, 2014
  • Conference: September 9th-12th, 2014
  • Special Issue

    Extended versions of distinguished selected papers accepted and presented in ESAA 2014, after further revisions, will be published in a special issue of the International Journal of Computers & Electrical Engineering.


    Workshop Organizers:
  • Prof. Jesus Carretero. University Carlos III of Madrid, Spain
  • Dr. Javier Garcia Blas. University Carlos III of Madrid, Spain

  • Program Committee:
  • Ivona Brandic. Vienna University of Technology, Austria
  • Minyi Guo, Shanghai Jiao Tong University, China
  • Francisco Igual Peña, Universidad Complutense, Spain
  • Florin Isaila, Argonne National Labs, Chicago, USA
  • Emmanuel Jeannot, INRIA, France
  • Hai Jin, Huazhong University of Science and Technology, China
  • Timothy K. Jones, University of Cambridge, UK
  • Christos Kartsaklis, Oak Ridge National Laboratory, USA
  • Laurent Lefevre, INRIA, Ecole Normale Superieure of Lyon, University of Lyon, France
  • Diego R. Llanos, University of Valladolid,Spain
  • Dimitar Lukarski, Uppsala University, Sweden
  • Svetozar Margenov, Bulgarian Academy of Sciences, Bulgaria
  • Raffaele Montella, University of Napoli Parthenope, Italy
  • Ravi S Nanjundiah, Indian Institute of Science in Bangalore, India
  • Ariel Olesiak, Poznan Supercomputing and Networking Center, Poland
  • Antonio J. Peña, Argonne National Labs, Chicago, USA
  • Enrique S. Quintana-Orti, Universidad Jaume I de Castellon, Spain
  • Matei Ripeanu, University of British Columbia, Canada
  • Leonel Sousa, Instituto Superior Tecnico, Universidade de Lisboa, Portugal
  • Alexander Supalov, INTEL, USA
  • Rupa K. Thulasiram, University of Manitoba, Canada
  • Manuel Ujaldon, Universidad de Malaga, Spain
  • Roman Wyrzykowski, Czestochowa University of Technology, Poland
  • Julius Zilinskas, Vilnius University, Lithuani
  • Xingshe Zhou, Northwestern Polytech University, China
  • Paper submission guidelines

    Contributors are invited to submit a full paper as a PDF document not exceeding 6 pages in English. The title page should contain an abstract of at most 100 words and five specific, topical keywords. The paper must be formatted according to double-column ACM ICPS proceedings style. The usage of LaTeX for preparation of the contribution as well as the submission in camera ready format is strongly recommended. Style files can be found at

    All contributions will be fully peer reviewed by the program committee. Registration for EuroMPI/ASIA 2014 main conference is mandatory to attend workshops.

    The paper submission online system is open:

    Organized by