Research Lines

Grid storage techniques

New input/output models and architectures for clusters

Multicore input/output mechanism

  • Developing collective I/O techniques for multicore architectures.
  • New methodologies for exploiting memory hierarchy in I/O operations.

Input/output optimization in massively parallel applications

The ever increasing computation and memory demand of parallel applications requires an appropriate increase in the performance of I/O systems. In this project we propose optimizations for the Message Passing Interface (MPI) standard tageting the following goals:

  • Increase the performance of data intensive parallel applications
  • Efficient parallel file accesses
  • Fast communication
  • On demand virtualization of distributed storage
  • Ease of reconfigurability of the parallel I/O system
  • Flexible parallel I/O benchmarking

Our optimizations are implemented in the MPICH distribution. Our modified version will be made available in this web page.

Goals

The project has the following specific goals:

  1. To develop new storage techniques for grid environments to enhance performance in data accesses and to achieve a good usability for those environments from the data management, resource scheduling and job execution perspectives. To achieve that, we´ll propose new grid scheduling, data localization, resource management, global data access, and load distribution techniques. Moreover, we´ll propose techniques to provide fault-tolerance and an enhanced reliability in grid environments.

  2. To propose new input/output models and architectures for clusters to solve some of the existing bottlenecks, to provide high scalability and performance, at the same that they increase the reliability through new methods of fault-tolerance and data replication adapted to the architectures proposed.

  3. To study the influence of multicore processors on the input/output mechanisms of the cluster architectures from two points of view: impact on the bandwidth increasing from each node over the efficiency of the input/output system and application of multicore processors to the input/output nodes.

  4. To propose new methods to optimize the input/output in massively parallel applications. We´ll propose optimization techniques base on data distribution over compute nodes and optimization techniques based on the data locality into the processor.

  5. To integrate the developed solutions into a prototype of the Expand parallel file system, developed by the research group over the last years. Expand is a parallel file system based on standard servers. Currently it uses NFS as base protocol for clusters and GridFTP to be used with Globus in Grid environments. On one side we plan to enhance the existing prototype, and on the other we plan to validate with a real prototype the new techniques and solutions developed into the project.