High-performance and fault-tolerant techniques for massive data distribution in online communities

1 June, 2016


In the recent years, the amount of information produced and consumed has experienced a spectacular growth. New Internet applications such as Social Networks, Web 2.0 and User-Generated Content Networks have contributed to increase the amount of information available on the Internet. The increase in available data has not been matched by a corresponding improvement of the network connectivity. In fact, the limitation factor is not the available bandwidth, but the ratio between the available bandwidth and the amount of data to be distributed or consumed. For residential users, the available bandwidth is usually small, whereas for enterprise users is the amount of information which limits the communication. Technological advances have also contributed to modify the behaviour of users and systems. New technology allows enterprises and scientists to solve problems with finer level of detail. An increase in detail, usually leads to an increase in the amount of information produced by applied algorithms. For residential users, the evolution in consumer electronics such as: digital cameras, video recorders and multimedia devices contributed to increase the size of the multimedia content. Merging these two ongoing situations there is a stringent need for systems that can efficiently distribute content to their users, and are able to evolve in terms of capacity and processing capabilities following the evolution of the behaviour in their user community. This PhD proposal is focused on defining a new architecture for the distribution of huge data sets based on the publish/subscribe paradigm with intelligent components. The research effort will be distributed into two main areas: social knowledge and user/environment constraints. The study of the user community will provide information regarding access patterns, content popularity, etc. This social knowledge will be leveraged by the system, in order to dynamically adapt different parameters such as mirror provisioning or content replication. Additionally, user and environment constraints such as quality-of-service or available bandwidth, will be taken into account by the architecture by addressing challenging issues such as download and notification scheduling, download priorities, etc.


author={Daniel Higuero Alonso-Mardones},
title={High-performance and fault-tolerant techniques for massive data distribution in online communities},
school={Universidad Carlos III de Madrid}