Massivizing Datacenter Scheduling to Bring All Data Services to All People

Long Summary (300 words)

Datacenters support our Digital Economy. They host the global cloud market (?84bn. by 2017), including many simple data services we take for granted, like storage (Dropbox). To be efficient, datacenters use smart resource management, in particular scheduling (RM&S). Many traditional scheduling techniques work well for simple data services, but only few do so when multiple simple data-services are combined into single, complex data services. Even for these few good schedulers, configuration requires significant cost and expertise. Consequently, only organizations with royal budgets and expert human resources can deploy complex, datacenter-based data services (CDDS).

In MagnaData, I aim to develop flexible and efficient (massivizing) RM&S techniques for CDDS. Inspired by Magna Charta (1215), which led to royal rights being brought to masses, I envision MagnaData as the first step in a long-term research line enabling "all data services for all people".

MagnaData promises three fundamental research contributions, none of which has been explored before for CDDS. First, leveraging social relationships to increase efficiency. After creating new knowledge about (implicit) social relationships and how they impact usage, MagnaData will develop the first socially aware RM&S for CDDS. Second, enabling CDDS to change per-component their performance and availability requirements. MagnaData will enable programmatic requirement changes, then use this in novel RM&S techniques. Third, enabling self-aware selection of datacenter-schedulers, currently done by human experts. MagnaData will develop the first portfolio scheduler for CDDS, which will also incorporate the other contributions as building-blocks.

I will combine fundamental and experimental research. Using my unique background and inspired by four broad application-domains (learning, industry, academia, and governance), I will create a MagnaData prototype-embedded in Apache Hadoop and Google Kubernetes data-service ecosystems, tested in real datacenters, and demonstrated for the application-domains. MagnaData will achieve substantial impact through competitive publications, open-source software and open-access data, and diverse project-stakeholders.

Short Summary (50 words)

Datacenters are factories producing (hosting) data services for our Digital Economy. MagnaData will develop groundbreaking resource management and scheduling techniques. These techniques help engineers manage increasingly larger datacenters, and address how social and sophisticated customers use data services. This makes datacenters much more flexible and efficient, and improves customer experience.

The Team