Meterstick: Benchmarking Performance Variability in Cloud and Self-hosted Minecraft-like Games

Research Track, 2023 ACM/SPEC International Conference on Performance Engineering (ICPE '23)

Download PDF Slides

Abstract

Due to increasing popularity and strict performance requirements, online games have become a cloud-based and self-hosted workload of interest for the performance engineering community. One of the most popular types of online games is the Minecraft-like Game (MLG), in which players can terraform the environment. The most popular MLG, Minecraft, provides not only entertainment, but also educational support and social interaction, to over 130 million people world-wide. MLGs currently support their many players by replicating isolated instances that support each only up to a few hundred players under favorable conditions. In practice, as we show here, the real upper limit of supported players can be much lower. In this work, we posit that performance variability is a key cause for the lack of scalability in MLGs, investigate experimentally causes of performance variability, and derive actionable insights. We propose an operational model for MLGs, which extends the state-of-the-art with essential aspects, e.g., through the consideration of environment-based workloads, which are sizable workload components that do not depend on player input (once set in action). Starting from this model, we design the first benchmark that focuses on MLG performance variability, defining specialized workloads, metrics, and processes. We conduct real-world benchmarking of MLGs, both cloud-based and self-hosted. We find environment-based workloads and cloud deployment are significant sources of performance variability: peak-latency degrades sharply to 20.7 times the arithmetic mean, and exceeds by a factor of 7.4 the performance requirements. We derive actionable insights for game-developers, game-operators, and other stakeholders to tame performance variability.