Toward Practical and Scalable Systems Evaluation for Post-Moore Datacenters
Hejing Li
Max Planck Institute for Software Systems
02 Feb 2026, 10:00 am - 11:00 am
SaarbrĂĽcken building E1 5, room 029
SWS Student Defense Talks - Thesis Proposal
Having a solid system evaluation under realistic workloads and environments
is essential for datacenter network research. Modern datacenter systems are
shaped
by a wide range of factors - including hardware behavior, software components,
and complex interactions between them - whose combined effects on end-to-end
performance is often difficult to predict. However, evaluating such systems on
physical testbeds is frequently infeasible due to scale, cost, limited
experimental
control, and the increasing reliance on specialized hardware that may be
unavailable
or still under development. ...
Having a solid system evaluation under realistic workloads and environments
is essential for datacenter network research. Modern datacenter systems are
shaped
by a wide range of factors - including hardware behavior, software components,
and complex interactions between them - whose combined effects on end-to-end
performance is often difficult to predict. However, evaluating such systems on
physical testbeds is frequently infeasible due to scale, cost, limited
experimental
control, and the increasing reliance on specialized hardware that may be
unavailable
or still under development. As a result, researchers often turn to simulation.
Existing
simulators, however, typically focus on isolated components, such as network
protocols, host architectures, or hardware RTL, making it challenging to conduct
faithful end-to-end evaluations or to scale experiments to realistic datacenter
sizes.
The goal of this thesis is to provide researchers with practical and scalable
tools
and methodologies for conducting faithful end-to-end system evaluation targeting
modern datacenters. To this end, this thesis is structured around three
components.
First, it introduced SimBricks, an end-to-end simulation framework that enables
the
modular composition of best-of-breed simulators, allowing unmodified hardware
and software system implementations to be evaluated together within a single
virtual testbed. Second, it presented SplitSim, a simulation framework designed
to make large-scale end-to-end evaluation practical by supporting mixed-.delity
simulation, controlled decomposition, and efficient resource utilization.
Finally,
the thesis will include a set of case studies that apply SplitSim to the
evaluation
of large-scale networked systems, demonstrating a concrete evaluation workflow
and distilling lessons on navigating trade-offs between fidelity, scalability,
and
simulation cost.
Read more