We’re delighted to share that a paper from the CNI community, “Learning to Offload in Hierarchical Inference Systems with Queues”, by Srinivas Nomula et al., has been accepted to IEEE INFOCOM 2025—one of the premier conferences in networking. Congratulations to the authors on this achievement!
Here, the authors turn to a problem that sits right at the crossroads of learning, networks, and real-world systems: when should an edge device trust its own intelligence, and when should it ask for help?
This work studies how intelligent edge devices can decide when to rely on their own AI models and when to send data to a more powerful edge server for help. In such hierarchical inference systems, devices first make local predictions and offload only those inputs for which the model is uncertain.
Unlike earlier studies that assume instant responses from servers, this research considers realistic wireless settings where offloaded requests may wait in a queue and results arrive with delay.
The authors develop a learning-based approach, called BOLD Hedge-Q, that adapts these offloading decisions over time while keeping the system stable. The study provides clear guarantees on how accuracy and response time can be balanced, helping bridge the gap between theoretical models and practical edge-AI deployments.