Hierarchical Inference (HI) has emerged as a promising approach for efficient distributed inference between end devices deployed with small pre-trained Deep Learning (DL) models and edge/cloud servers running large DL models. Under HI, a device uses the local DL model to perform inference on the data samples it collects, and only the data samples on which this inference is likely to be incorrect are offloaded to a remote DL model running on the server. Thus, gauging the likelihood of incorrect local inference is key to implementing HI. A natural approach is to compute a confidence metric for the local DL inference and then use a threshold on this confidence metric to determine whether to offload or not. The HI online learning problem was recently studied to learn an optimal threshold for the confidence metric over a sequence of data samples collected over time. However, existing algorithms have computation complexity that grows with the number of rounds and do not exhibit sub-linear regret. In this work, we propose algorithms with sub-linear regret for this problem. Using runtime measurements on Raspberry Pi, we demonstrate that our algorithm has a runtime lower by order of magnitude and achieves cumulative loss close to that of the alternatives.
Sharayu Moharir is an Associate Professor in the Department of Electrical Engineering, Indian Institute of Technology Bombay. Before that, she was a visiting fellow at the School of Technology and Computer Science at the Tata Institute of Fundamental Research (TIFR), Mumbai. She obtained my Ph.D. in Electrical and Computer Engineering from the University of Texas at Austin and her B.Tech. in Electrical Engineering and M.Tech. in Communications and Signal Processing from IIT Bombay.