Abstract: This paper proposes a novel queueing-theoretic approach to enable stochastic congestion-aware scheduling for distributed machine learning inference queries. Our proposed framework, called ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results