Abstract: Communication overhead represents a primary bottleneck in distributed deep learning, impeding training scalability. Although existing gradient sparsification techniques reduce network ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results