Pitt CS graduate student Mingda Zhang, assistant professor Adriana Kovashka, and professor Rebecca Hwa received the best paper award at the Efficient Deep Learning for Computer Vision CVPR Workshop (ECV 2021) for their work in collaboration with Drs. Chun-Te Chu, Andrey Zhmoginov, Andrew Howard, Brendan Jou, Yukun Zhu, and Li Zhang, from Google Research.
Their paper, Basis Net: Two-stage Model Synthesis for Efficient Inference presents BasisNet, which combines recent advancements in efficient neural network architectures, conditional computation, and early termination in a simple new form. Their approach incorporates a lightweight model to preview the input and generate input-dependent combination coefficients, which later controls the synthesis of a more accurate specialist model to make final prediction. The two-stage model synthesis strategy can be applied to any network architectures and both stages are jointly trained. They also show that proper training recipes are critical for increasing generalizability for such high capacity neural networks. On ImageNet classification benchmark, their BasisNet with MobileNets as backbone demonstrated clear advantage on accuracy-efficiency trade-off over several strong baselines. Specifically, BasisNet-MobileNetV3 obtained 80.3% top-1 accuracy with only 290M MultiplyAdd operations, halving the computational cost of previous state-of-the-art without sacrificing accuracy. With early termination, the average cost can be further reduced to 198M MAdds while maintaining accuracy of 80.0% on ImageNet. Read more here.