Abstract: In recent years, massive datasets have significantly driven the advancement of visual learning such as multi-modal large model at the expense of high computational costs and extensive ...
Abstract: In video-based emotion recognition, effective multi-modal fusion techniques are essential to leverage the complementary relationship between audio and visual modalities. Recent ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results