3D Convolutional Neural Networks for Cross Audio-Visual Matching Recognition - 42Papers