Abstract: In this research, we created a multimodal deep learning model, which incorporates visual, audio, and textual data that can perform better than video categorization using only one modality.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results