Hierarchical banzhaf interaction for general video-language representation learning
Multimodal representation learning, with contrastive learning, plays an important role in the
artificial intelligence domain. As an important subfield, video-language representation …
artificial intelligence domain. As an important subfield, video-language representation …