Google 학술 검색

A Brown, E Coto, A Zisserman - 2021 IEEE 4th International …, 2021 - ieeexplore.ieee.org

We present a method for automatically labelling all faces in video archives, such as TV
broadcasts, by combining multiple evidence sources and multiple modalities (visual and …

저장 인용 18회 인용 관련 학술자료 전체 6개의 버전

Deep cross-modal face naming for people news retrieval

Y Tian, L Zhou, Y Zhang, T Zhang… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

How to integrate multimodal information sources for face naming in multimodal news is a hot
and yet challenging problem. A novel deep cross-modal face naming scheme is developed …

저장 인용 12회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models

M Nguyen, F Dernoncourt, S Yoon… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce an approach to identifying speaker names in dialogue transcripts, a crucial
task for enhancing content accessibility and searchability in digital media archives. Despite …

저장 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Self-contained entity discovery from captioned videos

M Ayoughi, P Mettes, P Groth - ACM Transactions on Multimedia …, 2023 - dl.acm.org

This article introduces the task of visual named entity discovery in videos without the need
for task-specific supervision or task-specific external knowledge sources. Assigning specific …

저장 인용 2회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

From face recognition to models of identity: A Bayesian approach to learning about unknown identities from unsupervised data

DC de Castro, S Nowozin - Proceedings of the European …, 2018 - openaccess.thecvf.com

Current face recognition systems robustly recognize identities across a wide variety of
imaging conditions. In these systems recognition is performed via classification into known …

저장 인용 10회 인용 관련 학술자료 전체 10개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Expertise detection in crowdsourcing forums using the composition of latent topics and joint syntactic–semantic cues

YD Woldemariam - SN Computer Science, 2021 - Springer

We develop an NLP method for inferring potential contributors among multitude of users
within crowdsourcing forums (CSFs). The method basically provides a way to predict …

저장 인용 2회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Hierarchical multi-label propagation using speaking face graphs for multimodal person discovery

GB da Fonseca, G Sargent, R Sicre… - Multimedia Tools and …, 2021 - Springer

TV archives are growing in size so fast that manually indexing becomes unfeasible.
Automatic indexing techniques can be applied to overcome this issue, and this work …

저장 인용 3회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Adapting language specific components of cross-media analysis frameworks to less-resourced languages: the case of Amharic

Y Woldemariam, A Dahlgren - … of the 1st joint workshop on spoken …, 2020 - aclanthology.org

We present an ASR based pipeline for Amharic that orchestrates NLP components within a
cross media analysis framework (CMAF). One of the major challenges that are inherently …

저장 인용 4회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] diva-portal.org

NLP methods for improving user rating systems in crowdsourcing forums and speech recognition of less resourced languages

YD Woldemariam - 2024 - diva-portal.org

We develop NLP and ASR methods (eg, algorithms, architectures) for solving these
problems: biases induced by user rating, ranking, recommendation and search engine …

저장 인용 관련 학술자료 도서관 검색 저장된 페이지

[Free GPT-4]
[DeepSeek]

[PDF] upc.edu

UPC multimodal speaker diarization system for the 2018 Albayzin Challenge

MÀ India Massana, I Sagastiberri… - … 2018: program and …, 2018 - upcommons.upc.edu

This paper presents the UPC system proposed for the Multimodal Speaker Diarization task
of the 2018 Albayzin Challenge. This approach works by processing individually the speech …

저장 인용 3회 인용 관련 학술자료 전체 9개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Towards large scale multimedia indexing: A case study on person discovery in broadcast news

Automated video labelling: Identifying faces by corroborative evidence

Deep cross-modal face naming for people news retrieval

Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models

Self-contained entity discovery from captioned videos

From face recognition to models of identity: A Bayesian approach to learning about unknown identities from unsupervised data

Expertise detection in crowdsourcing forums using the composition of latent topics and joint syntactic–semantic cues

Hierarchical multi-label propagation using speaking face graphs for multimodal person discovery

Adapting language specific components of cross-media analysis frameworks to less-resourced languages: the case of Amharic

NLP methods for improving user rating systems in crowdsourcing forums and speech recognition of less resourced languages

UPC multimodal speaker diarization system for the 2018 Albayzin Challenge