Augmented datasheets for speech datasets and ethical decision-making

O Papakyriakopoulos, ASG Choi, W Thong… - Proceedings of the …, 2023 - dl.acm.org
Speech datasets are crucial for training Speech Language Technologies (SLT); however,
the lack of diversity of the underlying training data can lead to serious limitations in building …

Open Korean corpora: A practical report

WI Cho, S Moon, Y Song - arxiv preprint arxiv:2012.15621, 2020 - arxiv.org
Korean is often referred to as a low-resource language in the research community. While
this claim is partially true, it is also because the availability of resources is inadequately …

Improving Jejueo-Korean Translation With Cross-Lingual Pretraining Using Japanese and Korean

F Zheng, E Marrese-Taylor… - Proceedings of the 9th …, 2022 - aclanthology.org
Jejueo is a critically endangered language spoken on Jeju Island and is closely related to
but mutually unintelligible with Korean. Parallel data between Jejueo and Korean is scarce …

Deep Learning-based Korean Dialect Machine Translation Research Considering Linguistics Features and Service

S Lim, C Park, Y Yang - Journal of the Korea Convergence Society, 2022 - koreascience.kr
Based on the importance of dialect research, preservation, and communication, this paper
conducted a study on machine translation of Korean dialects for dialect users who may be …

[PDF][PDF] Revisiting Korean Corpus Studies through Technological Advances

WI Cho, S Moon, Y Song - Proceedings of the 37th Pacific Asia …, 2023 - aclanthology.org
The Korean language has been largely studied recently owing to the active development of
Korean-specific language models and the disclosure of natural language processing (NLP) …

Improving Korean-Jeju Machine Translation Using SOV Languages and Cross-Lingual Model Pretraining

JW Kim, CM Bae, MS Kim - Annual Conference of KIPS, 2024 - koreascience.kr
SOV 언어와 교차 언어 모델 사전 학습을 활용한 한국어-제주어 기계 번역 개선 Improving
Korean-Jeju Mac Page 1 ACK 2024 학술발표대회 논문집 (31권 2호) 1. 서론 제주어-한국어의 …

[PDF][PDF] 복사 매커니즘을 이용한 한국어-제주어 기계번역

박다솔, 차정원 - 정보과학회 컴퓨팅의 실제 논문지, 2022 - air.changwon.ac.kr
요 약 본 논문에서 한글로 표기하는 특성상 제주어와 한국어에는 겹치는 어휘가 상당히 많다는
점에집중하였고, 한국어-제주어의 단어 치환이라는 데이터 특성을 고려하여 복사 매커니즘을 …