팔로우
Taja Kuzman
제목
인용
인용
연도
Automatic genre identification: a survey
T Kuzman, N Ljubešić
Language Resources and Evaluation, 1-34, 2023
1192023
Neural machine translation of literary texts from English to Slovene
T Kuzman, Š Vintar, M Arcan
Proceedings of the qualities of literary machine translation, 1-9, 2019
392019
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint. ana 2.0
T Erjavec, M Ogrodniczuk, P Osenova, N Ljubešić, K Simov, V Grigorova, ...
CLARIN ERIC, 2021
35*2021
ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification
T Kuzman, I Mozetič, N Ljubešić
arXiv preprint arXiv: 2303.03953, 2023
30*2023
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages
M Banón, M Espla-Gomis, ML Forcada, C García-Romero, T Kuzman, ...
23rd Annual Conference of the European Association for Machine Translation …, 2022
222022
Training corpus ssj500k 1.3. Slovenian language resource repository CLARIN. SI
S Krek, T Erjavec, K Dobrovoljc, S Može, N Ledinek, N Holz
202013
Automatic genre identification for robust enrichment of massive text collections: Investigation of classification methods in the era of large language models
T Kuzman, I Mozetič, N Ljubešić
Machine Learning and Knowledge Extraction 5 (3), 1149-1175, 2023
152023
The GINCO training dataset for web genre identification of documents out in the wild
T Kuzman, P Rupnik, N Ljubešić
arXiv preprint arXiv:2201.03857, 2022
152022
CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation
N Ljubešić, T Kuzman
arXiv preprint arXiv:2403.12721, 2024
82024
BENCHić-lang: A benchmark for discriminating between Bosnian, Croatian, Montenegrin and Serbian
P Rupnik, T Kuzman, N Ljubešić
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial …, 2023
82023
Training corpus ssj500k 1.4
S Krek, K Dobrovoljc, T Erjavec, S Može, N Ledinek, N Holz
Centre for Language Resources and Technologies, University of Ljubljana, 2015
82015
Get to know your parallel data: Performing english variety and genre classification over macocu corpora
T Kuzman, P Rupnik, N Ljubešić
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial …, 2023
72023
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages
R van Noord, T Kuzman, P Rupnik, N Ljubešić, M Esplà-Gomis, ...
arXiv preprint arXiv:2403.08693, 2024
62024
Assessing comparability of genre datasets via cross-lingual and cross-dataset experiments
T Kuzman, N Ljubešic, S Pollak
Jezikovne tehnologije in digitalna humanistika: zbornik konference …, 2022
62022
Verbal multiword expressions in Slovene
P Gantar, S Krek, T Kuzman
Computational and Corpus-Based Phraseology: Second International Conference …, 2017
62017
ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification. March 7, 2023
T Kuzman, I Mozetič, N Ljubešić
Reference Source, 0
6
Slovene-English parallel corpus MaCoCu-sl-en 2.0
M Bañón, M Chichirau, M Esplà-Gomis, ML Forcada, A Galiano-Jiménez, ...
Jožef Stefan Institute, 2023
52023
Choice of plausible alternatives dataset in Serbian COPA-SR
N Ljubešić, M Starović, T Kuzman, T Samardžić
Jožef Stefan Institute, 2022
52022
Exploring the Impact of Lexical and Grammatical Features on Automatic Genre Identification
T Kuzman, N Ljubešić
Proceedings of the Odkrivanje Znanja in Podatkovna Skladišca—SiKDD …, 2022
52022
Annotated Corpora and Tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions
C Ramisch, SR Cordeiro, A Savary, V Vincze, V Barbu Mititelu, A Bhatia, ...
LINDAT/CLARIAH-CZ Digital Library at the Institute of Formal and Applied …, 2018
52018
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–20