Taja Kuzman

Citée par

	Toutes	Depuis 2020
Citations	409	372
indice h	8	8
indice i10	8	7

200

100

150

20162017201820192020202120222023202420258 8 7 11 6 13 23 125 184 15

Accès public

Tout afficher

15 articles

1 article

disponibles

non disponibles

Sur la base des exigences liées au financement

Coauteurs

Nikola LjubešićResearcher at Jožef Stefan InstituteAdresse e-mail validée de ijs.si
Peter RupnikJožef Stefan InstituteAdresse e-mail validée de ijs.si
Rik van NoordAssistant professor in Humane AI and NLP, University of GroningenAdresse e-mail validée de rug.nl
Antonio ToralAssistant Professor, University of GroningenAdresse e-mail validée de rug.nl
Gema Ramírez-SánchezCEO at Prompsit Language Engineering, computational linguistAdresse e-mail validée de prompsit.com
Vít SuchomelMasaryk University and Lexical Computing Ltd.Adresse e-mail validée de mail.muni.cz
Mikel L. Forcada (ORCID 0000-0003-0...Professor of Computer Languages and Systems, Universitat d'AlacantAdresse e-mail validée de ua.es
Leopoldo Pla SempereUniversidad de AlicanteAdresse e-mail validée de dlsi.ua.es
Marta BañónPrompsit Language EngineeringAdresse e-mail validée de prompsit.com
Simon KrekResearcher at Jožef Stefan InstituteAdresse e-mail validée de ijs.si
Tomaž ErjavecJožef Stefan InstituteAdresse e-mail validée de ijs.si
Miquel Esplà-GomisUniversitat d'AlacantAdresse e-mail validée de dlsi.ua.es
Jaka ČibejResearcher, University of LjubljanaAdresse e-mail validée de ff.uni-lj.si
Polona GantarResearcher, Faculty of Arts, Ljubljana, SloveniaAdresse e-mail validée de guest.arnes.si
Kaja DobrovoljcResearch Associate, University of Ljubljana & Jozef Stefan InstituteAdresse e-mail validée de ijs.si
Špela VintarFull Professor, University of LjubljanaAdresse e-mail validée de ff.uni-lj.si
Špela Arhar HoldtResearch Associate at University of Ljubljana, SloveniaAdresse e-mail validée de cjvt.si
Mihael ArcanCo-Founder and Chief Scientific Officer (CSO) @ Lua HealthAdresse e-mail validée de luahealth.io
Darja FišerAssistant Professor, University of LjubljanaAdresse e-mail validée de ff.uni-lj.si
Mojca BrglezJunior Researcher / PhD student, University of LjubljanaAdresse e-mail validée de ff.uni-lj.si

Suivre

Taja Kuzman

Department of Knowledge Technologies, Jožef Stefan Institute

Adresse e-mail validée de ijs.si

computational linguistics language technology natural language processing web corpora genre identification


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
Automatic genre identification: a survey T Kuzman, N Ljubešić Language Resources and Evaluation, 1-34, 2023	117	2023
Neural machine translation of literary texts from English to Slovene T Kuzman, Š Vintar, M Arcan Proceedings of the qualities of literary machine translation, 1-9, 2019	38	2019
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint. ana 2.0 T Erjavec, M Ogrodniczuk, P Osenova, N Ljubešić, K Simov, V Grigorova, ... CLARIN ERIC, 2021	35*	2021
ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification T Kuzman, I Mozetič, N Ljubešić arXiv preprint arXiv: 2303.03953, 2023	30*	2023
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages M Banón, M Espla-Gomis, ML Forcada, C García-Romero, T Kuzman, ... 23rd Annual Conference of the European Association for Machine Translation …, 2022	22	2022
Training corpus ssj500k 1.3. Slovenian language resource repository CLARIN. SI S Krek, T Erjavec, K Dobrovoljc, S Može, N Ledinek, N Holz	20	2013
The GINCO training dataset for web genre identification of documents out in the wild T Kuzman, P Rupnik, N Ljubešić arXiv preprint arXiv:2201.03857, 2022	15	2022
Automatic genre identification for robust enrichment of massive text collections: Investigation of classification methods in the era of large language models T Kuzman, I Mozetič, N Ljubešić Machine Learning and Knowledge Extraction 5 (3), 1149-1175, 2023	14	2023
CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation N Ljubešić, T Kuzman arXiv preprint arXiv:2403.12721, 2024	8	2024
BENCHić-lang: A benchmark for discriminating between Bosnian, Croatian, Montenegrin and Serbian P Rupnik, T Kuzman, N Ljubešić Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial …, 2023	8	2023
Training corpus ssj500k 1.4 S Krek, K Dobrovoljc, T Erjavec, S Može, N Ledinek, N Holz Centre for Language Resources and Technologies, University of Ljubljana, 2015	8	2015
Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages R van Noord, T Kuzman, P Rupnik, N Ljubešić, M Esplà-Gomis, ... arXiv preprint arXiv:2403.08693, 2024	6	2024
Get to know your parallel data: Performing english variety and genre classification over macocu corpora T Kuzman, P Rupnik, N Ljubešić Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial …, 2023	6	2023
Assessing comparability of genre datasets via cross-lingual and cross-dataset experiments T Kuzman, N Ljubešic, S Pollak Jezikovne tehnologije in digitalna humanistika: zbornik konference …, 2022	6	2022
Verbal multiword expressions in Slovene P Gantar, S Krek, T Kuzman Computational and Corpus-Based Phraseology: Second International Conference …, 2017	6	2017
Slovene-English parallel corpus MaCoCu-sl-en 2.0 M Bañón, M Chichirau, M Esplà-Gomis, ML Forcada, A Galiano-Jiménez, ... Jožef Stefan Institute, 2023	5	2023
Choice of plausible alternatives dataset in Serbian COPA-SR N Ljubešić, M Starović, T Kuzman, T Samardžić Jožef Stefan Institute, 2022	5	2022
Exploring the Impact of Lexical and Grammatical Features on Automatic Genre Identification T Kuzman, N Ljubešić Proceedings of the Odkrivanje Znanja in Podatkovna Skladišca—SiKDD …, 2022	5	2022
Annotated Corpora and Tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions C Ramisch, SR Cordeiro, A Savary, V Vincze, V Barbu Mititelu, A Bhatia, ... LINDAT/CLARIAH-CZ Digital Library at the Institute of Formal and Applied …, 2018	5	2018
ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification. March 7, 2023 T Kuzman, I Mozetič, N Ljubešić Reference Source, 0	5

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–20

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par

Coauteurs