Data governance in the age of large-scale data-driven language technology

Y Jernite, H Nguyen, S Biderman, A Rogers… - Proceedings of the …, 2022 - dl.acm.org
The recent emergence and adoption of Machine Learning technology, and specifically of
Large Language Models, has drawn attention to the need for systematic and transparent …

Donate speech

K Lindén, T Jauhiainen, M Lennes, M Kurimo, A Rossi… - CLARIN, 2022 - degruyter.com
The Donate Speech campaign aimed to collect 10,000 hours of ordinary, casual Finnish
speech to be used for studying language as well as for develo** technology and services …

The CLARIN Committee for Legal and Ethical Issues and the Normative Layer of the CLARIN Infrastructure: Ville Oksanen, in memoriam (26 december 1976-23 …

P Kamocki, A Kelli, K Lindén - CLARIN: The Infrastructure for …, 2022 - degruyter.com
The normative layer of CLARIN is, alongside the organizational and technical layers, an
essential part of the infrastructure. It consists of the regulatory framework (statutory law, case …

The impact of copyright and personal data laws on the creation and use of models for language technologies

A Kelli, A Tavast, K Lindén, K Vider… - … Papers from the …, 2020 - researchportal.helsinki.fi
The authors address the legal issues relating to the creation and use of language models.
The article begins with an explanation of the development of language technologies. The …

Challenges of transformation of research data into open data: The perspective of social sciences and humanities

A Kelli, T Mets, K Vider, A Värv… - International …, 2018 - intellectdiscover.com
The authors address the transformation of research data into open data. The article draws
on the experience in four countries: Sweden, Finland, Estonia and Lithuania. The …

Sharing is caring: a legal perspective on sharing language data containing personal data and the division of liability between researchers and research organisations

A Kelli, K Lindén, K Vider, P Kamocki… - LINKÖPING …, 2021 - usiena-air.unisi.it
The article focuses on determining responsible parties and the division of potential liability
arising from sharing language data (LD) containing personal data (PD). A key issue here is …

Categorizing legal features in a metadata-oriented task: defining the conditions of use

M Rigault, V Arranz, V Mapelli… - Proceedings of the …, 2022 - aclanthology.org
In recent times, more attention has been brought by the Human Language Technology
(HLT) community to the legal framework for making available and reusing Language …

A CLARIN contractual framework for sharing personal data for scientific research

K Lindén, A Kelli, A Nousias - Selected Papers from the …, 2020 - researchportal.helsinki.fi
The development and use of language resources often involve the processing of personal
data. Processing has to have a legal ground. The General Data Protection Regulation …

CLARIN contractual framework for sharing language data: The perspective of personal data protection

A Kelli, K Lindén, K Vider, P Kamocki… - CLARIN Annual …, 2020 - researchportal.helsinki.fi
The article analyses the responsibility for ensuring compliance with the General Data
Protection Regulation (GDPR) in research settings. As a general rule, organisations are …