- Academic Search

V Hofmann, G Glavaš, N Ljubešić… - Transactions of the …, 2024 - direct.mit.edu

While pretrained language models (PLMs) have been shown to possess a plethora of
linguistic knowledge, the existing body of research has largely neglected extralinguistic …

Speichern Zitieren Zitiert von: 3 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[PDF] arxiv.org

CLASSLA-Stanza: The next step for linguistic processing of South Slavic Languages

L Terčon, N Ljubešić - ar** the languages of Twitter in Finland

T Hiippala, T Väisänen, T Toivonen, O Järv - Neuphilologische Mitteilungen, 2020 - JSTOR

Twitter is a popular social media platform for scholarly research, because the user-
generated content on the platform can also include geographic and temporal information …

Speichern Zitieren Zitiert von: 17 Ähnliche Artikel Alle 10 Versionen

[Free GPT-4]

[PDF] ox.ac.uk

Using social-media data to investigate morphosyntactic variation and dialect syntax in a lesser-used language: Two case studies from Welsh

D Willis - Glossa, 2020 - ora.ox.ac.uk

Data gathered from social media have been used extensively to examine lexical dialect
variation in widely used languages such as English and Spanish, but their use to date in …

Speichern Zitieren Zitiert von: 17 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] unibo.it

Together we are stronger: Bootstrap** language technology infrastructure for South Slavic languages with CLARIN. SI

N Ljubešić, T Erjavec, M Miličević Petrović… - … . The Infrastructure for …, 2022 - degruyter.com

In this chapter we describe the recent developments in language technology infrastructure
building for three South Slavic languages–Slovenian, Croatian, and Serbian. These …

Speichern Zitieren Zitiert von: 6 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] arxiv.org

CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation

N Ljubešić, T Kuzman - arxiv preprint arxiv:2403.12721, 2024 - arxiv.org

This paper presents a collection of highly comparable web corpora of Slovenian, Croatian,
Bosnian, Montenegrin, Serbian, Macedonian, and Bulgarian, covering thereby the whole …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] clinjournal.org

How to optimize your Twitter collection: Dutch keywords for better coverage

T Kreutz, W Daelemans - Computational Linguistics in the …, 2019 - clinjournal.org

Twitter allows API calls to retrieve one percent of all tweets at any time using a search word
list. Since some languages, including Dutch, make up less than one percent of all tweets on …

Speichern Zitieren Zitiert von: 7 Ähnliche Artikel Alle 4 Versionen HTML-Version

6 Data Collection and Representation for Similar Languages, Varieties and Dialects

T Samardžic, N Ljubešic - Similar Languages, Varieties, and …, 2021 - books.google.com

Collections of digital text intended for research–known as language corpora–have been
used as linguistic data since the pioneering work on the Brown corpus by Francis and …

Speichern Zitieren Zitiert von: 3 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] researchgate.net

[PDF][PDF] The Russian invasion of Ukraine through the lens of ex-Yugoslavian Twitter

B Evkoski, I Mozetic, PK Novak, N Ljubešic - 2022 - researchgate.net

ABSTRACT The Russian invasion of Ukraine marks a dramatic change in international
relations globally, as well as at specific, already unstable, regions. The geographical area of …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 4 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Borders and boundaries in Bosnian, Croatian, Montenegrin and Serbian: Twitter data to the rescue

Geographic Adaptation of Pretrained Language Models

CLASSLA-Stanza: The next step for linguistic processing of South Slavic Languages

Using social-media data to investigate morphosyntactic variation and dialect syntax in a lesser-used language: Two case studies from Welsh

Together we are stronger: Bootstrap** language technology infrastructure for South Slavic languages with CLARIN. SI

CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation

How to optimize your Twitter collection: Dutch keywords for better coverage

6 Data Collection and Representation for Similar Languages, Varieties and Dialects

[PDF][PDF] The Russian invasion of Ukraine through the lens of ex-Yugoslavian Twitter