Where in the world are you? Geolocation and language identification in Twitter

M Graham, SA Hale, D Gaffney - The Professional Geographer, 2014 - Taylor & Francis
The movements of ideas and content between locations and languages are unquestionably
crucial concerns to researchers of the information age, and Twitter has emerged as a …

Microblog language identification: Overcoming the limitations of short, unedited and idiomatic text

S Carter, W Weerkamp, M Tsagkias - Language Resources and …, 2013 - Springer
Multilingual posts can potentially affect the outcomes of content analysis on microblog
platforms. To this end, language identification can provide a monolingual set of content for …

Global connectivity and multilinguals in the Twitter network

SA Hale - Proceedings of the SIGCHI conference on human …, 2014 - dl.acm.org
This article analyzes the global connectivity of the Twitter retweet and mentions network and
the role of multilingual users engaging with content in multiple languages. The network is …

When sparse traditional models outperform dense neural networks: the curious case of discriminating between similar languages

M Medvedeva, M Kroon, B Plank - … of the Fourth Workshop on NLP …, 2017 - aclanthology.org
We present the results of our participation in the VarDial 4 shared task on discriminating
closely related languages. Our submission includes simple traditional models using linear …

Selecting and weighting n-grams to identify 1100 languages

RD Brown - Text, Speech, and Dialogue: 16th International …, 2013 - Springer
This paper presents a language identification algorithm using cosine similarity against a
filtered and weighted subset of the most frequent n-grams in training data with optional inter …

[KIRJA][B] Twitter als Basis wissenschaftlicher Studien: Eine Bewertung gängiger Erhebungs-und Analysemethoden der Twitter-Forschung

F Pfaffenberger - 2016 - library.oapen.org
Hinsichtlich der Auswertung von Twitter-Daten ergibt sich ein ähnlich differenziertes Bild:
Twitter-bezogene Studien fokussieren sich nicht nur auf reine Inhaltsanalysen, sondern …

[PDF][PDF] How people use twitter in different languages

W Weerkamp, S Carter, M Tsagkias - 2011 - researchgate.net
In this paper we describe how Twitter is used in various languages. We observe notable
differences between languages regarding the use of hashtags, links, mentions, and …

[PDF][PDF] Where in the world are you? Geolocation and language identification in Twitter

S Hale, D Gaffney, M Graham - Proceedings of ICWSM, 2012 - academia.edu
The movements of ideas and content between locations and languages are unquestionably
crucial concerns to researchers of the information age, and Twitter has emerged as a …

[HTML][HTML] Word-length algorithm for language identification of under-resourced languages

A Selamat, N Akosu - Journal of King Saud University-Computer and …, 2016 - Elsevier
Abstract Language identification is widely used in machine learning, text mining, information
retrieval, and speech processing. Available techniques for solving the problem of language …

[PDF][PDF] Generalized language identification

MH Lui - 2014 - minerva-access.unimelb.edu.au
Abstract Language identification is the task of determining the natural language that a
document or part thereof is written in. The central theme of this thesis is generalized …