Probing pre-trained language models for cross-cultural differences in values

A Arora, LA Kaffee, I Augenstein - arxiv preprint arxiv:2203.13722, 2022 - arxiv.org
Language embeds information about social, cultural, and political values people hold. Prior
work has explored social and potentially harmful biases encoded in Pre-Trained Language …

More than a biomarker: could language be a biosocial marker of psychosis?

L Palaniyappan - npj Schizophrenia, 2021 - nature.com
Automated extraction of quantitative linguistic features has the potential to predict objectively
the onset and progression of psychosis. These linguistic variables are often considered to …

Simple, interpretable and stable method for detecting words with usage change across corpora

H Gonen, G Jawahar, D Seddah… - arxiv preprint arxiv …, 2021 - arxiv.org
The problem of comparing two bodies of text and searching for words that differ in their
usage between them arises often in digital humanities and computational social science …

BERTweetFR: Domain adaptation of pre-trained language models for French tweets

Y Guo, V Rennard, C Xypolopoulos… - arxiv preprint arxiv …, 2021 - arxiv.org
We introduce BERTweetFR, the first large-scale pre-trained language model for French
tweets. Our model is initialized using the general-domain French language model …

Geolocation of multiple sociolinguistic markers in Buenos Aires

O Kellert, NH Matlis - Plos one, 2022 - journals.plos.org
Analysis of language geography is increasingly being used for studying spatial patterns of
social dynamics. This trend is fueled by social media platforms such as Twitter which provide …

Joint embedding of structure and features via graph convolutional networks

S Lerique, JL Abitbol, M Karsai - Applied Network Science, 2020 - Springer
The creation of social ties is largely determined by the entangled effects of people's
similarities in terms of individual characters and friends. However, feature and structural …

When dialects collide: how socioeconomic mixing affects language use

T Louf, JJ Ramasco, D Sánchez, M Karsai - arxiv preprint arxiv …, 2023 - arxiv.org
The socioeconomic background of people and how they use standard forms of language are
not independent, as demonstrated in various sociolinguistic studies. However, the extent to …

Measuring international online human values with word embeddings

G Magno, V Almeida - ACM Transactions on the Web (TWEB), 2021 - dl.acm.org
As the Internet grows in number of users and in the diversity of services, it becomes more
influential on peoples lives. It has the potential of constructing or modifying the opinion, the …

Location, occupation, and semantics based socioeconomic status inference on twitter

JL Abitbol, M Karsai, E Fleury - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
The socioeconomic status of people depends on a combination of individual characteristics
and environmental variables, thus its inference from online behavioral data is a difficult task …

American cultural regions mapped through the lexical analysis of social media

T Louf, B Gonçalves, JJ Ramasco, D Sánchez… - Humanities and Social …, 2023 - nature.com
Cultural areas represent a useful concept that cross-fertilizes diverse fields in social
sciences. Knowledge of how humans organize and relate their ideas and behavior within a …