[HTML][HTML] How can we detect homophobia and transphobia? experiments in a multilingual code-mixed setting for social media governance
Homophobia or Transphobia can be defined as the hatred, discomfort, or dislike of lesbian,
gay, transgender or bisexual people. Studies have shown that these individuals were more …
gay, transgender or bisexual people. Studies have shown that these individuals were more …
L3cube-hindbert and devbert: Pre-trained bert transformer models for devanagari based hindi and marathi languages
R Joshi - arxiv preprint arxiv:2211.11418, 2022 - arxiv.org
The monolingual Hindi BERT models currently available on the model hub do not perform
better than the multi-lingual models on downstream tasks. We present L3Cube-HindBERT, a …
better than the multi-lingual models on downstream tasks. We present L3Cube-HindBERT, a …
L3cube-mahacorpus and mahabert: Marathi monolingual corpus, marathi bert language models, and resources
R Joshi - arxiv preprint arxiv:2202.01159, 2022 - arxiv.org
We present L3Cube-MahaCorpus a Marathi monolingual data set scraped from different
internet sources. We expand the existing Marathi monolingual corpus with 24.8 M sentences …
internet sources. We expand the existing Marathi monolingual corpus with 24.8 M sentences …
A review of bangla natural language processing tasks and the utility of transformer models
Bangla--ranked as the 6th most widely spoken language across the world (https://www.
ethnologue. com/guides/ethnologue200), with 230 million native speakers--is still …
ethnologue. com/guides/ethnologue200), with 230 million native speakers--is still …
Mono vs multilingual bert for hate speech detection and text classification: A case study in marathi
A Velankar, H Patil, R Joshi - IAPR Workshop on Artificial Neural Networks …, 2022 - Springer
Transformers are the most eminent architectures used for a vast range of Natural Language
Processing tasks. These models are pre-trained over a large text corpus and are meant to …
Processing tasks. These models are pre-trained over a large text corpus and are meant to …
Distributed deep learning in open collaborations
Modern deep learning applications require increasingly more compute to train state-of-the-
art models. To address this demand, large corporations and institutions use dedicated High …
art models. To address this demand, large corporations and institutions use dedicated High …
Hope speech detection in under-resourced kannada language
Numerous methods have been developed to monitor the spread of negativity in modern
years by eliminating vulgar, offensive, and fierce comments from social media platforms …
years by eliminating vulgar, offensive, and fierce comments from social media platforms …
[PDF][PDF] Hate speech detection: a comparison of mono and multilingual transformer model with cross-language evaluation
Warning: This paper contains examples of the language that some people may find
offensive. Transformer-based Language models have achieved state-of-the-art performance …
offensive. Transformer-based Language models have achieved state-of-the-art performance …
Authorship classification in a resource constraint language using convolutional neural networks
Authorship classification is a method of automatically determining the appropriate author of
an unknown linguistic text. Although research on authorship classification has significantly …
an unknown linguistic text. Although research on authorship classification has significantly …
Role of language relatedness in multilingual fine-tuning of language models: A case study in indo-aryan languages
We explore the impact of leveraging the relatedness of languages that belong to the same
family in NLP models using multilingual fine-tuning. We hypothesize and validate that …
family in NLP models using multilingual fine-tuning. We hypothesize and validate that …