Multils: A multi-task lexical simplification framework
Lexical Simplification (LS) automatically replaces difficult to read words for easier
alternatives while preserving a sentence's original meaning. LS is a precursor to Text …
alternatives while preserving a sentence's original meaning. LS is a precursor to Text …
Advancing Generative AI for Portuguese with Open Decoder Gerv\'asio PT
To advance the neural decoding of Portuguese, in this paper we present a fully open
Transformer-based, instruction-tuned decoder model that sets a new state of the art in this …
Transformer-based, instruction-tuned decoder model that sets a new state of the art in this …
[HTML][HTML] Teenytinyllama: open-source tiny language models trained in brazilian portuguese
NK Corrêa, S Falk, S Fatimah, A Sen… - Machine Learning with …, 2024 - Elsevier
Large language models (LLMs) have significantly advanced natural language processing,
but their progress has yet to be equal across languages. While most LLMs are trained in …
but their progress has yet to be equal across languages. While most LLMs are trained in …
The importance of context for sentiment analysis in dialogues
Sentiment Analysis (SA) can be applied to dialogues to determine the emotional tone
throughout the conversation. This is beneficial for dialogue systems because it may improve …
throughout the conversation. This is beneficial for dialogue systems because it may improve …
ptt5-v2: A closer look at continued pretraining of t5 models for the portuguese language
Abstract Despite advancements in Natural Language Processing (NLP) and the growing
availability of pretrained models, the English language remains the primary focus of model …
availability of pretrained models, the English language remains the primary focus of model …
[PDF][PDF] Exploring Portuguese Hate Speech Detection in Low-Resource Settings: Lightly Tuning Encoder Models or In-Context Learning of Large Models?
Automatically identifying hate speech is an emerging field driven by the growth of social
media and the consequent amplification of communication. However, this domain faces …
media and the consequent amplification of communication. However, this domain faces …
PORTULAN ExtraGLUE datasets and models: Kick-starting a benchmark for the neural processing of Portuguese
Leveraging research on the neural modelling of Portuguese, we contribute a collection of
datasets for an array of language processing tasks and a corresponding collection of fine …
datasets for an array of language processing tasks and a corresponding collection of fine …
[PDF][PDF] Automatic text readability assessment in European Portuguese
The automatic assessment of text readability and the classification of texts by levels is
essential for language education and languagerelated industries that rely on effective …
essential for language education and languagerelated industries that rely on effective …
[PDF][PDF] A named entity recognition approach for Portuguese legislative texts using self-learning
Even if technology has made legislative documents more accessible, they are often written
in jargon that makes them hard to understand for ordinary citizens, researchers, journalists …
in jargon that makes them hard to understand for ordinary citizens, researchers, journalists …
[PDF][PDF] Robertalexpt: A legal roberta model pretrained with deduplication for portuguese
This work investigates the application of Natural Language Processing (NLP) in the legal
context for the Portuguese language, emphasizing the importance of adapting pre-trained …
context for the Portuguese language, emphasizing the importance of adapting pre-trained …