PhoBERT: Pre-trained language models for Vietnamese

DQ Nguyen, AT Nguyen - arxiv preprint arxiv:2003.00744, 2020 - arxiv.org
We present PhoBERT with two versions, PhoBERT-base and PhoBERT-large, the first public
large-scale monolingual language models pre-trained for Vietnamese. Experimental results …

Vietnamese hate and offensive detection using PhoBERT-CNN and social media streaming data

K Quoc Tran, A Trong Nguyen, PG Hoang… - Neural Computing and …, 2023 - Springer
Society needs to develop a system to detect hate and offense to build a healthy and safe
environment. However, current research in this field still faces four major shortcomings …

The impact of wastewater treatment effluent on Crocodile River quality in ehlanzeni district, Mpumalanga province, South Africa

TT Phungela, T Maphanga, BS Chidi… - South African Journal …, 2022 - scielo.org.za
Excessive discharge of poorly treated effluent has impacted global water resource systems
intensely. The declining state of wastewater treatment plants (WWTPs) is a significant source …

HSD shared task in VLSP campaign 2019: Hate speech detection for social good

XS Vu, T Vu, MV Tran, T Le-Cong, H Nguyen - arxiv preprint arxiv …, 2020 - arxiv.org
The paper describes the organisation of the" HateSpeech Detection"(HSD) task at the VLSP
workshop 2019 on detecting the fine-grained presence of hate speech in Vietnamese textual …

[PDF][PDF] ViVQA: Vietnamese visual question answering

KQ Tran, AT Nguyen, ATH Le… - Proceedings of the 35th …, 2021 - aclanthology.org
Visual question answering (VQA) is a hot topic that has recently drew the attention of
researchers from domains as diverse as natural language processing and computer vision …

Enhancing lexical-based approach with external knowledge for Vietnamese multiple-choice machine reading comprehension

K Van Nguyen, KV Tran, ST Luu, AGT Nguyen… - IEEE …, 2020 - ieeexplore.ieee.org
Although Vietnamese is the 17 th most popular native-speaker language in the world, there
are not many research studies on Vietnamese machine reading comprehension (MRC), the …

XLMR4MD: new Vietnamese dataset and framework for detecting the consistency of description and permission in android applications using large language models

QN Nguyen, NT Cam, K Van Nguyen - Computers & Security, 2024 - Elsevier
Google Play and other application marketplaces have various Android applications and
metadata. Among these, description information and privacy policy help explain the …

Predicting job titles from job descriptions with multi-label text classification

HT Tran, HHP Vo, ST Luu - 2021 8th NAFOSTED Conference …, 2021 - ieeexplore.ieee.org
Finding a suitable job and hunting for eligible candidates are important to job seeking and
human resource agencies. With the vast information about job descriptions, employees and …

ViGPTQA-state-of-the-art LLMs for vietnamese question answering: system overview, core models training, and evaluations

MT Nguyen, KT Tran, N Van Nguyen… - Proceedings of the 2023 …, 2023 - aclanthology.org
Large language models (LLMs) and their applications in low-resource languages (such as
in Vietnamese) are limited due to lack of training data and benchmarking datasets. This …

Mc-ocr challenge: Mobile-captured image document recognition for vietnamese receipts

XS Vu, QA Bui, NV Nguyen… - … on Computing and …, 2021 - ieeexplore.ieee.org
The paper describes the organisation of the" Mobile Captured Receipt Recognition
Challenge"(MC-OCR) task at the RIVF conference 2021 1 on recognizing the fine-grained …