Toward human-like evaluation for natural language generation with error analysis

Q Lu, L Ding, L **e, K Zhang, DF Wong… - arxiv preprint arxiv …, 2022 - arxiv.org
The state-of-the-art language model-based automatic metrics, eg BARTScore, benefiting
from large-scale contextualized pre-training, have been successfully used in a wide range of …

An overview on machine translation evaluation

L Han - arxiv preprint arxiv:2202.11027, 2022 - arxiv.org
Since the 1950s, machine translation (MT) has become one of the important tasks of AI and
development, and has experienced several different periods and stages of development …

Large language models and control mechanisms improve text readability of biomedical abstracts

Z Li, S Belkadi, N Micheletti, L Han, M Shardlow… - arxiv preprint arxiv …, 2023 - arxiv.org
Biomedical literature often uses complex language and inaccessible professional
terminologies. That is why simplification plays an important role in improving public health …

[HTML][HTML] Neural machine translation of clinical text: an empirical investigation into multilingual pre-trained language models and transfer-learning

L Han, S Gladkoff, G Erofeev, I Sorokina… - Frontiers in Digital …, 2024 - frontiersin.org
Clinical text and documents contain very rich information and knowledge in healthcare, and
their processing using state-of-the-art language technology becomes very important for …

Investigating massive multilingual pre-trained machine translation models for clinical domain via transfer learning

L Han, G Erofeev, I Sorokina, S Gladkoff… - arxiv preprint arxiv …, 2022 - arxiv.org
Massively multilingual pre-trained language models (MMPLMs) are developed in recent
years demonstrating superpowers and the pre-knowledge they acquire for downstream …

Investigating large language models and control mechanisms to improve text readability of biomedical abstracts

Z Li, S Belkadi, N Micheletti, L Han… - 2024 IEEE 12th …, 2024 - ieeexplore.ieee.org
Biomedical literature often uses complex language and inaccessible professional
terminologies. That is why sim-plification plays an important role in improving public health …

Topic modelling of swedish newspaper articles about coronavirus: a case study using latent dirichlet allocation method

B Griciūtė, L Han, G Nenadic - 2023 IEEE 11th International …, 2023 - ieeexplore.ieee.org
Topic Modelling (TM) is a natural language processing (NLP) method for discovering topics
in a collection of documents. Being an unsupervised method, it is a valuable tool when trying …

Linguistically-motivated Yorùbá-English machine translation

I Adebara, M Abdul-Mageed… - Proceedings of the 29th …, 2022 - aclanthology.org
Translating between languages where certain features are marked morphologically in one
but absent or marked contextually in the other is an important test case for machine …

Predicting Perfect Quality Segments in MT Output with Fine-Tuned OpenAI LLM: Is it possible to capture editing distance patterns from historical data?

S Gladkoff, G Erofeev, L Han, G Nenadic - arxiv preprint arxiv:2308.00158, 2023 - arxiv.org
Translation Quality Estimation (TQE) is an important step before deploying the output
translation into usage. TQE is also critical in assessing machine translation (MT) and human …

Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

S Gladkoff, L Han, G Nenadic - arxiv preprint arxiv:2303.04526, 2023 - arxiv.org
In natural language processing (NLP) we always rely on human judgement as the golden
quality evaluation method. However, there has been an ongoing debate on how to better …