History, development, and principles of large language models: an introductory survey

Z Wang, Z Chu, TV Doan, S Ni, M Yang, W Zhang - AI and Ethics, 2024 - Springer
Abstract Language models serve as a cornerstone in natural language processing, utilizing
mathematical methods to generalize language laws and knowledge for prediction and …

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

L Barrault, YA Chung, MC Meglioli, D Dale… - arxiv preprint arxiv …, 2023 - arxiv.org
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …

Contrastive search is what you need for neural text generation

Y Su, N Collier - arxiv preprint arxiv:2210.14140, 2022 - arxiv.org
Generating text with autoregressive language models (LMs) is of great importance to many
natural language processing (NLP) applications. Previous solutions for this task often …

Introduction to latent variable energy-based models: a path toward autonomous machine intelligence

A Dawid, Y LeCun - Journal of Statistical Mechanics: Theory and …, 2024 - iopscience.iop.org
Current automated systems have crucial limitations that need to be addressed before
artificial intelligence can reach human-like levels and bring new technological revolutions …

A benchmark for learning to translate a new language from one grammar book

G Tanzer, M Suzgun, E Visser, D Jurafsky… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) can perform impressive feats with in-context learning or
lightweight finetuning. It is natural to wonder how well these models adapt to genuinely new …

A picture is worth more than 77 text tokens: Evaluating clip-style models on dense captions

J Urbanek, F Bordes, P Astolfi… - Proceedings of the …, 2024 - openaccess.thecvf.com
Curation methods for massive vision-language datasets trade off between dataset size and
quality. However even the highest quality of available curated captions are far too short to …

Steering large language models for cross-lingual information retrieval

P Guo, Y Ren, Y Hu, Y Cao, Y Li, H Huang - Proceedings of the 47th …, 2024 - dl.acm.org
In today's digital age, accessing information across language barriers poses a significant
challenge, with conventional search systems often struggling to interpret and retrieve …

Lingua Franca–Entity-Aware Machine Translation Approach for Question Answering over Knowledge Graphs

N Srivastava, A Perevalov, D Kuchelev… - Proceedings of the 12th …, 2023 - dl.acm.org
This research paper proposes an approach called Lingua Franca that improves machine
translation quality by utilizing information from a knowledge graph to translate named …

Towards massive multilingual holistic bias

XE Tan, P Hansanti, C Wood, B Yu, C Ropers… - arxiv preprint arxiv …, 2024 - arxiv.org
In the current landscape of automatic language generation, there is a need to understand,
evaluate, and mitigate demographic biases as existing models are becoming increasingly …

Contamination Report for Multilingual Benchmarks

S Ahuja, V Gumma, S Sitaram - arxiv preprint arxiv:2410.16186, 2024 - arxiv.org
Benchmark contamination refers to the presence of test datasets in Large Language Model
(LLM) pre-training or post-training data. Contamination can lead to inflated scores on …