Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Prompting large language model for machine translation: A case study

B Zhang, B Haddow, A Birch - International Conference on …, 2023 - proceedings.mlr.press
Research on prompting has shown excellent performance with little or even no supervised
training across many tasks. However, prompting for machine translation is still under …

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Prompting palm for translation: Assessing strategies and performance

D Vilar, M Freitag, C Cherry, J Luo, V Ratnakar… - arxiv preprint arxiv …, 2022 - arxiv.org
Large language models (LLMs) that have been trained on multilingual but not parallel text
exhibit a remarkable ability to translate between languages. We probe this ability in an in …

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arxiv preprint arxiv …, 2024 - arxiv.org
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

Faithful logical reasoning via symbolic chain-of-thought

J Xu, H Fei, L Pan, Q Liu, ML Lee, W Hsu - arxiv preprint arxiv:2405.18357, 2024 - arxiv.org
While the recent Chain-of-Thought (CoT) technique enhances the reasoning ability of large
language models (LLMs) with the theory of mind, it might still struggle in handling logical …

Nonparametric masked language modeling

S Min, W Shi, M Lewis, X Chen, W Yih… - arxiv preprint arxiv …, 2022 - arxiv.org
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary,
which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first …

Meet in the middle: A new pre-training paradigm

A Nguyen, N Karampatziakis… - Advances in Neural …, 2023 - proceedings.neurips.cc
Most language models (LMs) are trained and applied in an autoregressive left-to-right
fashion, predicting the next token from the preceding ones. However, this ignores that the full …

Towards generating functionally correct code edits from natural language issue descriptions

S Fakhoury, S Chakraborty, M Musuvathi… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs), such as OpenAI's Codex, have demonstrated their potential
to generate code from natural language descriptions across a wide range of programming …

This land is Your, My land: Evaluating geopolitical bias in language models through territorial disputes

B Li, S Haider, C Callison-Burch - … of the 2024 Conference of the …, 2024 - aclanthology.org
Abstract Do the Spratly Islands belong to China, the Philippines, or Vietnam? A pretrained
large language model (LLM) may answer differently if asked in the languages of each …