Foundation and large language models: fundamentals, challenges, opportunities, and social impacts

D Myers, R Mohawesh, VI Chellaboina, AL Sathvik… - Cluster …, 2024 - Springer
Abstract Foundation and Large Language Models (FLLMs) are models that are trained using
a massive amount of data with the intent to perform a variety of downstream tasks. FLLMs …

Transformer: A general framework from machine translation to others

Y Zhao, J Zhang, C Zong - Machine Intelligence Research, 2023 - Springer
Abstract Machine translation is an important and challenging task that aims at automatically
translating natural language sentences from one language into another. Recently …

Break the sequential dependency of llm inference using lookahead decoding

Y Fu, P Bailis, I Stoica, H Zhang - arxiv preprint arxiv:2402.02057, 2024 - arxiv.org
Autoregressive decoding of large language models (LLMs) is memory bandwidth bounded,
resulting in high latency and significant wastes of the parallel processing power of modern …

A survey on non-autoregressive generation for neural machine translation and beyond

Y **ao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Artificial intelligence foundation and pre-trained models: Fundamentals, applications, opportunities, and social impacts

A Kolides, A Nawaz, A Rathor, D Beeman… - … Modelling Practice and …, 2023 - Elsevier
With the emergence of foundation models (FMs) that are trained on large amounts of data at
scale and adaptable to a wide range of downstream applications, AI is experiencing a …

Multilingual text categorization and sentiment analysis: a comparative analysis of the utilization of multilingual approaches for classifying twitter data

G Manias, A Mavrogiorgou, A Kiourtis… - Neural Computing and …, 2023 - Springer
Text categorization and sentiment analysis are two of the most typical natural language
processing tasks with various emerging applications implemented and utilized in different …

Amom: adaptive masking over masking for conditional masked language model

Y **ao, R Xu, L Wu, J Li, T Qin, TY Liu… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Transformer-based autoregressive (AR) methods have achieved appealing performance for
varied sequence-to-sequence generation tasks, eg, neural machine translation …

ESM all-atom: multi-scale protein language model for unified molecular modeling

K Zheng, S Long, T Lu, J Yang, X Dai, M Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Protein language models have demonstrated significant potential in the field of protein
engineering. However, current protein language models primarily operate at the residue …

Importance-aware data augmentation for document-level neural machine translation

M Wu, Y Wang, G Foster, L Qu, G Haffari - arxiv preprint arxiv:2401.15360, 2024 - arxiv.org
Document-level neural machine translation (DocNMT) aims to generate translations that are
both coherent and cohesive, in contrast to its sentence-level counterpart. However, due to its …

Code-switching with word senses for pretraining in neural machine translation

V Iyer, E Barba, A Birch, JZ Pan, R Navigli - arxiv preprint arxiv …, 2023 - arxiv.org
Lexical ambiguity is a significant and pervasive challenge in Neural Machine Translation
(NMT), with many state-of-the-art (SOTA) NMT systems struggling to handle polysemous …