[PDF][PDF] Language model behavior: A comprehensive survey

TA Chang, BK Bergen - Computational Linguistics, 2024 - direct.mit.edu
Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …

Lightglue: Local feature matching at light speed

P Lindenberger, PE Sarlin… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce LightGlue, a deep neural network that learns to match local features across
images. We revisit multiple design decisions of SuperGlue, the state of the art in sparse …

Audiolm: a language modeling approach to audio generation

Z Borsos, R Marinier, D Vincent… - … ACM transactions on …, 2023 - ieeexplore.ieee.org
We introduce AudioLM, a framework for high-quality audio generation with long-term
consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts …

A holistic approach to undesired content detection in the real world

T Markov, C Zhang, S Agarwal, FE Nekoul… - Proceedings of the …, 2023 - ojs.aaai.org
We present a holistic approach to building a robust and useful natural language
classification system for real-world content moderation. The success of such a system relies …

Do large language models understand us?

BA y Arcas - Daedalus, 2022 - direct.mit.edu
Large language models (LLMs) represent a major advance in artificial intelligence and, in
particular, toward the goal of human-like artificial general intelligence. It is sometimes …

[PDF][PDF] ChatGPT vs. Bard: a comparative study

I Ahmed, A Roy, M Kajol, U Hasan, PP Datta… - Authorea …, 2023 - authorea.com
The rapid progress in conversational AI has given rise to advanced language models
capable of generating humanlike texts. Among these models, ChatGPT and Bard …

A survey of neural code intelligence: Paradigms, advances and beyond

Q Sun, Z Chen, F Xu, K Cheng, C Ma, Z Yin… - arxiv preprint arxiv …, 2024 - arxiv.org
Neural Code Intelligence--leveraging deep learning to understand, generate, and optimize
code--holds immense potential for transformative impacts on the whole society. Bridging the …

Omni-smola: Boosting generalist multimodal models with soft mixture of low-rank experts

J Wu, X Hu, Y Wang, B Pang… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this work we present Omni-SMoLA a multimodal architecture that mixes many multi-modal
experts efficiently and achieves both high specialist and generalist performance. In contrast …

Understanding the potential of fpga-based spatial acceleration for large language model inference

H Chen, J Zhang, Y Du, S **ang, Z Yue… - ACM Transactions on …, 2024 - dl.acm.org
Recent advancements in large language models (LLMs) boasting billions of parameters
have generated a significant demand for efficient deployment in inference workloads. While …

Fairness-aware structured pruning in transformers

A Zayed, G Mordido, S Shabanian, I Baldini… - Proceedings of the …, 2024 - ojs.aaai.org
The increasing size of large language models (LLMs) has introduced challenges in their
training and inference. Removing model components is perceived as a solution to tackle the …