Photonic matrix multiplication lights up photonic accelerator and beyond

H Zhou, J Dong, J Cheng, W Dong, C Huang… - Light: Science & …, 2022 - nature.com
Matrix computation, as a fundamental building block of information processing in science
and technology, contributes most of the computational overheads in modern signal …

Sustainable ai: Environmental implications, challenges and opportunities

CJ Wu, R Raghavendra, U Gupta… - Proceedings of …, 2022 - proceedings.mlsys.org
This paper explores the environmental impact of the super-linear growth trends for AI from a
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc
Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

Training compute-optimal large language models

J Hoffmann, S Borgeaud, A Mensch… - arxiv preprint arxiv …, 2022 - arxiv.org
We investigate the optimal model size and number of tokens for training a transformer
language model under a given compute budget. We find that current large language models …

Mip-nerf 360: Unbounded anti-aliased neural radiance fields

JT Barron, B Mildenhall, D Verbin… - Proceedings of the …, 2022 - openaccess.thecvf.com
Though neural radiance fields (" NeRF") have demonstrated impressive view synthesis
results on objects and small bounded regions of space, they struggle on" unbounded" …

Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs

M Niemeyer, JT Barron, B Mildenhall… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract Neural Radiance Fields (NeRF) have emerged as a powerful representation for the
task of novel view synthesis due to their simplicity and state-of-the-art performance. Though …

An empirical analysis of compute-optimal large language model training

J Hoffmann, S Borgeaud, A Mensch… - Advances in …, 2022 - proceedings.neurips.cc
We investigate the optimal model size and number of tokens for training a transformer
language model under a given compute budget. We find that current large language models …

Edge learning using a fully integrated neuro-inspired memristor chip

W Zhang, P Yao, B Gao, Q Liu, D Wu, Q Zhang, Y Li… - Science, 2023 - science.org
Learning is highly important for edge intelligence devices to adapt to different application
scenes and owners. Current technologies for training neural networks require moving …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

[HTML][HTML] Large language models in law: A survey

J Lai, W Gan, J Wu, Z Qi, SY Philip - AI Open, 2024 - Elsevier
The advent of artificial intelligence (AI) has significantly impacted the traditional judicial
industry. Moreover, recently, with the development of the concept of AI-generated content …