Open-sora: Democratizing efficient video production for all

Z Zheng, X Peng, T Yang, C Shen, S Li, H Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Vision and language are the two foundational senses for humans, and they build up our
cognitive ability and intelligence. While significant breakthroughs have been made in AI …

Stable and low-precision training for large-scale vision-language models

M Wortsman, T Dettmers… - Advances in …, 2023 - proceedings.neurips.cc
We introduce new methods for 1) accelerating and 2) stabilizing training for large language-
vision models. 1) For acceleration, we introduce SwitchBack, a linear layer for int8 quantized …

Virchow2: Scaling self-supervised mixed magnification models in pathology

E Zimmermann, E Vorontsov, J Viret, A Casson… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation models are rapidly being developed for computational pathology applications.
However, it remains an open question which factors are most important for downstream …

On the implicit bias of adam

MD Cattaneo, JM Klusowski, B Shigida - arxiv preprint arxiv:2309.00079, 2023 - arxiv.org
In previous literature, backward error analysis was used to find ordinary differential
equations (ODEs) approximating the gradient descent trajectory. It was found that finite step …

Xgen-7b technical report

E Nijkamp, T **e, H Hayashi, B Pang, C **a… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have become ubiquitous across various domains,
transforming the way we interact with information and conduct research. However, most high …

Jointly training large autoregressive multimodal models

E Aiello, L Yu, Y Nie, A Aghajanyan, B Oguz - arxiv preprint arxiv …, 2023 - arxiv.org
In recent years, advances in the large-scale pretraining of language and text-to-image
models have revolutionized the field of machine learning. Yet, integrating these two …

Towards foundation models for materials science: The open matsci ml toolkit

KLK Lee, C Gonzales, M Spellings, M Galkin… - Proceedings of the SC' …, 2023 - dl.acm.org
Artificial intelligence and machine learning have shown great promise in their ability to
accelerate novel materials discovery. As researchers and domain scientists seek to unify …

Why transformers need adam: A hessian perspective

Y Zhang, C Chen, T Ding, Z Li, R Sun… - arxiv preprint arxiv …, 2024 - arxiv.org
SGD performs worse than Adam by a significant margin on Transformers, but the reason
remains unclear. In this work, we provide an explanation of SGD's failure on Transformers …

Recontrast: Domain-specific anomaly detection via contrastive reconstruction

J Guo, L Jia, W Zhang, H Li - Advances in Neural …, 2024 - proceedings.neurips.cc
Most advanced unsupervised anomaly detection (UAD) methods rely on modeling feature
representations of frozen encoder networks pre-trained on large-scale datasets, eg …

Data efficient neural scaling law via model reusing

P Wang, R Panda, Z Wang - International Conference on …, 2023 - proceedings.mlr.press
The number of parameters in large transformers has been observed to grow exponentially.
Despite notable performance improvements, concerns have been raised that such a …