Volgen
Tim Dettmers
Tim Dettmers
Allen Institute for AI; Carnegie Mellon University
Geverifieerd e-mailadres voor allenai.org - Homepage
Titel
Geciteerd door
Geciteerd door
Jaar
Convolutional 2d knowledge graph embeddings
T Dettmers, P Minervini, P Stenetorp, S Riedel
AAAI 2018, 2018
32132018
Qlora: Efficient finetuning of quantized llms
T Dettmers, A Pagnoni, A Holtzman, L Zettlemoyer
NeurIPS 2023 (Oral), 2023
23942023
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
17482023
Llm. int8 (): 8-bit matrix multiplication for transformers at scale
T Dettmers, M Lewis, Y Belkada, L Zettlemoyer
NeurIPS 2022, 2022
974*2022
Sparse networks from scratch: Faster training without losing performance
T Dettmers, L Zettlemoyer
arXiv preprint arXiv:1907.04840, 2019
3922019
Base layers: Simplifying training of large, sparse models
M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer
ICML 2021, 2021
2522021
8-bit Optimizers via Block-wise Quantization
T Dettmers, M Lewis, S Shleifer, L Zettlemoyer
ICLR 2022 (Spotlight), 2022
2422022
8-bit Approximations for Parallelism in Deep Learning
T Dettmers
ICLR 2016, 2016
2342016
Spqr: A sparse-quantized representation for near-lossless llm weight compression
T Dettmers, R Svirschevski, V Egiazarian, D Kuznedelev, E Frantar, ...
arXiv preprint arXiv:2306.03078, 2023
2022023
The case for 4-bit precision: k-bit inference scaling laws
T Dettmers, L Zettlemoyer
ICML 2023, 2023
1942023
Branch-train-merge: Embarrassingly parallel training of expert language models
M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ...
arXiv preprint arXiv:2208.03306, 2022
1452022
Petals: Collaborative inference and fine-tuning of large models
A Borzunov, D Baranchuk, T Dettmers, M Ryabinin, Y Belkada, ...
ACL 2022, Demonstration, 2022
96*2022
Stable and low-precision training for large-scale vision-language models
M Wortsman, T Dettmers, L Zettlemoyer, A Morcos, A Farhadi, L Schmidt
NeurIPS 2023, 2023
372023
Swarm parallelism: Training large models can be surprisingly communication-efficient
M Ryabinin, T Dettmers, M Diskin, A Borzunov
NeurIPS 2023, 2023
262023
Jack the reader-A machine reading framework
D Weissenborn, P Minervini, T Dettmers, I Augenstein, J Welbl, ...
arXiv preprint arXiv:1806.08727, 2018
122018
Training transformers together
A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ...
NeurIPS 2021 Demonstration, 2022
112022
Matformer: Nested transformer for elastic inference
S Kudugunta, A Kusupati, T Dettmers, K Chen, I Dhillon, Y Tsvetkov, ...
arXiv preprint arXiv:2310.07707, 2023
72023
High performance natural language processing
G Ilharco, C Ilharco, I Turc, T Dettmers, F Ferreira, K Lee
EMNLP 2020, Tutorial, 2020
72020
Olmoe: Open mixture-of-experts language models
N Muennighoff, L Soldaini, D Groeneveld, K Lo, J Morrison, S Min, W Shi, ...
arXiv preprint arXiv:2409.02060, 2024
42024
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
R Shao, J He, A Asai, W Shi, T Dettmers, S Min, L Zettlemoyer, PWW Koh
Advances in Neural Information Processing Systems 37, 91260-91299, 2025
32025
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20