Прати
Tim Dettmers
Tim Dettmers
Allen Institute for AI; Carnegie Mellon University
Верификована је имејл адреса на allenai.org - Почетна страница
Наслов
Навело
Навело
Година
Convolutional 2d knowledge graph embeddings
T Dettmers, P Minervini, P Stenetorp, S Riedel
AAAI 2018, 2018
32262018
Qlora: Efficient finetuning of quantized llms
T Dettmers, A Pagnoni, A Holtzman, L Zettlemoyer
NeurIPS 2023 (Oral), 2023
24792023
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
17732023
Llm. int8 (): 8-bit matrix multiplication for transformers at scale
T Dettmers, M Lewis, Y Belkada, L Zettlemoyer
NeurIPS 2022, 2022
1006*2022
Sparse networks from scratch: Faster training without losing performance
T Dettmers, L Zettlemoyer
arXiv preprint arXiv:1907.04840, 2019
3922019
8-bit Optimizers via Block-wise Quantization
T Dettmers, M Lewis, S Shleifer, L Zettlemoyer
ICLR 2022 (Spotlight), 2022
2542022
Base layers: Simplifying training of large, sparse models
M Lewis, S Bhosale, T Dettmers, N Goyal, L Zettlemoyer
ICML 2021, 2021
2532021
8-bit Approximations for Parallelism in Deep Learning
T Dettmers
ICLR 2016, 2016
2322016
Spqr: A sparse-quantized representation for near-lossless llm weight compression
T Dettmers, R Svirschevski, V Egiazarian, D Kuznedelev, E Frantar, ...
arXiv preprint arXiv:2306.03078, 2023
2142023
The case for 4-bit precision: k-bit inference scaling laws
T Dettmers, L Zettlemoyer
ICML 2023, 2023
1982023
Branch-train-merge: Embarrassingly parallel training of expert language models
M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ...
arXiv preprint arXiv:2208.03306, 2022
1562022
Petals: Collaborative inference and fine-tuning of large models
A Borzunov, D Baranchuk, T Dettmers, M Ryabinin, Y Belkada, ...
ACL 2022, Demonstration, 2022
102*2022
Qlora: Efficient finetuning of quantized llms. arXiv 2023
T Dettmers, A Pagnoni, A Holtzman, L Zettlemoyer
arXiv preprint arXiv:2305.14314, 2023
552023
Stable and low-precision training for large-scale vision-language models
M Wortsman, T Dettmers, L Zettlemoyer, A Morcos, A Farhadi, L Schmidt
NeurIPS 2023, 2023
422023
Swarm parallelism: Training large models can be surprisingly communication-efficient
M Ryabinin, T Dettmers, M Diskin, A Borzunov
NeurIPS 2023, 2023
272023
Matformer: Nested transformer for elastic inference
F Devvrit, S Kudugunta, A Kusupati, T Dettmers, K Chen, IS Dhillon, ...
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024
122024
Jack the reader-A machine reading framework
D Weissenborn, P Minervini, T Dettmers, I Augenstein, J Welbl, ...
arXiv preprint arXiv:1806.08727, 2018
122018
Training transformers together
A Borzunov, M Ryabinin, T Dettmers, Q Lhoest, L Saulnier, M Diskin, ...
NeurIPS 2021 Demonstration, 2022
102022
Matformer: Nested transformer for elastic inference
S Kudugunta, A Kusupati, T Dettmers, K Chen, I Dhillon, Y Tsvetkov, ...
arXiv preprint arXiv:2310.07707, 2023
92023
High performance natural language processing
G Ilharco, C Ilharco, I Turc, T Dettmers, F Ferreira, K Lee
EMNLP 2020, Tutorial, 2020
72020
Систем тренутно не може да изврши ову радњу. Пробајте поново касније.
Чланци 1–20