ติดตาม
Thomas Wolf
Thomas Wolf
Co-founder at HuggingFace
ยืนยันอีเมลแล้วที่ polytechnique.edu - หน้าแรก
ชื่อ
อ้างโดย
อ้างโดย
ปี
Transformers: State-of-the-art natural language processing
T Wolf, L Debut, V Sanh, J Chaumond, C Delangue, A Moi, P Cistac, ...
Proceedings of the 2020 conference on empirical methods in natural language …, 2020
17097*2020
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
V Sanh, L Debut, J Chaumond, T Wolf
arXiv preprint arXiv:1910.01108, 2019
88652019
Multitask prompted training enables zero-shot task generalization
V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ...
arXiv preprint arXiv:2110.08207, 2021
18052021
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
17822023
Starcoder: may the source be with you!
R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ...
arXiv preprint arXiv:2305.06161, 2023
1009*2023
Transfer learning in natural language processing
S Ruder, ME Peters, S Swayamdipta, T Wolf
Proceedings of the 2019 conference of the North American chapter of the …, 2019
8202019
Datasets: A community library for natural language processing
Q Lhoest, AV Del Moral, Y Jernite, A Thakur, P Von Platen, S Patil, ...
arXiv preprint arXiv:2109.02846, 2021
605*2021
Transfertransfo: A transfer learning approach for neural network based conversational agents
T Wolf, V Sanh, J Chaumond, C Delangue
arXiv preprint arXiv:1901.08149, 2019
5472019
Zephyr: Direct distillation of lm alignment
L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ...
arXiv preprint arXiv:2310.16944, 2023
5232023
Movement pruning: Adaptive sparsity by fine-tuning
V Sanh, T Wolf, A Rush
Advances in neural information processing systems 33, 20378-20389, 2020
4992020
Diffusers: State-of-the-art diffusion models
P Von Platen, S Patil, A Lozhkov, P Cuenca, N Lambert, K Rasul, ...
4612022
Natural language processing with transformers
L Tunstall, L Von Werra, T Wolf
" O'Reilly Media, Inc.", 2022
4582022
Two-dimensional superconductivity at a Mott insulator/band insulator interface LaTiO3/SrTiO3
J Biscaras, N Bergeal, A Kushwaha, T Wolf, A Rastogi, RC Budhani, ...
Nature communications 1 (1), 89, 2010
3552010
Open llm leaderboard
E Beeching, C Fourrier, N Habib, S Han, N Lambert, N Rajani, ...
3182023
The stack: 3 tb of permissively licensed source code
D Kocetkov, R Li, LB Allal, J Li, C Mou, CM Ferrandis, Y Jernite, M Mitchell, ...
arXiv preprint arXiv:2211.15533, 2022
2902022
A hierarchical multi-task approach for learning embeddings from semantic tasks
V Sanh, T Wolf, S Ruder
Proceedings of the AAAI conference on artificial intelligence 33 (01), 6949-6956, 2019
2822019
Huggingface’s transformers: State-of-the-art natural language processing. arXiv 2019
T Wolf, L Debut, V Sanh, J Chaumond, C Delangue, A Moi, P Cistac, ...
arXiv preprint arXiv:1910.03771 10, 2020
2472020
Scaling data-constrained language models
N Muennighoff, A Rush, B Barak, T Le Scao, N Tazi, A Piktus, S Pyysalo, ...
Advances in Neural Information Processing Systems 36, 50358-50376, 2023
2432023
Starcoder 2 and the stack v2: The next generation
A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ...
arXiv preprint arXiv:2402.19173, 2024
2132024
Grounding large language models in interactive environments with online reinforcement learning
T Carta, C Romac, T Wolf, S Lamprier, O Sigaud, PY Oudeyer
International Conference on Machine Learning, 3676-3713, 2023
1722023
ระบบไม่สามารถดำเนินการได้ในขณะนี้ โปรดลองใหม่อีกครั้งในภายหลัง
บทความ 1–20