The curious case of neural text degeneration A Holtzman, J Buys, L Du, M Forbes, Y Choi arXiv preprint arXiv:1904.09751, 2019 | 3231 | 2019 |
A Formal Perspective on Byte-Pair Encoding V Zouhar, C Meister, JL Gastaldi, L Du, T Vieira, M Sachan, R Cotterell arXiv preprint arXiv:2306.16837, 2023 | 38 | 2023 |
A Measure-Theoretic Characterization of Tight Language Models L Du, LT Hennigen, T Pimentel, C Meister, J Eisner, R Cotterell arXiv preprint arXiv:2212.10502, 2022 | 30 | 2022 |
Tokenization and the Noiseless Channel V Zouhar, C Meister, JL Gastaldi, L Du, M Sachan, R Cotterell arXiv preprint arXiv:2306.16842, 2023 | 29 | 2023 |
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning S Xie, J Qiu, A Pasad, L Du, Q Qu, H Mei arXiv preprint arXiv:2210.10041, 2022 | 20 | 2022 |
On the Representational Capacity of Recurrent Neural Language Models F Nowak, A Svete, L Du, R Cotterell arXiv preprint arXiv:2310.12942, 2023 | 11 | 2023 |
Formal Aspects of Language Modeling R Cotterell, A Svete, C Meister, T Liu, L Du arXiv preprint arXiv:2311.04329, 2023 | 10 | 2023 |
Structured voronoi sampling A Amini, L Du, R Cotterell Advances in Neural Information Processing Systems 36, 2024 | 6 | 2024 |
Autoregressive Modeling with Lookahead Attention L Du, H Mei, J Eisner arXiv preprint arXiv:2305.12272, 2023 | 5 | 2023 |
When is a Language Process a Language Model? L Du, H Lee, J Eisner, R Cotterell Findings of the Association for Computational Linguistics ACL 2024, 11083-11094, 2024 | 1 | 2024 |
Principled Gradient-based Markov Chain Monte Carlo for Text Generation L Du, A Amini, LT Hennigen, XV Yu, J Eisner, H Lee, R Cotterell ICML 2024, 2023 | | 2023 |
Principled Gradient-Based MCMC for Conditional Sampling of Text L Du, A Amini, LT Hennigen, H Lee, J Eisner, R Cotterell Forty-first International Conference on Machine Learning, 0 | | |