Prati
Matteo Pagliardini
Naslov
Citirano
Citirano
Godina
Unsupervised learning of sentence embeddings using compositional n-gram features
M Pagliardini, P Gupta, M Jaggi
NAACL-HLT, 2018, 2017
9562017
Meditron-70b: Scaling medical pretraining for large language models
Z Chen, AH Cano, A Romanou, A Bonnet, K Matoba, F Salvi, ...
arXiv preprint arXiv:2311.16079, 2023
2622023
Agree to disagree: Diversity through disagreement for better transferability
M Pagliardini, M Jaggi, F Fleuret, SP Karimireddy
ICLR 2023, 2022
782022
Better word embeddings by disentangling contextual n-gram information
P Gupta, M Pagliardini, M Jaggi
NAACL-HLT, 2019, 2019
522019
Taming gans with lookahead
T Chavdarova, M Pagliardini, SU Stich, M Jaggi, F Fleuret
ICLR 2021, 2020
40*2020
Fast attention over long sequences with dynamic sparse flash attention
M Pagliardini, D Paliotta, M Jaggi, F Fleuret
Advances in Neural Information Processing Systems 36, 59808-59831, 2023
29*2023
Doge: Domain reweighting with generalization estimation
S Fan, M Pagliardini, M Jaggi
arXiv preprint arXiv:2310.15393, 2023
242023
The peril of popular deep learning uncertainty estimation methods
Y Liu, M Pagliardini, T Chavdarova, SU Stich
Bayesian Deep Learning workshop, at NeurIPS 2021, 2021
212021
Unsupervised learning of sentence embeddings using compositional n-gram features (2017)
M Pagliardini, P Gupta, M Jaggi
arXiv preprint arXiv:1703.02507, 2017
132017
The ademamix optimizer: Better, faster, older
M Pagliardini, P Ablin, D Grangier
arXiv preprint arXiv:2409.03137, 2024
72024
Denseformer: Enhancing information flow in transformers via depth weighted averaging
M Pagliardini, A Mohtashami, F Fleuret, M Jaggi
Advances in Neural Information Processing Systems 37, 136479-136508, 2025
62025
A primal-dual approach to solving variational inequalities with general constraints
T Chavdarova, T Yang, M Pagliardini, MI Jordan
arXiv preprint arXiv:2210.15659, 2022
5*2022
Meditron: Open medical foundation models adapted for clinical practice
A Bosselut, Z Chen, A Romanou, A Bonnet, A Hernández-Cano, ...
42024
Improving generalization via uncertainty driven perturbations
M Pagliardini, G Manunza, M Jaggi, MI Jordan, T Chavdarova
arXiv preprint arXiv:2202.05737, 2022
32022
Cotformer: More tokens with attention make up for less depth
A Mohtashami, M Pagliardini, M Jaggi
Workshop on Advancing Neural Network Training: Computational Efficiency …, 2023
22023
Diversity through disagreement for better transferability
M Pagliardini, M Jaggi, F Fleuret, SP Karimireddy
NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and …, 2022
22022
Fast causal attention with dynamic sparsity
D Paliotta, M Pagliardini, M Jaggi, F Fleuret
Workshop on Efficient Systems for Foundation Models@ ICML2023, 2023
12023
Improved generalization-robustness trade-off via uncertainty targeted attacks
M Pagliardini, G Manunza, M Jaggi, T Chavdarova
12022
Leveraging the true depth of LLMs
RC González, D Paliotta, M Pagliardini, M Jaggi, F Fleuret
arXiv preprint arXiv:2502.02790, 2025
2025
Leveraging the true depth of LLMs
R Calvo González, D Paliotta, M Pagliardini, M Jaggi, F Fleuret
arXiv e-prints, arXiv: 2502.02790, 2025
2025
Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.
Članci 1–20