팔로우
Yibo Jiang
제목
인용
인용
연도
Beyond reverse KL: Generalizing direct preference optimization with diverse divergence constraints
C Wang, Y Jiang, C Yang, H Liu, Y Chen
International conference on learning representations (ICLR), Spotlight, 2023
592023
Invariant and transportable representations for anti-causal domain shifts
Y Jiang, V Veitch
Advances in Neural Information Processing Systems (NeurIPS) 35, 20782-20794, 2022
342022
Learning nonparametric latent causal graphs with unknown interventions
Y Jiang, B Aragam
Advances in Neural Information Processing Systems (NeurIPS) 36, 2023
282023
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
K Park, YJ Choe, Y Jiang, V Veitch
International conference on learning representations (ICLR), 2024
162024
On the origins of linear representations in large language models
Y Jiang, G Rajendran, P Ravikumar, B Aragam, V Veitch
International conference on machine learning (ICML), 2024
152024
Associative memory in iterated overparameterized sigmoid autoencoders
Y Jiang, C Pehlevan
International conference on machine learning (ICML), 4828-4838, 2020
152020
Meta-learning to cluster
Y Jiang, N Verma
arXiv preprint arXiv:1910.14134, 2019
132019
Uncovering meanings of embeddings via partial orthogonality
Y Jiang, B Aragam, V Veitch
Advances in Neural Information Processing Systems (NeurIPS) 36, 2023
92023
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Y Jiang, G Rajendran, P Ravikumar, B Aragam
Advances in Neural Information Processing Systems (NeurIPS) 37, 2024
52024
Direct Acquisition Optimization for Low-Budget Active Learning
Z Zhao, Y Jiang, Y Chen
arXiv preprint arXiv:2402.06045, 2024
42024
Model-agnostic meta-learning using runge-kutta methods
DJ Im, Y Jiang, N Verma
arXiv preprint arXiv:1910.07368, 2019
42019
Quantifying generalization complexity for large language models
Z Qi, H Luo, X Huang, Z Zhao, Y Jiang, X Fan, H Lakkaraju, J Glass
International conference on learning representations (ICLR), 2024
32024
Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
C Wang, Z Zhao, Y Jiang, Z Chen, C Zhu, Y Chen, J Liu, L Zhang, X Fan, ...
arXiv preprint arXiv:2501.09620, 2025
2025
현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.
학술자료 1–13