Segueix
Yufeng Zhang
Yufeng Zhang
Correu electrònic verificat a u.northwestern.edu
Títol
Citada per
Citada per
Any
What and how does in-context learning learn? bayesian model averaging, parameterization, and generalization
Y Zhang, F Zhang, Z Yang, Z Wang
arXiv preprint arXiv:2305.19420, 2023
662023
Generative adversarial imitation learning with neural network parameterization: Global optimality and convergence rate
Y Zhang, Q Cai, Z Yang, Z Wang
International conference on machine learning, 11044-11054, 2020
35*2020
Learning from demonstration: Provably efficient adversarial policy imitation with linear function approximation
Z Liu, Y Zhang, Z Fu, Z Yang, Z Wang
International conference on machine learning, 14094-14138, 2022
26*2022
Provably Efficient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case
Y Zhang, Z Yang, Z Wang
International Conference on Artificial Intelligence and Statistics, 2764-2772, 2021
202021
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration
R Wu, Y Zhang, Z Yang, Z Wang
Advances in Neural Information Processing Systems 34, 2021
192021
Provably efficient offline reinforcement learning for partially observable Markov decision processes
H Guo, Q Cai, Y Zhang, Z Yang, Z Wang
International Conference on Machine Learning, 8016-8038, 2022
182022
Federated offline reinforcement learning
D Zhou, Y Zhang, A Sonabend-W, Z Wang, J Lu, T Cai
Journal of the American Statistical Association 119 (548), 3152-3163, 2024
142024
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Y Zhang, Q Cai, Z Yang, Y Chen, Z Wang
Advances in Neural Information Processing Systems 33, 19680-19692, 2020
132020
An analysis of attention via the lens of exchangeability and latent variable models
Y Zhang, B Liu, Q Cai, L Wang, Z Wang
arXiv preprint arXiv:2212.14852, 2022
102022
Infinite-dimensional optimization for zero-sum games via variational transport
L Liu, Y Zhang, Z Yang, R Babanezhad, Z Wang
International conference on machine learning, 7033-7044, 2021
10*2021
Can large language models play games? a case study of a self-play approach
H Guo, Z Liu, Y Zhang, Z Wang
arXiv preprint arXiv:2403.05632, 2024
82024
Variational transport: A convergent particle-basedalgorithm for distributional optimization
Z Yang, Y Zhang, Y Chen, Z Wang
arXiv preprint arXiv:2012.11554, 2020
72020
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
Y Zhang, S Chen, Z Yang, M Jordan, Z Wang
Advances in Neural Information Processing Systems 34, 2021
62021
Lobass: Gauging learnability in supervised fine-tuning data
H Zhou, T Liu, Q Ma, J Yuan, P Liu, Y You, H Yang
arXiv preprint arXiv:2310.13008, 2023
52023
Fullstack bench: Evaluating llms as full stack coder
S Liu, H Zhu, J Liu, S Xin, A Li, R Long, L Chen, J Yang, J Xia, ZY Peng, ...
arXiv preprint arXiv:2412.00535, 2024
42024
Seed-cts: Unleashing the power of tree search for superior performance in competitive coding tasks
H Wang, B Liu, Y Zhang, J Chen
arXiv preprint arXiv:2412.12544, 2024
12024
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
S Zhang, Z Liu, B Liu, Y Zhang, Y Yang, Y Liu, L Chen, T Sun, Z Wang
arXiv preprint arXiv:2410.08067, 2024
12024
BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data
X Wang, Q Cui, Y Tao, Y Wang, Z Chai, X Han, B Liu, J Yuan, J Su, ...
arXiv preprint arXiv:2410.00773, 2024
2024
A Mean-Field Analysis of Neural Gradient Descent-Ascent: Applications to Functional Conditional Moment Equations
Y Zhu, Y Zhang, Z Wang, Z Yang, X Chen
arXiv e-prints, arXiv: 2404.12312, 2024
2024
En aquests moments el sistema no pot dur a terme l'operació. Torneu-ho a provar més tard.
Articles 1–19