Suivre
Yihao Feng
Yihao Feng
Apple AIML
Adresse e-mail validée de apple.com
Titre
Citée par
Citée par
Année
Action-depedent Control Variates for Policy Optimization via Stein's Identity
H Liu, Y Feng, Y Mao, D Zhou, J Peng, Q Liu
arXiv preprint arXiv:1710.11198, 2017
1052017
Unicontrol: A unified diffusion model for controllable visual generation in the wild
C Qin, S Zhang, N Yu, Y Feng, X Yang, Y Zhou, H Wang, JC Niebles, ...
arXiv preprint arXiv:2305.11147, 2023
1022023
Hive: Harnessing human feedback for instructional visual editing
S Zhang, X Yang, Y Feng, C Qin, CC Chen, N Yu, Z Chen, H Wang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
942024
Dynamic pricing and information disclosure for fresh produce: An artificial intelligence approach
C Yang, Y Feng, A Whinston
Production and Operations Management 31 (1), 155-171, 2022
902022
Doubly robust bias reduction in infinite horizon off-policy estimation
Z Tang, Y Feng, L Li, D Zhou, Q Liu
ICLR 2020, 2020
842020
Learning to draw samples with amortized stein variational gradient descent
Y Feng, D Wang, Q Liu
arXiv preprint arXiv:1707.06626, 2017
832017
Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents
Z Liu, W Yao, J Zhang, L Xue, S Heinecke, R Murthy, Y Feng, Z Chen, ...
arXiv preprint arXiv:2308.05960, 2023
772023
Libero: Benchmarking knowledge transfer for lifelong robot learning
B Liu, Y Zhu, C Gao, Y Feng, Q Liu, Y Zhu, P Stone
Advances in Neural Information Processing Systems 36, 2024
762024
A kernel loss for solving the bellman equation
Y Feng, L Li, Q Liu
Advances in Neural Information Processing Systems 32, 2019
702019
Retroformer: Retrospective large language agents with policy gradient optimization
W Yao, S Heinecke, JC Niebles, Z Liu, Y Feng, L Xue, R Murthy, Z Chen, ...
arXiv preprint arXiv:2308.02151, 2023
592023
Incremental few-shot text classification with multi-round new classes: Formulation, dataset and system
C Xia, W Yin, Y Feng, P Yu
arXiv preprint arXiv:2104.11882, 2021
572021
Accountable off-policy evaluation with kernel bellman statistics
Y Feng, T Ren, Z Tang, Q Liu
International Conference on Machine Learning, 3102-3111, 2020
462020
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
R Zhang, L Gui, Z Sun, Y Feng, K Xu, Y Zhang, D Fu, C Li, A Hauptmann, ...
arXiv preprint arXiv:2404.01258, 2024
42*2024
Unsupervised out-of-domain detection via pre-trained transformers
K Xu, T Ren, S Zhang, Y Feng, C Xiong
arXiv preprint arXiv:2106.00948, 2021
412021
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability
C Xia, C Xing, J Du, X Yang, Y Feng, R Xu, W Yin, C Xiong
arXiv preprint arXiv:2402.18667, 2024
37*2024
Two methods for wild variational inference
Q Liu, Y Feng
arXiv preprint arXiv:1612.00081, 2016
262016
Apigen: Automated pipeline for generating verifiable and diverse function-calling datasets
Z Liu, T Hoang, J Zhang, M Zhu, T Lan, S Kokane, J Tan, W Yao, Z Liu, ...
arXiv preprint arXiv:2406.18518, 2024
252024
Famo: Fast adaptive multitask optimization
B Liu, Y Feng, P Stone, Q Liu
Advances in Neural Information Processing Systems 36, 2024
232024
Fantastic rewards and how to tame them: A case study on reward learning for task-oriented dialogue systems
Y Feng, S Yang, S Zhang, J Zhang, C Xiong, M Zhou, H Wang
arXiv preprint arXiv:2302.10342, 2023
222023
Agentohana: Design unified data and training pipeline for effective agent learning
J Zhang, T Lan, R Murthy, Z Liu, W Yao, M Zhu, J Tan, T Hoang, Z Liu, ...
arXiv preprint arXiv:2402.15506, 2024
212024
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–20