Follow
Yi Su
Yi Su
Google Deepmind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805, 2023
24802023
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
9712024
Doubly robust off-policy evaluation with shrinkage
Y Su, M Dimakopoulou, A Krishnamurthy, M Dudík
International Conference on Machine Learning, 2020, 2019
1092019
Offline rl for natural language generation with implicit language q learning
C Snell, I Kostrikov, Y Su, M Yang, S Levine
arXiv preprint arXiv:2206.11871, 2022
882022
Cab: Continuous adaptive blending for policy evaluation and learning
Y Su, L Wang, M Santacatterina, T Joachims
International Conference on Machine Learning, 6005-6014, 2019
842019
Off-policy bandits with deficient support
N Sachdeva, Y Su, T Joachims
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020
792020
Online adaptation to label distribution shift
R Wu, C Guo, Y Su, KQ Weinberger
Advances in Neural Information Processing Systems 34, 11340-11351, 2021
582021
Adaptive Estimator Selection for Off-Policy Evaluation
Y Su, P Srinath, A Krishnamurthy
International Conference on Machine Learning, 2020, 2020
452020
Training language models to self-correct via reinforcement learning
A Kumar, V Zhuang, R Agarwal, Y Su, JD Co-Reyes, A Singh, K Baumli, ...
arXiv preprint arXiv:2409.12917, 2024
432024
Optimizing Rankings for Recommendation in Matching Markets
Y Su, M Bayoumi, T Joachims
Proceedings of the ACM Web Conference 2022, 328-338, 2022
312022
Context-Aware Language Modeling for Goal-Oriented Dialogue Systems
C Snell, S Yang, J Fu, Y Su, S Levine
NAACL, 2022, 2022
262022
Recommendations as treatments
T Joachims, B London, Y Su, A Swaminathan, L Wang
AI Magazine 42 (3), 19-30, 2021
212021
Data-driven offline decision-making via invariant representation learning
H Qi, Y Su, A Kumar, S Levine
Advances in Neural Information Processing Systems 35, 13226-13237, 2022
172022
Data-driven model-based optimization via invariant representation learning
H Qi, Y Su, A Kumar, S Levine
Proc. Adv. Neur. Inf. Proc. Syst (NeurIPS), 2022
62022
Training language models to self-correct via reinforcement learning, 2024
A Kumar, V Zhuang, R Agarwal, Y Su, JD Co-Reyes, A Singh, K Baumli, ...
URL https://arxiv. org/abs/2409.12917, 0
6
Unified off-policy learning to rank: a reinforcement learning perspective
Z Zhang, Y Su, H Yuan, Y Wu, R Balasubramanian, Q Wu, H Wang, ...
Advances in Neural Information Processing Systems 36, 2024
42024
Long-Term Value of Exploration: Measurements, Findings and Algorithms
Y Su, X Wang, EY Le, L Liu, Y Li, H Lu, B Lipshitz, S Badam, L Heldt, S Bi, ...
Proceedings of the 17th ACM International Conference on Web Search and Data …, 2024
32024
Learning from logged bandit feedback of multiple loggers
Y Su, A Agarwal, T Joachims
ICML Workshop on Machine Learning for Causal Inference, Counterfactual …, 2018
32018
Value of exploration: Measurements, findings and algorithms
Y Su, X Wang, EY Le, L Liu, Y Li, H Lu, B Lipshitz, S Badam, L Heldt, S Bi, ...
CoRR, 2023
22023
International Conference on Machine Learning
Y Su, L Wang, M Santacatterina, T Joachims
22019
The system can't perform the operation now. Try again later.
Articles 1–20