Παρακολούθηση
Cyril Zhang
Cyril Zhang
Microsoft Research NYC
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα microsoft.com - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Phi-3 technical report: A highly capable language model locally on your phone
M Abdin, J Aneja, H Awadalla, A Awadallah, AA Awan, N Bach, A Bahree, ...
arXiv preprint arXiv:2404.14219, 2024
7702024
Transformers learn shortcuts to automata
B Liu, JT Ash, S Goel, A Krishnamurthy, C Zhang
International Conference on Learning Representations, 2023
1812023
Hidden progress in deep learning: SGD learns parities near the computational limit
B Barak, BL Edelman, S Goel, S Kakade, E Malach, C Zhang
Advances in Neural Information Processing Systems, 2022
1402022
Inductive biases and variable creation in self-attention mechanisms
BL Edelman, S Goel, S Kakade, C Zhang
International Conference on Machine Learning, 5793-5831, 2022
1392022
Understanding contrastive learning requires incorporating inductive biases
N Saunshi, J Ash, S Goel, D Misra, C Zhang, S Arora, S Kakade, ...
International Conference on Machine Learning, 19250-19286, 2022
1282022
Learning Linear Dynamical Systems via Spectral Filtering
E Hazan, K Singh, C Zhang
Advances in Neural Information Processing Systems, 6705-6715, 2017
1182017
Spectral filtering for general linear dynamical systems
E Hazan, H Lee, K Singh, C Zhang, Y Zhang
Advances in Neural Information Processing Systems 31, 2018
1132018
Efficient Regret Minimization in Non-Convex Games
E Hazan, K Singh, C Zhang
International Conference on Machine Learning 70, 1433-1441, 2017
1112017
Efficient full-matrix adaptive regularization
N Agarwal, B Bullins, X Chen, E Hazan, K Singh, C Zhang, Y Zhang
International Conference on Machine Learning, 102-110, 2019
91*2019
Exposing Attention Glitches with Flip-Flop Language Modeling
B Liu, JT Ash, S Goel, A Krishnamurthy, C Zhang
Advances in Neural Information Processing Systems 37, 2023
482023
Disentangling adaptive gradient methods from learning rates
N Agarwal, R Anil, E Hazan, T Koren, C Zhang
arXiv preprint arXiv:2002.11803, 2020
452020
Calibration, entropy rates, and memory in language models
M Braverman, X Chen, SM Kakade, K Narasimhan, C Zhang, Y Zhang
International Conference on Machine Learning, 2020
432020
A combined spectroscopic and photometric stellar activity study of Epsilon Eridani
MJ Giguere, DA Fischer, CXY Zhang, JM Matthews, C Cameron, ...
The Astrophysical Journal 824 (2), 150, 2016
382016
Acceleration via fractal learning rate schedules
N Agarwal, S Goel, C Zhang
International Conference on Machine Learning, 87-99, 2021
282021
Not-So-Random Features
B Bullins, C Zhang, Y Zhang
International Conference on Learning Representations, 2018
282018
Towards provable control for unknown linear dynamical systems
S Arora, E Hazan, H Lee, K Singh, C Zhang, Y Zhang
International Conference on Learning Representations, Workshop, 2018
262018
No-regret prediction in marginally stable systems
U Ghai, H Lee, K Singh, C Zhang, Y Zhang
Conference on Learning Theory, 2020
222020
Can large language models explore in-context?
A Krishnamurthy, K Harris, DJ Foster, C Zhang, A Slivkins
arXiv preprint arXiv:2403.15371, 2024
202024
Machine learning for mechanical ventilation control
D Suo, N Agarwal, W Xia, X Chen, U Ghai, A Yu, P Gradu, K Singh, ...
arXiv preprint arXiv:2102.06779, 2021
19*2021
Phi-4 technical report
M Abdin, J Aneja, H Behl, S Bubeck, R Eldan, S Gunasekar, M Harrison, ...
arXiv preprint arXiv:2412.08905, 2024
182024
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–20