Deep residual learning for image recognition K He, X Zhang, S Ren, J Sun Proceedings of the IEEE conference on computer vision and pattern …, 2016 | 256251 | 2016 |
Adam: A method for stochastic optimization DP Kingma arXiv preprint arXiv:1412.6980, 2014 | 203318 | 2014 |
Attention is all you need A Vaswani Advances in Neural Information Processing Systems, 2017 | 153082 | 2017 |
Random forests L Breiman Machine learning 45, 5-32, 2001 | 147794 | 2001 |
Imagenet classification with deep convolutional neural networks A Krizhevsky, I Sutskever, GE Hinton Advances in neural information processing systems 25, 2012 | 138841 | 2012 |
Very deep convolutional networks for large-scale image recognition K Simonyan, A Zisserman arXiv preprint arXiv:1409.1556, 2014 | 138133 | 2014 |
Bert: Pre-training of deep bidirectional transformers for language understanding J Devlin arXiv preprint arXiv:1810.04805, 2018 | 127380 | 2018 |
Long Short-term Memory S Hochreiter Neural Computation MIT-Press, 1997 | 120842 | 1997 |
Scikit-learn: Machine learning in Python F Pedregosa, G Varoquaux, A Gramfort, V Michel, B Thirion, O Grisel, ... the Journal of machine Learning research 12, 2825-2830, 2011 | 106259 | 2011 |
Deep learning Y LeCun, Y Bengio, G Hinton nature 521 (7553), 436-444, 2015 | 91451 | 2015 |
The elements of statistical learning: data mining, inference, and prediction T Hastie Springer, 2009 | 82239 | 2009 |
Imagenet: A large-scale hierarchical image database J Deng, W Dong, R Socher, LJ Li, K Li, L Fei-Fei 2009 IEEE conference on computer vision and pattern recognition, 248-255, 2009 | 80682 | 2009 |
Convex optimization S Boyd Cambridge UP, 2004 | 79358 | 2004 |
Reinforcement learning: An introduction RS Sutton A Bradford Book, 2018 | 78894 | 2018 |
Generative adversarial nets I Goodfellow, J Pouget-Abadie, M Mirza, B Xu, D Warde-Farley, S Ozair, ... Advances in neural information processing systems 27, 2014 | 77831 | 2014 |
Deep learning I Goodfellow MIT press, 2016 | 73760 | 2016 |
Gradient-based learning applied to document recognition Y LeCun, L Bottou, Y Bengio, P Haffner Proceedings of the IEEE 86 (11), 2278-2324, 1998 | 73262 | 1998 |
Support-Vector Networks C Cortes Machine Learning, 1995 | 72816 | 1995 |
The nature of statistical learning theory V Vapnik Springer science & business media, 2013 | 70274 | 2013 |
Classification and regression trees L Breiman Routledge, 2017 | 67897 | 2017 |