Obserwuj
Sheng Ma
Tytuł
Cytowane przez
Cytowane przez
Rok
DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip
S Ma, N Enright Jerger, Z Wang
Proceedings of the 38th annual international symposium on Computer …, 2011
2272011
Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip
S Ma, NE Jerger, Z Wang
IEEE International Symposium on High-Performance Comp Architecture, 1-12, 2012
1122012
Low-cost binary128 floating-point FMA unit design with SIMD support
L Huang, S Ma, L Shen, Z Wang, N Xiao
IEEE Transactions on Computers 61 (5), 745-751, 2011
542011
Supporting efficient collective communication in NoCs
S Ma, NE Jerger, Z Wang
IEEE International Symposium on High-Performance Comp Architecture, 1-12, 2012
502012
Leaving one slot empty: Flit bubble flow control for torus cache-coherent NoCs
S Ma, Z Wang, Z Liu, NE Jerger
IEEE Transactions on Computers 64 (3), 763-777, 2013
482013
A Survey of Design and Optimization for Systolic Array-based DNN Accelerators
R Xu, S Ma, Y Guo, D Li
ACM Computing Surveys 56 (1), 1-37, 2023
412023
A high performance reliable NoC router
L Wang, S Ma, C Li, W Chen, Z Wang
Integration 58, 583-592, 2017
382017
Novel flow control for fully adaptive routing in cache-coherent NoCs
S Ma, Z Wang, NE Jerger, L Shen, N Xiao
IEEE Transactions on Parallel and Distributed Systems 25 (9), 2397-2407, 2013
382013
Configurable multi-directional systolic array architecture for convolutional neural networks
R Xu, S Ma, Y Wang, X Chen, Y Guo
ACM Transactions on Architecture and Code Optimization (TACO) 18 (4), 1-24, 2021
332021
Heterogeneous systolic array architecture for compact cnns hardware accelerators
R Xu, S Ma, Y Wang, Y Guo, D Li, Y Qiao
IEEE Transactions on Parallel and Distributed Systems 33 (11), 2860-2871, 2021
282021
Networks-on-chip: from implementations to programming paradigms
S Ma, L Huang, M Lai, W Shi
Morgan Kaufmann, 2014
282014
SIF: Overcoming the limitations of SIMD devices via implicit permutation
L Huang, L Shen, Z Wang, W Shi, N Xiao, S Ma
HPCA-16 2010 The Sixteenth International Symposium on High-Performance …, 2010
282010
A low-cost conflict-free NoC for GPGPUs
X Zhao, S Ma, Y Liu, L Eeckhout, Z Wang
Proceedings of the 53rd Annual Design Automation Conference, 1-6, 2016
232016
A heterogeneous low-cost and low-latency ring-chain network for GPGPUs
X Zhao, S Ma, C Li, L Eeckhout, Z Wang
2016 IEEE 34th International Conference on Computer Design (ICCD), 472-479, 2016
202016
CMSA: Configurable multi-directional systolic array for convolutional neural networks
R Xu, S Ma, Y Wang, Y Guo
2020 IEEE 38th International Conference on Computer Design (ICCD), 494-497, 2020
192020
Priority-based PCIe scheduling for multi-tenant multi-GPU systems
C Li, Y Sun, L Jin, L Xu, Z Cao, P Fan, D Kaeli, S Ma, Y Guo, J Yang
IEEE Computer Architecture Letters 18 (2), 157-160, 2019
172019
RHS-TRNG: A resilient high-speed true random number generator based on STT-MTJ device
S Fu, T Li, C Zhang, H Li, S Ma, J Zhang, R Zhang, L Wu
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2023
162023
HeSA: Heterogeneous systolic array architecture for compact CNNs hardware accelerators
R Xu, S Ma, Y Wang, Y Guo
2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 657-662, 2021
162021
A comprehensive comparison between virtual cut-through and wormhole routers for cache coherent Network on-Chips
P Wang, S Ma, H Lu, Z Wang
IEICE Electronics Express 11 (14), 20140496-20140496, 2014
142014
Coordinated DMA: improving the DRAM access efficiency for matrix multiplication
S Ma, Z Liu, S Chen, L Huang, Y Guo, Z Wang, M Zhang
IEEE Transactions on Parallel and Distributed Systems 30 (10), 2148-2164, 2019
132019
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20