A light CNN for deep face representation with noisy labels
The volume of convolutional neural network (CNN) models proposed for face recognition
has been continuously growing larger to better fit the large amount of training data. When …
has been continuously growing larger to better fit the large amount of training data. When …
Transmuter: Bridging the efficiency gap using memory and dataflow reconfiguration
With the end of Dennard scaling and Moore's law, it is becoming increasingly difficult to build
hardware for emerging applications that meet power and performance targets, while …
hardware for emerging applications that meet power and performance targets, while …
EPOC: A 28-nm 5.3 pJ/SOP Event-driven Parallel Neuromorphic Hardware with Neuromodulation-based Online Learning
Bio-inspired neuromorphic hardware with learning ability is highly promising to achieve
human-like intelligence, particularly in terms of high energy efficiency and strong …
human-like intelligence, particularly in terms of high energy efficiency and strong …
[HTML][HTML] ASIP Performance Enhancement by Hazard Control through Scoreboard
X Zhou, Y Man, P Hao, W Chen, B Yang, B Ding, D Liu - Micromachines, 2024 - mdpi.com
The application-specific instruction set processor (ASIP) has been gradually accepted in AI,
communication, media, game and industry control. The digital signal processor (DSP) is a …
communication, media, game and industry control. The digital signal processor (DSP) is a …
An approach to build cycle accurate full system VLIW simulation platform
L Yang, L Wang, X Zhang, DL Wang - Simulation Modelling Practice and …, 2016 - Elsevier
Very long instruction word (VLIW) architecture is widely used in the design of digital signal
processors (DSPs) and application-specific processors because of its hardware simplicity …
processors (DSPs) and application-specific processors because of its hardware simplicity …
Multilayer Dataflow based Butterfly Sparsity Orchestration to Accelerate Attention Workloads
Recent neural networks (NNs) with self-attention exhibit competitiveness across different AI
domains, but the essential attention mechanism brings massive computation and memory …
domains, but the essential attention mechanism brings massive computation and memory …
Obstacle-aware symmetrical clock tree construction
M Liu, Z Zhang, W Sun, D Wang - 2017 IEEE 60th International …, 2017 - ieeexplore.ieee.org
High performance chip design is always a hot topic in integrated circuit (IC) field. Clock
design plays a critical role in improving chip performance and affecting power consumption …
design plays a critical role in improving chip performance and affecting power consumption …
Parallel polar encoding in 5G communication
Y Guo, S **e, Z Liu, L Yang… - 2018 IEEE Symposium on …, 2018 - ieeexplore.ieee.org
Because of its theoretical capacity-achieving property, polar code has become the coding
scheme of the control channel in the 5G communication standard. Although its encoding …
scheme of the control channel in the 5G communication standard. Although its encoding …
A novel hardware support for heterogeneous multi-core memory system
T Hussain - Journal of Parallel and Distributed Computing, 2017 - Elsevier
Memory technology is one of the cornerstones of heterogeneous multi-core system
efficiency. Many memory techniques are developed to give good performance within the …
efficiency. Many memory techniques are developed to give good performance within the …
Conflict-Free Parallel Data Access Technology for Matrix Calculation in Memory System of ASIP of 5G/6G Macro Base Stations
W Chen, D Liu - IEEE Transactions on Computer-Aided Design …, 2023 - ieeexplore.ieee.org
Among the physical layer baseband algorithms in macro base stations, the matrix
processing has the dominant computing cost, large data access overhead, and complicated …
processing has the dominant computing cost, large data access overhead, and complicated …