A light CNN for deep face representation with noisy labels

X Wu, R He, Z Sun, T Tan - IEEE transactions on information …, 2018 - ieeexplore.ieee.org
The volume of convolutional neural network (CNN) models proposed for face recognition
has been continuously growing larger to better fit the large amount of training data. When …

Transmuter: Bridging the efficiency gap using memory and dataflow reconfiguration

S Pal, S Feng, D Park, S Kim, A Amarnath… - Proceedings of the …, 2020 - dl.acm.org
With the end of Dennard scaling and Moore's law, it is becoming increasingly difficult to build
hardware for emerging applications that meet power and performance targets, while …

EPOC: A 28-nm 5.3 pJ/SOP Event-driven Parallel Neuromorphic Hardware with Neuromodulation-based Online Learning

F Chen, Q Tian, L **e, Y Zhou, Z Wu… - … Circuits and Systems, 2024 - ieeexplore.ieee.org
Bio-inspired neuromorphic hardware with learning ability is highly promising to achieve
human-like intelligence, particularly in terms of high energy efficiency and strong …

[HTML][HTML] ASIP Performance Enhancement by Hazard Control through Scoreboard

X Zhou, Y Man, P Hao, W Chen, B Yang, B Ding, D Liu - Micromachines, 2024 - mdpi.com
The application-specific instruction set processor (ASIP) has been gradually accepted in AI,
communication, media, game and industry control. The digital signal processor (DSP) is a …

An approach to build cycle accurate full system VLIW simulation platform

L Yang, L Wang, X Zhang, DL Wang - Simulation Modelling Practice and …, 2016 - Elsevier
Very long instruction word (VLIW) architecture is widely used in the design of digital signal
processors (DSPs) and application-specific processors because of its hardware simplicity …

Multilayer Dataflow based Butterfly Sparsity Orchestration to Accelerate Attention Workloads

H Wu, W Li, K Yan, Z Fan, T Liu, Y Liu, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent neural networks (NNs) with self-attention exhibit competitiveness across different AI
domains, but the essential attention mechanism brings massive computation and memory …

Obstacle-aware symmetrical clock tree construction

M Liu, Z Zhang, W Sun, D Wang - 2017 IEEE 60th International …, 2017 - ieeexplore.ieee.org
High performance chip design is always a hot topic in integrated circuit (IC) field. Clock
design plays a critical role in improving chip performance and affecting power consumption …

Parallel polar encoding in 5G communication

Y Guo, S **e, Z Liu, L Yang… - 2018 IEEE Symposium on …, 2018 - ieeexplore.ieee.org
Because of its theoretical capacity-achieving property, polar code has become the coding
scheme of the control channel in the 5G communication standard. Although its encoding …

A novel hardware support for heterogeneous multi-core memory system

T Hussain - Journal of Parallel and Distributed Computing, 2017 - Elsevier
Memory technology is one of the cornerstones of heterogeneous multi-core system
efficiency. Many memory techniques are developed to give good performance within the …

Conflict-Free Parallel Data Access Technology for Matrix Calculation in Memory System of ASIP of 5G/6G Macro Base Stations

W Chen, D Liu - IEEE Transactions on Computer-Aided Design …, 2023 - ieeexplore.ieee.org
Among the physical layer baseband algorithms in macro base stations, the matrix
processing has the dominant computing cost, large data access overhead, and complicated …