Spatten: Efficient sparse attention architecture with cascade token and head pruning

H Wang, Z Zhang, S Han - 2021 IEEE International Symposium …, 2021 - ieeexplore.ieee.org
The attention mechanism is becoming increasingly popular in Natural Language Processing
(NLP) applications, showing superior performance than convolutional and recurrent …

Dual: Acceleration of clustering algorithms using digital-based processing in-memory

M Imani, S Pampana, S Gupta, M Zhou… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Today's applications generate a large amount of data that need to be processed by learning
algorithms. In practice, the majority of the data are not associated with any labels …

[HTML][HTML] Modeling and simulating in-memory memristive deep learning systems: An overview of current efforts

C Lammie, W **ang, MR Azghadi - Array, 2022 - Elsevier
Deep Learning (DL) systems have demonstrated unparalleled performance in many
challenging engineering applications. As the complexity of these systems inevitably …

Accelerating applications using edge tensor processing units

KC Hsu, HW Tseng - Proceedings of the International Conference for …, 2021 - dl.acm.org
Neural network (NN) accelerators have been integrated into a wide-spectrum of computer
systems to accommodate the rapidly growing demands for artificial intelligence (AI) and …

𝖧𝗒𝖣𝖱𝖤𝖠: Utilizing Hyperdimensional Computing for a More Robust and Efficient Machine Learning System

J Morris, K Ergun, B Khaleghi, M Imani… - ACM Transactions on …, 2022 - dl.acm.org
Today's systems rely on sending all the data to the cloud and then using complex
algorithms, such as Deep Neural Networks, which require billions of parameters and many …

[HTML][HTML] Toolflow for the algorithm-hardware co-design of memristive ANN accelerators

M Wabnitz, T Gemmeke - Memories-Materials, Devices, Circuits and …, 2023 - Elsevier
The capabilities of artificial neural networks are rapidly evolving, so are the expectations for
them to solve ever more challenging tasks in numerous everyday situations. Larger, more …

A survey of near-data processing architectures for neural networks

M Hassanpour, M Riera, A González - Machine Learning and Knowledge …, 2022 - mdpi.com
Data-intensive workloads and applications, such as machine learning (ML), are
fundamentally limited by traditional computing systems based on the von-Neumann …

FloatAP: Supporting High-Performance Floating-Point Arithmetic in Associative Processors

K Yang, JF Martínez - 2024 57th IEEE/ACM International …, 2024 - ieeexplore.ieee.org
Associative Processors (AP) enable in-situ, data-parallel computation in content-
addressable memories (CAM). In particular, arithmetic operations are accomplished via bit …

Hydrea: Towards more robust and efficient machine learning systems with hyperdimensional computing

J Morris, K Ergun, B Khaleghi, M Imani… - … , Automation & Test …, 2021 - ieeexplore.ieee.org
Today's systems, especially in the age of federated learning, rely on sending all the data to
the cloud, and then use complex algorithms, such as Deep Neural Networks, which require …

ARAS: An Adaptive Low-Cost ReRAM-Based Accelerator for DNNs

M Sabri, M Riera, A González - arxiv preprint arxiv:2410.17931, 2024 - arxiv.org
Processing Using Memory (PUM) accelerators have the potential to perform Deep Neural
Network (DNN) inference by using arrays of memory cells as computation engines. Among …