Zero-shot kernel learning

H Zhang, P Koniusz - … of the IEEE conference on computer …, 2018 - openaccess.thecvf.com
In this paper, we address an open problem of zero-shot learning. Its principle is based on
learning a map** that associates feature vectors extracted from ie images and attribute …

DRAMHiT: A Hash Table Architected for the Speed of DRAM

V Narayanan, D Detweiler, T Huang… - Proceedings of the …, 2023 - dl.acm.org
Despite decades of innovation, existing hash tables fail to achieve peak performance on
modern hardware. Built around a relatively simple computation, ie, a hash function, which in …

COMET: Communication-optimised multi-threaded error-detection technique

K Mitropoulou, V Porpodas, TM Jones - Proceedings of the International …, 2016 - dl.acm.org
Relentless technology scaling has made transistors more vulnerable to soft, or transient,
errors. To keep systems robust against these, current error detection techniques use …

X-OpenMP—eXtreme fine-grained tasking using lock-less work stealing

P Nookala, K Chard, I Raicu - Future Generation Computer Systems, 2024 - Elsevier
Processors with 100s of threads of execution are among the state-of-the-art in high-end
computing systems. This transition to many-core computing has required the community to …

Ffq: A fast single-producer/multiple-consumer concurrent fifo queue

S Arnautov, P Felber, C Fetzer… - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
With the spreading of multi-core architectures, operating systems and applications are
becoming increasingly more concurrent and their scalability is often limited by the primitives …

Enabling extremely fine-grained parallelism via scalable concurrent queues on modern many-core architectures

P Nookala, P Dinda, KC Hale… - … on Modeling, Analysis …, 2021 - ieeexplore.ieee.org
Enabling efficient fine-grained task parallelism is a significant challenge for hardware
platforms with increasingly many cores. Existing techniques do not scale to hundreds of …

Cache‐aware design of general‐purpose Single‐Producer–Single‐Consumer queues

V Maffione, G Lettieri, L Rizzo - Software: Practice and …, 2019 - Wiley Online Library
Data processing pipelines normally use lockless Single‐Producer–Single‐Consumer
(SPSC) queues to efficiently decouple their processing threads and achieve high …

Equeue: Elastic lock-free fifo queue for core-to-core communication on multi-core processors

J Wang, Y Tian, X Fu - IEEE Access, 2020 - ieeexplore.ieee.org
In recent years, the number of CPU cores in a multi-core processor keeps increasing. To
leverage the increasing hardware resource, programmers need to develop parallelized …

A cache-friendly concurrent lock-free queue for efficient inter-core communication

X Meng, X Zeng, X Chen, X Ye - 2017 IEEE 9th International …, 2017 - ieeexplore.ieee.org
Buffer sharing based on pipeline parallelism is quite susceptible to inter-core
communication overhead. Existing work on concurrent lock-free (CLF) queue algorithm did …

Inter-Core Communication Mechanisms for Microkernel Operating System based on Signal Transmission and Shared Memory

C Liu, L Luo, M Li, P Lei, L Chen… - 2021 7th International …, 2021 - ieeexplore.ieee.org
With the coming of the Internet of things (IoT) era and the development of semiconductor
equipment, multicore processors have begun to be widely used in IoT devices to meet their …