Google Академик

V Saravanan, KD Pralhaddas, DP Kothari… - … -centric Computing and …, 2015 - Springer

The power-performance trade-off is one of the major considerations in micro-architecture
design. Pipelined architecture has brought a radical change in the design to capitalize on …

Сачувај Цитирај 51 пута наведен Сродни чланци Све верзије (14)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Efficient warp execution in presence of divergence with collaborative context collection

F Khorasani, R Gupta, LN Bhuyan - Proceedings of the 48th International …, 2015 - dl.acm.org

GPU's SIMD architecture is a double-edged sword confronting parallel tasks with control
flow divergence. On the one hand, it provides a high performance yet power-efficient …

Сачувај Цитирај 45 пута наведен Сродни чланци Све верзије (8)

[Free GPT-4]
[DeepSeek]

[PDF] uni-konstanz.de

Generative data models for validation and evaluation of visualization techniques

C Schulz, A Nocaj, M El-Assady, S Frey… - Proceedings of the …, 2016 - dl.acm.org

We argue that there is a need for substantially more research on the use of generative data
models in the validation and evaluation of visualization techniques. For example, user …

Сачувај Цитирај 28 пута наведен Сродни чланци Све верзије (11) Претрага библиотека

Gpu subwarp interleaving

S Damani, M Stephenson, R Rangan… - … Symposium on High …, 2022 - ieeexplore.ieee.org

Raytracing applications have naturally high thread divergence, low warp occupancy and are
limited by memory latency. In this paper, we present an architectural enhancement called …

Сачувај Цитирај 9 пута наведен Сродни чланци Све верзије (3)

[Free GPT-4]
[DeepSeek]

[PDF] sanadamani.com

Speculative reconvergence for improved SIMT efficiency

S Damani, DR Johnson, M Stephenson… - Proceedings of the 18th …, 2020 - dl.acm.org

GPUs perform most efficiently when all threads in a warp execute the same sequence of
instructions convergently. However, when threads in a warp encounter a divergent branch …

Сачувај Цитирај 15 пута наведен Сродни чланци Све верзије (3)

[Free GPT-4]
[DeepSeek]

[PDF] googleapis.com

Device and method for scheduling multiple thread groups on SIMD lanes upon divergence in a single thread group

SH ** - US Patent 10,831,490, 2020 - Google Patents

Provided are an apparatus and a method for effectively managing threads diverged by a
conditional branch based on Single Instruction Multiple-based Data (SIMD). The appa ratus …

Сачувај Цитирај 25 пута наведен Сродни чланци Све верзије (4) Кеширано

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] An efficient algorithm for the calculation of sub-grid distances for higher-order LBM boundary conditions in a GPU simulation environment

D Mierke, CF Janßen, T Rung - Computers & Mathematics with Applications, 2020 - Elsevier

This paper presents a new and efficient algorithm for the calculation of sub-grid distances in
the context of a lattice Boltzmann method (LBM). LBMs usually operate on equidistant …

Сачувај Цитирај 15 пута наведен Сродни чланци Све верзије (4)

[Free GPT-4]
[DeepSeek]

[PDF] ucr.edu

Eliminating intra-warp load imbalance in irregular nested patterns via collaborative task engagement

F Khorasani, B Rowe, R Gupta… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org

Nested patterns are one of the most frequently occurring algorithmic themes in GPU
applications where coarse-grained tasks are constituted from a number of fine-grained ones …

Сачувај Цитирај 19 пута наведен Сродни чланци Све верзије (5)

[Free GPT-4]
[DeepSeek]

[PDF] googleapis.com

System, method, and computer program product for managing divergences and synchronization points during thread block execution by using a double sided queue …

O Giroux, GF Diamos - US Patent 9,459,876, 2016 - Google Patents

BACKGROUND Threads (ie, an abstract construct of an instance of a program executing on
a processor) have a basic guarantee of forward progress. In other words, if one thread …

Сачувај Цитирај 21 пута наведен Сродни чланци Све верзије (4) Кеширано

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

CUIRRE: An open-source library for load balancing and characterizing irregular applications on GPUs

T Zhang, W Shu, MY Wu - Journal of parallel and distributed computing, 2014 - Elsevier

Abstract While Graphics Processing Units (GPUs) show high performance for problems with
regular structures, they do not perform well for irregular tasks due to the mismatches …

Сачувај Цитирај 16 пута наведен Сродни чланци Све верзије (4)

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

SIMT microscheduling: Reducing thread stalling in divergent iterative algorithms

An optimizing pipeline stall reduction algorithm for power and performance on multi-core CPUs

Efficient warp execution in presence of divergence with collaborative context collection

Generative data models for validation and evaluation of visualization techniques

Gpu subwarp interleaving

Speculative reconvergence for improved SIMT efficiency

Device and method for scheduling multiple thread groups on SIMD lanes upon divergence in a single thread group

[HTML][HTML] An efficient algorithm for the calculation of sub-grid distances for higher-order LBM boundary conditions in a GPU simulation environment

Eliminating intra-warp load imbalance in irregular nested patterns via collaborative task engagement

System, method, and computer program product for managing divergences and synchronization points during thread block execution by using a double sided queue …

CUIRRE: An open-source library for load balancing and characterizing irregular applications on GPUs