REPLICA T7-16-128—A 2048-threaded 16-core 7-FU chained VLIW chip multiprocessor

M Forsell, J Roivainen - 2014 48th Asilomar Conference on …, 2014 - ieeexplore.ieee.org
Processor-based solutions are getting increasingly popular over dedicated
logic/accelerators among embedded system designers due to their flexibility and …

A quantitative comparison of PRAM based emulated shared memory architectures to current multicore CPUs and GPUs

E Hansson, E Alnervik, C Kessler… - ARCS 2014; 2014 …, 2014 - ieeexplore.ieee.org
The performance of current multicore CPUs and GPUs is limited in computations making
frequent use of communication/synchronization between the subtasks executed in parallel …

Numa computing with hardware and software co-support on configurable emulated shared memory architectures

M Forsell, E Hansson, C Kessler, JM Mäkelä… - International Journal of …, 2014 - jstage.jst.go.jp
The emulated shared memory (ESM) architectures are good candidates for future general
purpose parallel computers due to their ability to provide an easy-to-use explicitly parallel …

Towards a parallel debugging framework for the massively multi-threaded, step-synchronous REPLICA architecture

JM Mäkelä, V Leppänen, M Forsell - Proceedings of the 14th …, 2013 - dl.acm.org
Modern chip-multiprocessors pack an increasing amount of computational cores with each
generation. Along with new computational power comes a problem of managing a large …

A source-to-source compiler for the PRAM language Fork to the REPLICA many-core architecture

C Zhou - 2012 - diva-portal.org
This thesis describes the implementation of a source to source compiler that translates Fork
language to REPLICA baseline language. The Fork language is a high-level programming …

A quantitative comparison of emulated shared memory architectures to current multicore cpus and gpus

E Hansson, E Alnervik, C Kessler, M Forsell - 2013 - hgpu.org
The performance of current multicore CPUs and GPUs is limited in computations making
frequent use of communication/synchronization between the subtasks executed in parallel …

[PDF][PDF] Composable hierarchical synchronization support for REPLICA

JM Mäkelä, V Leppänen, M Forsell - 13th Symposium on …, 2013 - researchgate.net
Synchronization is a key concept in parallel programming. General purpose languages and
architectures often assume a restricted form of synchronicity with the focus on asynchronous …

[КНИГА][B] Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore Architecture

E Hansson - 2014 - search.proquest.com
In this thesis we describe techniques for code generation and global optimization for a
PRAM-NUMA multicore architecture. We specifically focus on the REPLICA architecture …

Evaluation of the Configurable Architecture REPLICA with Emulated Shared Memory

E Alnervik - 2014 - diva-portal.org
The purpose of this thesis is to, by benchmarking different types of computation problems on
REPLICA, similar parallel architectures (SB-PRAM and XMT) and more diverse ones (Xeon …

[ЦИТАТА][C] Institutionen för datavetenskap

E Alnervik