Abstract
Current GPUs are massively parallel multicore processors optimised for workloads with a large degree of SIMD parallelism. Good performance requires highly idiomatic programs, whose development is work intensive and requires expert knowledge. To raise the level of abstraction, we propose a domain-specific high-level language of array computations that captures appropriate idioms in the form of collective array operations. We embed this purely functional array language in Haskell with an online code generator for NVIDIA's CUDA GPGPU programming environment. We regard the embedded language's collective array operations as algorithmic skeletons; our code generator instantiates CUDA implementations of those skeletons to execute embedded array programs. This paper outlines our embedding in Haskell, details the design and implementation of the dynamic code generator, and reports on initial benchmark results. These results suggest that we can compete with moderately optimised native CUDA code, while enabling much simpler source programs.
Original language | English |
---|---|
Title of host publication | DAMP'11 - Proceedings of the 6th ACM Workshop on Declarative Aspects of Multicore Programming |
Pages | 3-14 |
Number of pages | 12 |
DOIs | |
Publication status | Published - 7 Mar 2011 |
Externally published | Yes |
Event | 6th Workshop on Declarative Aspects of Multicore Programming, DAMP 2011 - Austin, TX, United States Duration: 23 Jan 2011 → 23 Jan 2011 |
Conference
Conference | 6th Workshop on Declarative Aspects of Multicore Programming, DAMP 2011 |
---|---|
Country/Territory | United States |
City | Austin, TX |
Period | 23/01/11 → 23/01/11 |
Keywords
- Arrays
- Data parallelism
- Dynamic compilation
- GPGPU
- Haskell
- Skeletons