Abstract
Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However, the naive compilation of such programs quickly leads to both code explosion and an excessive use of intermediate data structures. The resulting slowdown is not acceptable on target hardware that is usually chosen to achieve high performance. In this paper, we discuss two optimisation techniques, sharing recovery and array fusion, that tackle code explosion and eliminate superfluous intermediate structures. Both techniques are well known from other contexts, but they present unique challenges for an embedded language compiled for execution on a GPU. We present novel methods for implementing sharing recovery and array fusion, and demonstrate their effectiveness on a set of benchmarks.
Original language | English |
---|---|
Title of host publication | ICFP 2013 - Proceedings of the 2013 ACM SIGPLAN International Conference on Functional Programming |
Pages | 49-60 |
Number of pages | 12 |
DOIs | |
Publication status | Published - 12 Nov 2013 |
Externally published | Yes |
Event | 2013 18th ACM SIGPLAN International Conference on Functional Programming, ICFP 2013 - Boston, MA, United States Duration: 25 Sept 2013 → 27 Sept 2013 |
Conference
Conference | 2013 18th ACM SIGPLAN International Conference on Functional Programming, ICFP 2013 |
---|---|
Country/Territory | United States |
City | Boston, MA |
Period | 25/09/13 → 27/09/13 |
Keywords
- Array fusion
- Arrays
- Data parallelism
- Dynamic compilation
- Embedded language
- GPGPU
- Haskell
- Sharing recovery