Optimising purely functional GPU programs

Trevor L. McDonell, Manuel M.T. Chakravarty, Gabriele Keller, Ben Lippmeier

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However, the naive compilation of such programs quickly leads to both code explosion and an excessive use of intermediate data structures. The resulting slowdown is not acceptable on target hardware that is usually chosen to achieve high performance. In this paper, we discuss two optimisation techniques, sharing recovery and array fusion, that tackle code explosion and eliminate superfluous intermediate structures. Both techniques are well known from other contexts, but they present unique challenges for an embedded language compiled for execution on a GPU. We present novel methods for implementing sharing recovery and array fusion, and demonstrate their effectiveness on a set of benchmarks.

Original languageEnglish
Title of host publicationICFP 2013 - Proceedings of the 2013 ACM SIGPLAN International Conference on Functional Programming
Pages49-60
Number of pages12
DOIs
Publication statusPublished - 12 Nov 2013
Externally publishedYes
Event2013 18th ACM SIGPLAN International Conference on Functional Programming, ICFP 2013 - Boston, MA, United States
Duration: 25 Sept 201327 Sept 2013

Conference

Conference2013 18th ACM SIGPLAN International Conference on Functional Programming, ICFP 2013
Country/TerritoryUnited States
CityBoston, MA
Period25/09/1327/09/13

Keywords

  • Array fusion
  • Arrays
  • Data parallelism
  • Dynamic compilation
  • Embedded language
  • GPGPU
  • Haskell
  • Sharing recovery

Fingerprint

Dive into the research topics of 'Optimising purely functional GPU programs'. Together they form a unique fingerprint.

Cite this