Optimising purely functional gpu programs

Trevor L. McDonell, Manuel M.T. Chakravarty, Gabriele Keller, Ben Lippmeier

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Purely functional, embedded array programs are a good match for SIMD hardware, such as GPUs. However, the naive compilation of such programs quickly leads to both code explosion and an excessive use of intermediate data structures. The resulting slowdown is not acceptable on target hardware that is usually chosen to achieve high performance. In this paper, we discuss two optimisation techniques, sharing recovery and array fusion, that tackle code explosion and eliminate superfluous intermediate structures. Both techniques are well known from other contexts, but they present unique challenges for an embedded language compiled for execution on a GPU. We present novel methods for implementing sharing recovery and array fusion, and demonstrate their effectiveness on a set of benchmarks.

Original languageEnglish
Pages (from-to)49-60
Number of pages12
JournalACM SIGPLAN Notices
Volume48
Issue number9
Publication statusPublished - 1 Sept 2013
Externally publishedYes

Keywords

  • Array fusion
  • Arrays
  • Data parallelism
  • Dynamic compilation
  • Embedded language
  • GPGPU
  • Haskell
  • Sharing recovery

Fingerprint

Dive into the research topics of 'Optimising purely functional gpu programs'. Together they form a unique fingerprint.

Cite this