Zero-Overhead Parallel Scans for Multi-Core CPUs

Research output: Contribution to conferencePaperAcademic

Abstract

We present three novel parallel scan algorithms for multi-core CPUs which do not need to fix the number of available cores at the start, and have zero overhead compared to sequential scans when executed on a single core. These two properties are in contrast with most existing parallel scan algorithms, which are asymptotically optimal, but have a constant factor overhead compared to sequential scans when executed on a single core. We achieve these properties by adapting the classic three-phase scan algorithms. The resulting algorithms also exhibit better performance than the original ones on multiple cores. Furthermore, we adapt the chained scan with decoupled look-back algorithm to also have these two properties. While this algorithm was originally designed for GPUs, we show it is also suitable for multi-core CPUs, outperforming the classic three-phase scans in our benchmarks, by better using the caches of the processor at the cost of more synchronisation. In general our adaptive chained scan is the fastest parallel scan, but in specific situations our assisted reduce-then-scan is better.
Original languageEnglish
Pages52-61
Number of pages12
DOIs
Publication statusPublished - 6 Mar 2024
Event15th International Workshop on Programming Models and Applications for Multicores and Manycores - Edinburgh, United Kingdom
Duration: 3 Mar 20243 Mar 2024

Conference

Conference15th International Workshop on Programming Models and Applications for Multicores and Manycores
Abbreviated titlePMAM '24
Country/TerritoryUnited Kingdom
CityEdinburgh
Period3/03/243/03/24

Fingerprint

Dive into the research topics of 'Zero-Overhead Parallel Scans for Multi-Core CPUs'. Together they form a unique fingerprint.

Cite this