Abstract
In earlier work, we presented a one-dimensional cache-oblivious sparse matrix–vector
(SpMV) multiplication scheme which has its roots in one-dimensional sparse matrix partitioning.
Partitioning is often used in distributed-memory parallel computing for the SpMV
multiplication, an important kernel in many applications. A logical extension is to move
towards using a two-dimensional partitioning. In this paper, we present our research in
this direction, extending the one-dimensional method for cache-oblivious SpMV multiplication
to two dimensions, while still allowing only row and column permutations on the
sparse input matrix. This extension requires a generalisation of the compressed row storage
data structure to a block-based data structure, for which several variants are investigated.
Experiments performed on three different architectures show further
improvements of the two-dimensional method compared to the one-dimensional method,
especially in those cases where the one-dimensional method already provided significant
gains. The largest gain obtained by our new reordering is over a factor of 3 in SpMV speed,
compared to the natural matrix ordering.
Original language | English |
---|---|
Pages (from-to) | 806-819 |
Number of pages | 14 |
Journal | Parallel Computing |
Volume | 37 |
Issue number | 12 |
DOIs | |
Publication status | Published - Dec 2011 |
Keywords
- Matrix–vector multiplication
- Sparse matrix
- Parallel computing
- Recursive bipartitioning
- Fine-grain
- Cache-oblivious