I am running a sparse matrix-vector multiplication with CSR compression using CUSPARSE and CUSP libraries. I am getting the matrices from University of Florida matrix collection (.mtx files) and have notices some strange performance with the different matrix “types” - real and binary.
When executing sparse matrix-vector on GPU using the .mtx matrix type REAL, the GPU libraries operate much faster than CPU sparse matrix-vector (which is as expected). However, when executing sparse matrix-vector on GPU using .mtx matrix type BINARY, the GPU libraries operate much slower than CPU version. It doesn’t seem to make a difference regarding number of rows, columns and non-zeros, but when TYPE is defined as BINARY the performance on GPU drops dramatically.
I have been searching for literature regarding this but can find nothing. Has anyone ever read information regarding this performance differential between BINARY and REAL using .mtx sparse matrices? This is an interesting aspect that I have not found before and am curious if anyone has ran into similar issue(s).
What are these BINARY matrices? Matrices with boolean elements? If so, I could imagine there are highly specialized algorithms or code sequences for computing with those that may not be implemented in CUSPARSE.
Feature support in libraries associated with CUDA is largely driven by customer requests and feedback, so consider filing an enhancement request (through the bug reporting form linked from the registered developer website; prefix synopsis with “RFE:” to mark as a request for enhancement).
Thank you for the post. The BINARY matrices as it applies to type=binary in the .mtx description as defined in Florida University Sparse Matrices Repository (https://sparse.tamu.edu). The actual value(s) in the matrix file appear using the following format for .mtx type=REAL and type=BINARY:
1 1 0.4234
1 2 12.0021
where first two values are row and column index and third is the element value (in REAL). I am assuming that element values for BINARY are 1 for all row and column entries listed in file - else 0. However, I am just guessing.
Based on this, the BINARY matrices do seem to be composed of Boolean elements with value of either 0 or 1. Which means such matrices could be manipulated efficiently with integer instructions in bit-parallel fashion. Whereas matrices with REAL format are manipulated by classical floating-point processing.
As a working hypothesis, processing BINARY matrices comprising Boolean elements with methods tuned for the processing of REAL matrices comprising floating-point numbers would be inefficient, both in terms of memory bandwidth efficiency and computational throughput, since each 1-bit Boolean would be represented by (probably) a 32-bit floating-point number.
While CUBLAS may be able to handle BINARY matrices correctly (I am not sure), my understanding is that it is currently optimized for floating-point computation, i.e. REAL matrices.
Interesting hypothesis and likely correct. CUSPARSE and CUSP are optimized for floating-point elements and boolean may be a different story.
I will look into it and see what happens. Thanks again :)