How much performance boost should one expect to get going from Double Precision (CUDA_R_64F) cuSparseSpSV to the Single Precision (CUDA_R_32F) cuSparseSpSV. I am only getting a 18% reduction in compute time on a lusol when one would expected at least a 50% reduction going down in precision.
The performance is not proportional to the data type size. There are other factors that affect the execution time, especially if the routine is not entirely memory-bound. Some examples are register pressure, cache behavior, instruction dependencies, etc.
Thank you for the insight!
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.