dgeev Lapack call CUDA batch mode?

cuda_hpc80 · January 26, 2012, 2:44pm

I am trying to find a CUDA equivalent of dgeev function call from LAPACK.

I compiled magma-1.1.0 on a Tesla C2070 and tested the dgeev function which benchmarks for matrices from size 1024 to 8064. It’s interesting to see the results for a 1024x1024 matrix, where GPU takes more time than the CPU.

N     CPU Time(s)    GPU Time(s)     ||R||_F / ||A||_F

==========================================================

 <b>1024      31.66          51.06</b>

 2048     251.49         138.11

 3072     515.84         322.13

 4032     738.23         578.76

 5184     1429.96         793.89

 6016     1634.60         1136.89

 7040     2171.73         1432.91

 8064     3345.07         1625.88

I am trying to see if I can use dgeev for a 10x10 matrix 100,000 times (i.e. in burst mode).

In this scenario, each thread on the GPU solves for a 10x10 matrix. Therefore, assuming 64 threads are called, 64 10x10 matrices would be solved parallelising the whole operation.

Any suggestions on a CUDA library that can handle this??

PS: I have looked at CULA R12 and haven’t found anything on their forums that suggest a burst mode for small matrices.

Thanks in advance.

Topic		Replies	Views
Low performance – Patch Match. Image Processing on GPU (CUDA) CUDA Programming and Performance	1	771	January 4, 2018
matrix multiplication with large dimensions CUDA Programming and Performance	7	1587	April 9, 2011
CUBLAS VS CBLAS sgemv Benchmarking matrix-vector operations on GPU and CPU CUDA Programming and Performance	5	10040	March 24, 2014
non square matrix mul CUDA Programming and Performance	1	740	January 11, 2018
Newbie question how to do some tasks CUDA Programming and Performance	5	4864	October 24, 2007
CUDA Driver Version / Runtime Version problem? CUDA Programming and Performance	4	1372	January 25, 2019
I'm novice, please help -- pure performance CUDA Programming and Performance	17	60	October 30, 2024
approach to take for multiple small matrices CUDA Programming and Performance	0	2088	March 30, 2007
CUDA functions performance CUDA Programming and Performance	3	639	September 14, 2017
GPUs, How do they work? Suspiciously fast matrix dot product execution CUDA Programming and Performance	3	2090	July 7, 2011

dgeev Lapack call CUDA batch mode?

Related topics