I was wondering if anyone can answer a small question about CUBLAS.
I am using a GTX275 board and have 2 complex matrices of 10241024 x 10241000.
This complex multiplication is taking about 47 miliseconds (using the cublasCgemm() function) on average and I was wondering if there is any way I could speed it up. I need this speed up since I am hoping to have a real time system and hence I should drop this operation bellow 20ms if possible.
I am new to CUDA and just read about constant memory. The first matrix (1024*1024) will not change and hence it can be set by the CPU at start up. If I use texture or constant memory for this, would I see an improvement?
Are the CUBLAS functions open source? is there a link to them somewhere?
Thanks a lot for your help!