Sample MatrixCUBLAS fails on V100

dorra.boughzalayuhqo · January 9, 2020, 9:44am

Hello everyone,

I was trying to run the sample matrixCUBLAS from CUDA SAMPLES SDK on V100.

   <b>Performance= inf GFlop/s, Time= 0.000 msec, Size= 196608000 Ops</b>

However it fails to run. The same sample run very well on other GPUs.

Can anyone relate to this ?

Thank you,
Dorra

Robert_Crovella · January 10, 2020, 10:51am

There isn’t any CUDA sample code called matrixCUBLAS that I am aware of. There is one called matrixMulCUBLAS. For that sample code, I’m able to run it on a Tesla V100 without difficulty. You may have a problem with your CUDA install or other problem with that setup.

$ /usr/local/cuda/samples/bin/x86_64/linux/release/matrixMulCUBLAS
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "Tesla V100-PCIE-32GB" with compute capability 7.0

GPU Device 0: "Tesla V100-PCIE-32GB" with compute capability 7.0

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...done.
Performance= 6073.35 GFlop/s, Time= 0.032 msec, Size= 196608000 Ops
Computing result using host CPU...done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
$

Topic		Replies	Views
CUBLAS - low performance on matrix multiplication CUDA Programming and Performance	7	18280	March 30, 2011
CUDA error when running matrixMulCUBLAS sample - Ubuntu 16.04 CUDA Setup and Installation	19	13597	May 4, 2018
Matrix multiplication fails (Tesla C2070, CUBLAS, Linux SLES 11sp1) CUDA Setup and Installation	0	1348	February 12, 2013
Why is my cublas so slow and is there anything I can do to fix it? CUDA Programming and Performance	1	1528	June 27, 2018
Help with CUBLAS performance and timing issues, please help... CUDA Programming and Performance	1	3482	December 26, 2008
benchmark CUDA CuBLas and OpenCL CUDA Programming and Performance	13	28181	February 1, 2011
GTX 660 and Nano performance drop-off after sustained matrix multiplies CUDA Programming and Performance	16	920	July 15, 2022
[Solved]Same Cublas Functions work slower on the GTX1080 from GTX 960M GPU-Accelerated Libraries	3	902	June 5, 2018
Is it correct that my Pascal card is calling Maxwell_gemm kernels through cublas? And if so, why is cublas unusably slow for me? CUDA Programming and Performance	6	1022	August 23, 2018
cuBLAS call from kernel in CUDA 10.0 GPU-Accelerated Libraries	9	4983	April 7, 2021

Sample MatrixCUBLAS fails on V100

Related topics