Hello everyone,
I was trying to run the sample matrixCUBLAS from CUDA SAMPLES SDK on V100.
<b>Performance= inf GFlop/s, Time= 0.000 msec, Size= 196608000 Ops</b>
However it fails to run. The same sample run very well on other GPUs.
Can anyone relate to this ?
Thank you,
Dorra
There isn’t any CUDA sample code called matrixCUBLAS that I am aware of. There is one called matrixMulCUBLAS. For that sample code, I’m able to run it on a Tesla V100 without difficulty. You may have a problem with your CUDA install or other problem with that setup.
$ /usr/local/cuda/samples/bin/x86_64/linux/release/matrixMulCUBLAS
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "Tesla V100-PCIE-32GB" with compute capability 7.0
GPU Device 0: "Tesla V100-PCIE-32GB" with compute capability 7.0
MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...done.
Performance= 6073.35 GFlop/s, Time= 0.032 msec, Size= 196608000 Ops
Computing result using host CPU...done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
$