deviceQuery passes and then fails

flyviapie · July 5, 2016, 3:51pm

Hello.

I have installed CUDA 7.5 on an Ubuntu 14.04 server. The machine has 4 graphics cards in it. My ultimate goal is to install and run Caffe for a machine learning application I am working with, and although I can compile Caffe just fine I fail the runtests every time I attempt them. I believe I have done something incorrectly while installing CUDA despite attempting to follow the official install instructions precisely. I have noticed that when I reboot the system I am able to run ./deviceQuery and it passes:

user@cuda:~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 4 CUDA Capable device(s)

Device 0: "Tesla C2050"
  CUDA Driver Version / Runtime Version          7.5 / 7.5
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 2687 MBytes (2817982464 bytes)
  (14) Multiprocessors, ( 32) CUDA Cores/MP:     448 CUDA Cores
  GPU Max Clock rate:                            1147 MHz (1.15 GHz)
  Memory Clock rate:                             1500 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 786432 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: "GeForce GT 430"
  CUDA Driver Version / Runtime Version          7.5 / 7.5
  CUDA Capability Major/Minor version number:    2.1
  Total amount of global memory:                 1023 MBytes (1072185344 bytes)
  ( 2) Multiprocessors, ( 48) CUDA Cores/MP:     96 CUDA Cores
  GPU Max Clock rate:                            1400 MHz (1.40 GHz)
  Memory Clock rate:                             900 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 131072 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 2: "GeForce GTX 470"
  CUDA Driver Version / Runtime Version          7.5 / 7.5
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 1280 MBytes (1341849600 bytes)
  (14) Multiprocessors, ( 32) CUDA Cores/MP:     448 CUDA Cores
  GPU Max Clock rate:                            1215 MHz (1.22 GHz)
  Memory Clock rate:                             1674 Mhz
  Memory Bus Width:                              320-bit
  L2 Cache Size:                                 655360 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 5 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 3: "GeForce GTX 470"
  CUDA Driver Version / Runtime Version          7.5 / 7.5
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 1280 MBytes (1341849600 bytes)
  (14) Multiprocessors, ( 32) CUDA Cores/MP:     448 CUDA Cores
  GPU Max Clock rate:                            1215 MHz (1.22 GHz)
  Memory Clock rate:                             1674 Mhz
  Memory Bus Width:                              320-bit
  L2 Cache Size:                                 655360 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 6 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
> Peer access from Tesla C2050 (GPU0) -> GeForce GT 430 (GPU1) : No
> Peer access from Tesla C2050 (GPU0) -> GeForce GTX 470 (GPU2) : Yes
> Peer access from Tesla C2050 (GPU0) -> GeForce GTX 470 (GPU3) : Yes
> Peer access from GeForce GT 430 (GPU1) -> Tesla C2050 (GPU0) : No
> Peer access from GeForce GT 430 (GPU1) -> GeForce GTX 470 (GPU2) : No
> Peer access from GeForce GT 430 (GPU1) -> GeForce GTX 470 (GPU3) : No
> Peer access from GeForce GTX 470 (GPU2) -> Tesla C2050 (GPU0) : Yes
> Peer access from GeForce GTX 470 (GPU2) -> GeForce GT 430 (GPU1) : No
> Peer access from GeForce GTX 470 (GPU2) -> GeForce GTX 470 (GPU3) : Yes
> Peer access from GeForce GTX 470 (GPU3) -> Tesla C2050 (GPU0) : Yes
> Peer access from GeForce GTX 470 (GPU3) -> GeForce GT 430 (GPU1) : No
> Peer access from GeForce GTX 470 (GPU3) -> GeForce GTX 470 (GPU2) : Yes

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.5, NumDevs = 4, Device0 = Tesla C2050, Device1 = GeForce GT 430, Device2 = GeForce GTX 470, Device3 = GeForce GTX 470
Result = PASS

I do not have the experience with CUDA to be able to tell if there are significant problems with the above output, I have only been going off of the “PASS” result so far.

Here is where my troubles start. After getting this initial PASS, I move on to test other example programs. They seem to work fine initially. I attempted the following commands in order, leading up to the generation of the errors. The first two tests with matrixMul and matrixMulCUBLAS work fine, and then I begin having issues, culminating in the machine becoming unable to pass deviceQuery a second time:

user@cuda:~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release$ ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
GPU Device 0: "Tesla C2050" with compute capability 2.0

MatrixA(320,320), MatrixB(640,320)
Computing result using CUDA Kernel...
done
Performance= 173.56 GFlop/s, Time= 0.755 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block
Checking computed result for correctness: Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

user@cuda:~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release$ ./matrixMulCUBLAS 
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "Tesla C2050" with compute capability 2.0

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...done.
Performance= 516.29 GFlop/s, Time= 0.381 msec, Size= 196608000 Ops
Computing result using host CPU...done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Here specifically is where things take a negative turn. It should be noted that I chose to run matrixMul → matrixMulCUBLAS → simpleP2P in that order simply on a whim to try out test programs and see if they work correctly since I have isolated my CUDA install as the likely source of my issues with Caffe.

user@cuda:~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release$ ./simpleP2P
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA error at simpleP2P.cu:63 code=10(cudaErrorInvalidDevice) "cudaGetDeviceCount(&gpu_n)"

Since the last test of simpleP2P didn’t go well I attempted to backtrack and run matrixMul again:

user@cuda:~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release$ ./matrixMul
[Matrix Multiply Using CUDA] - Starting...
cudaGetDevice returned error invalid device ordinal (code 10), line(396)
cudaGetDeviceProperties returned error invalid device ordinal (code 10), line(409)
MatrixA(160,160), MatrixB(320,160)
cudaMalloc d_A returned error invalid device ordinal (code 10), line(164)

And finally, running deviceQuery a second time:

user@cuda:~/NVIDIA_CUDA-7.5_Samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 10
-> invalid device ordinal
Result = FAIL

If anyone could shed some light on what is going on here I would be forever grateful. I have been working to get this running for some time with no luck.

Robert_Crovella · July 5, 2016, 4:17pm

You have an unstable system. It’s going to be difficult to diagnose from the information you’ve provided so far. I would start by checking the basics: power delivery and cooling. Use the nvidia-smi tool repeatedly to see if you can see any anomalies before the GPUs “disappear”. Is your power supply up to the task of powering 4 GPUs? Is your motherboard designed/tested/advertised to support 4 GPUs?

Your initial deviceQuery output looks OK.

If you want to get some lower level data, besides the nvidia-smi output, you could try running:

dmesg | grep NVRM

after things take the “negative turn”.

If you want to run Caffe for deep learning, and use GPU acceleration, the best way to do this (with current Caffe builds) is to use the cuDNN library. This is already integrated into current Caffe builds. But cuDNN requires cc3.0 or higher GPUs, and all your GPUs are cc2.x (Fermi class) so none of them are supported by cuDNN. It doesn’t mean you “can’t” use Caffe, but it will be using the GPUs not at all, or else on a “slower path” in a branch that is probably no longer maintained by the Caffe developers. The way forward here is to use cuDNN.

flyviapie · July 5, 2016, 5:26pm

Thank you for the response. If you don’t mind me asking, how are you able to tell from what I posted that the system is unstable versus there being some less serious software-based issue? In other words, is the erratic behavior (deviceQuery behavior unexpectedly changing) indicative of hardware instability? I am remotely logged into this machine (which I inherited from another student) and I will be attempting to get answers to the hardware questions you mentioned.

I suppose one thing I can try in the mean time is to try and run Caffe without any type of GPU acceleration just on the CPU.

Here is my output from dmesg | grep NVRM

user@cuda:~$ dmesg | grep NVRM
[   12.191504] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  352.39  Fri Aug 14 18:09:10 PDT 2015
[   64.682144] NVRM: Your system is not currently configured to drive a VGA console
[   64.682147] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
[   64.682149] NVRM: requires the use of a text-mode VGA console. Use of other console
[   64.682150] NVRM: drivers including, but not limited to, vesafb, may result in
[   64.682152] NVRM: corruption and stability problems, and is not supported.
[  126.171751] NVRM: request_irq() failed (-22)
[  126.171754] NVRM: nvidia_frontend_open: minor 2, module->open() failed, error -22
[  138.176564] NVRM: request_irq() failed (-22)
[  138.176567] NVRM: nvidia_frontend_open: minor 2, module->open() failed, error -22
[  176.024943] NVRM: request_irq() failed (-22)
[  176.024946] NVRM: nvidia_frontend_open: minor 2, module->open() failed, error -22
[ 1481.441616] NVRM: request_irq() failed (-22)
[ 1481.441620] NVRM: nvidia_frontend_open: minor 2, module->open() failed, error -22

Robert_Crovella · July 5, 2016, 5:43pm

By unstable I was just reflecting what you have said already, that things work for some period of time, then they don’t. Normally, the CUDA install works or it doesn’t. In an “unstable” situation like this, one of the things I would want to be sure of is that power and cooling are not an issue.

It doesn’t mean that it is for sure either a hardware or software issue.

How did you install CUDA 7.5? Did you use the runfile installer method?

My guess right now is that you still have the nouveau driver. This needs to be removed:

[url]http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau[/url]

flyviapie · July 6, 2016, 3:43pm

Sorry for the delay, I have actually reached a bit of a resolution. I’m posting what happened in the hopes it could (somehow) help someone else in the future dealing with similar issues.

I actually have properly blacklisted the nouveau drivers as per those instructions. Upon double checking that and rebooting I decided to try everything again and to my suprise all of my original tests worked fine with no problems. Since they all worked out I decided to plunge forward and try to compile and test Caffe (recall that using Caffe was my original intent).

I compiled and ran through the Caffe runtest, pinging nvidia-smi repeatedly along the way. While running I got mostly “good” outputs from nvidia-smi, for example here are a few:

Before running the Caffe Runtest:

user@cuda:~$ nvidia-smi
Tue Jul  5 13:30:51 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.39     Driver Version: 352.39         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2050         Off  | 0000:01:00.0     Off |                    0 |
| 30%   40C    P0    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 430      Off  | 0000:02:00.0     N/A |                  N/A |
| 52%   38C    P0    N/A /  N/A |      3MiB /  1022MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 470     Off  | 0000:05:00.0     N/A |                  N/A |
| 40%   34C    P0    N/A /  N/A |      4MiB /  1279MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 470     Off  | 0000:06:00.0     N/A |                  N/A |
| 40%   31C    P0    N/A /  N/A |      4MiB /  1279MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    1                  Not Supported                                         |
|    2                  Not Supported                                         |
|    3                  Not Supported                                         |
+-----------------------------------------------------------------------------+

One example of output during the runtest:

user@cuda:~$ nvidia-smi
Tue Jul  5 13:47:08 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.39     Driver Version: 352.39         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2050         Off  | 0000:01:00.0     Off |                    0 |
| 30%   49C    P0    N/A /  N/A |     68MiB /  2687MiB |     74%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 430      Off  | 0000:02:00.0     N/A |                  N/A |
| 52%   37C    P8    N/A /  N/A |     24MiB /  1022MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 470     Off  | 0000:05:00.0     N/A |                  N/A |
| 40%   40C    P3    N/A /  N/A |     63MiB /  1279MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 470     Off  | 0000:06:00.0     N/A |                  N/A |
| 40%   40C    P3    N/A /  N/A |     63MiB /  1279MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     11620    C   .build_release/test/test_all.testbin            60MiB |
|    1                  Not Supported                                         |
|    2                  Not Supported                                         |
|    3                  Not Supported                                         |
+-----------------------------------------------------------------------------+

Then right before the runtest was aborted:

user@cuda:~$ nvidia-smi
Tue Jul  5 13:48:42 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.39     Driver Version: 352.39         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2050         Off  | 0000:01:00.0     Off |                    0 |
| 30%   58C    P0    N/A /  N/A |     69MiB /  2687MiB |     70%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 430      Off  | 0000:02:00.0     N/A |                  N/A |
| 52%   35C    P8    N/A /  N/A |     24MiB /  1022MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  ERR!               ERR!  | ERR!            ERR! |                 ERR! |
|ERR!  ERR! ERR!    ERR! / ERR! |     63MiB /  1279MiB |    ERR!         ERR! |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 470     Off  | 0000:06:00.0     N/A |                  N/A |
| 40%   43C   P12    N/A /  N/A |     63MiB /  1279MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     11620    C   .build_release/test/test_all.testbin            61MiB |
|    1                  Not Supported                                         |
|    2                  ERROR: GPU is lost                                    |
|    3                  Not Supported                                         |
+-----------------------------------------------------------------------------+

Final output:

user@cuda:~$ nvidia-smi
Unable to determine the device handle for GPU 0000:05:00.0: Unknown Error

And at this point the Caffe runtest has aborted with an unknown error.

So I rebooted the machine and got this from nvidia-smi:

user@cuda:~$ nvidia-smi
Wed Jul  6 10:22:53 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.39     Driver Version: 352.39         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla C2050         Off  | 0000:01:00.0     Off |                    0 |
| 30%   45C    P0    N/A /  N/A |      6MiB /  2687MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 430      Off  | 0000:02:00.0     N/A |                  N/A |
| 52%   40C    P0    N/A /  N/A |      3MiB /  1022MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 470     Off  | 0000:06:00.0     N/A |                  N/A |
| 40%   33C    P0    N/A /  N/A |      4MiB /  1279MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    1                  Not Supported                                         |
|    2                  Not Supported                                         |
+-----------------------------------------------------------------------------+

I take all of this to mean that one of the 470 cards was in fact failing, and actually
burned out. The system appears to be stable after now, meaning that accross multiple reboots nvidia-smi output has remained consistent and is able to pass every sample program test from the NVIDIA_CUDA-7.5_Samples directory with no issues.

The fact remains that the cards are too old to make use of cuDNN or to properly use
Caffe, but that is a separate issue I suppose.

Also a separate issue is the fact that Ubuntu 14.04 is actually not a viable platform to install and run Caffe on due to multiple dependency issues that cannot be resolved in a safe manner (despite 14.04 being listed as a good platform to use on the official documentation page for Caffe…).

Thanks for your help txbob.

Topic		Replies	Views
Windows 7 no CUDA-capable device is detected CUDA Setup and Installation	23	19246	January 9, 2018
why "all CUDA-capable devices are busy or unavailable" ? CUDA Programming and Performance	34	64023	April 20, 2011
trying to get a tesla k10 online. cuda_5.5.22_linux_64.run fails Linux	18	5782	February 16, 2014
Ubuntu 14.04: optimus + CUDA Linux	16	43171	March 10, 2016
Install Problem CUDA Programming and Performance	32	12681	December 17, 2009
nvidia-smi "No devices were found" error CUDA Setup and Installation	23	61945	February 14, 2021
bandwidthTest example throws cudaErrorCallRequiresNewerDriver error when launched via nv-nsight-cu-cli Nsight Compute linux , driver	17	1237	February 9, 2024
deviceQuery OK, everything else hangs Cuda sdk 4.1 examples simply hang, no errors, no warnings CUDA Programming and Performance	12	8872	April 23, 2012
CUDA 4 + driver 270.35 (C2050) random errors CUDA Programming and Performance	13	18685	April 7, 2011
340.106 nvidia-uvm.ko fails to build under kernel 4.14.y Linux	16	7294	October 14, 2021

deviceQuery passes and then fails

Related topics