No GPU option in digits-devserver

mandel.yonatan · February 6, 2018, 11:23am

Hi,

While Creating Image Classification Model with DIGITS, I don’t have an option to select a GPU.
What’s the problem?

Thanks, Yoni.

dusty_nv · February 6, 2018, 4:50pm

Hi mandel.yonatan, what are the specifications of the system you are running DIGITS on?

mandel.yonatan · February 6, 2018, 5:06pm

Hi,

Ububtu 16.04 LTS
Intel core i7-6700 CPU @ 3.40GHz*4
GEFORCE GT 730/PCIe/SSE2
64bit

Thank you.

dusty_nv · February 6, 2018, 8:35pm

Have you installed cuDNN and nvcaffe-0.15 ok? DIGITS uses them underneath to access the GPU.

mandel.yonatan · February 6, 2018, 9:56pm

Yes, according to your Github:

How can I check the installation went ok?

dusty_nv · February 7, 2018, 7:02pm

If it went correctly, and your GPU is supported (not entirely sure about GT 730 2GB), normally DIGITS should just start with GPU.
Baring that, you can navigate a terminal into your caffe build tree, and try running:

$ make runtest

Generally speaking, you’ll want a training GPU with typically at least 6GB memory to use on DNNs like Alexnet/Googlenet/Resnet on ImageNet / COCO.
You may be able change the Batch Sizes and Batch Accumulation hyperparameters around, see this step of the tutorial for reference.

mandel.yonatan · February 8, 2018, 7:37am

[b]Hi,

In a terminal I navigated /caffe/build$ and ran: make runtest
I don’t think it helped, since I still don’t have an option to select a GPU in the “New image classification model” in Digits.
However, when I press create, the following appears:[/b]

Job Status Running
Initialized at 09:31:01 AM (1 second)
Running at 09:31:02 AM
Train Caffe Model Running
0%
Estimated time remaining: ?

Initialized at 09:31:01 AM (1 second)
Running at 09:31:02 AM
Hardware
GeForce GT 730 (#0)
Memory
1.2 GB / 1.95 GB (61.4%)
Temperature
64 °C
Process #6171
CPU Utilization
112.9%
Memory
900 MB (11.4%)

But after 10 minutes the error appears:

Train Caffe Model Error
Initialized at 09:13:51 AM (1 second)
Running at 09:13:52 AM (11 minutes, 13 seconds)
Error at 09:25:06 AM
(Total - 11 minutes, 15 seconds)
ERROR: Out of memory: failed to allocate 12845056 bytes on device 0
This network produces output loss2/accuracy-top5
This network produces output loss2/loss
Network initialization done.
Solver scaffolding done.
Starting Optimization
Solving
Learning Rate Policy: step
Iteration 0, Testing net (#0)
Ignoring source layer train-data
Ignoring source layer label_train-data_1_split
Test net output #0: accuracy = 0.0214326
Test net output #1: accuracy-top5 = 0.7368
Test net output #2: loss = 2.27526 (* 1 = 2.27526 loss)
Test net output #3: loss1/accuracy = 0.0955242
Test net output #4: loss1/accuracy-top5 = 0.697909
Test net output #5: loss1/loss = 4.35742 (* 0.3 = 1.30723 loss)
Test net output #6: loss2/accuracy = 0.0842165
Test net output #7: loss2/accuracy-top5 = 0.703586
Test net output #8: loss2/loss = 2.27488 (* 0.3 = 0.682465 loss)
Out of memory: failed to allocate 12845056 bytes on device 0

[b]
Is there anything else I can check or must I switch to a newer GPU?

Thanks, Yoni.[/b]

mandel.yonatan · February 8, 2018, 5:14pm

Hi Dustin,

By changing the Batch size to 2 and the Accumulation to 5 it solved the problem (though I still didn’t have the option to choose the GPU in the Digits). Now it’s running.

By the way, do you think that a GEFORCE 940MX (2GB) is more capable than the GT 730 (2 GB)?
How important is the memory size (2 Vs 4 GB)?

Thank you, Yoni.

dusty_nv · February 8, 2018, 6:24pm

OK great, technically you should be able to train this way although it may not be completely ideal.

It seems that although the GPU selection menu isn’t listing your card, from the Job Status output it’s still detecting it correctly - it should be ok since you have 1 GPU. That menu is really for selecting from multiple GPUs.

The memory capacity is typically very important to training, without enough memory some networks you may not be able to complete the job.
For upgrading a lower-end card I would recommend GeForce 1060 6GB. Here is one example compact version.

mandel.yonatan · February 8, 2018, 8:31pm

Thank you very much Dustin.

Topic		Replies	Views
No GPU option in Digits. Closed CUDA Setup and Installation	1	773	February 27, 2018
Digits does not use gpu on Jeton TX2 Jetson TX2	4	951	October 18, 2021
DIGITS STARTING ERROR Jetson TX2	3	723	October 11, 2017
How to install DIGITS and make it work with GPU? Jetson TX2	3	897	October 18, 2021
JetPack 3.2 issue with DIGITS Jetson TX1	1	595	April 18, 2018
DIGITS installation guidance CUDA Setup and Installation	1	1389	February 1, 2017
DIGITS 4 on Jetson TX1 Jetson TX1	10	3155	October 5, 2017
Unable to run Digits Jetson TX2	2	709	October 18, 2021
ERROR: No supported GPU(s) detected to run this container Docker and NVIDIA Docker	0	2193	October 30, 2019
DIGITS on Jetson TX2 vs Virtual Machine Jetson TX2	2	1419	October 18, 2021

No GPU option in digits-devserver

Related topics