Hi Everybody,
I have two problems with DIGITS. (note: I have installed only CAFFE-0.15, not tensorflow)
Problem 1) When I run "./digits-devserver" on localhost:5000, I get these errors:
[WARNING] Failed to load 2 jobs.
[DEBUG] 20180424-202847-b6fe - IOError: [Errno 2] No such file or directory: ‘/home/deep/digits/digits/jobs/20180424-202847-b6fe/status.pickle’
[DEBUG] 20180424-203035-3b6a - IOError: [Errno 2] No such file or directory: ‘/home/deep/digits/digits/jobs/20180424-203035-3b6a/status.pickle’
Problem 2) When I train a model, I get this ouput (error):
[DEBUG] Network sanity check - train
[DEBUG] Network sanity check - val
[DEBUG] Network sanity check - deploy
[INFO ] Train Caffe Model task started.
[INFO ] Task subprocess args: “/home/deep/caffe/build/tools/caffe train --solver=/home/deep/digits/digits/jobs/20180424-213359-b64a/solver.prototxt --gpu=0”
[ERROR] Train Caffe Model: Cannot create cuDNN handle. cuDNN won’t be available.
[ERROR] Train Caffe Model: Cannot create cuDNN handle. cuDNN won’t be available.
[ERROR] Train Caffe Model: Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
[ERROR] Train Caffe Model: Cannot create cuDNN handle. cuDNN won’t be available.
[ERROR] Train Caffe Model task failed with error code -6
~/.bashrc includes as follows:
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CAFFE_ROOT=~/caffe
export DIGITS_ROOT=~/digits
export PYTHONPATH=/home/deep/caffe/python:${PYTHONPATH:+:${PYTHONPATH}}
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111 Driver Version: 384.111 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 00000000:01:00.0 On | N/A |
| N/A 48C P2 30W / N/A | 619MiB / 8111MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1044 G /usr/lib/xorg/Xorg 344MiB |
| 0 1828 G compiz 150MiB |
| 0 14169 G ...-token=17443F43231739145F843E03090635BC 122MiB |
+-----------------------------------------------------------------------------+
$ dpkg -l | egrep 'digits|caffe|libcudnn|libnccl|cudart|nvidia'
ii cuda-cudart-9-0 9.0.176-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-dev-9-0 9.0.176-1 amd64 CUDA Runtime native dev links, headers
ii libcudnn7 7.1.3.16-1+cuda9.1 amd64 cuDNN runtime libraries
ii libcudnn7-dev 7.1.3.16-1+cuda9.1 amd64 cuDNN development libraries and headers
ii libcudnn7-doc 7.0.5.15-1+cuda9.0 amd64 cuDNN documents and samples
ii nvidia-384 384.111-0ubuntu1 amd64 NVIDIA binary driver - version 384.111
ii nvidia-384-dev 384.111-0ubuntu1 amd64 NVIDIA binary Xorg driver development files
ii nvidia-machine-learning-repo-ubuntu1604 1.0.0-1 amd64 nvidia-machine-learning repository configuration files
ii nvidia-modprobe 390.30-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-icd-384 384.111-0ubuntu1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-settings 390.30-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver