AWS K80 Docker

Hi every one… I installed DIGITS in my AWS cloud server with K80 GPU but when i try to training the model appear this message:

Started BatchTransformer thread 87
Loading mean file from: /workspace/jobs/20190818-080628-28c8/train_db/mean.binaryproto
Loading mean file from: /workspace/jobs/20190818-080628-28c8/train_db/mean.binaryproto
Loading mean file from: /workspace/jobs/20190818-080628-28c8/train_db/mean.binaryproto
Data Reader threads: 3, out queues: 12, depth: 10
{0} Starting 3 internal thread(s) on device 0
Started internal thread 91 on device 0, rank 0
Opened lmdb /workspace/jobs/20190818-080628-28c8/train_db/features
Started internal thread 92 on device 0, rank 0
Opened lmdb /workspace/jobs/20190818-080628-28c8/train_db/features
Started internal thread 93 on device 0, rank 0
Opened lmdb /workspace/jobs/20190818-080628-28c8/train_db/features
Output data size: 10, 3, 384, 1248
Parser threads: 3 (auto)
Transformer threads: 4 (auto)
Started internal thread 78 on device 0, rank 0
Started internal thread 79 on device 0, rank 0
Started internal thread 82 on device 0, rank 0
Started internal thread 80 on device 0, rank 0
Check failed: error == cudaSuccess (209 vs. 0) no kernel image is available for execution on the device

i have installed CUDA 10.1 / NVIDIA-SMI 418
±----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 |
| N/A 64C P0 58W / 149W | 92MiB / 11441MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1849 C python 81MiB |
±----------------------------------------------------------------------------+

some idea?

The K80 is CUDA Compute capability 3.7. The error message, 209, indicated the underlying framework was compiled without 3.7 support. For example, nvcaffe was built with 5.2 and above.

Please consider launching DIGITS inside other AWS GPU instances (for example P3 instance). In case that is not an option for you, you will need to build nvcaffe with Kepler support and run DIGITS on top of it.

Hi!,
I am having the same problem using the digits docker (nvcr.io/nvidia/digits:19.02-caffe).
When I try to train I get the following error:
Copying source layer inception_5b/relu_5x5_reduce Type:ReLU #blobs=0
Copying source layer inception_5b/5x5 Type:Convolution #blobs=2
Copying source layer inception_5b/relu_5x5 Type:ReLU #blobs=0
Copying source layer inception_5b/pool Type:Pooling #blobs=0
Copying source layer inception_5b/pool_proj Type:Convolution #blobs=2
Copying source layer inception_5b/relu_pool_proj Type:ReLU #blobs=0
Copying source layer inception_5b/output Type:Concat #blobs=0
Ignoring source layer pool5/7x7_s1
Ignoring source layer pool5/drop_7x7_s1
Ignoring source layer loss3/classifier
Ignoring source layer loss3/loss3
Starting Optimization
Solving Learning Rate Policy: exp
Reserving 23918336 bytes of shared learnable space for type FLOAT
Initial Test started…
Iteration 0, Testing net (#0)
Ignoring source layer train_data
Ignoring source layer train_label
Ignoring source layer train_transform
Check failed: error == cudaSuccess (209 vs. 0) no kernel image is available for execution on the device

My CUDA installation is as follows.
±----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 840M Off | 00000000:03:00.0 Off | N/A |
| N/A 43C P0 N/A / N/A | 446MiB / 2004MiB | 0% Default |
±------------------------------±---------------------±---------------------+

I don’t know if I should also build nvcaffe with Kepler support and run digits on top of it. In this case, how should I do it?