Error -11 taining Detectnet example : https://github.com/NVIDIA/DIGITS/tree/master/examples/object-detection

Hello I am following the example :

My hw setup :
Intel(R) Core™ i7-7700 CPU @ 3.60GHz
NVIDIA Driver Version: 384.130
4 x Nvidia Geforce GTX 1070ti 8Gb

Installed DIGITS version:
6.1.1
Caffe version:
0.15.14

For NvCaffe installation I followed the guide :
https://github.com/NVIDIA/DIGITS/blob/master/docs/BuildCaffe.md
make runtest all 100% passed after building

Executing Training for Detecnet I have the following error :

ERROR: error code -11
Setting up coverage_loss
Top shape: (1)
with loss weight 1
Memory required for data: 1274879368
Creating layer cluster

From caffe_output.log :

I0717 10:31:13.465857 563 net.cpp:159] Memory required for data: 1274879364
I0717 10:31:13.465859 563 layer_factory.hpp:77] Creating layer coverage_loss
I0717 10:31:13.465863 563 net.cpp:94] Creating Layer coverage_loss
I0717 10:31:13.465867 563 net.cpp:435] coverage_loss ← coverage_coverage/sig_0_split_0
I0717 10:31:13.465868 563 net.cpp:435] coverage_loss ← coverage-label_slice-label_4_split_0
I0717 10:31:13.465873 563 net.cpp:409] coverage_loss → loss_coverage
I0717 10:31:13.465911 563 net.cpp:144] Setting up coverage_loss
I0717 10:31:13.465915 563 net.cpp:151] Top shape: (1)
I0717 10:31:13.465916 563 net.cpp:154] with loss weight 1
I0717 10:31:13.465919 563 net.cpp:159] Memory required for data: 1274879368
I0717 10:31:13.465921 563 layer_factory.hpp:77] Creating layer cluster
*** Aborted at 1531816275 (unix time) try “date -d @1531816275” if you are using GNU date ***
PC: @ 0x7f903acee873 std::_Hashtable<>::clear()
*** SIGSEGV (@0x9) received by PID 563 (TID 0x7f92dae5a0c0) from PID 9; stack trace: ***
@ 0x7f92d8ab94b0 (unknown)
@ 0x7f903acee873 std::_Hashtable<>::clear()
@ 0x7f903ace0346 google::protobuf::DescriptorPool::FindFileByName()
@ 0x7f903acbeac8 google::protobuf::python::cdescriptor_pool::AddSerializedFile()
@ 0x7f92d96e49f0 PyEval_EvalFrameEx
@ 0x7f92d981a05c PyEval_EvalCodeEx
@ 0x7f92d977046d (unknown)
@ 0x7f92d9743273 PyObject_Call
@ 0x7f92d9763b75 (unknown)
@ 0x7f92d96fa173 (unknown)
@ 0x7f92d9743273 PyObject_Call
@ 0x7f92d96e135c PyEval_EvalFrameEx
@ 0x7f92d981a05c PyEval_EvalCodeEx
@ 0x7f92d96dbda9 PyEval_EvalCode
@ 0x7f92d977d244 PyImport_ExecCodeModuleEx
@ 0x7f92d977dc1f (unknown)
@ 0x7f92d977f390 (unknown)
@ 0x7f92d977f658 (unknown)
@ 0x7f92d978076b PyImport_ImportModuleLevel
@ 0x7f92d96ea8b8 (unknown)
@ 0x7f92d9743273 PyObject_Call
@ 0x7f92d9819487 PyEval_CallObjectWithKeywords
@ 0x7f92d96df7e6 PyEval_EvalFrameEx
@ 0x7f92d981a05c PyEval_EvalCodeEx
@ 0x7f92d96dbda9 PyEval_EvalCode
@ 0x7f92d977d244 PyImport_ExecCodeModuleEx
@ 0x7f92d977dc1f (unknown)
@ 0x7f92d977f390 (unknown)
@ 0x7f92d977f658 (unknown)
@ 0x7f92d978076b PyImport_ImportModuleLevel
@ 0x7f92d96ea8b8 (unknown)
@ 0x7f92d9743273 PyObject_Call**

Any suggestion appreciated
Thnxx in advance
Marco Gonnelli

Can you run the same experiment on nvcr.io/nvidia/digits:19.05-caffe?

I have encountered the same issue and wasn’t able to resolve it.
For a quick fix I have removed the Python layers at the end of the network.