I am trying to use the pyTorch NGC container to train a model. My docker version is 19.03. I have pulled the container from NGC and ran it as https://ngc.nvidia.com/catalog/containers/nvidia:pytorch suggests. To train the model I am using the Detectron2 framework. I tried running some of my code and got the next error:
ImportError: /opt/conda/lib/python3.6/site-packages/detectron2-0.1-py3.6-linux-x86_64.egg/detectron2/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E any help? googling this problem suggests that the torch and torchvision do not match, this makes no sense to me (because the container was pulled from NGC and is supposed to be pre-configed with this).
BTW, It is mentioned in workspace/README.md that there is a conda env that already exists on the machine (named pytorch-35) but I only have the base env.