I have multiple deployed docker images based off nvidia/cuda:11.8.0-devel-ubuntu22.04. The latest push to this image caused my images to stop working.
I have two separate problems, both which only occur on images built after the push on the 10th. Images built prior to this (using the same docker file) still work.
My images no longer work on AWS Kubernetes running the nVidia AMI. I get an error from kubernetes reporting that the CUDA version needs to be updated to 11.8 or later. This image has 11.8 on it, and I’ve verified that, but something in the built is causing compatibility issues with the AMI.
The second issue has to do with one of the python libraries I use pyVista. pyVista provides a python API to the VTK C++ librrary for mesh manipulation and visualization. One of the built in functions (calculates a 2D slice from a 3D mesh) throws a warning, and returns incorrect results on the new image. I’ve verified it’s the image as I can run the exact same python code in both sets of images (old and new) and get different results.
I found that the images pushed last week include gcc-11.4 where the older images have 11.3. I’m conjecturing that this is one of the sources of the problem.
I’ve lost 4 days of time trying to figure out why none of my new docker images work. I would like to just use the old version of the nvidia/cuda:11.8.0-devel-ubuntu22.04, but it’s not available on either Docker Hub or NGC. Is there any way I can get the old docker image?
Has anyone else had issues with the new image?