We would like to test cuOpt library, if we could make use of it in our specific use case, but we cannot get it to run, because of specifics behind our infrastructure.
We are using Azure Cloud HPS Kubernetes Cluster with multiple Nvidia A100 accelerated GPUs. Our containers are build from Nvidia container registry, we are currently using this container image
We were wondering, because we already running our code in a container, would there be any chance to access python library, which is being shipped with your custom container image from nvcr.io/ea-reopt-member-zone/ea-cuopt, separately?
The only option listed in the setup https://developer.nvidia.com/docs/reopt/python/setup.html, does not work for us.
Thank you in advance,
We do not have an available distribution outside of the container at this time. However, maybe we can find a workaround.
I have not tried any of these things below specifically, but I do have an OpenShift/Kubernetes background and so maybe we can find a way to make this work. Let me try to understand your situation a little better:
Is the cuda version prohibitive for you, that is you must run on cuda 11.3?
It sounds like your application is containerized already. Could you use the EA cuOpt image as your base image instead of nvidia/cuda:11.3.0-devel-ubuntu20.04 and rebuild your application?
- Could you create a multistage build using your existing application image as a source and copy your application binaries into a new image based on the EA cuOpt container?
- could you put your application in a volume and mount it on a cuOpt container so that it would be available inside the container?
Thank you for the quick reply.
- No, it is just that this version is probably one of the most frequently used - that is why we chose it. I think there would not be any conflicts if we shift to a previous minor release (like 11.2), as long as it is >11 - it should be fine.
The 1st option sounds like - we should try it.
Do you have any recommendations and/or steps where we should start?
I will investigate this a bit.
Aside from changing the base image in the FROM to nvcr.io/ea-reopt-member-zone/ea-cuopt:v0.1, it will be necessary I think to install all of your Python dependencies into the “cuopt” conda env inside the image (since that is where the cuopt packages and library are installed).
Let me look into how to work with that conda environment from a Dockerfile.
Then of course there would be any additional Ubuntu packages you are installing in your current image, those will need to be added too.
Thanks for the reply.
We are now to setting up the container with the
cuOpt image, we will see where that leads us.