I am using DIGITS (docker) as inference server via API and assigning port 5050 to it. My question is, I want to make use of two other GPUs in the system for inference, but cURL call format has no GPU selection parameter. In this case I think I need to change some inference code to make each docker run with a different GPU, permanently.
Which scripts I need to modify in DIGITS for this purpose, and which lines, I appreciate if you provide some guidance.
If you plan to pin one container to one GPU, you may try to add -e NVIDIA_VISIBLE_DEVICES=0 when you launch docker. For example,
docker run -it --rm --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 -p 5000:5000 $DIGITS_DOCKER_IMAGE
and the second DIGITS container instance can be
docker run -it --rm --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=1 -p 5001:5000 $DIGITS_DOCKER_IMAGE
For more information about GPU isolation in docker, please check https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions#i-have-multiple-gpu-devices-how-can-i-isolate-them-between-my-containers.
Thank you very much, this is a lot better then modifying the code.