I’m trying to run inference over grpc between two docker containers on the same machine (Ubuntu 24.04 on WSL2).
One container runs Triton Inference Server, and the other runs DeepStream Python Apps. both images are deepstream:8.0-triton-multiarch. the app i’m running is deepstream-test3.
CUDA buffer sharing is enabled, but I encounter CUDA IPC errors on both sides.
i have set enable_cuda_buffer_sharing: true in config_triton_grpc_infer_primary_peoplenet.txt and i got the following errors:
Triton Side Errors: “failed to open CUDA IPC handle: invalid resource handle”
deepstream side Errors:
INFO: TritonGrpcBackend id:1 initialized for model: peoplenet
ERROR: Failed to register CUDA shared memory.
ERROR: Failed to set inference input: failed to register shared memory region: invalid args
ERROR: gRPC backend run failed to create request for model: peoplenet
According to the documentation, CUDA buffer sharing should work when both processes are on the same machine, so I am trying to identify what is wrong
• Hardware Platform: Geforce RTX 3050 6GB laptop
• DeepStream Version: deepstream 8.0-triton-multiarch
• JetPack Version (valid for Jetson only)
• TensorRT Version: 10.9.0.34-1+cuda12.8
• NVIDIA GPU Driver Version: Driver Version: 581.80
• Issue Type( questions, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
-
Triton Container:
docker run --name triton_from_ds --gpus ‘“device=0”’ -it --rm -p8000:8000 -p8001:8001 -p8002:8002 --ipc=host --pid=host
–shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v ${PWD}/model_repository:/models -v ${PWD}:/workspace deepstream-python-8.0 bash -
download and export peoplenet to tensorrt as in the app readme deepstream_python_apps/apps/deepstream-test3/README at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub . i used the config.pbtxt provided.
- /model_repository/peoplenet/
- model.plan (exported as the provided readme link)
- config.pbtxt (same as vrepo)
- /model_repository/peoplenet/
-
run server with:
tritonserver --model-repository=/models -
deepstream container:
docker run --name deepstream --gpus ‘“device=0”’ -it --rm --network=host --ipc=host --pid=host --privileged --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -w /opt/nvidia/deepstream/deepstream-8.0 deepstream-python-8.0 bash -
follow deepstream_python_apps/apps/deepstream-test3/README at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub to download poeple labels.txt. i have the same setup.
-
in deep stream container:
cd /opt/nvidia/deepstream/deepstream/sources/deepstream_python_apps/apps/deepstream-test3python3 deepstream_test_3.py
-i file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_office.mp4
–pgie nvinferserver-grpc
-c config_triton_grpc_infer_primary_peoplenet.txt
note: deepstream-python-8.0 is just deepstream 8.0-triton-multiarch with python bindings installed. you can build the same image with this dockerfile
# Base image
FROM nvcr.io/nvidia/deepstream:8.0-triton-multiarch
# Set environment variables
ENV DEBIAN_FRONTEND=noninteractive
ENV DS_PYTHON_APPS_PATH=/opt/nvidia/deepstream/deepstream/sources/deepstream_python_apps
# 1. Install System Dependencies
RUN apt-get update && apt-get install -y \
python3-gi \
python3-dev \
python3-gst-1.0 \
python3-venv \
python3-pip \
git \
wget \
libgstrtspserver-1.0-0 \
gstreamer1.0-rtsp \
libgirepository1.0-dev \
gobject-introspection \
gir1.2-gst-rtsp-server-1.0 \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
# 2. Clone DeepStream Python Apps Repository
WORKDIR /opt/nvidia/deepstream/deepstream/sources/
RUN git clone
# 3. Setup Virtual Environment (with system packages)
# We create it in a standard location
ENV VIRTUAL_ENV=/opt/pyds
RUN python3 -m venv --system-site-packages $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
# 4. Install Python Dependencies
RUN pip3 install cuda-python
# 5. Download and Install Prebuilt Bindings Wheel
# download the specific wheel for DeepStream 8.0 / Python 3.12 / x86_64
# Note: Update the URL if a newer version is released.
WORKDIR $DS_PYTHON_APPS_PATH
RUN wget https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.2.2/pyds-1.2.2-cp312-cp312-linux_x86_64.whl \
&& pip3 install pyds-1.2.2-cp312-cp312-linux_x86_64.whl \
&& rm pyds-1.2.2-cp312-cp312-linux_x86_64.whl
# 6. Set Working Directory to Test App 1
WORKDIR $DS_PYTHON_APPS_PATH/apps/deepstream-test1
CMD ["/bin/bash"]