Description
I am trying to run notebook examples from the torch-tensorrt official repository, and I get a warning when doing torch_tensorRT.compile
Environment
I am using the following container:
nvcr.io/nvidia/pytorch:25.01-py3
and running everything on H100 GPUs
Relevant Files
I am encountering issues in most examples of this folder TensorRT/notebooks at main · pytorch/TensorRT
Steps To Reproduce
-
creating a container:
docker run --gpus “device=5” -it -d -v ./Nvidia:/workspace/Nvidia --net=host --ipc=host nvcr.io/nvidia/pytorch:25.01-py3
I followed the documentation of the official repository and tried to remove the tags
–ulimit memlock=-1 --ulimit stack=67108864 but it had no impact.
-
Accessing notebooks:
jupyter notebook --allow-root --ip 0.0.0.0 --port 8888 --no-browser
-
Running notebooks
I get those warnings when executing torch_tensorrt.compile, but when doing nvtop I still have a lot of GPU memory left.
After a while, the notebook crashes.
However I don’t have such issues when using the python API of TensorRT with the following container:
nvcr.io/nvidia/tensorrt:25.01-py3
Here are some potential solutions to address the warnings and crashes you are experiencing while running notebook examples from the torch-tensorrt official repository in the nvcr.io/nvidia/pytorch:25.01-py3 container on H100 GPUs:
- Update Software: Ensure that you are using the latest version of torch-tensorrt, as updates may include fixes for known issues related to H100 GPUs.
- Compatibility Check: Verify the compatibility of the nvcr.io/nvidia/pytorch:25.01-py3 container with the version of torch-tensorrt you are using.
- Upgrade Containers: If warnings and crashes continue, consider updating both the torch-tensorrt and the container image to the most recent versions to take advantage of any bug fixes.
- Documentation Review: Consult the official documentation for any specific guidance or best practices regarding the use of torch-tensorrt with H100 GPUs.
- Community Support: If the issues persist, you may want to reach out to support channels or community forums associated with torch-tensorrt or NVIDIA for further assistance.
These steps should help you troubleshoot and mitigate the issues while using torch-tensorrt in your current environment.