I am currently trying to bootstrap a PyTorch/ONNX/TensorRT container to do serve some models for inference. I’m exploring serving PyTorch → ONNX models through ONNXRuntime with TensorRT Execution Provider.
I was wondering if there is a way to find out how exactly these containers are built? Are there any Dockerfiles where I can see what exactly goes into each image?
For example, looking at nvcr.io/nvidia/pytorch:21.03-py3, container it seems that it comes with bunch of things that I do not need. One concrete issue I have is that it comes with onnxruntime installed, however I need to have my own onnxruntime-gpu-tensorrt runtime installed (which I have compiled from source using NVIDIA’s tensorrt container as a base).
I have looked inside the containers and there are sample docker files, but they only show how to either patch PyTorch and recompile, or how to add some software in the container.