ModuleNotFoundError: No module named 'tensorrtserver'

My setup is as:

GPU: RTX 2080 Ti
Cuda Version: 10.2
Driver: 450.57
cudnn: libcudnn8_8.0.2.39-1+cuda10.2_amd64
OS: Ubuntu 18.04

#Run the Triton Inference Server
docker pull nvcr.io/nvidia/tritonserver:20.07-v1-py3
docker run --gpus=2 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/mgsaeed/wd500gb/github/triton-inference-server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:20.07-v1-py3 tritonserver --model-repository=/models
Quick Test: curl -v localhost:8000/v2/health/ready (not working)
Quick Test: curl localhost:8000/api/status (working)

#Inference via clientsdk
#docker pull nvcr.io/nvidia/tritonserver:20.07-py3-clientsdk (not working)
#docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:20.07-py3-clientsdk (not working)

docker pull nvcr.io/nvidia/tensorrtserver:20.02-py3-clientsdk (worked)
docker run -it --rm --net=host nvcr.io/nvidia/tensorrtserver:20.02-py3-clientsdk (worked)

Tested using following examples (both examples worked from inside the container):
/workspace/install/bin/image_client -m resnet50_netdef -s INCEPTION /workspace/images/mug.jpg
python /workspace/install/python/image_client.py -m resnet50_netdef -s INCEPTION /workspace/images/mug.jpg

After this I have done manual build using source code from https://github.com/NVIDIA/triton-inference-server/releases/

wget https://github.com/NVIDIA/triton-inference-server/archive/v1.15.0.tar.gz
install all pre-requisites as per Docker.client file
mkdir builddir && cd builddir
cmake -DCMAKE_BUILD_TYPE=Release …/build
make -j8 trtis-clients

Successful build. However testing outside the container (manual build with cmake)
./builddir/trtis-clients/install/bin/image_client -m resnet50_netdef -s INCEPTION qa/images/mug.jpg (works)
but
python ./builddir/trtis-clients/install/python/image_client.py -m resnet50_netdef -s INCEPTION qa/images/mug.jpg (doesn’t work)

and throws error
Traceback (most recent call last):
File “./builddir/trtis-clients/install/python/image_client.py”, line 34, in
from tensorrtserver.api import *
ModuleNotFoundError: No module named ‘tensorrtserver’

Upon investigation I have noticed that there is tensorrt python module which is available under the container (nvcr.io/nvidia/tensorrtserver:20.02-py3-clientsdk) due to which it is working inside the container but this module is not available outside the container.

Could you please help me with best way to resolve this dependency? Thanks.

Regards,
Ghazni

Actually your questions are all related to triton server instead of tlt.Please try to follow triton user guide to fix your issue.

Thank you Morganh.

Yes I am following Triton user guide as available on and followed the step for container vs cmake-build.

https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/quickstart.html

and yes they are not related to TLT however there wasn’t any triton inference category in forums that is why it is under TLT.

So any help on how to make available tensorrt python module for cmake-build will be much appreciated? Many thanks.

Regards,
Ghazni

Please change as below and retry.

docker run --gpus=2 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/mgsaeed/wd500gb/github/triton-inference-server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:20.07-v1-py3 tritonserver --model-repository=/models

to

docker run --gpus=2 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/mgsaeed/wd500gb/github/triton-inference-server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:20.07-v1-py3 trtserver --model-repository=/models

Thank you for the reply. I investigated this a bit more by using same versions of cuda 10.2 in containers as well as in my cmake based setup. I have tried your suggestion however this not successful. Details below:

Started server (success):
nvidia-docker run --gpus=2 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/mgsaeed/wd500gb/inferencing_server/infer_models/model_repository:/models nvcr.io/nvidia/tritonserver:20.03-py3 trtserver --model-repository=/models

Testing
./install/bin/image_client -m resnet50_netdef -s INCEPTION …/qa/images/mug.jpg (success)
Request 0, batch size 1
Image ‘…/qa/images/mug.jpg’:
504 (COFFEE MUG) = 0.723992

./install/bin/image_client -i grpc -u localhost:8001 -m resnet50_netdef -s INCEPTION …/qa/images/mug.jpg (Success)
Request 0, batch size 1
Image ‘…/qa/images/mug.jpg’:
504 (COFFEE MUG) = 0.723992

python3 ./install/python/image_client.py -m resnet50_netdef -s INCEPTION …/qa/images/mug.jpg (Failed)
Traceback (most recent call last):
File “./install/python/image_client.py”, line 34, in
from tensorrtserver.api import *
ModuleNotFoundError: No module named ‘tensorrtserver’

I have followed the steps in Dockerfile.Client to build this however there is a difference in my build and docker clientsdk container. Here is what I mean

docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:20.03-py3-clientsdk (20.03 version corresponding to server)

I could run following example successfully
python /workspace/install/python/image_client.py -m resnet50_netdef -s INCEPTION /workspace/images/mug.jpg

I noticed that when I go in python command prompt and write help(“modules)”. I receive a clean list of all modules including the module “tensorrtserver”

However in my cmake build on my Ubuntu 18.04 machine (outside the clientsdk container) when I go in python prompt then type help (“modules”) then I get following warning:

/usr/lib/python3.6/dist-packages/tensorrt/legacy/infer/init.py:5: DeprecationWarning: The infer submodule will been removed in a future version of the TensorRT Python API
You can suppress these warnings by setting tensorrt.legacy._deprecated_helpers.SUPPRESS_DEPRECATION_WARNINGS=True after importing, or setting the TRT_SUPPRESS_DEPRECATION_WARNINGS environment variable to 1
warn_deprecated(“The infer submodule will been removed in a future version of the TensorRT Python API”)

and there is no “tensorrtserver” module in the list produced by help(“modules”) command.

I don’t know why is that after following instructions in Dockerfile.client but sounds like the reason behind this issue. Thank you.

In short,
When you use nvcr.io/nvidia/tritonserver:20.03-py3-clientsdk directly, there is no issue.
When you build the client with the steps in Dockerfile.Client, there is an issue if run “python /workspace/install/python/image_client.py” . How about running "./install/bin/image_client " ?

Yes that is correct.

How about running "./install/bin/image_client " ?

Yes above works fine.

The problem is only when I run “python /workspace/install/python/image_client.py”.

OK, so,there is not blocking issue now. You can use "./install/bin/image_client ".
For “python /workspace/install/python/image_client.py”, suggest you to dig out it further.

Thanks for the confirmation. Just wanted to let you know so that it can be fixed in upcoming releases.

Yes it is not blocking at the moment. I am working on some other high priority items.

If you confirm that it is an issue in triton server, please create a topic in its forum https://forums.developer.nvidia.com/c/ai-deep-learning/libraries-sdk/inference-server/97

or ask questions in its github.