Hi!
I cannot run TensorRT2 inference with INT8 precision when I run it inside docker. FP32 and FP16 work OK. The problem is when I use INT8. The error message I get is: ‘ERROR LAUNCHING INT8-to-INT8 GEMM: 8’.
My system is the following:
HOST:
- Eight P100 GPUs.
- OS: RedHat Enterprise 7.3
- Driver version 367.48
- CUDA: 8.0.44
DOCKER
- Docker version 17.03.1-ce, build c6d412e
- NVIDIA Docker version 1.0.1-1
- Image derived from 'nvidia/cuda:8.0-cudnn5-devel-ubuntu16.04'.
- CUDA 8.0.61
I run standard examples located in /usr/src/gie_examples:
/usr/src/gie_samples/samples# ./bin/giexec --deploy=./data/samples/googlenet/googlenet.prototxt --output=prob --batch=2 --device=0 --iterations=10 --int8
Thanks!