Crash happened when change batch size on 2080Ti

Description

Running a same model on 1080Ti and 2080Ti, all is ok on 1080Ti, but 2080Ti will raise an error when change batchsize with context->execute(batchsize,buffer)

ERROR: …/rtSafe/cuda/caskFullyConnectedRunner.cpp (288) - Cuda Error in execute: 1 (invalid argument)
ERROR: FAILED_EXECUTION: std::exception

Environment

TensorRT Version : 7.0.0.11
GPU Type : 1080Ti and 2080Ti
Nvidia Driver Version : 440
CUDA Version : 10.2
CUDNN Version : 7.6.5
Operating System + Version : 16.04
Python Version (if applicable) :
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) :
Baremetal or Container (if container which image + tag) :

Relevant Files

https://drive.google.com/drive/folders/1BAcKVbOEt3DYL_k7IMdubn1TBM2Jbnhq?usp=sharing

Steps To Reproduce

  1. load this model
  2. set its maxbatchsize=10
  3. inference with batchsize 1 at first
  4. inference with batchsize 2 then, it will be crashed

I am not able to reproduce the issue on my set-up with TRT 7 and RTX 2060 GPU.
Could you please share the script along with verbose log so we can help better?

Meanwhile, please try to run the Caffe model using “trtexec” command line tool in verbose mode for debugging:

Thanks

1 Like

I will try it, Thanks

I have tried sampleMNIST, if I put cudaStreamCreate(&stream) at SampleMNIST construction function, problem will be produced, other wise , it will be OK

you can reproduce the problem with sampleMNIST.cpp provide in https://drive.google.com/drive/folders/1BAcKVbOEt3DYL_k7IMdubn1TBM2Jbnhq?usp=sharing

Issue is fixed and should be available in next TRT release.
Request you to please stay tuned for TRT release announcement.

Thanks

glad to here that, please release it as soon as possible since 2080Ti is a substitute GPU of 1080Ti which are used widely