Description
I used the BERT demo in TensorRT github,here come the link
I got my bert model with 2 difference shape outputs, which i use network.mark_output() to make them become the output of engine one by one, it works and i successfully build the engine.
[TensorRT] INFO: Detected 3 inputs and 2 output network tensors.
[TensorRT] INFO: Detected 3 inputs and 2 output network tensors.
[TensorRT] INFO: Detected 3 inputs and 2 output network tensors.
[TensorRT] INFO: Saving Engine to bert_slot_384.engine
[TensorRT] INFO: Done.
But problem happens when i use inference.py to do inference, i didn’t changed any code of cuda and memory part, which comes with such error message,and if only one output it can do inference.
“”"
[TensorRT] ERROR: engine.cpp (165) - Cuda Error in ~ExecutionContext: 700 (an illegal memory access was
encountered)
[TensorRT] ERROR: INTERNAL_ERROR: std::exception
[TensorRT] ERROR: Parameter check failed at: …/rtSafe/safeContext.cpp::terminateCommonContext::165, condition:
cudaEventDestroy(context.start) failure.
[TensorRT] ERROR: Parameter check failed at: …/rtSafe/safeContext.cpp::terminateCommonContext::170, condition:
cudaEventDestroy(context.stop) failure.
[TensorRT] ERROR: …/rtSafe/safeRuntime.cpp (32) - Cuda Error in free: 700 (an illegal memory access was
encountered)
terminate called after throwing an instance of ‘nvinfer1::CudaError’
what(): std::exception
Aborted (core dumped)
“”"
Environment
TensorRT Version: 6.0
GPU Type: 2080TI
Nvidia Driver Version: 418.39
CUDA Version: 10.1