illegal memory exception on T4

147060616 · July 11, 2019, 5:32am

I am porting a C++ program from the P4 platform to the T4 platform, which is work fine on P4. But on T4 could not, and the error messages are related to illegal memory. Dos T4 not support running multiple algorithm models on a single GPU card?

P4 env info: cuda9 + cudnn7 + tensorRT3
T4 env info: cuda10.0 + cudnn7.3 + tensorRT5

if anybody has some idea, please let me know.Thanks !!!

NVES_R · July 12, 2019, 5:36pm

Hi,

Can you please provide a small repro package with code and other necessary data so I can further debug this? It is possible that code from TensorRT 3 has since been deprecated in TensorRT 5.

Thanks,
NVIDIA Enterprise Support

147060616 · July 15, 2019, 12:59pm

I have some models in Caffe and then optimized by TensorRT. They are running well in the same thread but I am having some problems when using mulit-threads，each Thread run a model. I am pasting the error just below:

ERROR: CUDA cask failure at execution for trt_maxwell_scudnn_128x32_relu_small_nn_v1.
ERROR: cuda/caskConvolutionLayer.cpp (256) - Cuda Error in execute: 77
ERROR: cuda/caskConvolutionLayer.cpp (256) - Cuda Error in execute: 77

I’ve generated the PLAN file from a turing uarchitecture (GeForce 1050 Ti and T4) so it sounds weird for me that part that sais “failure at execution for trt_maxwell_scudnn_128x32_relu_small_nn_v1”. Does it have any sense?

NVES_R · July 15, 2019, 9:04pm

Hi,

Can you share a repro package with the

1. original model
2. converted model
3. scripts that you ran to convert the model
4. scripts for running the model with multiple threads
5. dataset used for models/scripts if any

so I can reproduce and debug this issue? You can private message this information to me if you don’t want to share it publicly.

Thanks,
NVIDIA Enterprise Support

147060616 · July 17, 2019, 3:26am

Hi NVES_R,

Thanks for all the answers!

I’ve fixed this problem, a GpuMat copy problem.The second thread uses some GpuMat data, which is just a part of GpuMat data in the first thread.

Thanks again for your help!

NVES_R · July 17, 2019, 4:11pm

No problem, glad you figured it out.

Thanks,
NVIDIA Enterprise Support

Topic		Replies	Views
CUDA cask failure at execution for trt_volta_scudnn_128x64_relu_xregs_large_nn_v1 TensorRT	5	1036	May 12, 2022
Running 2 models on the same GPU with TensorRT TensorRT	7	1365	January 15, 2021
[reformat.cpp::executeCutensor::384] Error Code 1: CuTensor (Internal cuTensor permutate execute failed) Jetson Nano cuda	3	2191	June 13, 2023
Tesla T4: Cudnn Error in execute: 8 (CUDNN_STATUS_EXECUTION_FAILED) TensorRT	2	1601	July 17, 2020
[TensorRT] engine happed a error in multithreaded TensorRT tensorrt , cuda	2	1670	January 19, 2023
building with tensor rt 5.1.2 , cuda 10.0 ,cudnn 7.6.4 on Titan Rtx 24gb TensorRT	1	815	December 4, 2019
cudnn error when using Tesla T4 TensorRT	1	1869	December 5, 2019
Segmentation fault (cored dumped) when using TensorRT with multithreading TensorRT	1	1862	May 3, 2021
TensorRT fails to exit properly TensorRT tensorrt , cuda , pycuda	8	2965	October 14, 2021
TensorRT do_inference error TensorRT	19	8693	November 14, 2022

illegal memory exception on T4

Related topics