Cuda Error in NCHWToNCHW

J-Penny · December 17, 2018, 6:40am

using Jetson xavier and Jetpack 4.1,TensorRT 5.0,caffe model parser

Hello,

I have already successfully implements the priorbox plugin to parse caffe model,but, when calling the buildCudaEngine to create engine,it always occurs to the following errors:

ERROR: cuda/reformat.cu (408) - Cuda Error in NCHWToNCHW: 11
ERROR: cuda/reformat.cu (408) - Cuda Error in NCHWToNCHW: 11
sample_detection: TensorRtCaffeModel.cpp:59: void caffeToGIEModel(const string&, const string&, const std::vector<std::__cxx11::basic_string<char> >&, unsigned int, nvcaffeparser1::IPluginFactoryExt*, nvinfer1::IHostMemory*&): Assertion `engine' failed.
Aborted (core dumped)

In the class PriorBoxPlugin : nvinfer1::IPluginExt,I have implemented the functions of supportsFormat.From the error log,I don’t know where is wrong,whether I should declare the format,and where do i need to declare?

please explain the cause of this problem and privode solutions,thanks a lot.

bool supportsFormat(DataType type, PluginFormat format) const override
{ 
	return (type == DataType::kFLOAT || type == DataType::kHALF) && format == PluginFormat::kNCHW; 
}
void configureWithFormat(const Dims* inputDims, int nbInputs, const Dims* outputDims, int nbOutputs, DataType type, PluginFormat format, int maxBatchSize) override
{
	assert((type == DataType::kFLOAT || type == DataType::kHALF) && format == PluginFormat::kNCHW);
	assert(nbInputs == 2);
        .......
}

J-Penny · December 19, 2018, 7:38am

Hello,
In order to better analyze the problem, the code files in the project are uploaded to the following website, hoping to reply to the problem mentioned above as soon as possible.
thanks a lot.
[url]https://github.com/Penny-J/Detection_test[/url]

NVES_K · January 16, 2019, 11:32pm

Moving this over to the Jetson forums for support.

J-Penny · January 17, 2019, 10:56am

hi，
according to your suggestion, the modification had done as follows:

//the return value of getOutputDimensions function change from (1,2,top_data_size)to(2,top_data_size)
 Dims getOutputDimensions(int index, const Dims* inputs, int nbInputDims)override
 {
 assert(index == 0 && nbInputDims == 2 && inputs[0].nbDims==3  );

 //dim=layer_height * layer_width * num_priors_ * 4; 
 top_data_size = inputs[0].d[1] * inputs[0].d[2] *mPriorBoxParamters.numPriors *4;

 std::cout << "getOutputDimensions top_data_size:" << top_data_size << " inputs[0]d[0]:"<< inputs[0].d[0] << " d[1]:" << inputs[0].d[2] << " d[2]:" << inputs[0].d[1] <<"  num_priors_:" << mPriorBoxParamters.numPriors << std::endl;
 //std::cout << "getOutputDimensions inputs[1]d[0]:"<< inputs[1].d[0] << " d[1]:" << inputs[1].d[2] << " d[2]:" << inputs[1].d[1] << std::endl;
 
 return Dims2(2, top_data_size);
 }

//modify the output of enquen function
int PriorBoxPlugin::enqueue(int batchSize, const void*const *inputs, void** outputs, void* workspace,cudaStream_t stream)

{
    ……

    //Before the change
   //cudaMemcpy(outputs[0],top_data,(dim*sizeof(float)),cudaMemcpyHostToDevice);
   //cudaMemcpy(outputs[1],variance_data,(dim*sizeof(float)),cudaMemcpyHostToDevice);

   //after the change
   cudaMalloc(&outputs[0],top_data_size*2*sizeof(float));
   cudaMemcpy(outputs[0],top_data,(top_data_size*2*sizeof(float)),cudaMemcpyHostToDevice);
   return 0;
}

After modification, the problem did not appear, but the new one(Segmentation fault) occured as follows:
i don’t know where and how to called the funciton of makeEngineFromGraphy and buildsignallayer,
and What kind of mistakes will lead to this problem, so, how can i do for solve this problem ? thanks for your reply.

Thread 1 "face_detection" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) where
#0  0x0000000000000000 in  ()
#1  0x0000007faf148fcc in nvinfer1::builder::<b>buildSingleLayer</b>(nvinfer1::rt::EngineBuildContext&, nvinfer1::builder::Node&, std::unordered_map<std::string, std::unique_ptr<nvinfer1::rt::Region, std::default_delete<nvinfer1::rt::Region> >, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::unique_ptr<nvinfer1::rt::Region, std::default_delete<nvinfer1::rt::Region> > > > > const&, nvinfer1::CpuMemoryGroup&, std::unordered_map<std::string, std::vector<float, std::allocator<float> >, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::vector<float, std::allocator<float> > > > >*, bool) ()
    at /usr/lib/aarch64-linux-gnu/libnvinfer.so.5
#2  0x0000007faf14cd84 in nvinfer1::builder::<b>makeEngineFromGraph</b>(nvinfer1::CudaEngineBuildConfig const&, nvinfer1::rt::HardwareContext const&, nvinfer1::builder::Graph&, std::unordered_map<std::string, std::vector<float, std::allocator<float> >, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::vector<float, std::allocator<float> > > > >*, int) () at /usr/lib/aarch64-linux-gnu/libnvinfer.so.5
#3  0x0000007faf14facc in nvinfer1::builder::<b>buildEngine</b>(nvinfer1::CudaEngineBuildConfig&, nvinfer1::rt::HardwareContext const&, nvinfer1::Network const&) ()
    at /usr/lib/aarch64-linux-gnu/libnvinfer.so.5
#4  0x0000007faf1ba2ec in nvinfer1::builder::Builder::<b>buildCudaEngine</b>(nvinfer1::INetworkDefinition&) () at /usr/lib/aarch64-linux-gnu/libnvinfer.so.5
#5  0x000000555555cd24 in <b>caffeToGIEModel</b>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, unsigned int, nvcaffeparser1::IPluginFactoryExt*, nvinfer1::IHostMemory*&) ()
#6  0x0000005555564894 in main ()

AastaLLL · January 25, 2019, 6:02am

Hi,

CUDA error 11 indicates that one or more of the parameters passed to the API call is not within an acceptable range of values.
Could you double check if any issue in your implementation?

Here is a plugin sample for your reference:

Thanks.

J-Penny · January 25, 2019, 6:48am

Hello,
This problem has been solved with the code as follows before create the engine.

initLibNvInferPlugins(&gLogger,"");

Can you tell me what this function means? Thanks for your apply.

AastaLLL · January 31, 2019, 8:11am

Hi,

This function is required by the plugin layer.
You can also find this information in our document:

https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#add_custom_layer
----------------------------------------------------
Note: To use TensorRT registered plugins in your application, the libnvinfer_plugin.so library must be loaded and all plugins must be registered. This can be done by calling initLibNvInferPlugins(void* logger, const char* libNamespace)() in your application code.
----------------------------------------------------

Thanks.

Topic		Replies	Views
Cuda error in file C:/source/rtExt/winograd/src/implicit_gemm.cu at line 648: invalid argument TensorRT	1	398	May 18, 2021
TensorRT fails to build engine for network constructed using C++ API when setHalf2Mode(true) GPU-Accelerated Libraries	7	2491	March 16, 2018
CUDA10 initialization issue with TensorRT5 on Jetson Tx2 TensorRT	0	585	August 28, 2019
Tlt3.0 train yolov4 of resnet10, "tlt yolo_v4 inference" could get right bboxes, but deepstream5.1 get wrong result TAO Toolkit	9	668	October 12, 2021
Segmentation fault (core dumped) while doing Tensorrt optimization of lenet Jetson TX2	6	6217	October 18, 2021
ERROR: [TRT} stdArchiveReader ... Serialization assertion TAO Toolkit tensorrt	12	6550	September 25, 2022
[TensorRT] Cuda Error in findFastestTactic: 4 (unspecified launch failure), Cuda Error in free: 4 (u... Jetson AGX Xavier	8	1711	March 11, 2020
TensorRT 3.0.1 - SSDNormalizePlugin destroy fails TensorRT	17	3265	August 7, 2018
Engine from file :primary.resnet10.caffemodel_b4_gpu0_int8.engine failed DeepStream SDK jetson-inference , docker , deepstream , graph-composer	8	504	March 7, 2023
Building INT8 inferencing engine of Caffe+Faster R-CNN meeted CudaError 4 in findFastestTactic. TensorRT	0	567	June 4, 2019

Cuda Error in NCHWToNCHW

Related topics