TensorRT inference process

Description

Hi, I am new to TensorRT C++, and I tried to load a pre-build tensorRT model and run it. However, when I have finished deserializing my tensorRT model, I found out some trouble when I referred to the “TensorRT documentation”.
In the documentation the inference process requires user to “set up a buffer array pointing to the input and output buffers on the GPU”. But, in the code below,:

void buffers[2];*
**buffers[inputIndex] = inputBuffer; **
buffers[outputIndex] = outputBuffer;

the variable “inputBuffer”, “outputBuffer” were not specified in the above sections.

Could you give me some more details about these 2 varibales? And I have put my UNFINISHED code below just for convenience.

int main(int argc, char** argv){
std::cout << “Read and load the engine” << std::endl;
IRuntime* runtime = createInferRuntime(sample::gLogger);
std::string cached_path = “./itest_8.trt”;
std::ifstream fin(cached_path);
std::string cached_engine = “”;
std::stringstream buffer;
buffer << fin.rdbuf();
cached_engine.append(buffer.str());
fin.close();

ICudaEngine* loaded_engine = runtime>deserializeCudaEngine(cached_engine.data(), cached_engine.size(), nullptr); 
std::cout << "Loading complete!" << std::endl; 

std::cout << "now let's do some inference" << std::endl; 
IExecutionContext *context = loaded_engine->createExecutionContext(); 
int input_node = loaded_engine->getBindingIndex("input_1"); 
int output_node = loaded_engine->getBindingIndex("output_1"); 
void* buffers[2]; 

////////////////////////////////////////
buffers[input_node] = inputbuffer; 	//Now the problem is here
////////////////////////////////////////
return 0; 

}

Many Thanks!

Environment

TensorRT Version: 7.2.2
GPU Type: Titan V;
Nvidia Driver Version: 450.51.05
CUDA Version: 11.0
CUDNN Version: 8.0.4
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.7
TensorFlow Version (if applicable): 1.14
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,
Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

Also, request you to share your model and script if not shared already so that we can help you better.

Thanks!

&&&& RUNNING TensorRT.trtexec # ./trtexec --loadEngine=./itest_8.trt --batch=1 --verbose
[04/30/2021-14:14:46] [I] === Model Options ===
[04/30/2021-14:14:46] [I] Format: *
[04/30/2021-14:14:46] [I] Model:
[04/30/2021-14:14:46] [I] Output:
[04/30/2021-14:14:46] [I] === Build Options ===
[04/30/2021-14:14:46] [I] Max batch: 1
[04/30/2021-14:14:46] [I] Workspace: 16 MiB
[04/30/2021-14:14:46] [I] minTiming: 1
[04/30/2021-14:14:46] [I] avgTiming: 8
[04/30/2021-14:14:46] [I] Precision: FP32
[04/30/2021-14:14:46] [I] Calibration:
[04/30/2021-14:14:46] [I] Refit: Disabled
[04/30/2021-14:14:46] [I] Safe mode: Disabled
[04/30/2021-14:14:46] [I] Save engine:
[04/30/2021-14:14:46] [I] Load engine: /home/wlx/new_project/itest_8.trt
[04/30/2021-14:14:46] [I] Builder Cache: Enabled
[04/30/2021-14:14:46] [I] NVTX verbosity: 0
[04/30/2021-14:14:46] [I] Tactic sources: Using default tactic sources
[04/30/2021-14:14:46] [I] Input(s)s format: fp32:CHW
[04/30/2021-14:14:46] [I] Output(s)s format: fp32:CHW
[04/30/2021-14:14:46] [I] Input build shapes: model
[04/30/2021-14:14:46] [I] Input calibration shapes: model
[04/30/2021-14:14:46] [I] === System Options ===
[04/30/2021-14:14:46] [I] Device: 0
[04/30/2021-14:14:46] [I] DLACore:
[04/30/2021-14:14:46] [I] Plugins:
[04/30/2021-14:14:46] [I] === Inference Options ===
[04/30/2021-14:14:46] [I] Batch: 1
[04/30/2021-14:14:46] [I] Input inference shapes: model
[04/30/2021-14:14:46] [I] Iterations: 10
[04/30/2021-14:14:46] [I] Duration: 3s (+ 200ms warm up)
[04/30/2021-14:14:46] [I] Sleep time: 0ms
[04/30/2021-14:14:46] [I] Streams: 1
[04/30/2021-14:14:46] [I] ExposeDMA: Disabled
[04/30/2021-14:14:46] [I] Data transfers: Enabled
[04/30/2021-14:14:46] [I] Spin-wait: Disabled
[04/30/2021-14:14:46] [I] Multithreading: Disabled
[04/30/2021-14:14:46] [I] CUDA Graph: Disabled
[04/30/2021-14:14:46] [I] Separate profiling: Disabled
[04/30/2021-14:14:46] [I] Skip inference: Disabled
[04/30/2021-14:14:46] [I] Inputs:
[04/30/2021-14:14:46] [I] === Reporting Options ===
[04/30/2021-14:14:46] [I] Verbose: Enabled
[04/30/2021-14:14:46] [I] Averages: 10 inferences
[04/30/2021-14:14:46] [I] Percentile: 99
[04/30/2021-14:14:46] [I] Dump refittable s:Disabled
[04/30/2021-14:14:46] [I] Dump output: Disabled
[04/30/2021-14:14:46] [I] Profile: Disabled
[04/30/2021-14:14:46] [I] Export timing to JSON file:
[04/30/2021-14:14:46] [I] Export output to JSON file:
[04/30/2021-14:14:46] [I] Export profile to JSON file:
[04/30/2021-14:14:46] [I]
[04/30/2021-14:14:54] [I] === Device Information ===
[04/30/2021-14:14:54] [I] Selected Device: TITAN V
[04/30/2021-14:14:54] [I] Compute Capability: 7.0
[04/30/2021-14:14:54] [I] SMs: 80
[04/30/2021-14:14:54] [I] Compute Clock Rate: 1.455 GHz
[04/30/2021-14:14:54] [I] Device Global Memory: 12066 MiB
[04/30/2021-14:14:54] [I] Shared Memory per SM: 96 KiB
[04/30/2021-14:14:54] [I] Memory Bus Width: 3072 bits (ECC disabled)
[04/30/2021-14:14:54] [I] Memory Clock Rate: 0.85 GHz
[04/30/2021-14:14:54] [I]
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::Proposal version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::Split version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[04/30/2021-14:14:54] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[04/30/2021-14:14:55] [W] [TRT] TensorRT was linked against cuDNN 8.0.5 but loaded cuDNN 8.0.4
[04/30/2021-14:14:56] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.2.0 but loaded cuBLAS/cuBLAS LT 11.1.0
[04/30/2021-14:14:56] [V] [TRT] Deserialize required 1106093 microseconds.
[04/30/2021-14:14:56] [I] Engine loaded in 1.76435 sec.
[04/30/2021-14:14:56] [W] [TRT] TensorRT was linked against cuDNN 8.0.5 but loaded cuDNN 8.0.4
[04/30/2021-14:14:56] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.2.0 but loaded cuBLAS/cuBLAS LT 11.1.0
[04/30/2021-14:14:56] [V] [TRT] Allocated persistent device memory of size 39212544
[04/30/2021-14:14:56] [V] [TRT] Allocated activation device memory of size 116225536
[04/30/2021-14:14:56] [V] [TRT] Assigning persistent memory blocks for various profiles
[04/30/2021-14:14:56] [I] Starting inference
[04/30/2021-14:14:59] [I] Warmup completed 1 queries over 200 ms
[04/30/2021-14:14:59] [I] Timing trace has 666 queries over 2.75008 s
[04/30/2021-14:14:59] [I] Trace averages of 10 runs:
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.44529 ms - Host latency: 4.8321 ms (end to end 7.86976 ms, enqueue 2.26097 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.43332 ms - Host latency: 4.82082 ms (end to end 8.12551 ms, enqueue 2.17996 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.43042 ms - Host latency: 4.81671 ms (end to end 8.06527 ms, enqueue 1.99925 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.42807 ms - Host latency: 4.81323 ms (end to end 7.92094 ms, enqueue 1.83416 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.42715 ms - Host latency: 4.81287 ms (end to end 8.02704 ms, enqueue 1.76178 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.17424 ms - Host latency: 4.55819 ms (end to end 7.41914 ms, enqueue 1.67219 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.14781 ms - Host latency: 4.53007 ms (end to end 7.6062 ms, enqueue 1.59985 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.12979 ms - Host latency: 4.51404 ms (end to end 7.77371 ms, enqueue 1.88416 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.12723 ms - Host latency: 4.51429 ms (end to end 7.834 ms, enqueue 2.11445 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.10267 ms - Host latency: 4.48977 ms (end to end 7.74086 ms, enqueue 2.04337 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08863 ms - Host latency: 4.47354 ms (end to end 7.09607 ms, enqueue 2.18604 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09067 ms - Host latency: 4.47547 ms (end to end 7.47484 ms, enqueue 2.07023 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08401 ms - Host latency: 4.47104 ms (end to end 7.70482 ms, enqueue 2.06426 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.0829 ms - Host latency: 4.46992 ms (end to end 7.9511 ms, enqueue 2.14661 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08402 ms - Host latency: 4.47418 ms (end to end 7.09452 ms, enqueue 2.28236 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08076 ms - Host latency: 4.46663 ms (end to end 7.33628 ms, enqueue 2.08529 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07993 ms - Host latency: 4.46367 ms (end to end 7.64805 ms, enqueue 2.01919 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08186 ms - Host latency: 4.46851 ms (end to end 7.86574 ms, enqueue 1.85051 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.0844 ms - Host latency: 4.47157 ms (end to end 7.96527 ms, enqueue 1.83258 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08369 ms - Host latency: 4.46936 ms (end to end 7.42255 ms, enqueue 1.93965 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08187 ms - Host latency: 4.46757 ms (end to end 7.46995 ms, enqueue 2.15243 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08207 ms - Host latency: 4.46857 ms (end to end 7.39984 ms, enqueue 2.0499 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07881 ms - Host latency: 4.46538 ms (end to end 7.68507 ms, enqueue 1.95341 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08275 ms - Host latency: 4.47366 ms (end to end 7.89254 ms, enqueue 2.1233 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.0871 ms - Host latency: 4.47847 ms (end to end 7.93748 ms, enqueue 2.11808 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08237 ms - Host latency: 4.46838 ms (end to end 7.48705 ms, enqueue 2.0648 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09088 ms - Host latency: 4.47905 ms (end to end 7.21792 ms, enqueue 2.10291 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09344 ms - Host latency: 4.49016 ms (end to end 7.7327 ms, enqueue 2.13967 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08782 ms - Host latency: 4.47954 ms (end to end 7.90985 ms, enqueue 2.0653 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08718 ms - Host latency: 4.47308 ms (end to end 7.9911 ms, enqueue 1.94706 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07974 ms - Host latency: 4.46454 ms (end to end 7.65446 ms, enqueue 1.84956 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.0792 ms - Host latency: 4.46511 ms (end to end 7.48359 ms, enqueue 1.83281 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.0799 ms - Host latency: 4.46553 ms (end to end 7.84554 ms, enqueue 1.967 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08434 ms - Host latency: 4.47159 ms (end to end 7.81114 ms, enqueue 2.18958 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08157 ms - Host latency: 4.46794 ms (end to end 7.98835 ms, enqueue 2.03068 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08042 ms - Host latency: 4.46547 ms (end to end 7.98448 ms, enqueue 1.86172 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08185 ms - Host latency: 4.46481 ms (end to end 7.31252 ms, enqueue 1.70641 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07881 ms - Host latency: 4.46306 ms (end to end 7.66566 ms, enqueue 1.70571 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08167 ms - Host latency: 4.47 ms (end to end 7.83624 ms, enqueue 1.70654 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09543 ms - Host latency: 4.49028 ms (end to end 7.70867 ms, enqueue 1.71919 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09526 ms - Host latency: 4.49038 ms (end to end 7.6366 ms, enqueue 1.67061 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09509 ms - Host latency: 4.49001 ms (end to end 7.49316 ms, enqueue 1.67634 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09336 ms - Host latency: 4.49277 ms (end to end 7.7749 ms, enqueue 1.69236 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09575 ms - Host latency: 4.48899 ms (end to end 7.98733 ms, enqueue 1.62747 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09229 ms - Host latency: 4.4939 ms (end to end 7.74849 ms, enqueue 1.61987 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09482 ms - Host latency: 4.48926 ms (end to end 7.80413 ms, enqueue 1.75386 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09729 ms - Host latency: 4.49399 ms (end to end 7.74124 ms, enqueue 2.25955 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09705 ms - Host latency: 4.49424 ms (end to end 7.9873 ms, enqueue 2.18704 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.09265 ms - Host latency: 4.48613 ms (end to end 7.98384 ms, enqueue 2.049 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08103 ms - Host latency: 4.46963 ms (end to end 7.5092 ms, enqueue 2.01763 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07854 ms - Host latency: 4.46416 ms (end to end 7.63745 ms, enqueue 2.20237 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07869 ms - Host latency: 4.46348 ms (end to end 7.69004 ms, enqueue 2.04788 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07947 ms - Host latency: 4.4645 ms (end to end 7.7304 ms, enqueue 1.94814 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08359 ms - Host latency: 4.47068 ms (end to end 7.92515 ms, enqueue 1.84417 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07607 ms - Host latency: 4.46248 ms (end to end 8.00276 ms, enqueue 2.20955 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07866 ms - Host latency: 4.46501 ms (end to end 7.52761 ms, enqueue 2.25459 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07788 ms - Host latency: 4.4637 ms (end to end 8.01196 ms, enqueue 2.15667 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08003 ms - Host latency: 4.46489 ms (end to end 7.70994 ms, enqueue 2.02893 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07688 ms - Host latency: 4.46316 ms (end to end 7.75447 ms, enqueue 1.93926 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07954 ms - Host latency: 4.46458 ms (end to end 7.90881 ms, enqueue 1.93975 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07825 ms - Host latency: 4.46692 ms (end to end 7.54536 ms, enqueue 2.21606 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.0791 ms - Host latency: 4.46443 ms (end to end 7.63345 ms, enqueue 2.0592 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08218 ms - Host latency: 4.46775 ms (end to end 7.96272 ms, enqueue 1.94109 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07944 ms - Host latency: 4.46541 ms (end to end 7.48708 ms, enqueue 1.87207 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.08196 ms - Host latency: 4.46794 ms (end to end 7.81377 ms, enqueue 1.84001 ms)
[04/30/2021-14:14:59] [I] Average on 10 runs - GPU latency: 4.07993 ms - Host latency: 4.46682 ms (end to end 7.94912 ms, enqueue 2.06599 ms)
[04/30/2021-14:14:59] [I] Host Latency
[04/30/2021-14:14:59] [I] min: 4.44922 ms (end to end 4.5033 ms)
[04/30/2021-14:14:59] [I] max: 4.93512 ms (end to end 8.72623 ms)
[04/30/2021-14:14:59] [I] mean: 4.50229 ms (end to end 7.72822 ms)
[04/30/2021-14:14:59] [I] median: 4.47177 ms (end to end 7.9743 ms)
[04/30/2021-14:14:59] [I] percentile: 4.8248 ms at 99% (end to end 8.65201 ms at 99%)
[04/30/2021-14:14:59] [I] throughput: 242.175 qps
[04/30/2021-14:14:59] [I] walltime: 2.75008 s
[04/30/2021-14:14:59] [I] Enqueue Time
[04/30/2021-14:14:59] [I] min: 1.52295 ms
[04/30/2021-14:14:59] [I] max: 2.68457 ms
[04/30/2021-14:14:59] [I] median: 1.99921 ms
[04/30/2021-14:14:59] [I] GPU Compute
[04/30/2021-14:14:59] [I] min: 4.06836 ms
[04/30/2021-14:14:59] [I] max: 4.55066 ms
[04/30/2021-14:14:59] [I] mean: 4.11436 ms
[04/30/2021-14:14:59] [I] median: 4.08472 ms
[04/30/2021-14:14:59] [I] percentile: 4.43701 ms at 99%
[04/30/2021-14:14:59] [I] total compute time: 2.74016 s
&&&& PASSED TensorRT.trtexec # ./trtexec --loadEngine=./itest_8.trt --batch=1 --verbose

I have run it using trtexec and it has passed.

Hi @364083042,

Sorry for delayed response. We use buffers to copy data to/from GPU memory.
Please refer C++ samples to create inference.

Thank you.