Deploying caffe model on Jetson tx2 with the help of TensorRT

shubham7494 · January 23, 2018, 10:59am

hello everyone,
I have tensorRT engine(API Reference :: NVIDIA Deep Learning TensorRT Documentation)developed from caffe model. Can I deploy this engine directly onto jetson tx2 or do I have to do any modifications?
Thanks!!

dusty_nv · January 23, 2018, 5:02pm

Hi shubham7494, the TensorRT engine should be built on the TX2, because TensorRT performs GPU-specific profiling and optimizations at this phase. But yes, after the TensorRT CUDA engine is built for your caffemodel, you can deploy it at runtime without further modification.

See here for a TensorRT code example which runs on Jetson TX1/TX2: [url]https://github.com/dusty-nv/jetson-inference[/url]

shubham7494 · January 24, 2018, 5:15am

Hello dusty_nv,
Do I have to copy the caffemodel on TX2 and build engine there?
and Does TX2 support python api to build the engine for caffemodel or Do I have to use c++ api?
Thank you.

dusty_nv · January 24, 2018, 5:32pm

Hi shubh, you’ll need to copy the caffemodel to a TX2 and build the engine there.
You can then save the engine to your own file and load it again in the future, saving time, or copy it to run on other TX2’s.

Jetson/ARM does not currently support the TensorRT python API, so it would be done through C++ API at this time.

shubham7494 · January 29, 2018, 4:22am

Thank you dusty_nv.

shubham7494 · January 30, 2018, 11:06am

Hi dusty_nv,
when I try to build engine I am getting following error at function call (buildCudaEngine) C++ api tensorRT.

*** Error in `./sample_CIFAR10_debug’: free(): invalid next size (fast): 0x00002afdef67ca20 ***
Aborted (core dumped)
make: *** [test_debug] Error 134

Please help.
(for same caffemodel and proto file I am able to build engine using python api)

AastaLLL · February 2, 2018, 6:08am

Hi,

Looks like there is some incorrect handling in your application.
It’s recommended to modify from our standard sample to figure out the problem.

Default:
/usr/src/tensorrt/samples/*

Online:
https://github.com/dusty-nv/jetson-inference#classifying-images-with-imagenet

Thanks.

shubham7494 · February 2, 2018, 10:50am

Hi AastaLLL,
I am working with sample programs only, even sampleInt8 giving me same error.
Thanks.

AastaLLL · February 5, 2018, 8:30am

Hi,

INT8 is only available on 6.1 GPU architecture, not for TX2 which is on 6.2.
Do you meet this error in other samples?

Thanks.

shubham7494 · February 5, 2018, 9:02am

Hi,

Yes for sampleGoogleNet too I am getting same error. Also if I use other dataset and model, and modify sampleMNIST file accordingly still it gives the same error.

Thanks.

shubham7494 · February 6, 2018, 4:37am

for giexec commandline too its giving same error.

shubham@pas-lab-server5:~/TensorRT-3.0.2/bin$ ./giexec --deploy=/home/shubham/TensorRT-3.0.2/data/mnist/mnist.prototxt --model=/home/shubham/TensorRT-3.0.2/data/mnist/mnist.caffemodel --output=prob --half2 --engine=/home/shubham/TensorRT-3.0.2/new.engine
deploy: /home/shubham/TensorRT-3.0.2/data/mnist/mnist.prototxt
model: /home/shubham/TensorRT-3.0.2/data/mnist/mnist.caffemodel
output: prob
half2
engine: /home/shubham/TensorRT-3.0.2/new.engine
Input “data”: 1x28x28
Output “prob”: 10x1x1
Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
*** Error in `./giexec’: free(): invalid next size (fast): 0x00007f5f2f3b1660 ***
Aborted (core dumped)

AastaLLL · February 8, 2018, 8:41am

Hi,

Looks like you are not on a Jetson platform.
Could you share the deviceQuery information with us first?

/usr/local/cuda-9.0/bin/cuda-install-samples-9.0.sh .
cd NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery
make
./deviceQuery

Thanks

shubham7494 · February 9, 2018, 4:27am

Yeah I am not working on tx2 yet.
This is result of ./deviceQuery

shubham@pas-lab-server5:~/NVIDIA_CUDA-8.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery
./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “Tesla K20Xm”
CUDA Driver Version / Runtime Version 9.0 / 8.0
CUDA Capability Major/Minor version number: 3.5
Total amount of global memory: 5700 MBytes (5976424448 bytes)
(14) Multiprocessors, (192) CUDA Cores/MP: 2688 CUDA Cores
GPU Max Clock rate: 732 MHz (0.73 GHz)
Memory Clock rate: 2600 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 1572864 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 132 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = Tesla K20Xm
Result = PASS

AastaLLL · February 12, 2018, 8:02am

Hi,

Your CUDA driver and CUDA runtime are in the different version.
>> CUDA Driver Version / Runtime Version 9.0 / 8.0

Please set up your environment with identical CUDA version first:

For CUDA8.0, it should be
[i]>> CUDA Driver Version / Runtime Version 8.0 / 8.0

TensorRT 3.0.2 for Ubuntu 1604 and CUDA 8.0 DEB local repo packages[/i]

For CUDA9.0, it should be
[i]>> CUDA Driver Version / Runtime Version 9.0 / 9.0

TensorRT 3.0.2 for Ubuntu 1604 and CUDA 9.0 DEB local repo packages[/i]

Thanks.

Topic		Replies	Views
about deploying the caffe model inside Jeston TX2 using TensorRT TensorRT	6	1206	November 19, 2019
Faster-RCNN engine (TensorRT-8.2) failed to run inference on Jetson TX2 NX Jetson TX2 tensorrt	4	979	September 6, 2023
How to build caffe2 on TX2 with tensorRT? Jetson TX2	1	1023	October 31, 2018
Aboat tensorrt python api on TX2 Jetson TX2	2	1668	May 20, 2019
How to use C/C++ with Tensorflow on Jetson TX2 Jetson TX2	5	3452	May 7, 2019
Failed to build engine.py on the Jetson TX2 jetpack 4.4 Jetson TX2	3	692	August 11, 2023
Engine Plan Inference on JetsonTX2 Jetson TX2 tensorrt , python	10	2038	June 17, 2020
Jetson_tx2_nx Jetson TX2 tensorrt	1	565	November 24, 2021
Tensorflow not using GPU in Jetson TX2 Jetson TX2	11	4525	February 12, 2018
Loading of the tensorRT Engine in C++ API Jetson TX1	23	19715	July 30, 2020

Deploying caffe model on Jetson tx2 with the help of TensorRT

Related topics