I’ve created a complete tutorial about how to train your custom model (a hand detector) and deploy the model with ‘tf_trt_models’ onto JTX2. Refer to the following link for details.
I tried to make ‘tf_trt_models’ to work for ‘faster_rcnn’ and ‘rfcn’ models. It was not straightforward (see reference below). I had to put it quite a few hacks to be able to build TF-TRT optimized graphs for those models. For example, I reduced the number of region proposals in the ‘faster_rcnn’/‘rfcn’ models from 300 to 32 (otherwise I’d be stuck with Out Of Memory issues on JTX2). I was finally able to get the models to work to a certain degree (not completely working yet…). And I put my latest code in my GitHub repository.
Then I measured and compared the following TF-TRT optimized models on JTX2 (in MAX-N mode).
ssd_mobilenet_v1_coco (90 classes): 43.5 ms ssd_inception_v2_coco (90 classes): 45.9 ms ssd_mobilenet_v1_egohands (1 class): 24.5 ms ssd_mobilenet_v2_egohands (1 class): 28.7 ms ssdlite_mobilenet_v2_egohands (1 class): 28.9 ms ssd_inception_v2_egohands (1 class): 25.9 ms rfcn_resnet101_egohands (1 class): 351 ms faster_rcnn_resnet50_egohands (1 class): 226 ms faster_rcnn_resnet101_egohands (1 class): 317 ms faster_rcnn_inception_v2_egohands (1 class): 117 ms
I am working in Jetson TX2 with Jetpack 3.3 which has a TensorFlow v1.9 and TensorRT v4, currently I am testing SSD Inception V2 model getting an average of 50 ms per frame with no tensorRT optimization for my frozen model. I am stuck in the optimization process as I want to further reduce the inference time with trt.create_inference_graph function.
trt_graph = trt.create_inference_graph(
max_workspace_size_bytes=1 << 25,
I get the following error:
2019-04-23 21:31:36.655950: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 0
2019-04-23 21:31:41.939334: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:515] subgraph conversion error for subgraph_index:0 due to: “Invalid argument: Output node ‘FeatureExtractor/InceptionV2/InceptionV2/Mixed_3b/concat-4-LayoutOptimizer’ is weights not tensor” SKIPPING…( 844 nodes)
I saw this issue in one of the previous comments, was there any solution for this?
I have a solution to the “extremely long model loading time problem” of TF-TRT now. Please check out my blog post for details: https://jkjung-avt.github.io/tf-trt-revisited/.
I have flashed jetpack4.2 on my TX2 board.
I trained an object detection model using Feature pyramid network(model size is 242MB).
Whenever I try the inference code on TX2 - the process is getting killed just before starting the session for processing frame(it’s able to load the model though).
I tried limiting/allocating the full memory usage using tensorflow (version 1.13), didn’t work.
any solution for this ?
Thanks in advance.
@varun365 There is a very similar issue report on GitHub: https://github.com/NVIDIA-AI-IOT/tf_trt_models/issues/6#issuecomment-498207648
The problem is clearly due to out-of-memory. I’m not able to solve it. Hopefully TF-TRT would be improved over time and such larger models would work on a future release of it.