TensorRT - max_batch_size issue

vincent.dufour · March 14, 2019, 9:25am

Hi evernyone,

I’m currently trying to make some object detection using TRT but I have this recurrent error:

2019-03-14 10:11:56.751652: F tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:84] input tensor batch larger than max_batch_size: 1
Aborted (core dumped)

To freeze my graph I’m using a script (included file) based on the following repository:

github.com

ardianumam/Tensorflow-TensorRT/blob/master/1_convert_TF_to_TRT.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is TensorRT?\n",
    "\n",
    "TensorRT is an optimization tool provided by NVIDIA that applies graph optimization and layer fusion, and finds the fastest implementation of a deep learning model. In other words, TensorRT will optimize our deep learning model so that we expect a faster inference time than the original model (before optimization), such as 5x faster or 2x faster. The bigger model we have, the bigger space for TensorRT to optimize the model. Furthermore, this TensorRT supports all NVIDIA GPU devices, such as 1080Ti, Titan XP for Desktop, and Jetson TX1, TX2 for embedded device."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Standard workflow for optimizing Tensorflow model to TensorRT\n",
    "\n",
    "![alt text](pictures/tf-trt_workflow.png)\n",
    "\n",
    "## Library I use in this video series\n",

This file has been truncated. show original

It works perfectly but when I want to do some detection with a home-made code (where there is no mention of batch-size) on either video or single images I’m facing the issue.
To be noted: when I use the script to do detection (the last part of tutorial in the github) I can make it work but without displaying the bbox cause my network isn’t made like his Yolov3. I don’t get why I don’t have the batch-size error with my script.

Any idea to fix this ?
Thanks in advance
TRT_optimizing.txt (1.92 KB)

AastaLLL · March 15, 2019, 6:13am

Hi,

We also have a tutorial for using TF-TRT:
https://github.com/NVIDIA-AI-IOT/tf_trt_models

The error indicates that your max_batch_size is set to 1 but the input size is larger than it.
Please try to increase the max_batch_size value to see if helps:

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,# frozen model
    outputs=your_outputs,
    <b>max_batch_size=2,# specify your max batch size</b>
    max_workspace_size_bytes=2*(10**9),# specify the max workspace
    precision_mode="FP32") # precision, can be "FP32" (32 floating point precision) or "FP16"

Thanks.

vincent.dufour · March 15, 2019, 9:11am

Hi AastaLLL,

I already tried, up to max_batch_size=10…

Is it linked to the number of images in the input ? I always put images one by one…

Thanks in advance

AastaLLL · March 22, 2019, 7:41am

Hi,

Does your output log also change to 10?

input tensor batch larger than max_batch_size: 1

If yes, could you help to print out the input buffer size?
Thanks.

vincent.dufour · March 22, 2019, 8:16am

Hi Aastalll,

yes the output log changes with the batch size I used earlier to train.

How can I print out the input buffer size exactly ?

I had a colleague this week who took a look at it and he had to put max_batch_size = 1000 to make it work…
But we now face another issue we are currently dealing dealing with cause it’s relative to our script not the Jetson.

I have though a question concerning this output log when I launch my inference script:

Using TensorFlow backend.
GPU is available!
[_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456), _DeviceAttributes(/job:localhost/replica:0/task:0/device:GPU:0, GPU, 3588841472)]

Especially the

device:GPU:0

What does it mean ? I saw on other forums guys with the same output but with GPU:1 or even 2 is it the number of GPU available or the index of the GPU ?
I’m pretty sure it uses the GPU to do the inference there is no reason why it wouldn’t but I prefer to ask to be certain.

Thanks in advance

AastaLLL · March 29, 2019, 7:47am

Hi,

device:GPU:0 indicates that it is a GPU device and its index is 0.
For TX2, there is only one GPU and always bind to index 0.

Thanks.