How to set parameters when using TensorRT for optimizing InceptionV4 using Jetson TX2?

Hi

I am optimizing InceptionV4 using Tensor RT. This means I am using the frozen file. I am having a “Low memory warning”. And most of the times the Jetson TX2 crashes. So I want to understand the allocation of memory. Here is my reasoning, please tell me if it is ok.

trt_graph_def = trt.create_inference_graph(
        input_graph_def=classifier_graph_def,
        outputs=['InceptionV4/Logits/Logits/BiasAdd'],
        max_batch_size=1,
        max_workspace_size_bytes=3*(10**9),
        precision_mode=FP16)
    print('Generated TensorRT graph def')

The Jetson TX2 has a RAM memory of 8GB. It is shared between CPU and GPU.

  • A batch size is the number of training examples present in a single batch. And that is why I do not understand this piece of line: max_batch_size=1
  • Here I am specifying 3GB as the maximun GPU memory size available for TensorRT: max_workspace_size_bytes=3*(10**9)
  • And here I am giving 67% (5GB) to TensorFlow and 33% (3GB) to TensorRT, of of GPU memory: trt_gpu_ops=tf.GPUOptions(per_process_gpu_memory_fraction = 0.67)

Hi,

You can set a smaller workspace.
Usually, we set it from 16MiB to 1GiB depends on the model size.

Another suggestion is to use pure TensorRT instead.
It’s known that TensorFlow might duplicate the memory for the entitle model in some older version.

Using the pure TensorRT can save you lots of memory allocated from TF.
A sample can be found here:
/usr/src/tensorrt/samples/sampleUffSSD

Thanks.