Hi
I am optimizing InceptionV4 using Tensor RT. This means I am using the frozen file. I am having a “Low memory warning”. And most of the times the Jetson TX2 crashes. So I want to understand the allocation of memory. Here is my reasoning, please tell me if it is ok.
trt_graph_def = trt.create_inference_graph(
input_graph_def=classifier_graph_def,
outputs=['InceptionV4/Logits/Logits/BiasAdd'],
max_batch_size=1,
max_workspace_size_bytes=3*(10**9),
precision_mode=FP16)
print('Generated TensorRT graph def')
The Jetson TX2 has a RAM memory of 8GB. It is shared between CPU and GPU.
- A batch size is the number of training examples present in a single batch. And that is why I do not understand this piece of line: max_batch_size=1
- Here I am specifying 3GB as the maximun GPU memory size available for TensorRT: max_workspace_size_bytes=3*(10**9)
- And here I am giving 67% (5GB) to TensorFlow and 33% (3GB) to TensorRT, of of GPU memory: trt_gpu_ops=tf.GPUOptions(per_process_gpu_memory_fraction = 0.67)