Hi:
Here is my platform:
Centos 7
P4
CUDA 10
Python 2.7
Tensorflow 1.15
I want to convert a model to INT8, so i send calibrate data by:
def feed_dict_fn_1():
return {'image_example:0': ['Hello']}
new_graph_def = converter.calibrate(fetch_names=preserve_nodes,
num_runs=1,
feed_dict_fn=feed_dict_fn_1)
but i get error:
File “/usr/lib64/python2.7/site-packages/tensorflow_core/python/compiler/tensorrt/trt_convert.py”, line 613, in calibrate
fetches, feed_dict=feed_dict_fn() if feed_dict_fn else None)
File “/usr/lib64/python2.7/site-packages/tensorflow_core/python/client/session.py”, line 956, in run
run_metadata_ptr)
File “/usr/lib64/python2.7/site-packages/tensorflow_core/python/client/session.py”, line 1180, in _run
feed_dict_tensor, options, run_metadata)
File “/usr/lib64/python2.7/site-packages/tensorflow_core/python/client/session.py”, line 1359, in _do_run
run_metadata)
File “/usr/lib64/python2.7/site-packages/tensorflow_core/python/client/session.py”, line 1384, in _do_call
raise type(e)(node_def, op, message)
InvalidArgumentError: You must feed a value for placeholder tensor ‘image_example’ with dtype string
[[node image_example (defined at usr/lib64/python2.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]
it seems that i send calibrate data but the placeholder tensor fail to get it.
Hi,
Thans for your response, i can do calibration now, but i only send one picture to calibrate, but calibration used 12G of GPU and i got error: DefaultLogger …/builder/cudnnCalibrator.cpp (703) - Cuda Error in add: 2 (out of memory)
Hi,
Thans for your response, i just set max_workspace_size_bytes=(1 << 30) * 4 in trt.TrtGraphConverter, and my GPU is 12G. but converter.calibrate use all of 12G and raise cuda can’t alloc enough memory like:
Cuda error in file src/implicit_gemm.cu at line 585: out of memory
Here is my code:
config = tf.ConfigProto(gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.50))
converter = trt.TrtGraphConverter(input_graph_def=original_graph_def,
nodes_blacklist=preserve_nodes,
session_config=config,
max_batch_size=self.max_batch_size,
max_workspace_size_bytes=(1 << 30) * 4,
precision_mode="INT8",
minimum_segment_size=self.minimum_segment_size,
is_dynamic_op=self.is_dynamic_op,
maximum_cached_engines=self.maximum_cached_engines,
use_calibration=True)
Hi,
Can you try running your model in FP32 mode and check the mem usage?
Also, please share the model file so we can help better.
Thanks
Hi,
When i run my model in TRT-F32 mode, the mem usage is 12045MiB / 12196MiB.
I am sorry than i can’t share the model because of authority, but i get this error with several models, thankyou.
Hi,
It seems your model size is almost 12GB.
For calibration, computation are done in FP32 and all weights also needs to be fp32. If you have more weights than memory you have on GPU, you will get failure.
One way might be breaking the network down into pieces and do calibration one by one then you just calibrate smaller networks with less weight data or use GPU with higher memory configuration.
Thanks
Hi,
thank you for your response. I think that 12GB of GPU is used because i allow tensorflow use all of GPU, as i can also run the model on one 7GB GPU. But i don’t know my model size, so you mean that my model is too big so that i can’t calibrate it directly?
I have several models fail to calibrate on a 12GB GPU, but i would also work on a 7GB GPU, so can i slove this problem in a simple way? Thank you very much.
Hi,
Without model it’s difficult to reproduce and debug it to root cause the issue.
But, here are few things that you can try:
- set maximum_cached_engines to 1
- Create model in static mode
- Breaking the network down into pieces and do calibration one by one then you just calibrate smaller networks
- You can also try TF 2.O. Please refer this sample: https://github.com/tensorflow/tensorrt/blob/54c446be88db4e1ac76e56b438e1c6df79e6018d/tftrt/examples/image-classification/TF-TRT-inference-from-saved-model.ipynb
Thanks