Description
Hi,
I am new to NVIDIA tools, I am running the notebook example of training, test and export to TF-TRT UNet Industrial on DAGM dataset provided at UNet Industrial Inference Demo with TF-TRT.
I was able to do all the pipeline and exported the checkpoint to inference with TF-TRT. I am running the inference in one image with the following command:
# inference with save TF-TRT model
hvd.init()
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
start = time()
with tf.Session(graph=tf.Graph(), config=config) as sess:
tf.saved_model.loader.load(
sess, [tf.saved_model.tag_constants.SERVING], SAVED_MODEL_DIR)
nodes = [n.name for n in tf.get_default_graph().as_graph_def().node]
#print(nodes)
output = sess.run(["UNet_v1/sigmoid:0"], feed_dict={"input:0": img})
print(f'Time spent: {time() - start}')
It was supposed to be very fast (since that is the premise of doing inference on NGC Containers), but it takes 60~70 seconds to run the inference.
What can I do to speed up this time?
Is there another way to load and predict my .pb model?
The model is attached in zip file.
TR-TRT-model-FP32.zip (6.5 MB)
Environment
GPU Type: NVidia RTX A6000
Container: ```
docker build . --rm -t unet_industrial:latest