Building a network takes too long

oelgendy1 · April 19, 2021, 7:04pm

Description

Hi, I am trying to build a U-Net like the one here (GitHub - milesial/Pytorch-UNet: PyTorch implementation of the U-Net for image semantic segmentation with high quality images) by compiling it and saving the serialzed trt engine. However, the process is too slow. Takes 45min for 2048*2048 resolution. Is there anyway to speed up the network serialization?

Environment

TensorRT Version: TensorRT-7.2.3.4
GPU Type: NVIDIA GeForce GTX 1660 Ti with Max-Q Design
Nvidia Driver Version: 27.21.14.6079
CUDA Version: 11
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

spolisetty · April 20, 2021, 4:32pm

Hi @oelgendy1,

We request you to provide more details. Could you please let us know how are you building tensorrt engine.

Thank you.

oelgendy1 · April 20, 2021, 5:04pm

Thanks @spolisetty for your reply. I am using the same method in sampleOnnxMNIST.cpp
https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/sampleOnnxMNIST/sampleOnnxMNIST.cpp
I am using same build parameters as in the sample code. Then, I save the serialized engine using

    nvinfer1::IHostMemory* serializedModel = mEngine->serialize();
    std::ofstream ofs(engineName, std::ios::out | std::ios::binary);
    ofs.write((char*)(serializedModel->data()), serializedModel->size());
    ofs.close();

I also tried to use tensorrt.exe and it is very slow as well. It takes 45 min on my machine to build a U-Net engine for 2048x2048 image!

Does the workspace size relate to the network build time? And if I reduced it, will it get a sub-optimal serialized network?

spolisetty · April 23, 2021, 6:19am

Hi @oelgendy1,

Could you please share the ONNX model with us to try from our end. Meanwhile we request you to check the GPU utilization during serialization. Also please refer (Section - How do I choose the optimal workspace size),

Thank you.

oelgendy1 · April 28, 2021, 5:30pm

Thanks @spolisetty for your reply. I tried other GPU and the network build is faster (10min) It highly depends on the GPU

spolisetty · May 5, 2021, 1:02pm

Hi @oelgendy1,

We also recommend you to refer NVIDIA Deep Learning TensorRT Documentation

Thank you.

Topic		Replies	Views
Building a engine takes too long TensorRT	13	3353	December 8, 2022
TensorRT model build time and deployment TensorRT tensorrt , cuda , computer-vision-cv	3	2996	February 2, 2022
TensorRT: slowdown for buildSerializedNetwork() TensorRT	6	966	April 1, 2023
TensorRT Batching Speed scales poorly TensorRT tensorrt , cuda	6	1726	September 30, 2021
Inference on Very High Resolution Images TensorRT	5	1374	October 12, 2021
Speed up or measure progress of the network profiling/building phase TensorRT	3	483	May 24, 2022
BIggest Latency in TensorRT TensorRT cudnn	1	306	October 19, 2023
【TensorRT】buildEngineWithConfig too slow in FP16 TensorRT tensorrt	11	3762	April 5, 2022
TensorRT inference take too much time than expected TensorRT tensorrt	2	1032	December 22, 2020
TensorRT runtime batch processing in C++ TensorRT tensorrt	5	1576	September 8, 2021

Building a network takes too long

Description

Environment

Relevant Files

Steps To Reproduce

Related topics