Extremely long time to load TRT-optimized frozen TF graphs

I’m happy it helped!
I can’t comment on TX2 nor Xavier as I’m still using mainly DPX2, but I’d love if NVIDIA sorts it out.

I apologize for not responding earlier. I too followed the steps provided by dariusz.filipski.

As we are deploying many TX2’s I put the steps in two scripts. They are not yet tested on a second unit, but I hope I faithfully recorded what I did.

protobuf_build_part_one.sh

#!/bin/bash
if [[ $EUID -eq 0 ]]; then
   echo "This script must be run as NON root" 
   exit 1
fi
# instructions from nvidia forum
# https://devtalk.nvidia.com/default/topic/1046492/tensorrt/extremely-long-time-to-load-trt-optimized-frozen-tf-graphs/post/5315675/#5315675
# download files
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protobuf-python-3.6.1.zip
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protoc-3.6.1-linux-aarch_64.zip
# unzip them
unzip protoc-3.6.1-linux-aarch_64.zip -d protoc-3.6.1
unzip protobuf-python-3.6.1.zip
# Update the protoc
sudo cp protoc-3.6.1/bin/protoc /usr/bin/protoc
# verify version number
echo "Verfiy version number"
protoc --version
# BUILD AND INSTALL THE LIBRARIES
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
cd protobuf-3.6.1/
./autogen.sh
./configure
make
make check
sudo make install

# if old version exists these steps may be required
# Remove unnecessary links to the old version
#    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf.a
#    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf-lite.a
#    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so
#    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf.so

# Move old version of the libraries to the same folder where the new ones have been installed, for clarity
#    sudo cp -d /usr/lib/aarch64-linux-gnu/libproto* /usr/local/lib/
#    sudo rm /usr/lib/aarch64-linux-gnu/libproto*

# Refresh shared library cache
sudo ldconfig


# Check the updated version
echo "Verfiy version number - again ?"
protoc --version

# reboot -- then do part two
echo "reboot -- then do part two"

protobuf_build_part_two.sh

#!/bin/bash
if [[ $EUID -eq 0 ]]; then
   echo "This script must be run as NON root" 
   exit 1
fi
# instructions from nvidia forum
# https://devtalk.nvidia.com/default/topic/1046492/tensorrt/extremely-long-time-to-load-trt-optimized-frozen-tf-graphs/post/5315675/#5315675

# this is part two, did you do part one ?
# BUILD AND INSTALL THE PYTHON-PROTOBUF MODULE
cd protobuf-3.6.1/python/
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp

# Fix setup.py to force compilation with c++11 standard
echo "Fix setup.py to force c++11 standard"
echo "See protobuf_setup_diff.txt"
echo "BEWARE OF THE INDENT"
sleep 5
vi setup.py

# Build, test and install
sudo apt-get -y install python3-dev
python3 setup.py build --cpp_implementation
python3 setup.py test --cpp_implementation
sudo python3 setup.py install --cpp_implementation

# Make the cpp backend a default one when user logs in
sudo sh -c "echo 'export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp' >> /etc/profile.d/protobuf.sh"

If someone tries these, please let me know if they need adjustment.
Thanks.

Also thanks again to dariusz.filipski

I’ve verified Dariusz’s finding and put the solution into a script. Please check out my blog post for details: https://jkjung-avt.github.io/tf-trt-revisited/.

Big thanks to Dariusz for sharing this again.

Any help with Jetson Nano? The model loading really takes a long time.

with tf.gfile.GFile('./ssd_mobilenet_v1_coco_trt.pb', 'rb') as pf:
    trt_graph.ParseFromString(pf.read())

Were you able to make any .pb file work on jetson nano spsayakpaul?

Also need help for Jetson Nano

I don’t have any Nano to test but @jkjung13 clearly wrote on his blog that he verified it with one. So simply try following what was put in this thread or his great blog article at https://jkjung-avt.github.io/tf-trt-revisited/

Thank @dbusby and @dariusz.filipski and you all, it works at TX2.

You’re welcome!

Note that the __init__.py file in the ~/.local/lib/python3.6/site-packages/protobuf-3.8.0-py3.6-linux-aarch64.egg/google/ folder should be removed or else Python won’t be able to find the module google.protobuf. This proved necessary if installing with python3 setup.py install --user --cpp_implementation to avoid polluting the system installation of python.

Also, besides this protobuf library it could be problem with Tensorflow cuda computability support. I have compiled Tensorflow 2.2.0 from the source and had this issue with long time model loading. I did some investigation running the script with strace -e trace=open,openat to see what was causing this problem and I saw that problem was with libnvidia-ptxjitcompiler.so.1. So, it looks like it was JIT compiling which was causing this. I thought it would be fine with next run, since it will use cache, but it wasn’t the case.
Then I tried to recompile the Tensorflow with cuda support and add the 6.1,7.0,7.5 cuda computability (default is only 3.5) and after that I don’t have any problem with this long time loading. I ran this on GCP with Tesla T4 and on my local GTX1080Ti.