Even if model.save ('path') is done, it cannot be converted to saved model format

I’m using jetpack 4.4.1.
I ran
model.save ('path')
to save the trained efficientnet model, but it’s unclear why it’s an HDF document instead of the saved model format. Do you understand?
In short, I can’t create a folder

Hi,

You may need a root authority to create a folder under some folder.
Could you give it a try?

Thanks.

〈effi_model_save.py〉

import os
import sys
import numpy as np
from tensorflow.keras.preprocessing import image
from keras.applications.imagenet_utils import decode_predictions

from efficientnet.keras import EfficientNetB2
from efficientnet.keras import preprocess_input
import tensorflow as tf
from tensorflow import keras

model = EfficientNetB2(weights=‘imagenet’)
save_path = ‘/home/effi_test/effi_saved_model’
model.save(save_path)

sudo python3 effi_model_save.py

I ran it with sudo as above, but got the following error:

2021-03-08 00:57:31.962215: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.2’; dlerror: libcudart.so.10.2: cannot open shared object file: No such file or directory
2021-03-08 00:57:31.962311: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-03-08 00:57:31.962488: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.2’; dlerror: libcudart.so.10.2: cannot open shared object file: No such file or directory
2021-03-08 00:57:31.962529: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Segmentation fault

By the way, I am using a docker container. The mount destination mounts /home/effi_test on the host to /home on the container.

In the case of

python3 effi_model_save.py

the folder cannot be created and it becomes an HDF document.
How can I solve it by saving it in saved model format in a container anyway?

Hi,

The default format in TF 1.x is HDF5.
Could you set the save_format to tf to see if it works?

save_format	Either 'tf' or 'h5', indicating whether to save the model to Tensorflow SavedModel or HDF5. Defaults to 'tf' in TF 2.X, and 'h5' in TF 1.X.

Thanks.

Try

model.save (save_path, save_format=‘tf’)

By the way
Looking at
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/keras/applications
it is said that efficientnet of tf.keras.applications is not supported by TF1.x and is supported by TF2.x. Since my environment is TF1.15, I don’t use

from tf.keras.applications import EfficientNetBx

, I use

from efficientnet.keras import EfficientNetBx

.
Are they the same?

thank you

In my environment, it is jetpack 4.4.1 with xavier AGX.
The container uses ceai3216217 / j44_tf2.3.0: v1 (built by changing the wheel version of Dockerfile.tensorflow to tensorflow2.3.0)

・Start the container and pip3 install pillow
・When I ran pred.py the time for 100 inferences was about 20 seconds.
・Then save the trained model with effi_model_save.py
・Then convert the trained model with tf_convert.py
・Finally run suiron_rt.py. The inference time of 100 times is 269 seconds, which is considerably slower than the normal pred.py.
I don’t know the cause.

pred.py

import os
import sys
import numpy as np
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.efficientnet import preprocess_input, decode_predictions
import tensorflow as tf
import time

model = tf.keras.applications.EfficientNetB2(weights=‘imagenet’)

img_path = ‘./panda.jpg’
img = image.load_img(img_path, target_size=(260, 260))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
x = tf.constant(x)

start = time.time()
for ii in range(100):
y = model.predict(x,steps=1)
elapsed_time = time.time() - start
print (“elapsed_time:{0}”.format(elapsed_time) + “[sec]”)

print(decode_predictions(y))

effi_model_save.py

import os
import sys
import numpy as np
import tensorflow as tf

save_path = ‘effi_saved_model’

model = tf.keras.applications.EfficientNetB2(weights=‘imagenet’)
model.save(save_path)

tf_convert.py

import os
import sys
from tensorflow.python.compiler.tensorrt import trt_convert as trt

load_path = ‘effi_saved_model’
save_path = ‘effi_saved_model_TFTRT’

conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP16,max_workspace_size_bytes=8000000000)
converter = trt.TrtGraphConverterV2(input_saved_model_dir=load_path, conversion_params=conversion_params)
converter.convert()
converter.save(output_saved_model_dir=save_path)

suiron_rt.py

from future import absolute_import, division, print_function, unicode_literals
import os
import sys
import numpy as np
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.efficientnet import preprocess_input, decode_predictions
import tensorflow as tf
import time

from tensorflow.python.compiler.tensorrt import trt_convert as trt
from tensorflow.python.saved_model import tag_constants

def predict_tftrt(input_saved_model):
img_path = ‘./panda.jpg’
img = image.load_img(img_path, target_size=(260, 260))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
x = tf.constant(x)

saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING])
signature_keys = list(saved_model_loaded.signatures.keys())
#print(signature_keys)

infer = saved_model_loaded.signatures['serving_default']
#print(infer.structured_outputs)

start = time.time()
for ii in range(100):
    labeling = infer(x)

elapsed_time = time.time() - start
print ("elapsed_time:{0}".format(elapsed_time) + "[sec]")

preds = labeling['predictions'].numpy()
print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))

input_saved_model=“effi_saved_model_TFTRT”
predict_tftrt(input_saved_model)

I don’t understand why the inference time of TF-TRT is slow

Hi,

Do you maximize the device performance first?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Thanks

Originally, power_mode was performed as instructed by 0: MAXN, but no effect was obtained.

Hi,

Do you also fix the clock to maximal with jetson_clocks.
To figure the bottleneck, it’s recommended to run your application with our profiler first.

Thanks.