I’m using jetpack 4.4.1.
I ran
model.save ('path')
to save the trained efficientnet model, but it’s unclear why it’s an HDF document instead of the saved model format. Do you understand?
In short, I can’t create a folder
Hi,
You may need a root authority to create a folder under some folder.
Could you give it a try?
Thanks.
〈effi_model_save.py〉
import os
import sys
import numpy as np
from tensorflow.keras.preprocessing import image
from keras.applications.imagenet_utils import decode_predictionsfrom efficientnet.keras import EfficientNetB2
from efficientnet.keras import preprocess_input
import tensorflow as tf
from tensorflow import kerasmodel = EfficientNetB2(weights=‘imagenet’)
save_path = ‘/home/effi_test/effi_saved_model’
model.save(save_path)
sudo python3 effi_model_save.py
I ran it with sudo as above, but got the following error:
2021-03-08 00:57:31.962215: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.2’; dlerror: libcudart.so.10.2: cannot open shared object file: No such file or directory
2021-03-08 00:57:31.962311: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-03-08 00:57:31.962488: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.2’; dlerror: libcudart.so.10.2: cannot open shared object file: No such file or directory
2021-03-08 00:57:31.962529: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Segmentation fault
By the way, I am using a docker container. The mount destination mounts /home/effi_test on the host to /home on the container.
In the case of
python3 effi_model_save.py
the folder cannot be created and it becomes an HDF document.
How can I solve it by saving it in saved model format in a container anyway?
Hi,
The default format in TF 1.x is HDF5.
Could you set the save_format
to tf
to see if it works?
save_format Either 'tf' or 'h5', indicating whether to save the model to Tensorflow SavedModel or HDF5. Defaults to 'tf' in TF 2.X, and 'h5' in TF 1.X.
Thanks.
Try
model.save (save_path, save_format=‘tf’)
By the way
Looking at
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/keras/applications
it is said that efficientnet of tf.keras.applications is not supported by TF1.x and is supported by TF2.x. Since my environment is TF1.15, I don’t use
from tf.keras.applications import EfficientNetBx
, I use
from efficientnet.keras import EfficientNetBx
.
Are they the same?
thank you
In my environment, it is jetpack 4.4.1 with xavier AGX.
The container uses ceai3216217 / j44_tf2.3.0: v1 (built by changing the wheel version of Dockerfile.tensorflow to tensorflow2.3.0)
・Start the container and pip3 install pillow
・When I ran pred.py the time for 100 inferences was about 20 seconds.
・Then save the trained model with effi_model_save.py
・Then convert the trained model with tf_convert.py
・Finally run suiron_rt.py. The inference time of 100 times is 269 seconds, which is considerably slower than the normal pred.py.
I don’t know the cause.
〈pred.py〉
import os
import sys
import numpy as np
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.efficientnet import preprocess_input, decode_predictions
import tensorflow as tf
import timemodel = tf.keras.applications.EfficientNetB2(weights=‘imagenet’)
img_path = ‘./panda.jpg’
img = image.load_img(img_path, target_size=(260, 260))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
x = tf.constant(x)start = time.time()
for ii in range(100):
y = model.predict(x,steps=1)
elapsed_time = time.time() - start
print (“elapsed_time:{0}”.format(elapsed_time) + “[sec]”)print(decode_predictions(y))
〈effi_model_save.py〉
import os
import sys
import numpy as np
import tensorflow as tfsave_path = ‘effi_saved_model’
model = tf.keras.applications.EfficientNetB2(weights=‘imagenet’)
model.save(save_path)
〈tf_convert.py〉
import os
import sys
from tensorflow.python.compiler.tensorrt import trt_convert as trtload_path = ‘effi_saved_model’
save_path = ‘effi_saved_model_TFTRT’conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode=trt.TrtPrecisionMode.FP16,max_workspace_size_bytes=8000000000)
converter = trt.TrtGraphConverterV2(input_saved_model_dir=load_path, conversion_params=conversion_params)
converter.convert()
converter.save(output_saved_model_dir=save_path)
〈suiron_rt.py〉
from future import absolute_import, division, print_function, unicode_literals
import os
import sys
import numpy as np
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.efficientnet import preprocess_input, decode_predictions
import tensorflow as tf
import timefrom tensorflow.python.compiler.tensorrt import trt_convert as trt
from tensorflow.python.saved_model import tag_constantsdef predict_tftrt(input_saved_model):
img_path = ‘./panda.jpg’
img = image.load_img(img_path, target_size=(260, 260))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
x = tf.constant(x)saved_model_loaded = tf.saved_model.load(input_saved_model, tags=[tag_constants.SERVING]) signature_keys = list(saved_model_loaded.signatures.keys()) #print(signature_keys) infer = saved_model_loaded.signatures['serving_default'] #print(infer.structured_outputs) start = time.time() for ii in range(100): labeling = infer(x) elapsed_time = time.time() - start print ("elapsed_time:{0}".format(elapsed_time) + "[sec]") preds = labeling['predictions'].numpy() print('{} - Predicted: {}'.format(img_path, decode_predictions(preds, top=3)[0]))
input_saved_model=“effi_saved_model_TFTRT”
predict_tftrt(input_saved_model)
I don’t understand why the inference time of TF-TRT is slow
Hi,
Do you maximize the device performance first?
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
Thanks
Originally, power_mode was performed as instructed by 0: MAXN, but no effect was obtained.
Hi,
Do you also fix the clock to maximal with jetson_clocks
.
To figure the bottleneck, it’s recommended to run your application with our profiler first.
https://developer.nvidia.com/tools-overview
Thanks.