-
We converted a Mask RCNN model to a TF Saved Model. Huge pb file - 255MB
-
Tried to run it using Tensorflow-gpu on jetson xavier. version=1.13.1+nv19.3
16GB RAM -
Using os.environ[“CUDA_VISIBLE_DEVICES”]=“0” it gets killed after some time
-
I noticed that the 15+GB goes to 100% before it is killed
-
Using os.environ[“CUDA_VISIBLE_DEVICES”]=“1” it runs fine but takes a long time (thats not what we want)
-
It runs fine on a normal GPU desktop with GeForce GTK 1060 + 16GB + Corei7
Code below
import tensorflow as tf
import numpy as np
import os
os.environ[“CUDA_VISIBLE_DEVICES”]=“0”
export_path = “./serving_model/1”
config = tf.ConfigProto()
#config.gpu_options.per_process_gpu_memory_fraction = 0.4
config.log_device_placement=True
config.allow_soft_placement=True
used with tf.device(‘/gpu:0’): and os.environ[“CUDA_VISIBLE_DEVICES”]=“0”
together and alternatively. no success
with tf.device(‘/gpu:0’):
with tf.Session(config=config) as sess:
loaded = tf.saved_model.loader.load(sess, [“serve”], export_path)
graph = tf.get_default_graph()
sess.run(tf.global_variables_initializer())
print('session run initialized..')
opt = []
x_tensor1 = graph.get_tensor_by_name("input_image:0")
x_tensor2 = graph.get_tensor_by_name("input_image_meta:0")
x_tensor3 = graph.get_tensor_by_name("input_anchors:0")
op_to_restore = graph.get_tensor_by_name("mrcnn_detection/Reshape_1:0")
x_test1 = np.zeros(shape=(1,1024,1024,3), dtype=np.float)
x_test2 = np.zeros(shape=(1,14), dtype=np.float)
x_test3 = np.zeros(shape=(1,261888,4), dtype=np.float)
feed_dict = {x_tensor1: x_test1,x_tensor2: x_test2,x_tensor3: x_test3}
print('before loop')
import time
for i in range(1):
t1 = time.time()
opt = sess.run(op_to_restore, feed_dict)
t2 = time.time()
print('i = {} time taken: {}'.format(i, (t2 - t1)))
print(opt)