Tf 1.15 engine restarts on predict and hangs

Hi,

I have been working on implementing some inference into a tkinter ui program using threading.

I had it working a while ago but I went down the path of attempting to optimize with pureTRT but the documentation and help is so lacking I am trying to get back to a working model to begin again.

I am using the Jetson Nano with JetPack4.3 TF1.15

I built my own trainers and produced a .h5 file. I load this, as well as storing the default graph and make prediction function before initiating the thread that will make predictions.

Code initialised before thread:

model = tf.keras.models.load_model(g1.modelFile, compile = True)
        #model = tensorflow.saved_model.load(g1.modelFile, compile = False)
        
        print('Model Loaded..')

        try:

            modelGraph = tf.compat.v1.get_default_graph()
            print(f'Graph Stored: {modelGraph}')
        
        except (EnvironmentError, SystemError, TypeError, IOError, ValueError, AttributeError) as err:
            print(f'Error setting modelGraph: {err}')

        try:

            model._make_predict_function()
            print('prediction function made..')
        except (ValueError, TypeError, SystemError, SyntaxError, EnvironmentError, AttributeError) as err:
            print(f'Error making predict from model: {err}')

        
        # Create a threaded processImages()
        thread_pool_executor.submit(processImages)

Code inside thread:

for i in img_Files:

       CATEGORIES = ['Vacant', 'Occupied']

        try:

            IMG_SIZE = 75
            print(f'Image print: {i}')
            img_path = str(i)
            print(f'img_path: {img_path}')

            try:
                img_array = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
                print(img_array.shape)
            except (ValueError, EnvironmentError, IOError, TypeError, SystemError, AttributeError) as err:
                print(f'Error opening image: {err}')

            img_array = img_array / 255.0
            new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
            print(f'New array shape: {new_array.shape}')
            infer_array = new_array.reshape(-1, IMG_SIZE, IMG_SIZE, 1)
            print(f'infer_array shape: {infer_array.shape}')

        except (ValueError, EnvironmentError, AttributeError, SystemError, TypeError, IOError) as err:
            print(f'Error predicting: {err}')

        try:

            with modelGraph.as_default():
                print('Graph opened..')
                try:
                    print('Predicting..')
                    prediction = model.predict_classes(infer_array)
                    print(f'Prediction: {prediction}')
                except (ValueError, EnvironmentError, AttributeError, SystemError, TypeError, IOError, SyntaxError) as err:
                    print(f'Error predicting: {err}')

        except (ValueError, EnvironmentError, AttributeError, SystemError, TypeError, IOError, SyntaxError) as err:
            print(f'Error predicting: {err}')

        '''try:
            print('predicting..')
            prediction = model.predict_classes(infer_array)
            print('predicted..')
        
        except (ValueError, EnvironmentError, AttributeError, SystemError, TypeError, IOError, SyntaxError) as err:
            print(f'Error predicting: {err}')'''

        print(f'Prediction: {prediction}')
        #print(int(prediction[0][0]))

        result = 0

        if prediction > 0.8:

            result = 1

        else:

            result = 0
            
        #print(CATEGORIES[int(prediction[0][0])])
        print(f'Status: {CATEGORIES[result]}')

This is what happens when it runs the prediction, it simply restarts the engine and then hangs until completely frozen application. It throws no errors.

2020-05-16 18:23:03.634763: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-05-16 18:23:03.634954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 31 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Model Loaded..
Graph Stored: <tensorflow.python.framework.ops.Graph object at 0x7f529014e0>
prediction function made..
images written to folder
Image file is: /home/tom/Downloads/RAM_disk/10209.jpg
Image file is: /home/tom/Downloads/RAM_disk/10208.jpg
Image file is: /home/tom/Downloads/RAM_disk/10207.jpg
bay number: 10209
<class 'str'>
<class 'int'>
10209 is a Right Bay
Image print: /home/tom/Downloads/RAM_disk/10209.jpg
img_path: /home/tom/Downloads/RAM_disk/10209.jpg
(480, 640)
New array shape: (75, 75)
infer_array shape: (1, 75, 75, 1)
Graph opened..
Predicting..
2020-05-16 18:23:07.203678: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-05-16 18:23:07.203952: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-05-16 18:23:07.204113: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-05-16 18:23:07.204770: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-05-16 18:23:07.205129: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-05-16 18:23:07.205429: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-05-16 18:23:07.205635: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-05-16 18:23:07.205974: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-05-16 18:23:07.206353: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-16 18:23:07.206992: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-05-16 18:23:07.207757: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-05-16 18:23:07.207918: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-05-16 18:23:07.208031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-16 18:23:07.208079: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-05-16 18:23:07.208110: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-05-16 18:23:07.208565: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-05-16 18:23:07.209064: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:950] ARM64 does not support NUMA - returning NUMA node zero
2020-05-16 18:23:07.209285: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 31 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
2020-05-16 18:23:07.483041: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0 

I know its a bit difficult not providing all the code but its quite a lot.

The thing is I can run inference with this code on its own just fine:

import cv2
import tensorflow as tf

CATEGORIES = ['Vacant', 'Occupied']

def prepare(filepath):

    IMG_SIZE = 75
    img_array = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE)
    img_array = img_array / 255.0
    new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
    return new_array.reshape(-1, IMG_SIZE, IMG_SIZE, 1)

#model = tf.keras.models.load_model('TRT_Export/frozen_model-001.pb')
model = tf.keras.models.load_model('3conv-64nodes-2dense-CNN-003.h5', compile=False)
#model = tf.keras.models.load_model('outconv')

prediction = model.predict_classes([prepare('occ.jpg')])

print(f'Prediction: {prediction}')
#print(int(prediction[0][0]))

result = 0

if prediction > 0.8:

    result = 1

else:

    result = 0
    

print(f'Status: {CATEGORIES[result]}')

No errors no problems. But when implementing this into my other code I have a lot of issues. I managed to find out about storing the graph and using it when making the prediction as well as initialising the prediction function before the thread.

Other than that I am not sure what to do from here. Any help much appreciated.

Thank You