TensorRT optimization of Keras model on Jetson TX2

Hi all,

I am currently developing a CNN based litter detection system, to be run live on a Jetson TX2. However, the fps is currently very low, so I’ve been looking into accelerating the trained model with TensorRT. My model is currently trained and run in Keras.
Having worked through various other guides and help topics, I think I have managed to write a python program to freeze/convert my .h5 Keras model into a .pb, but I’m now running into issues when converting that into a uff. There are various “Warning: No conversion function registered for layer” type errors, which I have no clue how to resolve. This is my current output:

2018-08-02 13:14:54.511053: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-02 13:14:54.561749: W tensorflow/stream_executor/cuda/cuda_driver.cc:513] A non-primary context 0x465c8d0 for device 0 exists before initializing the StreamExecutor. The primary context is now 0x4621ac0. We haven't verified StreamExecutor works with that.
2018-08-02 13:14:54.561939: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-08-02 13:14:54.562293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.62
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.29GiB
2018-08-02 13:14:54.562304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-08-02 13:14:54.805068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-02 13:14:54.805107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-08-02 13:14:54.805113: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-08-02 13:14:54.805288: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9925 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-08-02 13:14:56.881180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-08-02 13:14:56.881216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-02 13:14:56.881222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958]      0 
2018-08-02 13:14:56.881226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0:   N 
2018-08-02 13:14:56.881307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9925 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
[TensorRT] ERROR: UFFParser: Validator error: dropout_1/cond/Switch: Unsupported operation _Switch
[TensorRT] ERROR: Failed to parse UFF model stream
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
Using TensorFlow backend.
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
  File "/usr/lib/python3.5/dist-packages/tensorrt/utils/_utils.py", line 255, in uff_to_trt_engine
    assert(parser.parse(stream, network, model_datatype))
Using output node activation_10/Softmax
Converting to UFF graph
Warning: No conversion function registered for layer: Merge yet.
Converting as custom op Merge dropout_3/cond/Merge
name: "dropout_3/cond/Merge"
op: "Merge"
input: "dropout_3/cond/Switch_1"
input: "dropout_3/cond/dropout/mul"
attr {
  key: "N"
  value {
    i: 2
  }
}
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}

Warning: No conversion function registered for layer: Floor yet.
Converting as custom op Floor dropout_3/cond/dropout/Floor
name: "dropout_3/cond/dropout/Floor"
op: "Floor"
input: "dropout_3/cond/dropout/add"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_3/cond/Switch
name: "dropout_3/cond/Switch"
op: "Switch"
input: "dropout_1/keras_learning_phase"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_BOOL
  }
}

Warning: No conversion function registered for layer: PlaceholderWithDefault yet.
Converting as custom op PlaceholderWithDefault dropout_1/keras_learning_phase
name: "dropout_1/keras_learning_phase"
op: "PlaceholderWithDefault"
input: "dropout_1/keras_learning_phase/input"
attr {
  key: "dtype"
  value {
    type: DT_BOOL
  }
}
attr {
  key: "shape"
  value {
    shape {
    }
  }
}

Warning: No conversion function registered for layer: RandomUniform yet.
Converting as custom op RandomUniform dropout_3/cond/dropout/random_uniform/RandomUniform
name: "dropout_3/cond/dropout/random_uniform/RandomUniform"
op: "RandomUniform"
input: "dropout_3/cond/dropout/Shape"
attr {
  key: "T"
  value {
    type: DT_INT32
  }
}
attr {
  key: "dtype"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "seed"
  value {
    i: 87654321
  }
}
attr {
  key: "seed2"
  value {
    i: 2989234
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_3/cond/mul/Switch
name: "dropout_3/cond/mul/Switch"
op: "Switch"
input: "activation_9/Relu"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@activation_9/Relu"
    }
  }
}

DEBUG: convert reshape to flatten node
Warning: No conversion function registered for layer: Merge yet.
Converting as custom op Merge dropout_2/cond/Merge
name: "dropout_2/cond/Merge"
op: "Merge"
input: "dropout_2/cond/Switch_1"
input: "dropout_2/cond/dropout/mul"
attr {
  key: "N"
  value {
    i: 2
  }
}
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}

Warning: No conversion function registered for layer: Floor yet.
Converting as custom op Floor dropout_2/cond/dropout/Floor
name: "dropout_2/cond/dropout/Floor"
op: "Floor"
input: "dropout_2/cond/dropout/add"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_2/cond/Switch
name: "dropout_2/cond/Switch"
op: "Switch"
input: "dropout_1/keras_learning_phase"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_BOOL
  }
}

Warning: No conversion function registered for layer: RandomUniform yet.
Converting as custom op RandomUniform dropout_2/cond/dropout/random_uniform/RandomUniform
name: "dropout_2/cond/dropout/random_uniform/RandomUniform"
op: "RandomUniform"
input: "dropout_2/cond/dropout/Shape"
attr {
  key: "T"
  value {
    type: DT_INT32
  }
}
attr {
  key: "dtype"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "seed"
  value {
    i: 87654321
  }
}
attr {
  key: "seed2"
  value {
    i: 7273518
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_2/cond/mul/Switch
name: "dropout_2/cond/mul/Switch"
op: "Switch"
input: "max_pooling2d_3/MaxPool"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@max_pooling2d_3/MaxPool"
    }
  }
}

Warning: No conversion function registered for layer: Merge yet.
Converting as custom op Merge dropout_1/cond/Merge
name: "dropout_1/cond/Merge"
op: "Merge"
input: "dropout_1/cond/Switch_1"
input: "dropout_1/cond/dropout/mul"
attr {
  key: "N"
  value {
    i: 2
  }
}
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}

Warning: No conversion function registered for layer: Floor yet.
Converting as custom op Floor dropout_1/cond/dropout/Floor
name: "dropout_1/cond/dropout/Floor"
op: "Floor"
input: "dropout_1/cond/dropout/add"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_1/cond/Switch
name: "dropout_1/cond/Switch"
op: "Switch"
input: "dropout_1/keras_learning_phase"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_BOOL
  }
}

Warning: No conversion function registered for layer: RandomUniform yet.
Converting as custom op RandomUniform dropout_1/cond/dropout/random_uniform/RandomUniform
name: "dropout_1/cond/dropout/random_uniform/RandomUniform"
op: "RandomUniform"
input: "dropout_1/cond/dropout/Shape"
attr {
  key: "T"
  value {
    type: DT_INT32
  }
}
attr {
  key: "dtype"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "seed"
  value {
    i: 87654321
  }
}
attr {
  key: "seed2"
  value {
    i: 1395036
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_1/cond/mul/Switch
name: "dropout_1/cond/mul/Switch"
op: "Switch"
input: "max_pooling2d_2/MaxPool"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@max_pooling2d_2/MaxPool"
    }
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_1/cond/Switch_1
name: "dropout_1/cond/Switch_1"
op: "Switch"
input: "max_pooling2d_2/MaxPool"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@max_pooling2d_2/MaxPool"
    }
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_2/cond/Switch_1
name: "dropout_2/cond/Switch_1"
op: "Switch"
input: "max_pooling2d_3/MaxPool"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@max_pooling2d_3/MaxPool"
    }
  }
}

Warning: No conversion function registered for layer: Switch yet.
Converting as custom op Switch dropout_3/cond/Switch_1
name: "dropout_3/cond/Switch_1"
op: "Switch"
input: "activation_9/Relu"
input: "dropout_1/keras_learning_phase"
attr {
  key: "T"
  value {
    type: DT_FLOAT
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@activation_9/Relu"
    }
  }
}

No. nodes: 117
Traceback (most recent call last):
  File "/usr/lib/python3.5/dist-packages/tensorrt/utils/_utils.py", line 255, in uff_to_trt_engine
    assert(parser.parse(stream, network, model_datatype))
AssertionError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "trt_modelconvert.py", line 134, in <module>
    main()
  File "trt_modelconvert.py", line 123, in main
    MAX_WORKSPACE)
  File "/usr/lib/python3.5/dist-packages/tensorrt/utils/_utils.py", line 263, in uff_to_trt_engine
    raise AssertionError('UFF parsing failed on line {} in statement {}'.format(line, text))
AssertionError: UFF parsing failed on line 255 in statement assert(parser.parse(stream, network, model_datatype))

Is it feasible to “fix” these problems, or do I just need to wait for a newer version of TensorRT?

Thanks

My best guess is that some ops are not supported by tensorRT. You might want to change them a bit.
Also I think there are a lot of layers that are not related to inferencing, you might want to get rid of them.

But how do I go about doing that, when I’ve trained the net using Keras, which doesn’t intuitively give access to that level of detail? Is it possible to “go into” Keras’s library code and change the function calls from within?

I am no machine learning expert, and dont quote me on this, but I think you can safely make a model without any training nodes (would just be equivalent to use the same file, and delete any training line), and try tensorRT on that.

The reason I assumed you still have training nodes is because I see the randomUniform, in Tensorflow, you use randomUniform when initializing the weights. I think you can delete that line, since for inference you dont need to initialize weights.
Also, there are possibly some other ops that are not supported. Your best chance is to replace them with similar ops.