Should I expect changed scores with FP32

After optimization a graph using this function

graphdef_trt = tensorflow.contrib.tensorrt.create_inference_graph(
    outputs=[OUTPUT_NODE1, OUTPUT_NODE2],
    max_workspace_size_bytes=1 << 32,

prediction of the model were slightly changed.

Is it normal behavior?
If yes, could you please provide a link where I can find why this happens?

Used docker image

tried ‘FP16’ predictions the same as for ‘FP32’ but different for original graph
tried ‘INT8’ predictions the same as for original graph but infer time increased from 6ms to 320ms (‘FP16’ and ‘FP32’ infer time 4ms)
Something strange is happening


can you provide more detail on how the prediction were “slightly changed”. It’d help us debug this issue if you can provide a small repro that contains the source, model, and dataset that demonstrate the symptoms.

NVIDIA Enterprise Support.