Hello!
I am trying to run a SCRFD 500m model on TX2 from https://github.com/deepinsight/insightface/blob/master/detection/scrfd. I managed to convert the model into a TensorRT engine, however the outputs are wildly different from the version I have deployed in my Laptop, which was running the ONNX version of the model as explained in the model repo.
I am aware that TensorRT conversion might not convert some of the preprocessing and postprocessing steps, but there are no minimal examples for SCRFD on how to manage these inputs & outputs. For example, can I give the same exact input to the TRT, as the ONNX model, or do I need to apply some preprocessing first? How can I know what steps are not included in tensorRT version, if any?
Can I do something to make the TRT engine run exactly the same as the ONNX version? For example,
from insightface.model_zoo import SCRFD
scrfd_model_path = "face/saved_models/det_500m.onnx"
...
scrfd_detector = SCRFD(scrfd_model_path, scrfd_session)
scrfd_bboxes_orig, _ = scrfd_detector.detect(rgb_orig) # rgb_orig is image as an array
Yields me the outputs I want, but
input_data = preprocess_image(image_path, input_shape) # just shuffles dimensions and channels
...
np.copyto(inputs[0]["host"], input_data.ravel())
cuda.memcpy_htod_async(inputs[0]["device"], inputs[0]["host"], stream)
context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
cuda.memcpy_dtoh_async(outputs[0]["host"], outputs[0]["device"], stream)
stream.synchronize()
output_data = outputs[0]["host"].reshape(5, -1)
gives a very different output. Even if I assume the outputs are unfiltered bboxes, they are not meaningful when overlaid on the input image (Many are smaller than 10x10). This indicates the input needs some processing, but I don’t know what they require. If possible, I want to see what tensorRT included as the model.