I need to implement pipeline like this:
video-input -> PGIE (detector) -> bboxes -> image PRE-PROCESSING based on bboxes (alignment, affine transformation) -> SGIE (custom) -> FEATURE VECTOR (float array)
I’ve done the PGIE and able to obtain bboxes. My questions are:
How to perform image pre-processing (alignment, affine transrormation) on regions obtained from primary detector, before passing them to SGIE. And how to pass transformed regions to secondary engine (not just bboxes and original image)?
How to obtain the result of secondary custom inference engine if it is not a classifier, not a detector, not a segmenter, and its output is a vector of 128 floats? (I will lately compute distance between resulting vector and pre-calculated set of vectors from disk)
Please point me to right direction which I should take to implement such pipeline.