Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) RTX 3060 • DeepStream Version 7.1 • JetPack Version (valid for Jetson only) • TensorRT Version 10.5 • NVIDIA GPU Driver Version (valid for GPU only) 560.70 • Issue Type( questions, new requirements, bugs) Question • How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) • Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
Is it possible to implement an LSTM model as an SGIE in the nvinferserver binary? I mean, to implement a secondary-gie after its corresponding secondary-pre-process.
I’ve seen on the forum that some users have managed to implement SGIEs with tensor inputs instead of images. However, all the implementations I’ve come across use nvinfer rather than nvinferserver.
About LSTM mode issue, is this topic helpful?
About “nvdspreprocess+sgie” usage, please refer to DeepStream SDK native sample deepstream-preprocess-test, config_preprocess_sgie.txt is nvdspreprocess cfg before sgie. you need to add sgie to pipeline, and set input-tensor-meta to true for sgie.
I have been checking that case but I am still not able to find a solution. I need the nvdspreprocess to output a tensor containing the center coordinates of the bounding boxes for detected objects after tracking them in previous frames.
I need the sgie to receive an input tensor of size [1, 30, 2] for each detected object, representing the (x, y) coordinates of the object over the last 30 frames. The LSTM will then predict the (x, y) coordinates for the subsequent frames, resulting in an output tensor of size [1, 3, 2] for each detected object if 3 frames are predicted.
Additionally, I am unsure how to retrieve the output from the sgie, whether for display on the OSD or saving to a file.
Finally, in my current pipeline, I notice that the sink pads of nvdspreprocess and sgie are linked to a fakesink. Is this correct, considering that the data is in NvDsBatchMeta?
About “representing the (x, y) coordinates of the object over the last 30 frames”, please refer to the sample deepstream-pose-classification mentioned above. In this sample, the first model detects person, the second model detects 34 keypoints (x,y,z) of body. the nvdspreprocess makes an tensor of [3 X 300 X 34] for each detected object, representing the 34 keypoints (x, y, z) coordinates of the object over the last 300 frames.
About LSTM, please refer to the doc in nvinferserver Introduction. In short, you need to use IInferCustomProcessor interface. please refer to the sample opt\nvidia\deepstream\deepstream\sources\TritonOnnxYolo\nvdsinferserver_custom_impl_yolo\nvdsinferserver_custom_process_yolo.cpp, In inferenceDone, you can save the inferences to a variable A, In extraInputProcess, you can get the data from A, then add the data as extra inputs.
what is the LSTM model used to do? could your share the scenario?