Module::execute() parameter StreamTensor object life-cycle and data format

There are 3 sub topic in this title, list as text in user guide and question, the text is extract from DeepStream User Guide document(ID:DU-08633-001_v03.1).

Text: calls the “execute” function of each module serially when executing DeviceWorker->start()
DeviceWorker will be in a suspended state until the user pushes video packets into it.
Question: Module::execute() should been call after DeviceWorker->start() and DeviceWorker->pushPacket() both called? There are upstream and downstream relations between modules in one Flexible Pipeline, so Module::execute() should serially called one by one in the same order they been add to DeviceWorker. After the last module’s execution function been called, will it go to call again from the first one?

Text: IStreamTensor should be created with createStreamTensor.
Question: After reading the code “sample/nvDecInfer_detection/parserModule_resnet18.h”, I can guess: StreamTensor object should be create and destroy by upstream/input module, and the downstream/output module should only allowed to read data from “const std::vector<IStreamTensor *>& in” parameter, right?

Text: Be sure to pass or update the Trace information, which includes information about the frames index, video index, etc.
Question: All customer module is downstream of inference module in the DeepStream sample code, so what is the TRACE_INFO::boxInfo if customer module is downstream of ColorSpaceConvertor module, should be meanless? In sample/nvDecInfer_detection/parserModule_resnet18.h file, the code seems maintain the BBOXS_PER_FRAME structure contained in OutputTensor CPU Data inside ParserModule::execute() function, where is the code to pass or update the Trace information, which one to use between TRACE_INFO and BBOXS_PER_FRAME to transfer the frame and video index?

Text: For the input of module, you can use functions with the “get” prefix to acquire the information and data.
Question: If we want to develop a cutomer module, and it’s input module is come from third part, like the DeepStream embed ones. Then if we touch the input tensor, to check the data is in CPU or GPU memory, should we call getGpuData or getCpuData after call getMemoryType first, or there should be some document to describe the full and detailed content of the output tensor published with the third part module?


After calling DeviceWorker->start(), the deepStream system will call the “execute” function serially.
Execution is launched when a packet is available, not depends on the finish of a pipeline.

Yes. All modules use IStreamTensor format.

Trace_INFO contains frame index and video index information.
Trace information is passed or updated when creating the SteamTensor.

You can call getMemoryType() function to decide which get function should be used.

Since all modules use IStreamTensor format, module need “cast” and parse ISteamTensor to get detail, even hierarchy information.


Custom module should be defined like this:

class UserDefinedModule : public IModule {

And data will be accessed via:

IStreamTensor* getOutputTensor(const int tensorIndex) ...


It seems we should best using the getOutputTensor API to get the IStreamTensor interface then touch it.


In the DeepStream technical stack, the TensorRT already heavy using tensor for it layer data input and output. I just learned the classify problem output in deep learning can’t use tensor directly, we need translate LABEL to one-hot vector to allow them been compute. Do you think it is suitable to use tensor as the main data type for DeepStream’s abstract level? it is difficult for DeepStream developer to understand shape or something, can it be more concrete, like packet or frame things? Thanks.


Tensor is the underlying architecture of most deep learning use-case.

We can represent label/image/voice/language with tensor structure.
That’s why we prefer to use tensor as our data type.


You make me to think data type abstract method again, maybe object and track info can be presented with tensor more naturally, O(bject)T(rack)YX format is more compatible with NCHW。
I accepted you post as answer, thank you very much for provide these detail technical information, it is very helpful for our design problems.
Thanks again!