Current ModuleProfiler only support the ms and batchSize metric, but each module execute with different element type and grain in a complex Flexible Pipeline, it is frame for Decode Module output Tensor, then the Parser Module is Object. It seems it is not enough just with ms and batchSize parameter(metric), we also want to know how many element been processed.
You can use nvprof to get more detail profiling data.
Okay, we will try nvprof. So if some module is the bottleneck for detect sub object in frame, we need to profile it by ourselves?
Yes, deepstream only provide the basic profiler.
NvProf can output more detail information, like the time for each API or each layer.
So you recommend we to use NvProf since we want to know the detail time consuming even for each API or layer? thank you very much, NvProf should the tool what we want.