Whether the model execution using multiple streams in the tensorRT framework USES multicore concurrency

Hi all,

I modified the fast- RCNN code to add two streams to run two separate fast- RCNN models。
I learned from cuda’s documentation that multiple streams can generate multicore concurrency to improve performance.But when I tested it, I didn’t see any performance improvements, using the RTX2060.