SSD with NON_BLOCKING stream

I have an SSD model with a custom layer. I initialize TensorRT net with a plugin factory which creates my plugin.
issue is when I do inference and provide IExecutionContext::enqueue with a stream creating using CU_STREAM_NON_BLOCKING the network returns invalid results, but when I create the stream with CU_STREAM_DEFAULT everything works fine.
Looking in Nsight I see most of the network kernels, including my custom layer are executed on the stream provided to IExecutionContext::enqueue while a few kernels named “void copy_kernel<float, int=0>(cublasCopyParams)” are executed on another stream.
It seems like one of the layers uses a stream other then the one provided to enqueue. mobdro
Can I work around this? I want to avoid synchronization with the default stream.


I also get the same error. Just like the SSD, I was trying to perform TensorRT implementation of the Refinedet model. It worked with “context.execute()”
However, “context.enqueue” gave error results.
Manual download for a custom assignment programming homework for NVIDIA Deep Learning TensorRT Documentation.