Hi
I am facing odd behavior, I am running two processes one do decode and one do encode, it causing to decode to take a lot of time (when I am not using encode and only decode it’s take around 7ms but when they run together it can take 30ms),
to add some interesting stuff… i running this process on serval machines and some behave like i said but some have good behaviors (decode and encode run together without any problems).
Did you have some explaining for that?
Edited: I do some research and the most of the time spend on copy from host to device (function cuMemcpy2DAsync), do you have suggestions how to improve the performances?