Thank you for your quick and detailed response.
So in my case I retrieve/get the live video from a SDI card… Professional SDI video is formated as packed UYVY 4:2:2 and it seams like Nvenc is not capable of accepting 4:2:2 UYVY data so before handing over the data I need to re-format the data. And I was thinking since I have to move the data from the CPU to the GPU I will have all the pixels in ‘my hand’ in the CPU. That should mean if I write condensed CPU code I would have to wait for memory accesses anyway meaning I would have no performance penalties even if processing in the CPU. And to get moving I need to use what I master today so I implemented a AVX2 scaler and a Packed UYVY 4:2:2 to NV12 4:2:0 semi planar converter.
Basically shuffling the data around and scaling to 1:1, 1:2 and 1:4 sizes (to keep it simple for now) … like this for Y and same for UV ->
ySamplesLine1 = _mm256_avg_epu8(ySamplesLine1, ySamplesLine2);
ySamplesLine2 = _mm256_avg_epu8(ySamplesLine3, ySamplesLine4);
ySamplesLine3 = _mm256_srli_si256(ySamplesLine1, 1);
ySamplesLine4 = _mm256_srli_si256(ySamplesLine2, 1);
ySamplesLine1 = _mm256_avg_epu8(ySamplesLine1, ySamplesLine3);
ySamplesLine2 = _mm256_avg_epu8(ySamplesLine2, ySamplesLine4);
ySamplesLine1HiHalf = _mm256_shuffle_epi8(ySamplesLine1, shuffleMaskUV); //Its not UV data the mask just happens to look the same as when filtering.
ySamplesLine1HiHalf = _mm256_permute4x64_epi64(ySamplesLine1HiHalf, 0b10001000);
ySamplesLine2HiHalf = _mm256_shuffle_epi8(ySamplesLine2, shuffleMaskUV);
ySamplesLine2HiHalf = _mm256_permute4x64_epi64(ySamplesLine2HiHalf, 0b10001000);
ySamplesLine1 = _mm256_permute2x128_si256(ySamplesLine1LowHalf, ySamplesLine1HiHalf, 0b00100000);
ySamplesLine2 = _mm256_permute2x128_si256(ySamplesLine2LowHalf, ySamplesLine2HiHalf, 0b00100000);
However GPU is the way to go in the long run, to perform better scaling and possibly other image filters. Any pointers on where I could learn to intercept the image inside the GPU using CUDA before handing it over to Nvenc is really appreciated.
Once again thanks for helping.