Hi, i have an image I on the device (properly allocated via ‘cudaMallocPitch’).
Now, I want to create a ‘sub-image’ S for a rectangular part of the orignal image I by ‘shallow copy’, so the buffer address in S simply refers to a specific offset in orignal image I.
In this way, i don’t have to copy anyhting, and any modifications to the pixels (e.g. call a NPP function working on S) in S actually change the pixel values in I.
On the CPU, this can be done nicely (e.g. fine for doing IPP functions only one a specific rectangular part of the image).
So, my questions:
- Can this be done? I suppose it can be done in the same way as for CPU image buffers by doing some offset calculations. Note the pitch of S and I will be the same.
- Does it incur any (significant) performance penalty, when a kernel processes a specific image buffer which does not start at a ‘properly aligned’ address (as it will be with the start address for the buffer in S) ? I suppose there will be some coalescing issues, and it might depend on the compute capability (e.g. 1.3 vs. 2.0).