We are implementing a camera grabber that writes to unified memory. We are taking the MMAPI samples as a reference. In line 447
of v4l2cuda/capture.cpp
, the buffer size rounded up from the actual image size to the next multiple of the page size:
buffer_size = (buffer_size + page_size - 1) & ~(page_size - 1);
What is the reason for that?
On a side note, I am also wondering if it is fine if the kernel accesses the globally attached buffer while CUDA kernels are running. If I am not mistaken, this would lead to BUS_ERROR
s in user space.