size of shrare memory between CPU and GPU

Hi Nvidia:
In tegra_multimedia_api/sample/v4l2cuda, we use command ./capture-cuda -d /dev/video0 -u -z to get video data from usb camera and convert the video data with GPU,and the memory we use is the share memory between CPU and GPU, is it right? if it is, how much is the size of share memory between CPU and GPU?
Expected your reply! Thanks a lot!


The function for allocating memory is cudaMallocManaged().
The basic idea of unified memory can be found here:

In short, there are two buffers located in CPU and GPU but sharing a same pointer.
And the coherence between processes is automatically handled by the CUDA driver.

For the buffer size, please check file ‘samples/v4l2cuda/capture.cpp’:

size_t size = width * height * 3;
cudaMallocManaged (&cuda_out_buffer, size, cudaMemAttachGlobal);

It is width * height * 3.


Hi AastaLLL:
Thank you for your reply!
sorry,I’m not clearly express my question, size_t size = width * height * 3 is the size malloced in this program, I wonder to know the total Unified Memory space size in TX2.
Expected your reply! Thank you!

The amount of “shared memory” is software controlled. Substantially all of the 8 GB of RAM on the TX2 module “could” be shared memory (minus bits needed by the OS to function.)

Hi snarky:
Thank you for your reply!
May I understand your answer as follow:
There are total 8GB RAM on TX2 module,and this 8GB RAM could be as “shared memory” by function cudaMallocManaged(),
Is it right? but how to understand minus bits needed by the OS to function?
Expected your reply! Thank you!

It should be obvious that the amount of RAM used by your system processes depends a lot on what you are doing with the system.
You cannot get a good answer without developing an actual solution, disabling system services you don’t need, run system services and applications you do need, and measuring.
For estimation purposes, a lightweight system might need about 200 MB for overhead; a heavy system installation might need more like 1 GB for overhead, but the specifics of your application will vary.


For an application in user space, the memory size is similar to the amount of allocation.
But in some case, the page number may be duplicate depends to CPU/GPU interactively access.

Here is a tutorial for your reference.