Cuda malloc -- 2 regions of GPU memory?

Hi all,

I have my cuda program in which there are several structure variables allocated. Specifically using the following code for printing the address of each pointer of all the structures of my program allocated on the GPU I have:

[i]cudaMalloc((void**)&mat, size_mat);

cudaError_t error=cudaGetLastError();

     if(error != cudaSuccess) {

              printf("cudaMalloc: %s\n",cudaGetErrorString(error));

              exit(EXIT_FAILURE);

     }[/i]

printf(“mat: 0x%X\n”, mat);

ADDRESSING RESULTS:

[b]mat: 0x3AB0000

coordinates: 0x3AC0000

inertiaDev: 0x1000800

errorcodeDev: 0x3B00000

neigh_list: 0x3B10000

neighEl: 0x1000900

potVet: 0x3C10000

sysVirDev: 0x3C20000

coordDev: 0x3C30000

eDev: 0x3C40000

typeDev: 0x1001700

massDev: 0x1002500

interactionTypeDev1: 0x3C50000

sigDev: 0x1002600

epsDev: 0x1002800

productElectroMomentsDev: 0x1002A00[/b]

I have performed the prints in order. It seems as if the program uses 2 different regions of memory (one region 0x3****** and the other 0x1******) and allocates memory in one of these two regions.

Since the memory is allocated in a synchronous way, why doesn’t the program first allocate structures in one region of the GPU and then in the other, but instead it allocates memory quite alternating the two regions?

Thenk you in advance for your answers!!

Hi all,

I have my cuda program in which there are several structure variables allocated. Specifically using the following code for printing the address of each pointer of all the structures of my program allocated on the GPU I have:

[i]cudaMalloc((void**)&mat, size_mat);

cudaError_t error=cudaGetLastError();

     if(error != cudaSuccess) {

              printf("cudaMalloc: %s\n",cudaGetErrorString(error));

              exit(EXIT_FAILURE);

     }[/i]

printf(“mat: 0x%X\n”, mat);

ADDRESSING RESULTS:

[b]mat: 0x3AB0000

coordinates: 0x3AC0000

inertiaDev: 0x1000800

errorcodeDev: 0x3B00000

neigh_list: 0x3B10000

neighEl: 0x1000900

potVet: 0x3C10000

sysVirDev: 0x3C20000

coordDev: 0x3C30000

eDev: 0x3C40000

typeDev: 0x1001700

massDev: 0x1002500

interactionTypeDev1: 0x3C50000

sigDev: 0x1002600

epsDev: 0x1002800

productElectroMomentsDev: 0x1002A00[/b]

I have performed the prints in order. It seems as if the program uses 2 different regions of memory (one region 0x3****** and the other 0x1******) and allocates memory in one of these two regions.

Since the memory is allocated in a synchronous way, why doesn’t the program first allocate structures in one region of the GPU and then in the other, but instead it allocates memory quite alternating the two regions?

Thenk you in advance for your answers!!

Keep in mind that GPUs (like CPUs) use virtual memory addresses, and so those addresses you print can be mapped to contiguous physical memory regions. Why the driver hand out virtual addresses with different prefixes is anyone’s guess…

Keep in mind that GPUs (like CPUs) use virtual memory addresses, and so those addresses you print can be mapped to contiguous physical memory regions. Why the driver hand out virtual addresses with different prefixes is anyone’s guess…