I have one general doubt regarding use of variable in kernal.
In my code I am able to use variable e.g “int blkperpnt” declared on CPU but accessible on GPU like below example;
__global__ void Kernal(float *d_ptrPnt,float *d_ptrVert,int numPoints,int numVertices,float *d_ptrPntVert,int <b>blkperpnt</b>)
{
//1. Value of blkperpnt is available at this point
//2. but blkperpnt was not declared on GPU
//3. blkperpnt is never copy from CPU to GPU
}
void PV(float *d_ptrPnt,float *d_ptrVert,int numPoints,int numVertices,float *d_ptrPntVert,int blkperpnt)
{
Kernal<<<numblocks,numthreads,sharedmem>>>(d_ptrPnt,d_ptrVert,numPoints,numVertices,d_ptrPntVert,<b>blkperpnt</b>);
}
How this is possible that variable declared on CPU are available on GPU.
Is the above method correct or what other method do I have to follow.
Yes, passing variables as arguments is the simplest way. Why do you think that your example variable is not declared on the GPU? When you place it in the formal parameter list of a global function, that is declaring it in the context of that function witch lives on the GPU. The runtime/drivers take care of copying the value you pass from host code.
Yes, passing variables as arguments is the simplest way. Why do you think that your example variable is not declared on the GPU? When you place it in the formal parameter list of a global function, that is declaring it in the context of that function witch lives on the GPU. The runtime/drivers take care of copying the value you pass from host code.
No, I will not provide you source. You can google for that yourself. Passing arguments to kernel functions is something that 99.999999% of all CUDA code does.
No, I will not provide you source. You can google for that yourself. Passing arguments to kernel functions is something that 99.999999% of all CUDA code does.
That your code compiles is not so surprising… The question is rather if it can run in any meaningful way. What is the pointer you are passing pointing to, and are you getting those values?
That your code compiles is not so surprising… The question is rather if it can run in any meaningful way. What is the pointer you are passing pointing to, and are you getting those values?
In section 3.2.6.3 of the programming guide I find that you need to pass the flag cudaHostAllocMapped to cudaHostAlloc() to get host memory into the address space of the device.
In section 3.2.6.3 of the programming guide I find that you need to pass the flag cudaHostAllocMapped to cudaHostAlloc() to get host memory into the address space of the device.