zero-copy memory

I am having trouble using zero-copy memory. I read an article talking about using zero-copy memory for a first pass at converting a program to cuda. It sounded like a good idea but I am having a great deal of trouble.

Perhaps someone can comment on my understanding or misunderstanding of zero-copy memory.

My understanding is that after the following code has run, I will have one set of data that can be accessed from either the host or the GPU. The data is accessed on the host by referencing *TF and the data is accessed on the device by referencing *devPtrTF.

Is this correct?

if (cudaHostAlloc( (void**)&TF,sizeof( double ) * Nz * Ny *( Nx+2L),cudaHostAllocMapped))
{       printf("\n CUDA error: cudaHostAlloc TF  error", xerror);
        return 1;
}


        if(cudaHostGetDevicePointer(&devPtrTF,TF,0))
{       printf("\n CUDA error: cudaHostGetDevicePointer TF  error", xerror);
        return 1;
}

Also, I think the following would be an equivalent call.

TF = (double *)malloc(15L*(Nx+2L)*Ny*Nz*sizeof(double));

        if (cudaHostRegister( (void**)&TF,sizeof( double ) * Nz * Ny *( Nx+2L),cudaHostRegisterMapped))
{       printf("\n CUDA error: cudaHostAlloc TF  error", xerror);
        return 1;
}


        if(cudaHostGetDevicePointer(&devPtrTF,TF,0))
{       printf("\n CUDA error: cudaHostGetDevicePointer TF  error", xerror);
        return 1;
}

Is this also correct?

Thanks for any help with this.

It should be…

Have You read this: CUDA C Best Practices Guide - Zero-copy Memory? This one is best for starting with many things in CUDA ;)

MK