once called cudaHostAlloc(...), thanks to UVA support it seems not strictly necessary to call cudaHostGetDevicePointer() in order to retrieve host pinned pointer to use in a CUDA kernel.
However, canUseHostPointerForRegisteredMem = 0 indicate that is mandatory to use it, unless that canUseHostPointerForRegisteredMem refers only to pointers registered using cudaHostRegister(...) function.
All 64-bit OS implementations of CUDA are automatically a UVA regime.
In a UVA regime with one exception, any memory allocated using cudaHostAlloc or cudaMallocHost, automatically has the property that the device pointer and the host pointer to access that allocation are numerically the same and interchangeable.
allocations via cudaHostRegisterdo not automatically have this property.
for a pinned host allocation in a UVA regime, you don’t need to look at device properties to answer this question. We have established that via the bullet points above, extracted from the doc section I linked.
I would say so, because we don’t need to look at device properties to answer the question about whether the pointer is interchangeable for host or device usage
correct, with the one exception called out in the section I linked
Right, we have now established in a couple different ways that this could only apply to memory obtained via cudaHostRegister.
So to sumup:
in UVA context (since Kepler generation + 64 bit process, i.e. cudaDevAttrUnifiedAddressing=1), host/device pointers share the same virtual addressing space and, if allocated using cudaHostAlloc or cudaMallocHost, they’re accessible from both device and host.
The only exception is for host pointer registered via cudaHostRegister(...).
In this case, two different scenario exists:
canUseHostPointerForRegisteredMem = 1, same as above.
canUseHostPointerForRegisteredMem = 0, I have to call cudaHostGetDevicePointer(...) in order to use pointer data on device.