Hi,
A question about HostBufferSize in NPP calls:
There are a large number of NPP routines for calculating buffer sizes, with names like nppiDotProdGetBufferHostSize_32s64f_C3R_Ctx
. Our understanding is that
- the word
Host
in the name indicates that the buffer size will be stored into a memory location on the host (CPU), pointed to byhpBufferSize
. TheHost
is forhp
. - the extension
_Ctx
indicates that this function runs on the GPU and therefore that the buffer size will be stored asynchronously, by a mechanism likecudaMemcpyAsync
. - Therefore: some kind of synchronization is needed following the call to this routine before accessing the buffer size at
hpBufferSize
Is this the right understanding?
Thanks,
Arnoud.