I do some frequent transfers between host and device. Before I was using normal host memory.
Now, I moved to page locked memory and got problems.
When I try to copy it using cudaMemcpy3D (copy to 3d CUDA array) - it crashes somewhere inside drivers code (at the 3rd call to it).
In spec I found:
126.96.36.199 Overlap of Data Transfer and Kernel Execution
Some devices of compute capability 1.1 and higher can perform copies between page-locked host memory and device memory concurrently with kernel execution. Applications may query this capability by calling cudaGetDeviceProperties() and checking the deviceOverlap property. This capability is currently supported only for memory copies that do not involve CUDA arrays or 2D arrays allocated through cudaMallocPitch() (see Section 3.2.1).
Ok, but this is about asynchronous copy, I try to do it synchronous. Is it supported?