clamping texture to zero / adding a zero border


I am using 3D textures extensively and often have to clamp the values outside the volume to zero.

In order to do so I set the border voxels to 0 and use the clamping access mode.

Unfortunately I also have to interface third party software in which the volumes to not have a zero border and are represented by a linear array of floats.

Thus I need to make the device array (a 3D cudaArray) 2 voxels larger in each dimension.

I understand I have to use cudaArray (device builtin) to bind to a texture<float, 3>.

Now, what is the fastest / correct way to copy my data from host to device?

Is there an interface to access the memory that a cudaArray uses to make partial copies (line by line)?

Or do I really have to construct an array on the host having the correct size (+2 in each dimension) and fill it on the host side first, before I can copy it as a whole to the device?

This really takes LOOOONG and reduces the effect of using the GPU in some cases to NIL (apart from the memory needed on the host, talking of upto 2-3 GBytes).

Without a border, I use something like

cudaExtent extent = make_cudaExtent(dim.x, dim.y, dim.z);;

  cudaMemcpy3DParms aParms = {0};

  aParms.srcPtr = make_cudaPitchedPtr (src, dim.x * sizeof(float), dim.x, dim.y);

  aParms.dstArray = this->m_cudaArr;

  aParms.extent = extent;

  aParms.kind =   cudaMemcpyHostToDevice;

  cudaMemcpy3D (&aParms);

Have you looked at using cudaAddressModeBorder? That should do exactly what you are looking for without copying the array.

Thanks for the answer!

So far I did not stumble across this addressMode since it remains completely undocumented in the documents that come with the toolkit.

Is there any documentation as to how it works ? Where does the value go I want the border to have ?

regards Rolf

According to this thread the border colour is fixed at zero in CUDA at the moment.

I tried cudaAdressModeBorder in order to clamp the texture values to zero in the background outside the texture.
Since I could not find any documentation on how it works, I simply tried a couple of things and also referred to [topic=‘184506’]thread 184506[/topic].

I wrote a simple test program that performs linear interpolation of a small 1D texture.
cudaAddressModeBorder works with normalized and with non-normalized coordinates!
This is in contrast to what is stated elsewhere ([topic=‘184506’]thread 184506[/topic])

Thanks for alerting us to the missing documentation for cudaAddressModeBorder. I have notified the documentation team to get this fixed for a future CUDA release.