How to allocate pitched unified memory?

michaelrrolle45 · October 10, 2019, 9:25pm

I want to allocate a H x W array on the device with unified memory. I will use cuMemcpy2D with it.
Unfortunately, if W is not a multiple of the required alignment (which would make a pitched array with no gaps), then the memcpy will complain (invalid parameter).
If I use cuMallocPitch, this works with the memcpy, but afaik, this would NOT be unified memory and I cannot use the CUdeviceptr value as a pointer on the host.
In order to use cuMallocManaged, I will need to calculate the pitch corresponding to W, then multiply this by H (times the element size). However, I need to know how the cuMallocPitch calculates the pitch.
On my Kepler, the pitch is the bytes in a row rounded up to a multiple of 512 bytes (determined by experiment). On other devices, this could be different. Will CU_DEVICE_ATTRIBUTE_TEXTURE_PITCH_ALIGNMENT always work?
My question is: How can I have an H x W array in unified memory if W is not properly aligned?
Also: Is linear device memory automatically managed by the UMS, or is there a way I can make it be so managed?
An enhancement request: Provide a cuMallocManagedPitch routine (and similarly for the Runtime API).

Robert_Crovella · October 11, 2019, 2:13am

there is no pitched unified memory (UM)

textures have no connection to UM

michaelrrolle45 · October 11, 2019, 2:23am

I tried cuMemcpy2D to UM with an unaligned pitch and it was happy.

If for some reason I want to have a pitch, I can always allocate memory as [Height] [Pitch = Width + pad] and make the kernel aware of the pitch when accessing it.

Regarding textures… If I bind a texref to a device linear memory address, which happens to be in UM, that will be OK, right? I have to obey the constraints on the address and pitch alignments. That way, I can fill the memory using cuMemcpy2D from some other data on the host.

michaelrrolle45 · October 11, 2019, 2:28am

Answers to my two questions:

It doesn't matter if W is not properly aligned. Except if it is going to be bound to a texture reference. Memcpy and memset will always work.
No, linear (cuMemAlloc[Pitch]) device memory is not managed by UMS, but this is not necessary.

Robert_Crovella · October 11, 2019, 3:06am

AFAIK UM cannot be the backing for texture

Topic		Replies	Views
Methods to allocated 3D Unified Memory CUDA Programming and Performance	4	1231	June 30, 2018
cuMemallocPitch for 3D allocations? CUDA Programming and Performance	2	7093	June 23, 2008
What are row alignments for 2D arrays used for? CUDA Programming and Performance	1	762	October 11, 2019
Texture binding with "unpitched" memory CUDA Programming and Performance	4	3945	July 22, 2011
cuMemAllocPitch document/comments in header still up to date ? CUDA Programming and Performance	1	4443	June 16, 2011
How to determine the base adress alignment and pitch alignment used by 'cudaMallocPitch' ? CUDA Programming and Performance	4	2642	June 9, 2016
2D Texture Combined breaks with Pitch memory CUDA Programming and Performance	4	1138	October 1, 2010
Documentation for cuMemcpy2DUnaligned might be wrong and could be a bit better Cuda Driver API Memor CUDA Programming and Performance	0	1054	June 18, 2011
What is the stream-ordered equivalent of cudaMallocPitch? CUDA Programming and Performance cuda	2	1391	September 18, 2021
Pitch linear memory CUDA Programming and Performance	6	14722	August 10, 2011

How to allocate pitched unified memory?

Related topics