PyCUDA pass pointers to GPU memory

sam17 · December 30, 2020, 1:45pm

Hi all,

Would it be possible in PyCUDA to pass a “pointer to GPU memory” from one thread in Python to another one that also uses a CUDA function through PyCUDA for which the input is the output of the CUDA function of the former thread?

(I have a hard time explaining this in words, so I’ll add a small schematic)

The idea would be that the PyCUDA function returns the pointer to the result (which can be as big as 300k elements), then the Python part passes that pointer to the next process which then passes it to the next PyCUDA function, which uses the pointer to get its input data. I noticed moving the data is one of the performance bottlenecks in my application.

I have been looking into the shared memory stuff, but I can’t seem to easily get my head around it.

Thanks in advance,
Sam

Robert_Crovella · December 30, 2020, 2:56pm

https://gist.github.com/lebedov/5179201

sam17 · December 31, 2020, 2:05pm

Thanks for the answer.

This example seems to work on my desktop (GTX1060), but not on a Jetson Nano.

The Nano yields the error
<class 'pycuda._driver.LogicError'> : cuIpcGetMemHandle failed: operation not supported

Could it be possible this IPC stuff is not implemented on Jetson?

Greetings,
Sam

Robert_Crovella · December 31, 2020, 3:31pm

correct, see here:

" IPC functionality is not supported on Tegra platforms."

The only suggestion I would have would be to use linux-based host IPC, which is not specific to CUDA or something I would be able to help with.

You might also ask on the jetson sub-forum. Since Jetson memory is unified, there usually is not a huge issue if you use e.g. host pinned memory. I believe it may be possible to use host-pinned memory for host-based IPC, but I can’t really help with that and you might get better ideas on the jetson sub-forum.

sam17 · December 31, 2020, 3:49pm

I will start testing with some host pinned memory, and shoot another question on the Jetson sub-forum.

Anyway, a major thank you, I learned a lot of new stuff through the information you supplied.

Robert_Crovella · December 31, 2020, 4:11pm

I may have been unclear. host-pinned memory by itself does not allow for or facilitate host-based IPC.

So the starting point would be to see if you can get host based IPC working correctly in your python environment (I assume there should be plenty of online help for that). Then, you might see if you can get that host-based IPC working even when the underlying source memory is host-pinned. Hope that helps.

Again, asking on the Jetson forum may be a good idea before going down this avenue.

Topic		Replies	Views
Registering POSIX-CPU shared memory to CUDA with cudaHostRegister CUDA Programming and Performance	5	132	July 16, 2024
Optimising GPU and CPU memory transfer time (CUDA/Hardware)? CUDA Programming and Performance hw , cuda	8	3939	January 7, 2022
CUDA on Jetson Xavier AGX Jetson AGX Xavier cuda	4	893	December 22, 2021
How to manage CUDA memory? Jetson Xavier NX cuda , python	4	659	December 28, 2022
Share GPU/host pinned memory between host processes CUDA Programming and Performance	5	4011	March 7, 2012
Is cuMemExportToShareableHandle available on the Nano with CUDA 10.2? Jetson Nano cuda	4	1281	April 11, 2022
Share Cuda memory between different system processes CUDA Programming and Performance	6	1932	November 3, 2021
Pass GPU memory pointer to a cpu pointer CUDA Programming and Performance	1	866	June 1, 2021
CUDA device pointer host-side processes sharing implementation CUDA Programming and Performance	0	665	June 7, 2016
Best hardware options to reduce GPU and CPU memory transfer time? Jetson Nano	6	1040	January 19, 2022

PyCUDA pass pointers to GPU memory

Related topics