Why require pinned memory for async transfers

Hi,

I was wondering why pinned memory is needed for the card to make async transfers ?

AFAIK, the kernel driver could take any __user pointer and use get_user_pages to temporarily lock them and obtain the required info to build a scatter-gather list.

Sylvain

I guess the DMA engine is not capable of scatter-gather access?

I guess the DMA engine is not capable of scatter-gather access?

I thought that too, but that wasn’t true for a long time because get_user_pages didn’t behave as we needed (or so one of the Linux kernel driver engineers explained to me). I’m keenly aware of how useful this feature is, and we’re investigating it again for a future release.

I thought that too, but that wasn’t true for a long time because get_user_pages didn’t behave as we needed (or so one of the Linux kernel driver engineers explained to me). I’m keenly aware of how useful this feature is, and we’re investigating it again for a future release.

I sure hope that’s not the reason. SG DMA engine are pretty simple, and … NVidia makes GPU, that’s order of magnitude more complex :)

Mmm, interesting. Do you have more details about what was failing ?

I used that technique on some custom hardware (a PCIe card with a FPGA) and it worked fine for us. Obviously that’s a pretty limited test set compared to NVidia’s wide range of cards and supported platforms but still … it’s supposed to work and if it doesn’t it’s a kernel bug worth investigating.

I’d definitely be interested in testing / debugging this if needed.

Yeah, making all those things zero-copy would just be great …

Do you know if the change would be only in the opensource part of the kernel driver or if some changes would be needed in the object files as well ?

I tried scanning the nv.c and ns-osinterface.c but didn’t see much related to VM page mapping.

Sylvain

I sure hope that’s not the reason. SG DMA engine are pretty simple, and … NVidia makes GPU, that’s order of magnitude more complex :)

Mmm, interesting. Do you have more details about what was failing ?

I used that technique on some custom hardware (a PCIe card with a FPGA) and it worked fine for us. Obviously that’s a pretty limited test set compared to NVidia’s wide range of cards and supported platforms but still … it’s supposed to work and if it doesn’t it’s a kernel bug worth investigating.

I’d definitely be interested in testing / debugging this if needed.

Yeah, making all those things zero-copy would just be great …

Do you know if the change would be only in the opensource part of the kernel driver or if some changes would be needed in the object files as well ?

I tried scanning the nv.c and ns-osinterface.c but didn’t see much related to VM page mapping.

Sylvain

Good to hear! Thanks, Sylvain, for bringing this up!

Good to hear! Thanks, Sylvain, for bringing this up!

It’s been on my radar for a while. :P

It’s been on my radar for a while. :P

But without Sylvain we would not yet have heard of it. ;)

Thanks, Tim, for having all the good stuff on your radar and making it happen. :thanks:

But without Sylvain we would not yet have heard of it. ;)

Thanks, Tim, for having all the good stuff on your radar and making it happen. :thanks:

No answer to the follow up questions I asked above ?

No answer to the follow up questions I asked above ?

I just know that one of our very serious Linux kernel guys said the implementation of get_user_pages wasn’t enough for a while.

I assume this is not a change in the open source layer.

I just know that one of our very serious Linux kernel guys said the implementation of get_user_pages wasn’t enough for a while.

I assume this is not a change in the open source layer.

I don’t doubt him, I just hate when I don’t understand exactly why stuff I’d expect to work doesn’t :P

He doesn’t read this forum does he ?

And BTW, do other OSes (OSX / Window) have/use a similar functions that avoids the requirement for pre-pinned memory ?

I don’t doubt him, I just hate when I don’t understand exactly why stuff I’d expect to work doesn’t :P

He doesn’t read this forum does he ?

And BTW, do other OSes (OSX / Window) have/use a similar functions that avoids the requirement for pre-pinned memory ?