Managed memory crash

I’m working on a CUDA-accelerated Maya plugin that’s been idle for a while. For some reason, all writes to managed memory are crashing, whether created with managed or cudaMallocManaged. Any hints of why? This happens even if I run the one sample that uses managed memory inside Maya (UnifiedMemoryStreams).

Are there known issues with applications using both CUDA and OpenCL? This version of Maya started using OpenCL. (I’ve disabled its GPU evaluation, so it shouldn’t be active, but it still initializes it.)

This is with CUDA 7.5.18, Win7, Geforce GTX 750, 359.06 drivers.

What exactly is meant by “all writes to managed memory are crashing”? What are the specific symptoms? Segfaults? Can you show an example of the error message(s)?

It’s just a typical invalid page fault (would be a segfault if I wasn’t in Windows).

Is it a 64-bit OS? Is the plugin compiled as a 64-bit application?

Yes. (Maya doesn’t have a 32-bit build.)

I guess I need to install Maya 2015 (which didn’t use OpenCL, and where this worked the last time I was working on this project) to see if it seems like an OpenCL conflict. It’s impossible to search for anything about CUDA/OpenCL conflicts, since search results are flooded with “CUDA vs. OpenCL”. If that’s the case, I’d have no option other than rewriting in OpenCL. (I don’t have any evidence that this is the issue, I just can’t think of anything else that has changed.)

Well, it does happen in Maya 2015 as well now, not just 2016. I guess I should be relieved, since at least that means it’s not a weird conflict with OpenCL, but I have no idea what changed. I did have very strange hard system crashes last time I looked at this (https://devtalk.nvidia.com/default/topic/823146/diagnosing-cuda-causing-a-hard-system-crash/) which I never found a solution to (and is one reason this has been shelved for so long), but this is nothing like that.

Unhandled exception at 0x000007FEC9D3ADAC in maya.exe: 0xC0000006: In page error writing location 0x0000001A00000000 (status code 0xC0000022).

It just looks like a plain old invalid access. Other CUDA code is working fine, this only happens once I try to use managed memory. 0x0000001A00000000 seems to be the start page where managed memory is being allocated.

Perhaps you are attempting to read/write managed memory, after having launched a kernel, without issuing a cudaDeviceSynchronize(). Perhaps you should try to create a standalone sample that demonstrates the managed memory crash that does not depend on Maya or being a plugin.

I’m already synchronizing, I tried writing to memory set to cudaMemAttachHost (which won’t care if a kernel is running), and I tried the UnifiedMemoryStreams sample as a Maya plugin. It only happens inside Maya, but it didn’t happen before, for the primary development of this code.

I’m probably going to try reinstalling Windows, to see if this is a weird different manifestation of the other strange behavior I was seeing before. (But if the solution is “reinstall Windows”, then I’m definitely going to have to plan a switch to OpenCL, since if I tell a user having this problem to reinstall Windows, they’ll laugh in my face…)

Well, it’s not happening in a Win10 install. No idea. I think dropping CUDA is the only real option–every time I use it I hit some strange showstopper problem that’s impossible to debug. If I can’t even figure out what’s wrong on my own system, I’ll never be able to support remote users.