cudaMallocPitch crash on 280

I’m currently working on a very large CUDA application which is malloc’ing and freeing millions of cuda pitches. I’m consistently hitting a crash (unfortunately after several hours of processing) inside of cudaMallocPitch, with a stack trace ending with :

Program received signal SIGSEGV, Segmentation fault.
0x00007f4ecc468566 in ?? () from /usr/lib/libcuda.so.1
#1 0x00007f4ecc45bce3 in ?? () from /usr/lib/libcuda.so.1
#2 0x00007f4ecc45041a in ?? () from /usr/lib/libcuda.so.1
#3 0x00007f4ecc2040c3 in cudaMallocPitch () from /usr/local/cuda/lib/libcudart.so.2

I’m allocating a plane that is 2048x1556 and I should have plenty of CUDA memory available to me. Any ideas on where to go to chase this? Like I said, it takes HOURS and millions of mallocs before it happens, so it’s tricky to try to trace down. Any known problems that I should look out for? I’m running driver version 177.67 on a 280 card. Thanks.

Can you provide an app that reproduces this along with an nvidia-bug-report.log ?

Unfortunately I can’t include the app I’m currently seeing the problem in as it contains alot of proprietary information. I’ll see about creating a test app which is able to reproduce it. I’ll work on getting you the ‘nvidia-bug-report.log’, the app is running right now and when it crashes I’ll get you the log. Thanks for the quick response.