Nsight Seg Fault

Hey guys,

I’m using CUDA for the first time and I think that I’m having a bit of trouble allocating enough memory to create an array of objects. So I have these objects of size of 81772(Pretty Huge, I know) and I want to create two arrays of the same size of these to pass to the Kernel. If I make the arrays size 32, the program runs fine, albeit slow. If I increase the number to 64, it creates the first array and seg faults. If I increase the number to something like 128, it seg faults while creating the first array. What is my problem and how do I fix it? I’m doing my dev on a VM and running the code on a remote machine.

First off, determine if you’re actually running out of VRAM. You can do this easily on paper, comparing the size available vs. the proposed size of your arrays.

Otherwise, if what you have truly is a memory issue, you’ll need to perform everything in chunks. There are ways of optimizing this but fundamentally, you’ll have to first figure out how to break down your problem into smaller, appropriately sized pieces that’ll fit in memory.

Ignore please

Again Ignore

The size of the objects is >82000 bytes. We have 8G of RAM so I should be able to create quite a few(at least a couple hundred) of these objects while allocating 6G memory to Nsight, yet I’m limited to somewhere between 64 and 128. I suspect it’s an Eclipse Nsight IDE limitation on the heap size.



82KB is not a huge object. You should be able to allocate > 20-30K such objects without a problem.

Please provide
a. Source code and build command for a minimal reproducible.
b. In the post mark at which line the segfault is occurring.
c. The OS, OS bitness, application bitness, NV GPU, NV driver version, and CUDA toolkit version used.

If your application does not perform proper CUDA error checking then please add error checking to the code. A segfault is a CPU error. This is likely due to a bad pointer de-reference caused by lack of error handling or a simple pointer mistake.