Unable to profile application

When I run the profiler on my code, I have the following error:

Unable to profile application.
"The application being profiled received a signal."

I don’t know where this error come from and how to fix it. My application works perfectly when I run it.

My app solved many fft without using cuFFT. When I decrease the number of fft (<103) to solve, I can profile it. I don’t know why there is a threshold.
One more thinks, I used a lot of register 255. (I know that is bad!!!).

I’ve tested on three cards and the result is the same:
K20C, M2090, GTXTitan
I use Cuda 5.5 with Nsight

Thank you

Hello,

I found something.
When I profiled with nsight, I have this error message, but when I use the command nvprof, I have an error:

Error: Application received signal 139

In linux, the error 139 is a segmentation fault.
It is wired because when I launch the program from Nsight, I used a Makefile, it works.

So the problem is here:

make -f myMakefile.mk run

which simply run my program (/tmp/ramdrive/…/myProgram.run) it works

But when I run only the command in a terminal (bin/bash)

/tmp/ramdrive/.../myProgram.run

I have a segmentation fault.

This is not a profiler problem, only a difference between the direct execution and the execution through “make”

The only difference between the two execution that I found is the shell. Make use /bin/sh and I use /bin/bash. I also tried to change the SHELL variable in the makefile, and the execution succeed.

Does anybody know a difference between a direct execution and an execution through make? stack allocation? heap allocation?

Thank you.

I had the similar problem in my application. When I compiled and ran using command line, it ran perfectly, but if I used Nsight or profiled it (either using Visual Profiler or nvprof), a lot of problems came out, like giving NaN in the result or saying some memory problems. The same thing, when I decreased the size of data, it ran OK. I would think it is memory management problem, but I am not sure which part of the memory it occupies. I am using Ubuntu 12.04 and GeForce 660, CUDA 5.5.

MLiu,
have you tried running your application with cuda-memcheck?