Computeprof crashes

I am using computeprof to profile by PGI accelerated code and CUDAC code on a Fermi hardware. The application crashes when I try to analyse for occupancy. Following is the error I get

*** glibc detected *** /usr/local/cuda/computeprof/bin/computeprof: malloc(): memory corruption: 0x0000000017638a20 ***

======= Backtrace: =========

/lib64/libc.so.6[0x3382e72fae]

/lib64/libc.so.6(__libc_malloc+0x6e)[0x3382e74cde]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN7QString17fromLatin1_helperEPKci+0x68)[0x2b75c10561c8]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN7QString16fromAscii_helperEPKci+0xa5)[0x2b75c1059055]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN7QString9fromAsciiEPKci+0xe)[0x2b75c105908e]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN8QVariantC1EPKc+0x32)[0x2b75c111ce12]

/usr/local/cuda/computeprof/bin/computeprof[0x440f07]

/usr/local/cuda/computeprof/bin/computeprof[0x4424d5]

/usr/local/cuda/computeprof/bin/computeprof[0x5111f6]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN11QMetaObject8activateEP7QObjectiiPPv+0x33f)[0x2b75c111140f]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN7QAction9triggeredEb+0x37)[0x2b75c039bd87]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN7QAction8activateENS_11ActionEventE+0x64)[0x2b75c039c0f4]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QMenuPrivate19activateCausedStackERK5QListI8QPointe

rI7QWidgetEEP7QActionNS7_11ActionEventEb+0x152)[0x2b75c07cf112]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QMenuPrivate14activateActionEP7QActionNS0_11ActionE

ventEb+0x1eb)[0x2b75c07d2b5b]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN7QWidget5eventEP6QEvent+0x3ae)[0x2b75c03f69de]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN5QMenu5eventEP6QEvent+0x5b)[0x2b75c07d086b]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN19QApplicationPrivate13notify_helperEP7QObjectP6QEven

t+0xc0)[0x2b75c03a2b20]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QApplication6notifyEP7QObjectP6QEvent+0xbd2)[0x2b75c03a48d2]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN16QCoreApplication14notifyInternalEP7QObjectP6QEvent+

0x9b)[0x2b75c10feb1b]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN19QApplicationPrivate14sendMouseEventEP7QWidgetP11QMo

useEventS1_S1_PS1_R8QPointerIS0_E+0x2bc)[0x2b75c03a398c]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN9QETWidget19translateMouseEventEPK7_XEvent+0x3cd)[0x2b75c041414d]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QApplication15x11ProcessEventEP7_XEvent+0x131f)[0x2b75c0412fef]

/usr/local/cuda/computeprof/bin/libQtGui.so.4[0x2b75c04388a2]

/lib64/libglib-2.0.so.0(g_main_context_dispatch+0x1b4)[0x338462cdb4]

/lib64/libglib-2.0.so.0[0x338462fc0d]

/lib64/libglib-2.0.so.0(g_main_context_iteration+0x6e)[0x338463011e]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEve

ntLoop17ProcessEventsFlagEE+0x70)[0x2b75c11293a0]

/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN23QGuiEventDispatcherGlib13processEventsE6QFlagsIN10Q

EventLoop17ProcessEventsFlagEE+0x29)[0x2b75c0438299]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN10QEventLoop13processEventsE6QFlagsINS_17ProcessEvent

sFlagEE+0x43)[0x2b75c10fe153]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE+0x

cb)[0x2b75c10fe31b]

/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN16QCoreApplication4execEv+0x9f)[0x2b75c110284f]

/usr/local/cuda/computeprof/bin/computeprof(_ZN11QMainWindow5eventEP6QEvent+0x2f3)[0x41bf23]

/lib64/libc.so.6(__libc_start_main+0xf4)[0x3382e1d994]

/usr/local/cuda/computeprof/bin/computeprof(_ZN18QStandardItemModel13insertColumnsEiiRK11QModelIndex

+0x8a)[0x41bcfa]

Also I was previously using cudaprof and for the same code (on Tesla C1070) and it showed profile results for global memory throughput and warp serialize.

The computeprof analyse options seems to be not working as expected. Please advice.

We’ve been seeing this problem since May (when I started working on this stuff). It could have been a problem since before that…I am not sure.

But we have given up on using this tool. We take less time profiling the code when using the command line and doing some back-of-the-napkin estimates than try to fiddle around with something nVidia cannot seem to fix over multiple CUDA releases.

For what it’s worth, I’m also having this problem, so it would appear that a fix still hasn’t been taken care of (or I’m missing the patch).

Matt

I was able to fix the problem by adding $NVIDIA_CUDA_SDK_LOCATION/bin/computeprof to LD_LIBRARY_PATH.

Marc

Are you seeing a computeprof crash when using “Occupancy Analysis”? Did you try using the CUDA 4.0 release? Please send more details if you still see the issue?

Sorry I just saw this. I think I’m still having the issue. I’ll try it later today and get back to you.

Matt

I’m not able to reproduce it right now, so I guess I won’t be of much help, sorry.

Matt