I am using computeprof to profile by PGI accelerated code and CUDAC code on a Fermi hardware. The application crashes when I try to analyse for occupancy. Following is the error I get
*** glibc detected *** /usr/local/cuda/computeprof/bin/computeprof: malloc(): memory corruption: 0x0000000017638a20 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3382e72fae]
/lib64/libc.so.6(__libc_malloc+0x6e)[0x3382e74cde]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN7QString17fromLatin1_helperEPKci+0x68)[0x2b75c10561c8]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN7QString16fromAscii_helperEPKci+0xa5)[0x2b75c1059055]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN7QString9fromAsciiEPKci+0xe)[0x2b75c105908e]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN8QVariantC1EPKc+0x32)[0x2b75c111ce12]
/usr/local/cuda/computeprof/bin/computeprof[0x440f07]
/usr/local/cuda/computeprof/bin/computeprof[0x4424d5]
/usr/local/cuda/computeprof/bin/computeprof[0x5111f6]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN11QMetaObject8activateEP7QObjectiiPPv+0x33f)[0x2b75c111140f]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN7QAction9triggeredEb+0x37)[0x2b75c039bd87]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN7QAction8activateENS_11ActionEventE+0x64)[0x2b75c039c0f4]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QMenuPrivate19activateCausedStackERK5QListI8QPointe
rI7QWidgetEEP7QActionNS7_11ActionEventEb+0x152)[0x2b75c07cf112]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QMenuPrivate14activateActionEP7QActionNS0_11ActionE
ventEb+0x1eb)[0x2b75c07d2b5b]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN7QWidget5eventEP6QEvent+0x3ae)[0x2b75c03f69de]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN5QMenu5eventEP6QEvent+0x5b)[0x2b75c07d086b]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN19QApplicationPrivate13notify_helperEP7QObjectP6QEven
t+0xc0)[0x2b75c03a2b20]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QApplication6notifyEP7QObjectP6QEvent+0xbd2)[0x2b75c03a48d2]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN16QCoreApplication14notifyInternalEP7QObjectP6QEvent+
0x9b)[0x2b75c10feb1b]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN19QApplicationPrivate14sendMouseEventEP7QWidgetP11QMo
useEventS1_S1_PS1_R8QPointerIS0_E+0x2bc)[0x2b75c03a398c]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN9QETWidget19translateMouseEventEPK7_XEvent+0x3cd)[0x2b75c041414d]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN12QApplication15x11ProcessEventEP7_XEvent+0x131f)[0x2b75c0412fef]
/usr/local/cuda/computeprof/bin/libQtGui.so.4[0x2b75c04388a2]
/lib64/libglib-2.0.so.0(g_main_context_dispatch+0x1b4)[0x338462cdb4]
/lib64/libglib-2.0.so.0[0x338462fc0d]
/lib64/libglib-2.0.so.0(g_main_context_iteration+0x6e)[0x338463011e]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEve
ntLoop17ProcessEventsFlagEE+0x70)[0x2b75c11293a0]
/usr/local/cuda/computeprof/bin/libQtGui.so.4(_ZN23QGuiEventDispatcherGlib13processEventsE6QFlagsIN10Q
EventLoop17ProcessEventsFlagEE+0x29)[0x2b75c0438299]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN10QEventLoop13processEventsE6QFlagsINS_17ProcessEvent
sFlagEE+0x43)[0x2b75c10fe153]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE+0x
cb)[0x2b75c10fe31b]
/usr/local/cuda/computeprof/bin/libQtCore.so.4(_ZN16QCoreApplication4execEv+0x9f)[0x2b75c110284f]
/usr/local/cuda/computeprof/bin/computeprof(_ZN11QMainWindow5eventEP6QEvent+0x2f3)[0x41bf23]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x3382e1d994]
/usr/local/cuda/computeprof/bin/computeprof(_ZN18QStandardItemModel13insertColumnsEiiRK11QModelIndex
+0x8a)[0x41bcfa]
Also I was previously using cudaprof and for the same code (on Tesla C1070) and it showed profile results for global memory throughput and warp serialize.
The computeprof analyse options seems to be not working as expected. Please advice.