Cuda 3.2 mutex lock indefinately

I am not sure if this is a bug in my code brought out by the new Cuda 3.2 or something else. I have not seen this before with Cuda 3.1. When compiling with the new cuda 3.2 driver and toolkit, my application does not start. Below is the backtrace of the threads.

The one unique thing the application does is extensive use of threads and shared contexts:

new GLXcontext for opengl window (a)

new GLXcontext for opengl off screen rendering, shared list with “a” (b)

new GLXcontext and Cuda context, shared list with “b”

all guarded with one X Display lock.

Using g++ 4.5

[codebox]

====== thread apply all bt

Thread 4 (Thread 0x7f5577351710 (LWP 13009)):

#0 __lll_lock_wait () at …/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136

#1 0x00007f5582ad5104 in _L_lock_1024 () from /lib/libpthread.so.0

#2 0x00007f5582ad4f67 in __pthread_mutex_lock (mutex=0x7f55844f0120) at pthread_mutex_lock.c:82

#3 0x00007f5584297729 in ?? () from /usr/lib/libGL.so.1

#4 0x00007f558429a689 in ?? () from /usr/lib/libGL.so.1

#5 0x00007f558429aaba in ?? () from /usr/lib/libGL.so.1

#6 0x00007f557dfa698d in ?? () from /usr/lib/tls/libnvidia-tls.so.260.24

#7 0x00007f5584c92f6d in boost::detail::get_once_per_thread_epoch() () from /opt/boost_1_43_0/stage/lib/libboost_thread.so.1.43.0

#8 0x00007f5584c8c9a0 in T.1292 () from /opt/boost_1_43_0/stage/lib/libboost_thread.so.1.43.0

#9 0x00007f5584c8cd99 in boost::detail::set_current_thread_data(boost::detail::thread

_data_base*) () from /opt/boost_1_43_0/stage/lib/libboost_thread.so.1.43.0

#10 0x00007f5584c8cf55 in thread_proxy () from /opt/boost_1_43_0/stage/lib/libboost_thread.so.1.43.0

#11 0x00007f55842998d3 in ?? () from /usr/lib/libGL.so.1

#12 0x00007f5582ad28ba in start_thread (arg=) at pthread_create.c:300

#13 0x00007f557e89702d in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:112

#14 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f5576b50710 (LWP 13010)):

#0 __lll_lock_wait () at …/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136

#1 0x00007f5582ad5104 in _L_lock_1024 () from /lib/libpthread.so.0

#2 0x00007f5582ad4f67 in __pthread_mutex_lock (mutex=0x7f55844f0120) at pthread_mutex_lock.c:82

#3 0x00007f5584297729 in ?? () from /usr/lib/libGL.so.1

—Type to continue, or q to quit—

#4 0x00007f558429a689 in ?? () from /usr/lib/libGL.so.1

#5 0x00007f558429aaba in ?? () from /usr/lib/libGL.so.1

#6 0x00007f557dfa698d in ?? () from /usr/lib/tls/libnvidia-tls.so.260.24

#7 0x00007f55847f18dd in operator new(unsigned long) () from /usr/local/lib/libfltk2.so

#8 0x00007f557f061459 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator const&) () from /usr/lib/libstdc++.so.6

#9 0x00007f557f0621eb in std::string::_Rep::_M_clone(std::allocator const&, unsigned long) () from /usr/lib/libstdc++.so.6

#10 0x00007f557f0622dc in std::string::reserve(unsigned long) () from /usr/lib/libstdc++.so.6

#11 0x00007f557f05b4f3 in std::basic_stringbuf<char, std::char_traits, std::allocator >::overflow(int) () from /usr/lib/libstdc++.so.6

#12 0x00007f557f05fb45 in std::basic_streambuf<char, std::char_traits >::xsputn(char const*, long) () from /usr/lib/libstdc++.so.6

#13 0x00007f557f056165 in std::basic_ostream<char, std::char_traits >& std::__ostream_insert<char, std::char_traits >(std::basic_ostream<char, std::char_traits >&, char const*, long) () from /usr/lib/libstdc++.so.6

#14 0x00007f557f0563ef in std::basic_ostream<char, std::char_traits >& std::operator<< <std::char_traits >(std::basic_ostream<char, std::char_traits >&, char const*) () from /usr/lib/libstdc++.so.6

#15 0x00000000008c7b5e in pt::Thread_Pool::start_thread (this=0x116a1f0) at /home/x/pt/src/util/Thread_Pool.cpp:144

#16 0x00000000008cbb69 in boost::_mfi::mf0<void, pt::Thread_Pool>::operator() (this=0x1265390, p=0x116a1f0) at /opt/boost_1_43_0/boost/bind/mem_fn_template.hpp:49

#17 0x00000000008cbcbe in boost::_bi::list1<boost::_bi::valuept::Thread_Pool* >::operator()<boost::_mfi::mf0<void, pt::Thread_Pool>, boost::_bi::list0> (this=0x12653a0, f=…, a=…) at /opt/boost_1_43_0/boost/bind/bind.hpp:253

#18 0x00000000008cbc6c in boost::_bi::bind_t<void, boost::_mfi::mf0<void, pt::Thread_Pool>, boost::_bi::list1<boost::_bi::valuept::Thread_Pool* > >::operator() (this=0x1265390) at /opt/boost_1_43_0/boost/bind/bind_template.hpp:20

#19 0x00000000008cbc2e in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf0<void, pt::Thread_Pool>, boost::_bi::list1<boost::_bi::valuept::Thread_Pool* > > >::run (this=0x1265260) at /opt/boost_1_43_0/boost/thread/detail/thread.hpp:59

#20 0x00007f5584c8cf60 in thread_proxy () from /opt/boost_1_43_0/stage/lib/libboost_thread.so.1.43.0

#21 0x00007f55842998d3 in ?? () from /usr/lib/libGL.so.1

#22 0x00007f5582ad28ba in start_thread (arg=) at pthread_create.c:300

#23 0x00007f557e89702d in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:112

#24 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f557634f710 (LWP 13011)):

—Type to continue, or q to quit—

#0 __lll_lock_wait () at …/nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136

#1 0x00007f5582ad5104 in _L_lock_1024 () from /lib/libpthread.so.0

#2 0x00007f5582ad4f67 in __pthread_mutex_lock (mutex=0x7f55844f0120) at pthread_mutex_lock.c:82

#3 0x00007f5584297729 in ?? () from /usr/lib/libGL.so.1

#4 0x00007f558429984e in ?? () from /usr/lib/libGL.so.1

#5 0x00007f5582ad28ba in start_thread (arg=) at pthread_create.c:300

#6 0x00007f557e89702d in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:112

#7 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f5585e4f780 (LWP 13008)):

#0 pthread_cond_wait@@GLIBC_2.3.2 () at …/nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162

#1 0x00007f5584c8d305 in boost::thread::join() () from /opt/boost_1_43_0/stage/lib/libboost_thread.so.1.43.0

#2 0x00000000008c967d in boost::thread_group::join_all (this=0x116a2f0) at /opt/boost_1_43_0/boost/thread/detail/thread_group.hpp:71

#3 0x00000000008c84e5 in pt::Thread_Pool::_run (this=0x116a1f0) at /home/x/pt/src/util/Thread_Pool.cpp:317

#4 0x00000000008c7f82 in pt::Thread_Pool::run (this=0x116a1f0) at /home/x/pt/src/util/Thread_Pool.cpp:196

#5 0x0000000000749fbe in pt::Environment_Runtime::run (this=0x7fff090b9db0) at /home/x/pt/src/main/Environment_Runtime.cpp:387

#6 0x00000000007c038f in main (argc=1, argv=0x7fff090ba208) at /home/x/pt/src/main/main.cpp:13

====== [/codebox]

By using :
strace ./app : runs ok
gdb ./app : runs ok
./app : freezes. attach and backtrace

Seems like a linking issue and/or corruption, since the calls are not directly to GL. I’m going to revert back to Cuda 3.1 for now.

edit : another note is that Cuda 3.1 had a package for ubuntu 9.10 which ran fine on debian stable/testing. Cuda 3.2 has a package for ubuntu 10.4.

By using :
strace ./app : runs ok
gdb ./app : runs ok
./app : freezes. attach and backtrace

Seems like a linking issue and/or corruption, since the calls are not directly to GL. I’m going to revert back to Cuda 3.1 for now.

edit : another note is that Cuda 3.1 had a package for ubuntu 9.10 which ran fine on debian stable/testing. Cuda 3.2 has a package for ubuntu 10.4.

What happens if you try the RedHat 5 build of the CUDA Toolkit?

Thanks,

Cliff

What happens if you try the RedHat 5 build of the CUDA Toolkit?

Thanks,

Cliff

Tried rh 5.5, same problem.

Tried rh 5.5, same problem.

Okay… it was worth a try. Actually, given the stack traces and your description, it kind of sounds like an app bug… likely some kind of race condition in your lock acquisition since it works under the debugger. If you’re able to reproduce this in a small, standalone app, I’d be interested in taking a look at it.

Thanks,

Cliff

Okay… it was worth a try. Actually, given the stack traces and your description, it kind of sounds like an app bug… likely some kind of race condition in your lock acquisition since it works under the debugger. If you’re able to reproduce this in a small, standalone app, I’d be interested in taking a look at it.

Thanks,

Cliff

I found out what was wrong. A combination of installing a signal handler
sigset(SIGINT, &handler);
and using multiple threads, multiple gl rendering contexts, and a FLTK window caused the app to freeze. I removed the sigset, and it works fine. This worked in CUDA 3.1.

I found out what was wrong. A combination of installing a signal handler
sigset(SIGINT, &handler);
and using multiple threads, multiple gl rendering contexts, and a FLTK window caused the app to freeze. I removed the sigset, and it works fine. This worked in CUDA 3.1.

Thanks for the update! Signal handling in a multithreaded application can indeed be quite tricky. It sounds possible that some unrelated change elsewhere in the system (perhaps the newer NVIDIA driver, perhaps something else) triggered a latent bug in your signal handling. Glad you got it figured out.

–Cliff

Thanks for the update! Signal handling in a multithreaded application can indeed be quite tricky. It sounds possible that some unrelated change elsewhere in the system (perhaps the newer NVIDIA driver, perhaps something else) triggered a latent bug in your signal handling. Glad you got it figured out.

–Cliff