running one of my program under development, I got a stuck CUDA process, then after a while I found a lot of messages like below.
Is it worth a bug report?
BUG: soft lockup - CPU#1 stuck for 16s! [double_queue:1524]
CPU 1:
Modules linked in: nvidia(PU) nfs fscache nfs_acl autofs4 lockd sunrpc uio iw_cxgb3 cxgb3 cpufreq_ondemand acpi_cpufreq freq_table ib_srp rds ib_sdp ib_ipoib ipoib_helper rdma_ucm rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ib_sa loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport mlx4_ib ib_mad ib_core mlx4_en joydev i2c_i801 igb 8021q mlx4_core serio_raw pcspkr shpchp dca i2c_core sg dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 1524, comm: double_queue Tainted: P 2.6.18-194.32.1.el5 #1
RIP: 0010:[<ffffffff896e10b3>] [<ffffffff896e10b3>] :nvidia:_nv022936rm+0x20/0x22
RSP: 0018:ffff810433c638d0 EFLAGS: 00000202
RAX: 00000000ffffffff RBX: ffff8107846b7330 RCX: 0000000000000040
RDX: ffffc20011680000 RSI: ffff8107a89ee000 RDI: ffff8107fdf56000
RBP: ffff810433c63918 R08: 0000000000000050 R09: ffff81043d7b2b80
R10: ffff81043d7b2b40 R11: 0000000000000050 R12: ffff81083e9ac840
R13: 0000000010008040 R14: ffff81083e9ac840 R15: ffff81083e9ac840
FS: 00002ae37b7895e0(0000) GS:ffff81010ee99440(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000406000 CR3: 0000000000201000 CR4: 00000000000006e0
Call Trace:
[<ffffffff89505f78>] :nvidia:_nv015002rm+0x148/0x190
[<ffffffff89505fe1>] :nvidia:_nv003211rm+0x21/0x52
[<ffffffff8950ce4d>] :nvidia:_nv015051rm+0x9e/0x14a
[<ffffffff89506203>] :nvidia:_nv003209rm+0x72/0x119
[<ffffffff89506306>] :nvidia:_nv014374rm+0x5c/0x71
[<ffffffff895bdbf1>] :nvidia:_nv020040rm+0x243/0x9df
[<ffffffff895a3a52>] :nvidia:_nv020037rm+0x4e/0x8c
[<ffffffff894ef700>] :nvidia:_nv013071rm+0xfd/0x1d2
[<ffffffff894ef5d6>] :nvidia:_nv013074rm+0x76/0xa3
[<ffffffff894c4ea4>] :nvidia:_nv013077rm+0xd44/0x10cd
[<ffffffff89206396>] :nvidia:_nv002388rm+0x404/0x485
[<ffffffff89203f3c>] :nvidia:_nv003713rm+0x1cd/0x770
[<ffffffff89203ee2>] :nvidia:_nv003713rm+0x173/0x770
[<ffffffff89202bcf>] :nvidia:_nv003711rm+0xc7/0xef
[<ffffffff89202c18>] :nvidia:_nv025316rm+0xe/0x13
[<ffffffff892030d5>] :nvidia:_nv003722rm+0x111/0x49d
[<ffffffff89202bcf>] :nvidia:_nv003711rm+0xc7/0xef
[<ffffffff89202c18>] :nvidia:_nv025316rm+0xe/0x13
[<ffffffff89202e4c>] :nvidia:_nv003717rm+0x1a8/0x320
[<ffffffff89202bcf>] :nvidia:_nv003711rm+0xc7/0xef
[<ffffffff89202c05>] :nvidia:_nv025318rm+0xe/0x13
[<ffffffff89646778>] :nvidia:_nv025098rm+0x58/0x7b
[<ffffffff896e3f3d>] :nvidia:_nv002329rm+0x144/0x18a
[<ffffffff896e962c>] :nvidia:rm_disable_adapter+0x8b/0xdf
[<ffffffff8970754b>] :nvidia:nv_kern_close+0x26b/0x410
[<ffffffff80012ad9>] __fput+0xd3/0x1bd
[<ffffffff80023c39>] filp_close+0x5c/0x64
[<ffffffff80038f19>] put_files_struct+0x63/0xae
[<ffffffff80015860>] do_exit+0x31c/0x911
[<ffffffff800491a7>] cpuset_exit+0x0/0x88
[<ffffffff8002b2ed>] get_signal_to_deliver+0x465/0x494
[<ffffffff8005ada1>] do_notify_resume+0x9c/0x7af
[<ffffffff89704a6a>] :nvidia:nv_kern_ioctl+0x382/0x393
[<ffffffff80066b88>] do_page_fault+0x4fe/0x874
[<ffffffff80062ff8>] thread_return+0x62/0xfe
[<ffffffff800421d7>] do_ioctl+0x21/0x6b
[<ffffffff8005d6dc>] retint_signal+0x3d/0x79