cuCtxCreate hangs for ever Tesla S1070

kalman · April 29, 2009, 3:46pm

Hi,
I’m experiencing some problema on my Tesla S1070. I’m not able to use first 2 devices anymore. For example a call to
cuCtxCreate hangs for ever, if used on first 2 devices, however it works on last 2 ones. Anyone can help me to identify
the problem before I do reboot my host ?

Thanks
Gaetano Mendola

tmurray · April 29, 2009, 3:57pm

Anything about it in /var/log/messages? (maybe the cable came out)

kalman · April 29, 2009, 4:03pm

The application on SDK "deviceQuery reports all devices, so I don’t think

it can be a cable problem:

There are 5 devices supporting CUDA

Device 0: “Tesla T10 Processor”

Major revision number: 1

Minor revision number: 3

Total amount of global memory: 4294705152 bytes

Number of multiprocessors: 30

Number of cores: 240

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.44 GHz

Concurrent copy and execution: Yes

Device 1: “Quadro NVS 290”

Major revision number: 1

Minor revision number: 1

Total amount of global memory: 267714560 bytes

Number of multiprocessors: 2

Number of cores: 16

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 8192

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 0.92 GHz

Concurrent copy and execution: Yes

Device 2: “Tesla T10 Processor”

Major revision number: 1

Minor revision number: 3

Total amount of global memory: 4294705152 bytes

Number of multiprocessors: 30

Number of cores: 240

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.44 GHz

Concurrent copy and execution: Yes

Device 3: “Tesla T10 Processor”

Major revision number: 1

Minor revision number: 3

Total amount of global memory: 4294705152 bytes

Number of multiprocessors: 30

Number of cores: 240

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.44 GHz

Concurrent copy and execution: Yes

Device 4: “Tesla T10 Processor”

Major revision number: 1

Minor revision number: 3

Total amount of global memory: 4294705152 bytes

Number of multiprocessors: 30

Number of cores: 240

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 16384 bytes

Total number of registers available per block: 16384

Warp size: 32

Maximum number of threads per block: 512

Maximum sizes of each dimension of a block: 512 x 512 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 1

Maximum memory pitch: 262144 bytes

Texture alignment: 256 bytes

Clock rate: 1.44 GHz

Concurrent copy and execution: Yes

tmurray · April 29, 2009, 4:11pm

Not necessarily true, because I think devices are only enumerated at boot.

kalman · April 29, 2009, 4:30pm

Disconnecting the cable the “queryDevice” hangs too. Anyway just rebooted the

host and now all works again, if it happens again I will let you know.

BTW I have filled the form for CUDA2.2 but I didn’t get any response yet, how much

time is usualy required for the approval ?

Regards

Gaetano

Topic		Replies	Views
cuCtxCreate hangs on multiple GPUs & Windows7 CUDA Programming and Performance	1	1416	November 14, 2011
Tesla S1070 4GUPs CUDA Programming and Performance	5	1748	May 11, 2009
2nd context creation fails on tesla C2050 CUDA Programming and Performance	2	1134	August 20, 2010
One C1060 out of two is not responsive CUDA Programming and Performance	0	2576	October 13, 2009
Tesla device problem Is it broken or it is just driver CUDA Programming and Performance	3	1047	March 16, 2012
Context Creation Still Taking Time Context Creation Still Taking Time on CUDA 2.2 and Tesla CUDA Programming and Performance	0	2648	July 1, 2009
There is no device supporting CUDA CUDA Programming and Performance	5	3719	October 12, 2010
Hardware problem with Tesla card? CUDA Programming and Performance	9	8372	April 2, 2008
Tesla installs, deviceQuery OK, bandwidthTest hangs (100%CPU) CUDA Programming and Performance	11	16549	March 24, 2010
why "all CUDA-capable devices are busy or unavailable" ? CUDA Programming and Performance	34	64807	April 20, 2011

cuCtxCreate hangs for ever Tesla S1070

Related topics