CUDA Maximum number of streams / Bug?

Hi there,

I’m running CUDA 2.3 on Win XP and a GeForce 9600 GT.

I encountered the problem of incorrect computing results if the number of streams is greater than 8. In order to make sure it’s not my fault and for reproduction I tried the following:

[list=1]

In the SDK example “simpleStreams”, change the line

[font=“Courier New”] int nstreams = 4;[/font]

to

[font=“Courier New”] int nstreams = 8;[/font]

(line 60, just below [font=“Courier New”]int Main(…)[/font]).

Run program, result passes

Change the [font=“Courier New”]nstreams[/font] to anything higher than 8:

[font=“Courier New”] int nstreams = 9;[/font]

Run program, result fails (see below)!

Question: Is there an official maximum number of streams I can querry?! Otherwise, this is a bug and the question arises how to report it?!

[i]Here’s the output for 8 and 9 streams:

[indent][font=“Courier New”][ simpleStreams ]

Device name : GeForce 9600 GT

CUDA Capable SM 1.1 hardware with 8 multi-processors

scale_factor = 1.0000

array_size = 16777216

memcopy: 19.59

kernel: 44.60

non-streamed: 64.28 (64.20 expected)

8 streams: 47.29 (47.05 expected with compute capability 1.1 or later)


Test PASSED

[ simpleStreams ]

Device name : GeForce 9600 GT

CUDA Capable SM 1.1 hardware with 8 multi-processors

scale_factor = 1.0000

array_size = 16777216

memcopy: 19.60

kernel: 44.60

non-streamed: 64.71 (64.20 expected)

9 streams: 76.96 (46.78 expected with compute capability 1.1 or later)


1863680: 0 50

Test FAILED[/font][/indent][/i]