Hi there,
I’m running CUDA 2.3 on Win XP and a GeForce 9600 GT.
I encountered the problem of incorrect computing results if the number of streams is greater than 8. In order to make sure it’s not my fault and for reproduction I tried the following:
[list=1]
[*]In the SDK example “simpleStreams”, change the line
[font=“Courier New”] int nstreams = 4;[/font]
to
[font=“Courier New”] int nstreams = 8;[/font]
(line 60, just below [font=“Courier New”]int Main(…)[/font]).
[*]Run program, result passes
[*]Change the [font=“Courier New”]nstreams[/font] to anything higher than 8:
[font=“Courier New”] int nstreams = 9;[/font]
[*]Run program, result fails (see below)!
Question: Is there an official maximum number of streams I can querry?! Otherwise, this is a bug and the question arises how to report it?!
[i]Here’s the output for 8 and 9 streams:
[indent][font=“Courier New”][ simpleStreams ]
Device name : GeForce 9600 GT
CUDA Capable SM 1.1 hardware with 8 multi-processors
scale_factor = 1.0000
array_size = 16777216
memcopy: 19.59
kernel: 44.60
non-streamed: 64.28 (64.20 expected)
8 streams: 47.29 (47.05 expected with compute capability 1.1 or later)
Test PASSED
[ simpleStreams ]
Device name : GeForce 9600 GT
CUDA Capable SM 1.1 hardware with 8 multi-processors
scale_factor = 1.0000
array_size = 16777216
memcopy: 19.60
kernel: 44.60
non-streamed: 64.71 (64.20 expected)
9 streams: 76.96 (46.78 expected with compute capability 1.1 or later)
1863680: 0 50
Test FAILED[/font][/indent][/i]