How many cuda streams can be launched at a time

I’m trying to use cuda stream to process my huge database. I’m wondering is there a hard limit for the number of streams that can be launched?

When I launched 10000 streams, the kernel failed shown as [run]Segmentation fault. But when I use 5000 streams, it worked fine.

Thank you

16 I think

16 is the number of concurrent kernels possible on Fermi. Obviously, the number of streams that can launch is much higher than this.

Are you checking the error code from the stream creation function? I bet it returns an error when it cannot create another stream. If it does not, then this is a bug that should be submitted to NVIDIA to fix. I don’t recall any specific number listed in the documentation, it likely depends on the amount of memory available to the driver and differs from machine to machine.

Actually I have a fixed database size like 100000 int data elements. Then each of the streams would take the job of processing part of the database.

I evenly tried the cases of

1 stream, each process 100000 elements (Failed: Segmentation fault)

2 streams, each process 50000 elements (Success)

10 streams, each process 10000 elements (Success)

100 streams, each process 1000 elements (Success)

1000 streams, each process 100 elements (Success)

5000 streams, each process 20 elements (Success)

10000 streams, each process 10 elements (Failed: Kernel execution failed:(9) invalid configuration argument. Error 255)

Only the two ends failed. Really don’t know why