How many particles should I be able to get onto a C870?

Each particle requires
11 float4 variables
16 float variables
2 int variables
2 int2 variables

allowing for the largest possible kernel of 2 million instructions.

This is just a memory question, right? What does the 2 million instruction limit have to do with this?

I thought the instructions for a kernel were stored in the global memory when that kernel was executed. If not, then where are the kernel instructions stored? and then discard the memory required for 2 million instructions; how many particles should I be able to hold in the global memory of a C870 and be able to do what ever I want with them?

Check the “Some Finding” section in the first post. It may be useful.

OK, I see what you are talking about, but you really don’t need to worry about the memory the instructions take up. 2 million instructions probably takes up less space than the CUDA driver reserves for other things. The C870 has 1536 MB of memory, so even 16 MB of code is a trivial correction to the free space.

To get the best answer to this question, you should call cuMemGetInfo() to determine the exact free space available. Then you can divide your particle record size into that to find out how many you can store.