So, first off, to be up-front I’m working with Cuda on my honours thesis. I’ve implemented my algorithm (optimised version of the double description method) and it does run fine on my GT240 using all 96 cores.
I’ve also got a single-threaded, CPU-only, version of the algorithm. It runs fine as well.
What I want to do is test my algorithm, with Cuda, but only using a smaller number of cores. I’m basically hoping to test my code with maybe 12/24/48/96 cores enabled, so I can get a sense of how well my algorithm is scaling. However, I don’t have any other Nvidia devices lying around.
Is there any way to tell Cuda to limit itself to a certain number of cores, beyond manually editing the program to make the algorithm only run a certain number of threads at a time?