Scalability Analysis Scalability Analysis on GPU

I am trying to do a scalability analysis using my quadro fx 5800 which have 240 cores to the run time scales with number of cores which is a classic study for parallel computing.
I was wondering how does the definition of the core fit it in this ?
And how can I use it to run on different core settings say ( 8,12,120,180, 240 cores ) ?
My test case is the simple matrix multiplication.