Hello guys,
I’m totally new in CUDA programming,and i recently start writing my own CUDA programs which simply tend to do two 2048*2048 matrices’ multiplication. And i wonder can these programs utilize all the stream processors on my GTX580, so i want to find some useful APIs to get to know my device’s working state, e.g. how many stream processors are working concurrently, the percentage of the RAM on GTX580 (not on the PC) which is being used, and so on.
Looking forward to any helpful advices, thanks very much…