I am trying to run concurrent kernels on Fermi card (new Feature in 2.0 and above). I would like to measure the time taken for each of those concurrent kernels. Is there any reliable way to do that?
I am trying to run concurrent kernels on Fermi card (new Feature in 2.0 and above). I would like to measure the time taken for each of those concurrent kernels. Is there any reliable way to do that?