I was trying to profile my code using Visual profiler to check the performance .However it doesnt seem to work neither for the standard examples nor my own code.Here is what I do …I start a new project go to session and under launch I specify the exe I want to run (so for sample examples in SDK i point to C:\CUDASDK\bin\win32\Release\XYZ.exe ) and for the working directory I have the directory where this .exe is located(arguments is --noprompt).Now when I launch my own .exe file (the one I want to test) it runs smoothly ie. it shows “Program run #1 completed” (and so for run 2,3) and then on completion says “Error in reading profiler output”.The standard examples in SDK doesnt work too.Could anyone help me with this?
Which version of CUDA toolkit are you using, is it v3.2?
Can you send the output in the Visual Profiler “Output window” (you can select all, copy and paste the text) ? There may be some errors or warnings which may help to identify what the issue is?
I suppose you setup is correct. Can you verify if the SDK exampes run and complete successfully out side the profiler?
The “Error in reading profiler output” - can happen when the application does not terminate successfully and the profiler output file is either not generated or has some errors.
Thanks ssatoor.I use v2.2 of CUDA toolkit.This is what I get in the window underneath :-
=== Start profiling for session 'Session1' === Start program 'C:/Documents and Settings/Raj/My Documents/Visual Studio 2008/Projects/Cuda001/Debug/Cuda001.exe' run #1 ... This is the time 22544384.000000 and this is the last error (null) Program run #1 completed. Start program 'C:/Documents and Settings/Raj/My Documents/Visual Studio 2008/Projects/Cuda001/Debug/Cuda001.exe' run #2 ... This is the time 22020096.000000 and this is the last error (null) Program run #2 completed. Start program 'C:/Documents and Settings/Raj/My Documents/Visual Studio 2008/Projects/Cuda001/Debug/Cuda001.exe' run #3 ... This is the time 22020096.000000 and this is the last error (null) Program run #3 completed. Error in reading profiler output.
I am pretty sure the setup runs fine .I ran the bandwidthtest.cu and it worked fine .Also my program gives the correct output.Its only with the profiler that this error arises.Any comments?
From your description it looks like the profiler output is not getting written out to the file. Can you try adding a call to cudaThreadSynchronize() at the end?
Are you using the CUDA runtime API or the driver API?
I am surprise that you see the same issue even with the CUDA SDK examples.
CUDA v2.2 is quite old now. Can you move to a newer version v3.2?
Thanks for the reply! I did try to add cudaThreadSynchronize right before finishing with the entire program.(I had it before starting and stopping timers too).It doesnt affect .I use cuda Runtime api.
As for the change to upper version that would be the last resort I am looking for .
Appreciate your time.