visual profiler with compute capability 1.0 cards?

CUDA_Moose · September 10, 2008, 11:56pm

Am I right in that the visual profiler does not work with compute capability 1.0 cards? And is there a way to check that my reads/writes are coalesced on a 1.0 card? I tried setting CUDA_PROFILE=1 and checking the cubin file, but I don’t see anything in them about this. Thanks in advance.

E.D_Riedijk · September 11, 2008, 8:57am

maybe you should read the releasenotes? the result of a profile is not in a .cubin file…
I have used the non-visual and the visual debugger on 8800GTX (1.0 card) without any problems

CUDA_Moose · September 11, 2008, 5:36pm

I know the profile results are supposed to be in cuda_profile.log. I have looked at this log file, but there is only gputime, cputime and occupancy information in there. How can I check coaslescing? I assume you mean profiler, not debugger. The visual profiler won’t run my code. I’ve already read the release notes. Is there something I’m missing?

MisterAnderson42 · September 11, 2008, 6:22pm

Please read the docs as was suggested, there is nothing we can tell you that is not already spelled out there. There is a file called CUDA_Profiler_2.0.txt in the doc/ directory isntalled by the toolkit. In it you will find

................

CUDA_PROFILE_CONFIG

is used to specify a config file for enabling performance counters

in the GPU. See the next section for configuration details.

Profiler Configuration

----------------------

This version of the Cuda profiler supports configuration options that allow

users to gather statistics about various events occurring in the GPU during execution.

These events are tracked with hardware counters on signals in the chip.

The followings options/signals are supported:

   timestamp

    ---------

    This option tells the profiler to log timestamps before kernel

    launches and memory operations so the user can do timeline analysis.

   gld_incoherent

    gld_coherent

    gst_incoherent

    gst_coherent

    --------------

    These options tell the profiler to record information about whether global

    memory loads/stores are coalesced (coherent) or non-coalesced (incoherent).

................

And much more. In short, you need to create a config file that lists the signals you want to record, set the proper environment variable and then read the numbers from cuda_profile.log.

You may also be interested in the visual profiler which automates this process for you with a GUI: http://forums.nvidia.com/index.php?showtopic=58283

E.D_Riedijk · September 11, 2008, 6:23pm

aaaarrrgghhh, profiler yes, it’s just 90% of my brain cells are burning all their cycles hoping for the debugger ;)

As far as I remember, you have to add the signals you want to have measured to a config file, and set an environment variable to the location of that config file. It should be in a text file on your system.

The visual profiler runs all code that the normal profiler does as far as I know, so what is going wrong with the visual profiler? (the commandline profiler is much, much more cumbersome to work with)

CUDA_Moose · September 11, 2008, 9:48pm

When I try to run the visual profiler, I get "Error -91 in reading profiler output.

Empty data for ‘gputime’ column in profiler output file" even though there are times recorded in the cuda_profile.log.

Do I still need a config file with the visual profiler, or does the visual profiler set that up for me?

edit: looks like the temporary profiler config file and the csv file is being crated by the visual profiler. I set all the environment variables mentioned in the CUDA_Profiler_2.0.txt. Now I get an error

Error -88 in reading profiler output.

Empty data for 'cputime' column in profiler output file.

Perhaps caused by the memcopy lines in the csv file?

method,gputime,cputime,occupancy,gld_incoherent,gld_coherent,gst_incoherent,gst_coherent

memcopy,2493.728

memcopy,74.944

memcopy,18.752

edit2: Seems like the visual profiler is confused by the memcopy lines, and my log file is partially corrupted as some of the lines are incomplete. If I remove the memcopy lines and the broken lines fromthe csv file, I can import it into the visual profiler. How can I stop the profiler from outputting the memcopy lines?

E.D_Riedijk · September 12, 2008, 6:12am

well, it would be even easier, to just let the visual profiler do all the work. You will not have to set any environment variables, and not tweak any config files. Importing should indeed also work, but I have never tried, so I also don’t know if there is an option to not generate memcopy output. I think there is no such option (and find it a bug in the visual profiler if it cannot handle that)

Anyhow, if you use only the visual profiler, you should have no such problems, except for programs that change the current directory before running CUDA code. Then it would be smart to change back to the old directory before calling your kernel(s) as the visual profiler expects the files in that directory.

MisterAnderson42 · September 12, 2008, 12:23pm

A lot of people on the forums see this error. I’ve never seen it myself, though. I think it can be caused if your program requires that you press a key to exit or if the working directory changes, as E.D. Riedijk pointed out.

The visual profiler will generate the config for you. You can select more than 4 signals and it will rerun your app several times to measure groups of 4 signals independantly. I find that this works best when the app always calls the same number of kernels in the same order so that the multiple runs can be merged line by line.

Weird. I’ve never seen this before either. To check, I just downloaded the latest profiler 1.0 (I haven’t profiled since I upgraded to CUDA 2.0) and ran it on my app which does lots of memcpys. I just setup the arguments, working directory, and then clicked go and everything worked with the memcpys.

I know it’s probably frustrating to hear me say “it worked for me”, but it does work for me with an extremely complicated app … Perhaps you could post a small test .cu file (to be compiled by nvcc -o test test.cu) that reproduces the problem. Then we could all try the same test case and narrow it down to the root cause.

Other ideas:

Maybe you have a mismatch between the CUDA profiler version and the version of CUDA? I don’t think the profiler format has changed since 1.1, but it may have. Try CUDA 2.0 and the latest profiler 1.0 download from http://www.nvidia.com/object/cuda_get.html .

Maybe it is a platform specific issue? I’m running on x86_64 linux. What platform are you on? I could switch to that platform and see if I get the same issue you do there.

MisterAnderson42 · September 12, 2008, 12:45pm

Does your application run on more than one GPU? I just tried the profiler on that for another post on the forums and found that the visual profiler doesn’t like the profile output when I run on multi-GPU apps complaining about a missing gpu time column once and a missing occupancy column the 2nd time.

CUDA_Moose · September 12, 2008, 5:26pm

I am running on Fedora 8 x86_64 linux. I am using the ver 1.0.11 profiler, 2.0 beta2 CUDA sdk and 177.67 Nvidia driver. My app is a multi-GPU application, so I guess that is the problem. Thanks. :)

Topic		Replies	Views
Visual Profiler not working (Win XP 64 bit) getting errors related to the profiler output CUDA Programming and Performance	21	37781	August 17, 2010
Running CUDA Visual Profiler CUDA Programming and Performance	8	5006	October 29, 2010
Error in reading profiler output CUDA Programming and Performance	16	23344	September 27, 2010
CUDA Visual Profiler generating csv that itself cannot read CUDA Programming and Performance	2	7373	December 10, 2008
Visual profiler bug? CUDA Programming and Performance	1	915	May 5, 2011
How can I tell if my memory accesses are being coalesced? CUDA Programming and Performance	5	1240	June 23, 2009
Profiling in a code line resolution CUDA Programming and Performance	7	7052	December 6, 2011
CUDA Profiler Error CUDA Programming and Performance	3	4098	June 5, 2008
cudaprof questions CUDA Programming and Performance	9	13463	February 5, 2009
Visual Profiler shows "Error in reading program output" CUDA Programming and Performance	4	2472	February 19, 2011

visual profiler with compute capability 1.0 cards?

Related topics