Problem with profiler on Mac OS X 10.5? Profiler only kind of works?

Hello,

I’m trying to get the profiler running and I’m getting odd results.

System: MacBook Pro, 10.5.4, NVIDIA GeForce 8600M GT, 256 MB.

I have the CUDA system installed and have implemented my kernels and integrated them into some test code. Now I want to make sure I’ve done all the coalescing correctly etc., so it’s time to use the profiler.

If I set my environment variables as such:
CUDA_PROFILE_CONFIG=/Users/sussillo/.cuda_profile_config
CUDA_PROFILE_CSV=1
CUDA_PROFILE_LOG=.cuda_profile_log.csv
CUDA_PROFILE=1

and the contents of the config file are:

more ~/.cuda_profile_config
gld_coherent
branch

then when I run my program “>testger” and then example the profile log file, it’s empty.

If I unset CUDA_PROFILE_CONFIG, run the program “>testger” then examine the profiler output:
$ more .cuda_profile_log.csv
method,gputime,cputime,occupancy
memcopy,1425.376
memcopy,8.928
_Z18my_cuda_syr_kernelfPfS_i,9638.784,9740.377,0.667
memcopy,2496.224

it works. So basically any reference to the config file gives no profiler output! But it’s really the global memory access information that I’m interested in.

Any ideas? (Sorry if this is newbie and I’m just too new to know.)

Thank you,
-David External Media

Hello,

So I just tried to run the profiler on a brand-new Mac Pro with 2 brand new GeForce 8800 GT cards, and I got the exact same error as described above.

According to $CUDA_HOME/doc/CUDA_Profiler_1.1.txt all I should have to do it set CUDA_PROFILE=1 and things should work, no compiler options or extra libraries.

Can I get a ping from anybody who has successfully used the profiler on the Mac platform while also utilizing any of the following options? At least then I would know the problem is on my side.
timestamp
gld_incoherent
gld_coherent
gst_incoherent
gst_coherent
local_load
local_store
branch
divergent_branch
instructions
warp_serialize
cta_launched

Thank you,
-David

Looks like I finally found the answer to my question:

This is from the Beta release of the Mac OS X CUDA Visual Profiler.

I’m definitely interested to find out if the project counters will ever be supported on Mac as it’s nearly impossible to speed up the kernel code without knowing exactly what is going on.

Regards,

-David External Media

Yeah, the lack of counters is killing me. All the profiler tells me is that 95% of execution time is spent in my kernel and only 5% is transferring data. Great as far as it goes, but it doesn’t help me at all in figuring out why the kernel runs slowly. Does anyone have an idea of when we can expect a working profiler for OSX?

Why did no one tell me that these features are supported in cudaprof 1.3 in the CUDA 2.3 release? I had a bad install and thought the problem was inherent with the program. Now lots of time was wasted.

To be clear, the more accurate GPU timestamps are not supported yet on OSX, but we have CPU timestamps and counters. The GPU timestamps (which are supported in Linux and Windows) are expected to be supported for Mac in the coming CUDA 3.0 release.

hi

i have read ur problem …i think right now may be u have solved ur problem

if no then i tell u that i have installed cuda profiler on MAC and it gives result of all counters.may be u forgot to choose profiler counters or as per your cuda version cuda profiler is not supported .so download the lower version of profiler and check it .may be ur profiler works.

ok

Bye

Thanks

Khyati Shah