--ptxas-options=-v Equivalent for CUDA Fortran?

With my current CUDA Fortran program, I’m at the point now where counting registers, lmem, etc., has become important (though whether I can do anything with that info is up to my brain). To that end, I was wondering if there was an equivalent to “–ptxas-options=-v” for pgfortran?

At the moment, I’m using -Mcuda=keepbin in the .bin (cf. .cubin) file to see this, but I was wondering if there was a more…elegant way to get this data (a la the nvcc option).

Thanks,
Matt

Hi Matt,

The short answer is no. However, we have been discussing how to cleanly pass options to the back-end Nvidia tools such as the ptxas assembler. I’ve sent a note to Michael to see where his team is at on this, but he’s out of the office this week. I’ll post a reply once I heard back from him.

Thanks,
Mat

Hi Matt,

FYI, as of 10.5, the output of “-Minfo=accel” will include the information from “–ptxas-options=-v”.

For example:

mat_times_vec:
    194, Generating copyout(y(:,:,:,:))
         Generating copyin(x(:,:,:,:))
         Generating compute capability 1.3 binary
    195, Loop is parallelizable
    196, Loop is parallelizable
    197, Loop is parallelizable
    198, Loop is parallelizable
         Accelerator kernel generated
        195, !$acc do parallel
        196, !$acc do parallel, vector(4)
        197, !$acc do vector(4)
        198, !$acc do vector(16)
             CC 1.3 : 56 registers; 24 shared, 136 constant, 0 local memory bytes; 25% occupancy
    207, Loop is parallelizable

Thanks,
Mat

Hi Mat,
How can we get the same output (e.g. the amount of registers per kernel, shared memory used per block…) in Fortran CUDA. Is this only available with Accelerator?

Thanks,
Tuan

You can add ptxinfo to your -Mcuda= string to get a similar analysis:

ptxas info    : Compiling entry function 'irrad'
ptxas info    : Used 63 registers, 16012+0 bytes lmem, 36+16 bytes smem, 7536 bytes cmem[0], 468 bytes cmem[1]

(NB: I don’t have access to a Fermi yet, so I don’t know if it works with it.)

Hi Mat,
I tried but the compiler doesn’t recognize the -Mcuda=ptxinfo value. I’m using Fortran 10.5, Tesla C1060. The one available is -Mcuda=keepptx, yet I don’t find out any related information in the generated file nor the output during the compilation stage.

Thanks,
Tuan

Ah. 10.5 doesn’t have ptxinfo, I think. You can find that information instead by using “keepbin” in the -Mcuda string.

Then, if you look at the outputted .bin file, you’ll see lmem, etc. inside that text file.

.bin is a binary file indeed, not a text file. I did open yet there is no such line with lmem that I can find. Any suggestion, Matt?

Tuan

Hmm. This suggests to me you are using the CUDA 3.x toolkit and not the 2.3 (or lower) toolkit since 3.x now uses ELF for .cubin/.bin rather than plain text (necessitating the need for ptxinfo).

Have you changed your default CUDA version?

Also, alternatively, do you get plain text bins if you use:

-Mcuda=2.3,keepgpu,keepbin,keepptx,...

I think this will force pgfortran to use the 2.3 toolkit.

You’re right, I’m using CUDA 3.0. To switch back to 2.3, I need to recompile some shared libraries which are required for the build. Thanks a lot, Matt.

Tuan

You’re welcome. That said, there have been a lot of improvements since PGI 10.5, so if you can upgrade to 10.8, I think it’d be worth it.