Outputing the CUDA C the accelerator constructs?

TheMatt · July 13, 2009, 2:55pm

My question is more of in the vein of “help me to continue to learn CUDA” than a “get this to work”. Namely, I was wondering if there was a pgfortran compiler option that one can use to get the compiler to output the CUDA code generated by the accelerator.

Following Dr Wolfe’s videos on this site, I constructed the Matrix Multiplication Driver/Kernel pairing that he demonstrates and tested them. On doing so, I used the -Minfo=all,accel -ta=nvidia options, and it does provide quite useful and interesting information about how the accelerator was working. I had just wondered if there was a further compiler option that might write out not just the $!acc do parallel, vector(16) calls, but the CUDA calls themselves, so I could see the shared memory, register, etc. transformations.

MatColgrove · July 13, 2009, 10:52pm

Hi TheMatt,

At this time we have not made this option available but are considering it. The problem is not so much exposing the generated CUDA code but the follow-up question of “can I then modify the generated CUDA code and have my application use the modified version?” is technically challenging since CUDA does not have a linker.

Thanks,
Mat

TheMatt · July 14, 2009, 4:16pm

Mat,

Heh, I hadn’t even thought of that. Rather, I was just thinking of learning what certain Accelerator options do, etc, in terms of CUDA. If I wanted to take the next step as you state, it’d be more toward thinking about converting my Fortran code into pure CUDA C to see if I can squeeze more performance out. That way, I’d have a starting point.

And, as I said, learning “better” CUDA through your Accelerator logic which has more expert minds behind them.

Matt

MatColgrove · July 14, 2009, 6:20pm

Hi Matt,

We’ve decided that we’ll add a flag to the next release (9.0-3) that will allow the user to keep the intermediate CUDA code. It will just be the generated kernel, but will give you at least a starting point.

Thanks,
Mat

TheMatt · October 6, 2009, 2:37pm

Mat, I can’t seem to find the option for this in 9.0-4, so could you post it? Now that I’m starting to look at/use CUDA Fortran and having to remap my brain, this could be useful to me.

MatColgrove · October 6, 2009, 3:13pm

Sure, it’s “-ta=nvidia,gpufile”, where a “.gpu” will be created containing the generated CUDA code.

Also, while normally the CUDA execuatable code will be embedded into your application’s binary, when “gpufile” is used, the CUDA binary is placed in a separate “.bin” file. The “.bin” file must be located in the same directory as the application in order to run.

Mat

TheMatt · October 7, 2009, 3:16pm

Ooh, thanks. A quick try at this seems to indicate that Dr Wolfe, you, the team, are much more clever than I am at CUDA. What the compiler does is nothing like what I was thinking of doing in CUDA Fortran!

Topic		Replies	Views
Generate CUDA kernel code Legacy PGI Compilers	4	4226	August 2, 2010
CUDA FORTRAN compiler CUDA Programming and Performance	3	1762	January 10, 2010
CUDA Fortran and PGI Accelerator mix Legacy PGI Compilers	8	6109	May 20, 2011
--ptxas-options=-v Equivalent for CUDA Fortran? Legacy PGI Compilers	10	12270	September 2, 2010
Survey for PGI FORTRAN compiler ï¼Thanks~ CUDA Programming and Performance	7	12483	July 27, 2010
Compiling for Both OpenACC and CUDA Fortran Legacy PGI Compilers	4	7198	September 11, 2014
Translating FORTRAN to C++ to CUDA advice CUDA Programming and Performance	19	23247	February 1, 2010
Using cudafor (community edition) with another compiler Legacy PGI Compilers	1	5668	April 23, 2020
openAcc and PTX code Legacy PGI Compilers	1	3069	June 26, 2012
Compiling CUDA code Legacy PGI Compilers	1	2083	February 20, 2013

Outputing the CUDA C the accelerator constructs?

Related topics