Is it possible to call a CUDA kernel from PGI compiled code?

PaulPa · November 25, 2012, 1:57pm

Hi,

I’m just curious if the statement of PGI’s OpenACC FAQ ist still valid:

“PGI is working on the design of a feature to allow you to call kernel functions written in CUDA or PTX or other languages directly from your C or Fortran program. We will announce this feature when it is available.”

If this is still up-to-date, could you please tell me when this feature is likely
to be added?

I’m looking forward to this feature since it would allow me to “hand tune”
some “hot” kernels.

Best,
Paul

MatColgrove · November 26, 2012, 7:34pm

Hi Paul,

This is part of the proposed OpenACC 2.0 Spec. See: PGI 2019 Now Available | OpenACC and http://www.openacc.org/sites/default/files/Proposed%20Additions%20for%20OpenACC%202.pdf. In particular, see the “routine” directive.

We expect to have these features implemented by mid next year.

Mat

PaulPa · November 26, 2012, 7:54pm

Hi Mat,

as always, thank you :)

EDIT: A closer look into the OpenACC_2.pdf reveals that we might be talking about two different things. You are talking about routines that can be called within an open acc region, right?

I was talking about a cuda kernel that can be called outside of an acc region.
E.g.:

some_cuda_kernel<<<num_blocks, threads_per_block>>>(in_out);
    // do something before acc region
#pragma acc parallel deviceptr(in_out)
    //do something with in_out
//copy in_out to C code

Best,
Paul

MatColgrove · November 26, 2012, 10:10pm

Hi Paul,

You are talking about routines that can be called within an open acc region, right?

Correct. I’ll see if we can make the FAQ more clear that it’s about calling CUDA device functions from within an OpenACC compute region.

I was talking about a cuda kernel that can be called outside of an acc region.

With Fortran you can do this now by using PGI CUDA Fortran.

As for C, the problem has to do with NVIDIA’s header files where they need to make some changes to allow PGCC to be used as a host compiler. We’ve asked several times, but they haven’t as of yet. Feel free to put a request in to NVIDIA. Not sure it will help, but wouldn’t hurt.

In the mean time, what you need to do is have your PGCC compiled code call your NVCC compiled CUDA C code.

Mat

PaulPa · November 27, 2012, 9:46pm

My project is structured as follows:

main.c
compute.c (this file issues calls to the cuda runtime API)
kernels.cu

I can compile this project with Intel’s compiler for the *.c files and NVIDIA’s nvcc for the *.cu file and finally linking everything with Intel’s compiler.

If I try exactly the same with the pgi compiler I receive the following error, while compiling the compute.c file:

PGC-F-0249-#error --  --- !!! UNKNOWN COMPILER: please provide a CUDA compatible definition for '__align__' !!! --- (/usr/local/cuda/5.0.35/include/host_defines.h: 128)

That’s what you are talking about, right?

In the mean time, what you need to do is have your PGCC compiled code call your NVCC compiled CUDA C code.

Please correct me if I’m mistaken, my compute.c file is not allowed to include the cuda_runtime.h and make cuda api calls? So I have to move all
cuda api calls to a different file which needs to be compiled with the nvcc?

Best,
Paul

MatColgrove · November 27, 2012, 10:12pm

That’s what you are talking about, right?

Yes. The CUDA header files are configured to use GNU or MSC (I’m guessing icc sets some macros making it appear like gcc).

Please correct me if I’m mistaken, my compute.c file is not allowed to include the cuda_runtime.h and make cuda api calls? So I have to move all
cuda api calls to a different file which needs to be compiled with the nvcc?

Thanks correct. All CUDA API calls and headers need to be put in a separate source file which then can be compiled by NVCC.

Mat

PaulPa · November 28, 2012, 7:29am

Thanks for the clarification. It would be nice if either PGI or NVIDIA could do something about this in the future, since the current workaround is somehow inconvenient.

Best, Paul

MatColgrove · November 28, 2012, 3:48pm

It would be nice if either PGI or NVIDIA could do something about this in the future,

Please send in a request to NVIDIA. We keep asking but maybe having a few user ask as well would help.

Mat

Topic		Replies	Views
how to compil CUDA device functions Legacy PGI Compilers	10	5150	August 29, 2018
pgcc compile error in OpenACC-CUDA interoperabily example Legacy PGI Compilers	5	13348	June 14, 2016
Calling CUDA-library functions in OpenACC parallel region Legacy PGI Compilers	4	6483	October 26, 2018
Compile and linke OpenACC and CUDA in one program Legacy PGI Compilers	2	4118	December 6, 2016
OpenACC-CUDA interoperability within the same file Legacy PGI Compilers	4	4217	November 4, 2016
Mapping the kernel arguments for PGI generated OpenACC cuda code Legacy PGI Compilers cuda	1	587	November 2, 2020
3 Versatile OpenACC Interoperability Techniques Technical Blog	6	443	September 19, 2016
using nvcc to compile the CUDA code +PGI to compile the C Legacy PGI Compilers	4	7262	February 24, 2010
cublas part 2 Legacy PGI Compilers	2	812	September 3, 2019
Calling functions within the kernels Legacy PGI Compilers	6	5094	July 25, 2011

Is it possible to call a CUDA kernel from PGI compiled code?

Related topics