Functions inside global functions question

Logical_American · March 13, 2012, 4:08am

While I can compile a straight piece of code

int main() {

int gcd(int a) {
int b=a;
return b;
}

return 0;
}

with nvcc or gcc (running Ubuntu 10.04) okay, if the file is named myfile.c, things fall apart for compiling this same thing as a global function in a file named as myfile.cu

global void myfunction()
{
int gcd(int a) {
int b=a;
return b;
}
}

and nvcc keeps asking for a missing ; on the int gcd line, expecting a function prototype? If I prototype it, then it says that I cannot call a host function from a global function.

So just how do we get functions working inside the one big global program for source files of the type .cu ?

Randall

njuffa · March 13, 2012, 10:04am

global functions are functions running on the device that are called from the host. A global function can call as many device functions as it desires. An unadorned function name is assumed to refer to a host function, and host functions cannot be invoked from the device. This is why in your code the call to gcd() is flagged as an error.

You might want to read the CUDA Programming Guide and look at the example apps in the CUDA SDK.

Logical_American · March 14, 2012, 2:18am

You are correct and I did read version 4 of the programming guide and did compile all the example applications in the CUDA SDK (latest) Furthermore I started playing around with the Makefile to see exactly what was happening there. And I did try use the device attribute in front of the function forward declaration and the function itself, but still cannot get the make or nvcc to compile.

I did notice that cppIntegration does use an external function inside the global body but it uses the “C” extern style and the function is located inside a cpp code unit. I did try this method of construction, but it too failed.

I also did notice that when lots of functions were compiled… the compiler took about an hour to run, then came up to a fatal message external call to CUPRINTF, even though that is a very legitimate working CUDA function. I am not sure why the compiler is saying something is illegal when it obviously is not.

So I ripped out all the CUPRINTF code, realizing that the compiler was going to flag it anyway and reject it and I didn’t want to take that risk.

My goal is to simply get a function inside a global function to work and compile.

And yes, I did read the guide and made the examples and ran them and even took one apart and modified the <<<>>> call to see exactly what was going on there. And yes, I ran 1000 threads on one device.

Are there any compiling guides?? I did a cursory check through all the pdf files, and corrected the main documentation.html file in the SDK so that it correctly references the correct location for the pdf files cited.

njuffa · March 14, 2012, 3:00am

I am not sure where you are stuck. Here is a minimalist example that may help:

#include <stdio.h>

#include <stdlib.h>

__device__ float square (float a);

__global__ void sqr_kernel (float x, float *res);

int main (void)

{

    float arg, res = 0.0f;

    float *res_d;

if (cudaSuccess != cudaMalloc ((void**)&res_d, sizeof(res_d[0]))) {

        fprintf (stderr, "!!!! allocation if res_d failed\n");

        return EXIT_FAILURE;

    }

    arg = 5.0f;

    sqr_kernel<<<1,1>>>(arg, res_d);

    if (cudaSuccess != 

        cudaMemcpy (&res, res_d, sizeof(res), cudaMemcpyDeviceToHost)) {

        fprintf (stderr, "!!!! device->host copy of result failed\n");

        return EXIT_FAILURE;

    }

    printf ("The square of %.f is %.f\n", arg, res);

    if (cudaSuccess != cudaFree (res_d)) {

        fprintf (stderr, "!!!! de-allocation of res_d failed\n");

        return EXIT_FAILURE;

    }

    return EXIT_SUCCESS;

}

__global__ void sqr_kernel (float x, float *res)

{

    *res = square (x);

}

__device__ float square (float a)

{

    return a * a;

}

I put the above in a file test.cu and compiled with

nvcc -o test test.cu

When run, the program prints

The square of 5 is 25

[Later:]

Looking at your original code again, it seems what you are trying to do is define a function nested inside another function. This is not supported by standard C and standard C++, and thus not supported by CUDA. gcc may allow nested functions in C/C++ as a proprietary extension, but this is not something I am familiar with.

Logical_American · March 14, 2012, 3:05am

Here’s what I did. I had to convert all the functions bodies to cu type, as well as all the header files. Inside each function, at the definition line, I prefaced device which made that function available to the global function calling them.

Next I had to fix the duplicate issues of functions while compiling, so I had to use the traditional #ifndef PROGRAM_FILE_NAME for the pre-compiler processor, this fixed the issues with functions that had the same .cuh header file reference in them.

Then I still couldn’t get things to work, but noticed that perhaps changing the include file name from “myfile.cuh” to “myfile.cu” in the cude file calling that function might actually fix the problem with external references. This DID fix the problem, apparently the nvcc compiler wants all the visibility to be at once, rather than using the header file. I am not sure how scoping is accomplished during the compile phase.

It would be nice to add a section in the Programming Guide on compiling and linking and how to call functions and such as this doesn’t seem too visible.

For example, I did a simple cut and paste from the programming guide, but it wouldn’t compile. It was only when I moved inside the SDK and started using the Makefile did things finally compile.

Thanks for your help.

Logical_American · March 14, 2012, 3:12am

Quote:
“Looking at your original code again, it seems what you are trying to do is define a function nested inside another function. This is not supported by standard C and standard C++, and thus not supported by CUDA. gcc may allow nested functions in C/C++ as a proprietary extension, but this is not something I am familiar with.”

Apparently gcc allows this, as it passes without complaint.

But yes, you are correct that this is not supported by standard C and C++, I only tried this construction while trying to get the main.cu file to compile and had put everything into one huge file (thousands of lines long) since I was still learning about the nuances of the nvcc compile and link.

Right now I have all the code broken up into modules as usual practice and nvcc likes that much better.

Topic		Replies	Views
Declaration problems of __global__, __device__ Confused about declarations CUDA Programming and Performance	5	8282	September 10, 2008
Calling CUDA functions from a C file CUDA Programming and Performance	19	29152	March 4, 2015
Compiling C and CUDA code Problems linking CUDA code and C code CUDA Programming and Performance	7	19119	November 4, 2011
warning: inline qualifier ignored for "global" function inline doesnt get ignored nor is an CUDA Programming and Performance	7	2698	March 25, 2012
CUDA and C++ Do i need a main function in the .cu? CUDA Programming and Performance	4	3886	October 1, 2009
NVCC forces c++ compilation of .cu files CUDA Programming and Performance	11	25658	December 11, 2011
Linking .CU and .CPP files CUDA Programming and Performance	11	35609	July 6, 2011
extern __device__ functions CUDA Programming and Performance	1	3073	January 12, 2010
Nvcc does not understand __global__ CUDA Programming and Performance	3	911	November 3, 2023
__device__ functions CUDA Programming and Performance	9	3113	November 10, 2010

Functions inside __global__ functions question

Related topics

Functions inside global functions question