ERROR: EXTERNAL CALLS NOT SUPPORTED

chrismc · July 4, 2008, 1:25pm

I have a CUDA program with one kernel and several device functions called by the kernel. It all compiles (variables all declared and aligned in function declarations and specifications etc).

But I get just the one following error for only one of the device functions;

“error : external calls not supported (found non-inlined call to _Z12 functionname ifPfS_S_S_S_S…”

What does this mean and how can the error be rectified?

Would it better to ditch the device functions and have one massive kernel?

E.D_Riedijk · July 4, 2008, 7:22pm

well it looks like on call to a device function did not get inlined. And all functions should get inlined. Why? I have no clue, if you add -keep, you might see a reason in the .ptx file.

MisterAnderson42 · July 7, 2008, 1:01pm

Are the device function in the same compilation unit as the kernel? I would guess that if you compiled multiple .cu files separately with nvcc, then the device functions would be “external”.

chrismc · July 10, 2008, 9:35am

Yes, all device functions are in the one .cu file with the kernel and host main function.

How should I

declare and define a device function?
call a device function from the kernel?

Why should a function be inlined explicitly?

MisterAnderson42 · July 10, 2008, 12:25pm

OK, I guess my intuition was incorrect. We’ll really need to see a minimal code that produces this error to help you further.

To (1) and (2): just like in normal C

__device__ float myfunc(float a)

    {

    return a*2.0f;

    }

__global__void mykernel(...)

    {

    ....

    float c = myfunc(d);

    ....

    }

The GPU doesn’t have the full stack system that normal CPUs do for passing arguments to functions. There is a call instruction so that functions are not required to be inlined but inlining opens up a lot of opportunites for register optimization so nvcc inlines all device function calls be default.

alphedra · December 17, 2008, 3:29pm

Hi,

I encounter exactly the same issue. Did you finally found out the solution? I am calling device functions from my kernel and I have the same error message, without any other compilation errors.

Thanks

snowlxm · June 3, 2009, 12:20am

I also have the same problem.

What should I do? Does someone have a solution?

dlmeetei · June 3, 2009, 4:06am

Can u pliz code segment which may reproduce the error?

panegyrize · June 17, 2009, 5:58pm

My codes fail to compile with the same error. What I do is to put the kernel in a file (myKernel.cu) and the functions it calls in another file (myFunc.cpp). In myFunc.cpp, I add device and host before those functions since they are called by both GPU and CPU. Then the same error comes out.

//file1: myFunc.cpp

__device__ __host__ myFunc1( int *a )

{

return 2*a;

}

....

//file2: myKernel.cu

__global__ theKernel()

{

...

myFunc1();

...

}

I try to add “extern” before myFunc1, but it does not work. And is there any compile configuration trick I should notice?

lsubrama · August 15, 2009, 10:44pm

I have the same error too, but in my case the error pops up when i try to create a structure variable in the device function.

[codebox]

device float waveletTransformMethod2(int i, int j, TILE_C *tileMother, WAVELET_C *wave) {

float valToReturn = 0.0;	

TILE_C tileForTransform;

createTileC(j, i,&tileForTransform);

............

[/codebox]

The problem crops up for the line TILE_C tileforTransform… any ideas??

Thanks

MashRinx · August 19, 2009, 8:06pm

I have the same error too, but in my case the error pops up when i try to create a structure variable in the device function.

[codebox]

device float waveletTransformMethod2(int i, int j, TILE_C *tileMother, WAVELET_C *wave) {
float valToReturn = 0.0;	

TILE_C tileForTransform;

createTileC(j, i,&tileForTransform);

............
[/codebox]

The problem crops up for the line TILE_C tileforTransform… any ideas??

Thanks

This is a total guess here as I’m not really an accomplished C programmer and even less so a CUDA one, but maybe it’s because this device function is not aware of the TILE_C type because the type declaration is not in the .cu file, even though it is being passed into the function as a parameter?

Probably way off, but just in case…

sidzonline85 · August 20, 2009, 7:30am

well i’m not getting the same error message as u guys, but when i call a device function from the kernel, the compiler complains, ‘identifier (function name) is undefined’. :blink:

can some1 help me out :)

lsubrama · August 20, 2009, 8:50pm

Can you elaborate on where you have declared your functions, are they all in a single file, or in multiple files??

Posting your code will be helpful.

sidzonline85 · August 21, 2009, 5:49am

hey i figured out the problem after i posted about it. i hadnt written the device function prototype before writing the fucntion body. i wasnt sure device functions required a declaration, but once i wrote the prototype, the code compiled. thanks for your reply. External Image

PS: the above holds only for functions with device qualifiers and not for kernels (global).

holovib · November 16, 2010, 11:20am

I had exactly the same problem and my solution was: right-click on the project->properties->Runtime-API->GPU.

There are the settings for GPU-Architecture(1),GPU-Architecture(2) and GPU-Architecture(3). They are set to

-sm 1.0 (GPU-Architecture(1))

-sm 2.0 (GPU-Architecture(2))

-0 (GPU-Architecture(3)).

I have only one GPU in my machine with CUDA capability major/minor version number 1.2.

So i set GPU-Architecture(2) on 0 and GPU-Architecture(1) on sm 1.2. Now it worked and the problem is solved.

holovib · November 16, 2010, 11:20am

I had exactly the same problem and my solution was: right-click on the project->properties->Runtime-API->GPU.

There are the settings for GPU-Architecture(1),GPU-Architecture(2) and GPU-Architecture(3). They are set to

-sm 1.0 (GPU-Architecture(1))

-sm 2.0 (GPU-Architecture(2))

-0 (GPU-Architecture(3)).

I have only one GPU in my machine with CUDA capability major/minor version number 1.2.

So i set GPU-Architecture(2) on 0 and GPU-Architecture(1) on sm 1.2. Now it worked and the problem is solved.

Gabe_S · December 12, 2010, 11:12pm

I was experiencing this error with Visual Studio 2010 and Visual Studio 2008 when trying to compile samples from “Cuda by Example” which inlcuded the device calls. I am using Parallel Nsight 1.5 and was always getting the EXTERNAL CALLS NOT SUPPORTED" exception when compliling. After a long and frustating voyage of trial and error, this response above gave me an idea that there may be differences between the debugging capability of global kernals vs device methods, such as the one below:

device int julia( int x, int y ) {

const float scale = 1.5;

float jx = scale * (float)(DIM/2 - x)/(DIM/2);

float jy = scale * (float)(DIM/2 - y)/(DIM/2);

cuComplex c(-0.8, 0.156);

cuComplex a(jx, jy);

int i = 0;

for (i=0; i<200; i++) {

    a = a * a + c;

    if (a.magnitude2() > 1000)

        return 0;

}

Basically, i disabled the Generate GPU debug information where device functions exist, and things work. To disable, right click on your code file, Properties==>Configuration properties ==> and in VS 2010 go to CUDA C++ -->Device–> Generate GPU debug information → NO

in VS 2008, it is under Runtime API==>GPU==>Generate GPU debug information → NO

yiguro · February 14, 2011, 7:29pm

Try adding device to the constructor line inside the struct:

struct cuComplex {

    float   r;

    float   i;

    cuComplex( float a, float b ) : r(a), i(b)  {}  //This line

so that it reads:

__device__ cuComplex( float a, float b ) : r(a), i(b)  {}

errold32 · April 19, 2011, 7:22am

Try adding device to the constructor line inside the struct:

struct cuComplex {

    float   r;

    float   i;

    cuComplex( float a, float b ) : r(a), i(b)  {}  //This line

so that it reads:

__device__ cuComplex( float a, float b ) : r(a), i(b)  {}

Thanks yiguro

fienefie · May 23, 2011, 4:03am

I am having a similar problem. Can u explain further, probably with some code?

Topic		Replies	Views
Error: External calls are not supported Functions defined inside other source files CUDA Programming and Performance	2	4527	January 25, 2012
Error: External calls are not supported how put __device__ functions in another module? CUDA Programming and Performance	4	9796	April 12, 2011
Non-inlined device functions for compute capability 2.0? CUDA Programming and Performance	6	23704	January 21, 2011
Device Function Library How to make a lib of device functions CUDA Programming and Performance	6	4850	June 24, 2009
__device__ functions CUDA Programming and Performance	9	3113	November 10, 2010
Error: External calls are not supported (found non-inlined call to _Znwm) CUDA Programming and Performance	10	24875	March 16, 2011
external calls to __device__ functions CUDA Programming and Performance	4	4932	July 20, 2010
Build Error MSB3721 When calling object method within kernel, using compiler directives CUDA Programming and Performance	9	5727	November 18, 2015
Invalid device function CUDA Programming and Performance	10	6822	February 25, 2015
Calling a class from cuda-kernel CUDA Programming and Performance	6	65545	March 1, 2010

ERROR: EXTERNAL CALLS NOT SUPPORTED

Related topics