Identifier __hiloint2double is undefined

BHC · June 29, 2010, 1:36pm

I am trying to use this function in my kernel, and I am getting a compiler error that it is undefined.

double A;

int2 i;

...

A = __hiloint2double(i.y, i.x)

Do I need to include a special header or something to use this function?

BHC · June 30, 2010, 2:49pm

The problem is resolved… well sort of.

I set up my machine to dual boot Windows 7 and CentOS.

This problem affects my Windows 7 setup but not CentOS. Since I didn’t do anything fancy during the setup process, my conclusion is that this is some sort of an issue with the CUDA toolkit running on Windows 7.

Edit:
The trick seems to be telling nvcc to compile for the correct architecture. When I remove the -arch sm_20 from my compilation command, the problem returns, even on CentOS. I do not recall if I was using this flag on Windows 7, so it may not be an OS issue after all.

BHC · June 30, 2010, 2:49pm

The problem is resolved… well sort of.

I set up my machine to dual boot Windows 7 and CentOS.

This problem affects my Windows 7 setup but not CentOS. Since I didn’t do anything fancy during the setup process, my conclusion is that this is some sort of an issue with the CUDA toolkit running on Windows 7.

Edit:
The trick seems to be telling nvcc to compile for the correct architecture. When I remove the -arch sm_20 from my compilation command, the problem returns, even on CentOS. I do not recall if I was using this flag on Windows 7, so it may not be an OS issue after all.

rolyluo · August 19, 2010, 8:51am

Hello

I have the same problem when using CUDA3.0 in RedHat Enterprise Linux.
“error: identifier “__hiloint2double” is undefined”

How did you solve it ? Is it in CUDA Toolkit library?

rolyluo · August 19, 2010, 8:51am

Hello

I have the same problem when using CUDA3.0 in RedHat Enterprise Linux.
“error: identifier “__hiloint2double” is undefined”

How did you solve it ? Is it in CUDA Toolkit library?

BHC · August 19, 2010, 1:21pm

Did you read my edit? If you have a device with compute capability 2.0, add “arch sm_20” to your nvcc command or make file. For me, I can make the problem come back and go away by removing and adding this flag.

If your device has a lower compute capability, then I don’t think you can use this function. You’ll have to search for it and copy the source code into your application.

(begin speculation)

My suspicion is that this function was added at the same time the instruction set was changed to support native execution of double precision arithmetic. To facilitate backward compatibility (new code running on old devices), nvcc automatically compiles for one of the 1.X architectures, unless the arch flag is set. If you want the benefits of native double precision, or access to this function, you have to give up backward compatibility and compile for architecture 2.0.

(/end speculation)

Please post back if this fix also works for you.

BHC · August 19, 2010, 1:21pm

Did you read my edit? If you have a device with compute capability 2.0, add “arch sm_20” to your nvcc command or make file. For me, I can make the problem come back and go away by removing and adding this flag.

If your device has a lower compute capability, then I don’t think you can use this function. You’ll have to search for it and copy the source code into your application.

(begin speculation)

My suspicion is that this function was added at the same time the instruction set was changed to support native execution of double precision arithmetic. To facilitate backward compatibility (new code running on old devices), nvcc automatically compiles for one of the 1.X architectures, unless the arch flag is set. If you want the benefits of native double precision, or access to this function, you have to give up backward compatibility and compile for architecture 2.0.

(/end speculation)

Please post back if this fix also works for you.

tera · August 19, 2010, 1:57pm

[/end speculation]
__hiloint2double was indeed introduced together with double precision support in compute capability 1.3. Thus you’ll have to specify a compute capability or architecture >= 1.3 to nvcc as the default is 1.0.

tera · August 19, 2010, 1:57pm

[/end speculation]
__hiloint2double was indeed introduced together with double precision support in compute capability 1.3. Thus you’ll have to specify a compute capability or architecture >= 1.3 to nvcc as the default is 1.0.

rolyluo · August 20, 2010, 3:16am

I modified and used the common.mk in SDK as my makefile, it seems to have included the “sm_20” flag by default:
here is :

Compiler-specific flags (by default, we always use sm_10 and sm_20), unless we use the SMVERSION template

GENCODE_SM10 := -gencode=arch=compute_10,code="sm_10,compute_10"
GENCODE_SM20 := -gencode=arch=compute_20,code="sm_20,compute_20"

but it does not work.

BTW, there is a flag “SMVERSIONFLAGS” in common.mk, so I add in my makefile like this:
SMVERSIONFLAGS := -arch=sm_20

it does not work neither.

rolyluo · August 20, 2010, 3:16am

I modified and used the common.mk in SDK as my makefile, it seems to have included the “sm_20” flag by default:
here is :

Compiler-specific flags (by default, we always use sm_10 and sm_20), unless we use the SMVERSION template

GENCODE_SM10 := -gencode=arch=compute_10,code="sm_10,compute_10"
GENCODE_SM20 := -gencode=arch=compute_20,code="sm_20,compute_20"

but it does not work.

BTW, there is a flag “SMVERSIONFLAGS” in common.mk, so I add in my makefile like this:
SMVERSIONFLAGS := -arch=sm_20

it does not work neither.

BHC · August 20, 2010, 5:05am

Thank you. I was hoping to get some feedback those comments.

BHC · August 20, 2010, 5:05am

Thank you. I was hoping to get some feedback those comments.

BHC · August 20, 2010, 5:07am

Looks like your makefile is generating the flags for two architectures - 1.0 and 2.0. Which one is it compiling for? There should be some conditional logic somewhere in the makefile that selects between GENCODE_SM10 and GENCODE_SM20. Later on you specify sm_20 in the SMVERSIONFLAGS, but if the GENCODE_SM10 flags are getting paired with -arch=sm_20, it seems reasonable that nvcc would still produce the error.

I would suggest temporarily ditching the makefile and calling nvcc directly from the command line until you get this working. It will take a lot of time to figure out all the necessary nvcc args, but it will be worth it in the end because you’ll come away with a clear understanding of the compilation step. (Refer to your_cuda_install_directory/doc/nvcc_2.0.pdf for more help with nvcc.)

BHC · August 20, 2010, 5:07am

Looks like your makefile is generating the flags for two architectures - 1.0 and 2.0. Which one is it compiling for? There should be some conditional logic somewhere in the makefile that selects between GENCODE_SM10 and GENCODE_SM20. Later on you specify sm_20 in the SMVERSIONFLAGS, but if the GENCODE_SM10 flags are getting paired with -arch=sm_20, it seems reasonable that nvcc would still produce the error.

I would suggest temporarily ditching the makefile and calling nvcc directly from the command line until you get this working. It will take a lot of time to figure out all the necessary nvcc args, but it will be worth it in the end because you’ll come away with a clear understanding of the compilation step. (Refer to your_cuda_install_directory/doc/nvcc_2.0.pdf for more help with nvcc.)

rolyluo · August 20, 2010, 12:25pm

Thank you for your advice.

I compile only with GENCODE_SM20 := -gencode=arch=compute_20,code="sm_20,compute_20" and not the SM10 one, it works.

but here is another problem during the rumtime, the error is:

Cuda error in file ‘texture.h’ in line 20 : invalid texture reference.

I used int2 to acquire double texture, it work in CUDA2.3, but error in CUDA3.1

the code in my texture.h is:

texture<int2, 1>  tex_x_double;

void bind_vec(double *x) 

{   

   CUDA_SAFE_CALL(cudaBindTexture(NULL, tex_x_double, x));   

} 

__inline__ __device__ double fetch_vec(const int &i, double *x)

{

	int2 v = tex1Dfetch(tex_x_double, i);

	return __hiloint2double(v.y, v.x);//avoid banks conflict

}

rolyluo · August 20, 2010, 12:25pm

Thank you for your advice.

I compile only with GENCODE_SM20 := -gencode=arch=compute_20,code="sm_20,compute_20" and not the SM10 one, it works.

but here is another problem during the rumtime, the error is:

Cuda error in file ‘texture.h’ in line 20 : invalid texture reference.

I used int2 to acquire double texture, it work in CUDA2.3, but error in CUDA3.1

the code in my texture.h is:

texture<int2, 1>  tex_x_double;

void bind_vec(double *x) 

{   

   CUDA_SAFE_CALL(cudaBindTexture(NULL, tex_x_double, x));   

} 

__inline__ __device__ double fetch_vec(const int &i, double *x)

{

	int2 v = tex1Dfetch(tex_x_double, i);

	return __hiloint2double(v.y, v.x);//avoid banks conflict

}

BHC · August 20, 2010, 4:15pm

I’m very glad to hear you got past the compilation problem. Its also good that you got the architecture flag set correctly since that could have caused suboptimal performance.

I suggest you create a new thread about the texture problem. I’m not exactly an expert on using textures, and I’m also extremely busy today trying to meet a deadline, so I won’t get to look at this until late tonight. If you create a new thread, more people will see it.

Thanks again for posting back that you got it working.

BHC · August 20, 2010, 4:15pm

I’m very glad to hear you got past the compilation problem. Its also good that you got the architecture flag set correctly since that could have caused suboptimal performance.

I suggest you create a new thread about the texture problem. I’m not exactly an expert on using textures, and I’m also extremely busy today trying to meet a deadline, so I won’t get to look at this until late tonight. If you create a new thread, more people will see it.

Thanks again for posting back that you got it working.

Topic		Replies	Views
Texture-Cache with "double" CUDA Programming and Performance	4	926	November 4, 2012
GPU cmoupute capability doesn't support __hiloint2double CUDA Programming and Performance	6	2475	May 24, 2012
use int2 to get double texture CUDA Programming and Performance	2	8511	August 21, 2010
Using linear memory textures with double precision. CUDA Programming and Performance	14	17971	July 28, 2010
missing double precision fp intrinsics CUDA Programming and Performance	1	7120	May 29, 2009
Bit-by-bit conversion: int4 to cuDoubleComplex Compiler Bug? CUDA Programming and Performance	9	5437	March 26, 2012
Double precision floating point support CUDA Programming and Performance	2	7770	January 24, 2011
Error with Doubles when using 1.1 Double is not supported. Demoting to float CUDA Programming and Performance	2	3591	December 9, 2010
Double precision in CUDA 2.3 CUDA Programming and Performance	5	38183	March 5, 2010
GTX280 can not support double, my newest test CUDA Programming and Performance	3	2321	July 18, 2008

Identifier __hiloint2double is undefined

Compiler-specific flags (by default, we always use sm_10 and sm_20), unless we use the SMVERSION template

Compiler-specific flags (by default, we always use sm_10 and sm_20), unless we use the SMVERSION template

Related topics