constant memory which is device-side only (avoiding cudaMemcpyToSymbol)

zjw518 · June 15, 2017, 3:28pm

I have some numerical constants (constant over the life of the application, the same values called by each thread every time a kernel is called) which I would like to reside in the device-side constant cache. Namely, I do not need them on the host, and would like to avoid using cudaMemcpyToSymbol. (Note that some of these constants are arrays which functions need the pointer to, so #define’ing is no good.)

If I set my constants as such:

__device__ __constant__ constNumber[4] = {1,2,3,4};

something goes wrong - the values, as implemented in the kernel, seem to be zero. So I’m guessing the above is the wrong way to go about this.

If there is a way to use constant and managed together, that would be appreciated as well. TIA!

Robert_Crovella · June 15, 2017, 4:51pm

what you have posted wouldn’t compile, there is no variable type (int, float, etc.) specification:

However if we fix that, what you have posted seems to work for me:

$ cat t152.cu
#include <stdio.h>

__device__ __constant__  int constNumber[4] = {1,2,3,4};

__global__ void t(){

  printf("%d\n", constNumber[3]);
}

int main(){

  t<<<1,1>>>();
  cudaDeviceSynchronize();
}
$ nvcc -arch=sm_61 -o t152 t152.cu
$ cuda-memcheck ./t152
========= CUDA-MEMCHECK
4
========= ERROR SUMMARY: 0 errors
$

This is static initialization of a device variable, and subject to some limitations, I believe it should work. Initialization of an array (or maybe it was array-of-struct) this way might be problematic, I think I’ve tried that before and you effectively need a C++11 style initialization, IIRC

zjw518 · June 15, 2017, 4:58pm

Right about the type - I wasn’t copying from my code, of course.

I get illegal memory access errors when trying to refer to constNumber from within a kernel. I forgot to mention in the OP that I was wanting to avoid passing pointers to these constants to each kernel, and then passing to each function called by the kernel, etc. This is not necessary when declaring constNumber as device managed - the device functions/kernels are aware of those addresses already. So I was hoping to avoid breaking this while still holding the values in the constant cache.

Thank you!

Robert_Crovella · June 15, 2017, 5:18pm

Not sure what you’re talking about. The example I gave doesn’t involve passing pointers to the constants to the kernel, nor does it throw any illegal memory access errors.

zjw518 · June 15, 2017, 5:22pm

I wasn’t using your code; I was doing as I said, which was using these values in device and global functions. This is where I get the memory error.

njuffa · June 15, 2017, 5:22pm

Maybe I am misunderstanding the use case, but statically initialized constant data has worked in CUDA from the very beginning. Here are a couple of relevant snippets from the CUDA 6.5 math library (inside the file math_functions_dbl_ptx3.h):

/* 1152 bits of 2/PI for Payne-Hanek style argument reduction. */
static __constant__ unsigned long long int __cudart_i2opi_d [] = {
  0x6bfb5fb11f8d5d08ULL,
  0x3d0739f78a5292eaULL,
  0x7527bac7ebe5f17bULL,
  0x4f463f669e5fea2dULL,
  0x6d367ecf27cb09b7ULL,
  0xef2f118b5a0a6d1fULL,
  0x1ff897ffde05980fULL,
  0x9c845f8bbdf9283bULL,
  0x3991d639835339f4ULL,
  0xe99c7026b45f7e41ULL,
  0xe88235f52ebb4484ULL,
  0xfe1deb1cb129a73eULL,
  0x06492eea09d1921cULL,
  0xb7246e3a424dd2e0ULL,
  0xfe5163abdebbc561ULL,
  0xdb6295993c439041ULL,
  0xfc2757d1f534ddc0ULL,
  0xa2f9836e4e441529ULL,
};

// [...]

#pragma unroll 1
  for (q = (idx-1); q < min(18,idx+3); q++) {
    p = __internal_umad64wide (__cudart_i2opi_d[q], ia, p.y);
    result[q-(idx-1)] = p.x;
  }

zjw518 · June 15, 2017, 5:29pm

I apologize - no memory access errors. (I’m on shoddy wifi so files may not have transferred as I thought.)

But the original issue posed in the OP still stands. To be concrete:

I declare constant values via

__device__ __constant__ int constNumbers[4] = {1,2,3,4};

which I then want to use directly with the kernel

__global__ void kernel()
{
    // do stuff with constNumbers such as
    int x  = constNumbers[1];
    printf("%d",x);
}

But the output of the kernel is as if constNumbers[1] = 0.

njuffa · June 15, 2017, 5:35pm

In #2 txbob already showed that the stripped-down version of your code works just fine. This would seem to suggest you have a bug somewhere in your code.

BTW, I am not sure what the addition of device to constant is supposed to do, because constant is on the device already.

zjw518 · June 15, 2017, 5:39pm

You both are correct; I apologize for misreading. I am not sure how it can be a bug in my code, for all I do is replace a managed keyword with constant which causes the issue. I will look into it.

And the device was there because the variables were originally device managed – good to know it’s not needed.

zjw518 · June 15, 2017, 6:08pm

I resolved my issue; I forgot that I call a host device function exactly once from the host which uses those values. I suppose this was worth it for a 7% speed increase…

The compiler tells me that constant memory can’t be managed - is there any way to have the same variable name for both constant and host-side memory?

Thank you again for your help!

Topic		Replies	Views
Device constant memory from shared object CUDA Programming and Performance	1	6416	June 9, 2009
__constant__ and __device__ memory access CUDA Programming and Performance	4	5946	April 10, 2012
Tables not correct when using __constant__ CUDA Programming and Performance	12	14827	July 28, 2010
Constant variabiles scope CUDA Programming and Performance	5	1777	June 9, 2009
Constant Memory problem CUDA Programming and Performance	12	1370	January 30, 2015
constant memory usage problem unexpected behavior using constant memory CUDA Programming and Performance	15	9704	April 1, 2009
cudaMemcpyToSymbol not working CUDA Programming and Performance	6	3398	December 22, 2017
Constant memory problem in CUDA 1.0? CUDA Programming and Performance	5	12122	February 26, 2008
copy device memory to constant memory CUDA Programming and Performance	4	12483	November 11, 2008
__constant__ stopped working CUDA Programming and Performance	20	4525	October 17, 2009

__constant__ memory which is device-side only (avoiding cudaMemcpyToSymbol)

Related topics

constant memory which is device-side only (avoiding cudaMemcpyToSymbol)