difference in the treatment of pointers to host memory in kernel arguments between Pascal and Volta

Hello,

From what I understand, passing pointer arguments to a cuda kernel with the runtime api will just copy the value of the pointer (the address it points to) to the gpu memory, similar to the treatment of arguments passed by reference (where the value is copied). I notice (by accident) that the following piece of code works in Volta to my surprise.

#include <stdio.h>
#include <stdlib.h>
#include<cuda_runtime.h>

global void myKernel(int a, int b, int* v)
{
printf(“Hello a:%d b:%d \n”,a,b);
*v = 1;
}

int main(int argc, char *argv)
{
int v=0;
myKernel<<<1,1>>>(1,2,&v);
if (cudaSuccess != cudaDeviceSynchronize()) {
printf(" Error \n");
exit(1);
}
fflush(stdout);
printf(" after myKernel: v:%d \n",v);

return 0;

}

It does not work in Pascal, as I would expect. Can someone explain if this is expected behavior in Volta and where in the documentation it is explained.

Hi moralessilva2,
is it possible that your Volta box is running a recent linux kernel (>= 4.14) while the Pascal box has an older version ?

According to EVERYTHING YOU NEED TO KNOW ABOUT UNIFIED MEMORY, slides 46-47, “Heterogeneous Memory Management” should be available on Linux starting from kernel 4.14, and allow exactly what you posted.

I haven’t been able to test it myself, as I’m still waiting for one of the Pascal or Volta boxes I have access to get a kernel update…

Your code doesn’t work for me on Volta.

Run your code with cuda-memcheck.

Also, are you running on a IBM Power9 system?