Hi,
I am going to test these two functions cuMemGetAccess' and
cuMemSetAccess’ but nvcc reports undefined references.
My cuda version is 10.2 and I do see the declarations of the two functions in the header file “/usr/local/cuda-10.2/targets/aarch64-linux/include/cuda.h”. I also added “#include <cuda.h>” and “#include “/usr/local/cuda-10.2/targets/aarch64-linux/include/cuda.h”” but neither solves the issue.
Have I missed anything? Any help is appreciated.
Hi,
Please use #include <cuda.h>
.
And add /usr/local/cuda-10.2/include
to include path and -lcudart
to the linked library.
Below is an example that also use driver API for your reference:
/usr/local/cuda-10.2/samples/0_Simple/matrixMul
Thanks.
Thanks for the reply.
Now my compilation command is this:
nvcc my_program.cu -O0 -g -lcudart -I/usr/local/cuda-10.2/include/ -o my_program
But the problem still exists. Could you explain more about the solution?
Hi,
Could you attach the source of my_program.cu
for us checking?
Thanks.
This is my source code:
#include <stdlib.h>
#include <time.h>
#include <stdio.h>
#include <thread>
#include <assert.h>
#include <sys/mman.h>
#include <errno.h>
#include <string.h>
#include <cuda.h>
__global__ void gpuAccess(char * tmp) {
tmp[0] = 'A';
}
void cpuAccess(char * tmp) {
tmp[0] = 'B';
}
int main() {
char * tmp;
cudaMallocManaged((void**)&tmp, sizeof(char)*10000*4096);
for(int i = 0; i < 10000; ++i) {
cpuAccess(tmp+i*4096);
unsigned long long flag;
cuMemGetAccess(&flag, 0, (CUdeviceptr)(tmp+i*4096));
fprintf(stderr, "flag = %lu\n", flag);
gpuAccess<<<1, 1>>>(tmp+i*4096);
cudaDeviceSynchronize();
}
return 0;
}
Hi,
Sorry for the mistake.
Since you are using CUDA driver API, it should be -lcuda
rather than -lcudart
(for runtime API).
$ nvcc my_program.cu -O0 -g -lcuda -I/usr/local/cuda-10.2/include/ -o my_program
Thanks.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.