I tried to use CUDA_ARCH but I read somewhere that this works only on the device portion of code.
After that, I came across this code on github: https://github.com/divijverma/NAMD/blob/dce41777d099e647965a8dc5e3fec1925027110f/src/DeviceCUDAkernel.cu
Is there any better way to achieve this?
I am asking this because I would like to determine if the GPU supports unified memory (on host code) in which case a cudaMallocManaged would take place or cudaMallocs && cudaMemcpys would take place instead.
Example of what I would like to do:
int main() {
// IF CUDA >= 6.0 && COMPUTE CAPABILITY >= 3.0
// USE cudaMallocManaged
// ELSE
// USE cudaMallocs && cudaMemcpys
// END IF
return 0;
}