New learner for cuda. And I want to assure whether the stream are concurrency. But there is no visualize grpah in nsys-ui.
OK. I found the problem.
I just run kernel like this.
__global__ void kernel_1(double sum){
for(int i = 0 ; i < N ; ++i ){
sum += tan(0.1) * tan(0.1);
}
}
And it will be ignore by nvcc for there is no output.
After change code :
__global__ void kernel_1(double * sum){
for(int i = 0 ; i < N ; ++i ){
*sum += tan(0.1) * tan(0.1);
}
}
It works.