Hi guys. I’ve got stucked here for a couple of days in this error. I have a Mac with a 320m, and I’m doing the following code:
__device__ float d_endAgents;
float h_endAgents;
void callingKernel(dim3 dimBlock dim3 dimThread){
h_endAgents = -1.0;
cudaMemset(&d_endAgents,-1.0,sizeof(float));
kernel<<< dimBlock, dimThread >>> ();
cudaThreadSynchronize());
cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);
printf("\n[%f]", h_endAgents);
}
__global__ void kernel(){
d_endAgents = -100.0;
}
This code returns to me that the value of d_endAgents is 0. I’ve tried many different ways to implement it, but all had the same anomaly. What can I do to return the correct value?
Sorry about any grammar mistakes
tera
January 4, 2012, 7:49pm
2
You can’t use the [font=“Courier New”]&[/font] operator in host code to obtain device addresses. You need to call [font=“Courier New”]cudaGetSymbolAddress()[/font] instead to obtain the address.
It’s just the same as you cannot write
cudaMemcpy(&h_endAgents, &d_endAgents, sizeof(float),cudaMemcpyDeviceToHost);
but need to use
cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);
instead.
You can’t use the [font=“Courier New”]&[/font] operator in host code to obtain device addresses. You need to call [font=“Courier New”]cudaGetSymbolAddress()[/font] instead to obtain the address.
It’s just the same as you cannot write
cudaMemcpy(&h_endAgents, &d_endAgents, sizeof(float),cudaMemcpyDeviceToHost);
but need to use
cudaMemcpyFromSymbol(&h_endAgents, "d_endAgents", sizeof(float),0, cudaMemcpyDeviceToHost);
instead.
Doing some modifications here since yesterday I’ve found the problem. When I was using the dim3 var to specify the number of threads and blocks for some reason the kernel wasn’t lauching well. I’ve changed it to <<< 1, 16 >>> and it worked somehow. I’m investigating it. Thanks Tera