I’m trying to write a simulation for an individual particles that interact with a medium with certain properties (particles do not interact with each other just the medium under test). What I want to do is run a number of these simulations in parallel, and then write the combined results to a file. At present I’ve having trouble getting my head around how to use shared memory. Part of the results is an average of the sum of interactions by the particles, but I don’t know how I can do this. Here is an example of what I’m trying to do:
Host code:
// particle.cu
// includes, system
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
// includes, project
#include <cutil.h>
#include <tmp_kernel.cu>
int main (int argc, char** argv)
{
float h_total;
dim3 threads (3, 1, 1);
dim3 grid (1, 1);
h_total = 0;
particleSim <<< grid, threads >>> (&h_total);
printf ("h_total = %.2f\n", h_total);
}
Kernel file:
// particle_kernel.cu
__shared__ float total;
__global__ void
particleSim(float *h_total)
{
total = *h_total;
__syncthreads();
total++;
__syncthreads();
*h_total = total;
}
I can compile the program and run it, but the output from the program is 0.00 and I expected at least a 3. I need to figure this out because I’ll be using an overall sum in order to compute the average at the end of the simulation.
Unlike the example I’ve shown, I thought the way to do this was to use:
CUDA_SAFE_CALL (cudaMemcpyToSymbol (total, &h_total,
sizeof (float), cudaMemcpyHostToDevice));
from inside particle.cu (host code), but I get an error related to the fact that the variable has not been declared although it is clearly a global variable (in the sense that it is declared out side the main method).
I just need to know how to use a variable that can be access by all threads in the simulation for keeping a running total. There may be a better way to do this, but its not clear to me from what I’ve read in the API or in the examples. The only example that looks something like this, is the Monte Carlo simulation, but it does not pass the results back to the host.
I could use some suggestions
Craig