Particles Sample Questions about the particles sample of the gpu computing sdk

Hi everybody,

first of all I want to introduce myself: I am a 25 year old student at a German university and I am currently working on a term paper about the particles simulation written by Simon Green. My task is it to modify the program. I already started to get a little idea of CUDA and I think I got a rough understanding of it.

I use Win 7 with VS 2008 Pro and a GeForce 8800 GTX to start with. My problem is that I cannot use Parallel Nsight because my graphic card is too old. So that is why I can only debug the C/C++ code and have less access to the CUDA part.

I have several questions regarding the source code. So I would really like to know if there is anybody who has already worked with this program. In order to come to the point I do not want finished code but I would like and appreciate to get some help.

PS: Please excuse my English for it is quite a bit away from being perfect.

Take a look at the Nvidia CUDA SDK examples. I think the package has a particle simulation example. :)

Maybe it was a bit misunderstanding what I wrote. I was referring to the CUDA SDK particles example. I have to modify this program and I am looking for somebody who might help me.

Here are the first two of several questions:

  1. Where in the code is the actual call for the function “update” which calls “Systemintegrate” which again calls again the CUDA kernel “integrate”? I was able to find “update” but cannot find its calling routine. I would like to build a variable “simulation_time” into the program so that the program can operate according to the actual elapsed time of simulation. Therefor I would like to know where to set up this variable and then find a way to get it into the kernel.

  2. The second question is about the starting of the program. I was not able to pinpoint the location of the code where I can modify the starting position of the first particles. I also wanted to alter the geometry and position of the sphere of particles which you can add at any time.

I hope it is now a little bit clearer.

Try using “find in files” (ctrl+shift+f) in Visual Studio. The “update” method is called from the “display()” function in “particles.cpp”. It already supports a variable time step (there’s a slider for this).

I am sorry I already knew the display() function. What I was looking for is the routine calling display(). I used ctrl+shift+f to look for “integrate”, “integrateSystem” and “update”. But for some reason I could not find the calling routine for “display”. But maybe I do not have to use it after I explained why I thought I should/would:

One of my goals is it to bring the surrounding cube to vibrate, first of all only in one direction, probably along the x-axis. Therefor I would like to use a sin-function to varry the position of the walls of the cube according to the ellapsed simulation time. That is also why I wanted to use a variable which holds the overall simulation time. As I have understood, “timestep” only contains the time between two simulation steps, so it would not help me with my particular problem. I wanted to start with “simtime” at zero and then update it every simulation step and so move the cube walls along the x-axis. Hence I would pass “simtime” from the top function down to the “integrate” routine. The goal is it to shake the particles after they settled on the ground.

Later on I may find a way to pass the impulse from the wall on to the particles. I just have not figured out how to do it. But in the meantime the particles will bounce of the wall end get no rest because the wall is moving and “hits” them.

In that code, the display() function is used as a callback function for the display window, which winds up being handled internally by OpenGL. If you look in main() you can see where a call to glutDisplayFunc() is used to register display() as the window’s callback function.

Thanks for the help. I think I can use that information.

Did I get it right, that it is not forseeable when “display()” is triggered? So, you do not know how often the window is rerendered (lets say) per second? How do you know it is triggered often enough, so you can be shure the simulation is computed properly? Are “timestep” or “iterations” accountable for triggering “display()”?

One little note: Yes I know the fps are computed and I can see that the simulation is indeed rendered sufficiently. I figure I have not got the whole picture of the simulation.

No. display() gets called to advance the state of the solution by one timestep every time OpenGL receives an event which indicates the window needs to be redrawn. The primary source of those events is the idle() routine which is registered as the routine OpenGL will call whenever there is nothing else to do. If you look in the code of idle(), you will see that it calls glutPostRedisplay(), which will trigger a call to display().

So the basic operation of the simulation is to advance the simulation and render the result (by a call to display()) every time OpenGL indicates the GPU is idle (ie. not running display()). Interpret it as “advance the solution by the fixed timestep increment and display the results as fast as you can”.

First of all thanks for the reply. I was not informed by email so I did not recognize it earlier.

So if I understood that correctly, timestep with a value of 0.5 can only be interpreted as 0.5 seconds when the GPU needs 0.5 seconds to calculate the next iteration of the simulation?

I know I have already asked a lot but there is at least one more question left: In particles_kernel.cu the function “collideD” calls “collideSpheres” (calculating the collision between a particle and the cursor sphere). What makes me wonder is the fact that “collideSpheres” is called with a zero velocity of the cursor sphere. If I run the simulation I can see that particles bounce off the larger sphere even faster when I move the cursor with more speed. Has that something to do with the “spring force”?

Install cuda 3.0, it has emulation mode, so you can debug it on cpu.

No, the timestep is really 0.5 seconds irrespective of GPU speed. If the GPU takes less than 0.5s per frame, it just means that the simulation is running “faster than real time”. I solve time dependent partial differential equations where typical problems take a week of wall clock time to integrate a few seonds of simulation time. How long it takes to obtain the solution has no bearing on the time steps which the simulation integrates. This SDK example should not be viewed any differently.

Thanks again. I think I got it now.

I know it is another topic and has really little to do with my previous questions but I am having trouble with global memory: I want to read some parameters for the simulation from a file so I do not have to change them in the code and then compile again. For that reason I wrote some routines to read some floats and ints into a struct. Lets call the struct “blo”. This struct is stored in global memory for the cpu. I used “extern blo* foo” in a .h-file and “blo* foo” in the corresponding .cpp-file. Thus I can access “foo” from every .cpp-file within any function. I allocated the necessary memory in my main-function using “foo = null; foo = (blo*)malloc(sizeof(blo));”. Now there is the following problem. I want to use the whole struct in some of the kernels of the particles sample too. I tried cudaMemcpyToSymbol(foo2, foo, sizeof(blo)) but it did not work. I already checked google and the forum. One of the problems is that I can not include “particles_kernel.cu” or “particleSystem.cu” in “particles.cpp” because I get some errors from the compiler and/or linker. So now I want to write another file only to copy the value of foo into a foo2 lieing on the gpu. Again foo2 should be global too.

Maybe some of you guys have an idea how I get my thing working.

It is a little late but never the less: Happy new year!