From the cuda 1.0 documetnation,
I use the Driver API to solve a PDE on a 1024 x 1024 grid, and plots the results
by attaching a buffer to a OpenGL framebuffer.
The profiler gives:
method=[ basicKernel ] gputime=[ 51.968 ] cputime=[ 4325.000 ] occupancy=[ 1.000 ]
I have determined that the cputime is accounted for by the kernel + the openGL commands.
Here is my display routine. If I comment everything but runCuda,
the profiler returns:
method=[ basicKernel ] gputime=[ 51.360 ] cputime=[ 62.000 ] occupancy=[ 1.000 ]
which indicates that the cputime reported is all the OpenGL stuff. What is
confusion is that each line of the profiler output should be associated with a single
kernel invocation. But the cpu time reported seems to be related to the time
between successive kernel invocations. Can anybody shed any light on this?
Thanks.
Gordon
MY CODE
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
runCuda(); // kernel invocation
// set view matrix
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glTranslatef(0.0, 0.0, translate_z);
glRotatef(rotate_x, 1.0, 0.0, 0.0);
glRotatef(rotate_y, 0.0, 1.0, 0.0);
// render from the vbo
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glVertexPointer(4, GL_FLOAT, 0, 0);
glEnableClientState(GL_VERTEX_ARRAY);
glColor4f(1.0, 0.0, 0.0,0.5);
glDrawArrays(GL_POINTS, 0, mesh_width * mesh_height);
glDisableClientState(GL_VERTEX_ARRAY);
glutSwapBuffers();
glutPostRedisplay();