I have an application on cuda that changes the position and orientation of 3D models. I have been using the same scheme of the SDK sample particles and have use the VBO in order to render the models as points. But now I want to render some more complex model, and i dont want to download the data from the GPU make the transformations on the model and send the model to the GPU again.
You could write to 3 float4 VBOs from CUDA (representing a 3x4 matrix), and then read these in the vertex shader as texture, constructing the matrix and then using it transform your points to world space.
But unless you have a huge number of objects it’s probably easier and not much slower to just read the transforms back to the CPU.