Is it possible to divide one CUDA code into some parts of different CUDA codes?

Hi there,
I got an advice from this forum, I always appreciate you!:)

At this moment, I am trying to make a program which can solve huge matrix calculation.
But there is a little difference.
I want to divide one CUDA cu flow into three parts as below

  1. Copy host data into device

While loop
2) Calculate huge matrix

  1. Free device

Why I try to make like this is that it is borthersome procedure that copy host large data to device.
If I make a program as a single cu file, I have to copy, calculate, and free each execution.
I think it is really convenience that copying host data to device once.
It is done, I just execute calculation by changing some parameters concerning matrix size or indices.
Is it possible to make like this?

Thank you in advance.


If you are meaning to call a kernel many times from inside that while loop then this is a very common approach. Sample code in the SDK e.g. on particles or n-body

dear kbam,

Thank you for a comment.
I am going to find it from sample code! thank you!


Dear kbam,

Good morning!
I have searched examples from CUDA sample browser.
Yeah, the structure of nBody is what I want to make. :)
Result is affected with respect to changing variables.

I will try to make the same structure of nBody.
Thank you!