Extracting OpenCL initialization from a loop


I’m trying to replace an important piece of code in my program by its equivalent in OpenCL.

This what I do more or less.



    1. get the current image (host)

    2. initialize OpenCL (get the platform and the device, create the context and the queue)

    3. allocate GPU memory (for image on the device)

    4. copy image from host to device memory

    5. compile the kernel

    6. launch the kernel

    7. copy resulting image from device to host memory

    8. release OpenCL objects

    9. do something with the resulting image


Do you think that extracting parts 2., 3., 5., 8. outside of the loop will save a lot of time?

I was wondering if the kernel is really compiled at each loop iteration or if the compiler is smart and compile the code only once?

That question is crucial since I have in fact a few kernels.

In advance, thank you!