clEnqueueReadBuffer so long !

I’am trying to use OpenCL to have a better speed building trees on GPU.
My pb concern the transfert between GPU’s memory and RAM.
How to speed this ?
With 8 Millions float4 data I spend 200ms to build the tree but 30s to transfert it to the RAM.
Is-it normal ?

In advance thank you, Emmanuel.

30 seconds? That’s not right… It should be more like 50ms.

30 seconds? That’s not right… It should be more like 50ms.

Thank’s, I’am gonna try to find my (probable) mistake !

Thank’s, I’am gonna try to find my (probable) mistake !