In order to have fast memory access it must be aligned at 64-byte adresses?
CUDA offers cudaMalloc3D to allocate 3D arrays at these adresses.
How it is done with OpenCL or is it done somehow automatically?
In order to have fast memory access it must be aligned at 64-byte adresses?
CUDA offers cudaMalloc3D to allocate 3D arrays at these adresses.
How it is done with OpenCL or is it done somehow automatically?
Image3D maybe?