I got the following issue: I want to make a simple vector by float multiplication in my cl kernel. But since I call the function to do the operation I get an OUT_OF_RESOURCES error when calling clEnqueueNDRangeKernel. How could this be with such a small function? I’m just using three new float vars in the function. If I comment the line out where I call the vec_mul_float, everything works fine! The vec_mul_float function and the Vector3D struct are defined as follows:
Vector3D
vec_mul_float (const Vector3D v, const float a) {
Vector3D ret_vec = {a * v.x, a * v.y, a * v.z};
return ret_vec;
}
typedef struct Vector3D {
float x, y, z;
} Vector3D;
Ah, I think it is :) I tested a few things until I tracked the bug. I was trying to access the components of a global struct directly. Now I’m first copying the values of the global mem struct into registers and then using the data.
But doesn’t the possibility of a direct access to globally defined structs exist? For example, I have a light struct array defined globally like:
__global Light* lights;
To access for example the color of the light which is also a structure, I just can copy the color values to a new variable like:
Color light_color = {lights[0].color.r, lights[0].color.g, lights[0].color.b};
but I want to use the color as follows:
Color col = col_add_col(col, lights[0].color);
Otherwise I have to copy ALL global mem into registers which would kill the performance I think. Any ideas?