Greetings,
I’ve got some code that does fixed indexing into global memory, but
PTX doesn’t like it and won’t build the program. If I do the same
things with indexing into local memory, PTX likes it just fine. Here’s
some example code:
__kernel void test_kernel1( __global float *some_data, __global float cov_matrix[9] )
{
// does not compile
cov_matrix[0] = 0.0f;
cov_matrix[1] = 0.0f;
cov_matrix[2] = 0.0f;
cov_matrix[3] = 0.0f;
cov_matrix[4] = 0.0f;
cov_matrix[5] = 0.0f;
cov_matrix[6] = 0.0f;
cov_matrix[7] = 0.0f;
cov_matrix[8] = 0.0f; // <- compile error here
}
This yields the following message:
But if I try to build this:
__kernel void test_kernel2( __global float *some_data )
{
__local float cov_matrix[9];
// compiles
cov_matrix[0] = 0.0f;
cov_matrix[1] = 0.0f;
cov_matrix[2] = 0.0f;
cov_matrix[3] = 0.0f;
cov_matrix[4] = 0.0f;
cov_matrix[5] = 0.0f;
cov_matrix[6] = 0.0f;
cov_matrix[7] = 0.0f;
cov_matrix[8] = 0.0f;
}
it compiles without any problems.
For the life of me, I can’t figure out what’s causing this. I’m using
the 190.29 driver on 32-bit 9.10 Ubuntu. Any ideas? BTW, I had tried
to upgrade to the 195.36.15 driver, but it was a disaster on my machine,
so I uninstalled the new driver and reinstalled the old one. There
might be some residual 195.36.15 badness floating around.
Thanks in advance!