of which only fd is used in the test kernel.
When compiled like this, many tests fail.
If I write it as: constant float4 fd[4]; constant float4 f1[8]; constant float4 f2[8]; constant float4 f3[8]; constant float4 f4[8];
then all tests pass.
If I comment one or more of [f1,f2,f3,f4], then all test pass also.
I’m using cuda 3.0 beta SDK and driver 195.30 on 64bit fedora 11. Same issue on 64 bit ubuntu 9.10
I also tried different drivers, but that didn’t help. Maybe it’s a cuda 3.0 beta or a pebcak problem :-)
I’d really appreciate it if someone could tell me why the first case doesn’t work.
Without looking at your code, my guess is that you have an out of bounds index somewhere to that fd constant memory array. When you declare other constant memory storage after it, your code is probably reading out of bounds, but into something “safe”, so no errors occur. When you declare fd last, it sits at the end of your legal constant memory block, which makes you code read out of bounds into somewhere it isn’t allowed, and it fails.
The problem is not located in the cudaMemcpyToSymbol call. According to the reference manual you can pass either a character array or the parameter name.
I double checked it be uploading the data to GPU and then downloading it again to an array initialized to all zeroes and the returned data is correct.
The reason I’m allocating the float constant array as ints is because I wanted to fill the array in this trimmed down template project with the exact same values as my larger project where I first encountered this bug.