What’s ABI?
I think I’m encountering a bug, but not really sure what else to do to troubleshoot. Any advice would be appreciated.
I have my OpenCL kernel. Building the kernel generates an error and the following build log:
ptxas application ptx input, line 41; error : Module-scoped variables in .local state space are not allowed with ABI
ptxas application ptx input, line 42; error : Module-scoped variables in .local state space are not allowed with ABI
ptxas fatal : Ptx assembly aborted due to errors
I was able to save the result from CL_PROGRAM_BINARIES, which is a text file that I think is the resulting PTX code. The lines around 41 and 42 are here:
.visible .const .align 4 .b8 eulerEqn_instances[24] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 51, 51, 179, 63, 0, 0, 0, 0, 0, 0, 0, 0};
.visible .const .align 4 .b8 hyperscheme_wave2d_instances[8] = {102, 102, 102, 63, 0, 0, 128, 63};
.local .align 1 .b8 hyperscheme_wave2d_step$face_dims[4]; <=== Line 41
.local .align 1 .b8 hyperscheme_wave2d_step$face_is_lower[4]; <=== Line 41
.entry sequenceKernel_00(
.param .align 16 .b8 sequenceKernel_00_param_0[16],
And here is the related OpenCL code declaring those variables as automatic arrays inside a function
void hyperscheme_wave2d_step( const int instance, HyperEqnSet_q_group_const_ptrs global_q, HyperEqnSet_q_group_ptrs global_q_new, const global_auxVarArrays_HyperEqnSet_ptrs * const auxVarArrays, const GridDataStruct * pGridDataStruct, real dt, const workDomain_t * const pWorkDomain, const ArrayDescriptorStruct * p_q_array_in_shape, const ArrayDescriptorStruct * p_q_array_out_shape, real * suggested_dt_out )
{
// wave2dScheme
local_q_group_HyperEqnSet qG[1+2*NUM_FACES+NUM_T_CELLS] /*center cell and 1st surrounding circle*/, correctiveFlux[NUM_FACES], qNew, local_qL, local_qR, secondOut_qR, total_face_contribution;
local_aux_group_HyperEqnSet auxVarG[1+2*NUM_FACES+NUM_T_CELLS], local_auxVarL, local_auxVarR, secondOut_auxVarR;
FaceRotationMatrix face_rotationMatrix;
FaceRotationMatrix face_rotationMatrix_Inverse; // the reference x-y coordinate system
FaceRotationMatrix transverseface_rotationMatrix;
const char face_dims[] = {0, 1, 0, 1}; // dimension each face lies in.
const char face_is_lower[] = {1, 1, 0, 0}; // face is the lower face in that dimension
Finally, here is the driver version and hardware that I am working with:
CL_PLATFORM_VERSION: OpenCL 1.1 CUDA 4.2.1
CL_PLATFORM_NAME: NVIDIA CUDA
CL_PLATFORM_VENDOR: NVIDIA Corporation
CL_PLATFORM_EXTENSIONS: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
OpenCL Device 0:
CL_DEVICE_NAME: Tesla C2050
CL_DEVICE_TYPE: GPU
CL_DRIVER_VERSION: 295.59