OpenCL + Compiler Optimization

Hi all,

I’m trying to optimize OpenCL code using a C++ source-to-source compiler. For simple C++ code without OpenCL calls, my source-to-source translator works ok. However, when I attempt to C++ code that uses OpenCL headers and function calls, I get several errors as shown below.

“/usr/lib64/gcc/x86_64-suse-linux/4.1.2/include/xmmintrin.h”, line 309: error:
identifier “__builtin_ia32_movss” is undefined
return (__m128) __builtin_ia32_movss ((__v4sf) __A,
^

“/usr/lib64/gcc/x86_64-suse-linux/4.1.2/include/xmmintrin.h”, line 319: error:
identifier “__builtin_ia32_cmpordss” is undefined
return (__m128) __builtin_ia32_cmpordss ((__v4sf)__A, (__v4sf)__B);
^

“/usr/lib64/gcc/x86_64-suse-linux/4.1.2/include/xmmintrin.h”, line 325: error:
identifier “__builtin_ia32_cmpunordss” is undefined
return (__m128) __builtin_ia32_cmpunordss ((__v4sf)__A, (__v4sf)__B);
^

“/usr/lib64/gcc/x86_64-suse-linux/4.1.2/include/xmmintrin.h”, line 335: error:
identifier “__builtin_ia32_cmpeqps” is undefined
return (__m128) __builtin_ia32_cmpeqps ((__v4sf)__A, (__v4sf)__B);

I did some googling and found that this happens for CUDA and the suggested solution was to use gcc and nvcc separately at compile stage, and then link the object files to create an executable. However, in the case of OpenCL, the compiler used is just a C++ compiler (g++ in my case). I do not know how to overcome the above errors.

I tried using -msse -msse3 options etc., but that did not help. I have the PATHS, LD_LIBRARY_PATHS etc. set up correctly. I can compile OpenCL code and generate an exeuctable using a common_opencl.mk file, and run it. The problem comes only when I attempt to convert a OpenCL C++ code into an abstract syntax tree (AST).

Could someone clarify why the above errors occur? Is there a known fix? If this is not the correct forum, could someone suggest where I could get some help?

Thanks,
Poornima