Runtime Error when executing OpenCL Samples

Hello,

i have installed “CUDA Toolkit 3.0” and installed the correct driver and successfully build all opencl samples. Running “oclDeviceQuery” yields:

oclDeviceQuery.exe Starting…

OpenCL SW Info:

CL_PLATFORM_NAME: NVIDIA CUDA
CL_PLATFORM_VERSION: OpenCL 1.0 CUDA 3.0.1
OpenCL SDK Version: 4954966

OpenCL Device Info:

1 devices found supporting OpenCL:


Device GeForce 9500 GT

CL_DEVICE_NAME: GeForce 9500 GT
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DRIVER_VERSION: 195.17
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 4
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 / 64
CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1625 MHz
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 511 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_SINGLE_FP_CONFIG: INF-quietNaNs round-to-nearest round-to-zero round-to-inf fma

CL_DEVICE_IMAGE 2D_MAX_WIDTH 8192
2D_MAX_HEIGHT 8192
3D_MAX_WIDTH 2048
3D_MAX_HEIGHT 2048
3D_MAX_DEPTH 2048

CL_DEVICE_EXTENSIONS: cl_khr_byte_addressable_store
cl_khr_gl_sharing
cl_nv_compiler_options
cl_nv_device_attribute_query
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics

CL_DEVICE_COMPUTE_CAPABILITY_NV: 1.1
CL_DEVICE_REGISTERS_PER_BLOCK_NV: 8192
CL_DEVICE_WARP_SIZE_NV: 32
CL_DEVICE_GPU_OVERLAP_NV: CL_TRUE
CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV: CL_TRUE
CL_DEVICE_INTEGRATED_MEMORY_NV: CL_FALSE
CL_DEVICE_PREFERRED_VECTOR_WIDTH_ CHAR 1, SHORT 1, INT 1, FLOAT 1, DOUBLE 1

oclDeviceQuery, Platform Name = NVIDIA CUDA, Platform Version = OpenCL 1.0 CUDA 3.0.1, SDK Version = 4954966, NumDevs = 1, Device = GeForce 9500 GT

System Info:

Local Time/Date = 08:37:43, 11/27/2009
CPU Name: Intel® Core™2 Quad CPU Q9300 @ 2.50GHz

of CPU processors: 4

Linux version 2.6.31-15-generic-pae (buildd@rothera) (gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu8) ) #50-Ubuntu SMP Tue Nov 10 16:12:10 UTC 2009

TEST PASSED

The “oclBandwithTest” test is passed as well. Any other sample program is prompting an error at runtime, for example “oclVectorAdd”:

./bin/linux/release/oclVectorAdd Starting…

of float elements per Array = 11444777

Global Work Size = 11444992
Local Work Size = 256

of Work Groups = 44707

Allocate and Init Host Mem…
clGetPlatformID…
clGetDeviceIDs…
clCreateContext…
clCreateCommandQueue…
clCreateBuffer…
oclLoadProgSource (VectorAdd.cl)…
clCreateProgramWithSource…
Error in clCreateProgramWithSource, Line 154 in file oclVectorAdd.cpp !!!

Starting Cleanup…

oclVectorAdd.exe Exiting.

Or program “oclParticles”:


clCreateCommandQueue…

…loading BitonicSort_b.cl

!!! Error # 0 (CL_SUCCESS) at line 42 , in file src/oclBitonicSort_launc !!!

Exiting…

Has anyone an explanation for the source of these errors? Did I miss a crucial step within the installation process?

Yours sincerly,
Marcus Große

OpenCL kernels files (*.cl) have to reside in the current directory when you launch samples (first two samples worked OK, as these do not involve launching any kind of kernel). So for example you could try by copying corresponding *.cl files say to /tmp directory, then changing current directory to /tmp,and then re-trying with launching samples.

Thanks for your reply. It solved my problems.

Hello,

I have the same problem as Marcus in VectorAdd (I am trying to execute e.g. oclVectorAdd (SDK example) on our Windows machine with OpenCL, Cuda toolkit 3.1). As you suggested I tried copying the cl-file into the temp folder, chaning to the temp folder on command line and the executing the exe-file in the sdk. That is not working → still the same error. I also tried to copy exe- and cl-file into the same folder, but no success.

Any suggestions? I really would like to run a non-trivial OpenCL-program on our machine.

Thanks,

Sandra

Hello,

I have the same problem as Marcus in VectorAdd (I am trying to execute e.g. oclVectorAdd (SDK example) on our Windows machine with OpenCL, Cuda toolkit 3.1). As you suggested I tried copying the cl-file into the temp folder, chaning to the temp folder on command line and the executing the exe-file in the sdk. That is not working → still the same error. I also tried to copy exe- and cl-file into the same folder, but no success.

Any suggestions? I really would like to run a non-trivial OpenCL-program on our machine.

Thanks,

Sandra