Hello,
i have installed “CUDA Toolkit 3.0” and installed the correct driver and successfully build all opencl samples. Running “oclDeviceQuery” yields:
oclDeviceQuery.exe Starting…
OpenCL SW Info:
CL_PLATFORM_NAME: NVIDIA CUDA
CL_PLATFORM_VERSION: OpenCL 1.0 CUDA 3.0.1
OpenCL SDK Version: 4954966
OpenCL Device Info:
1 devices found supporting OpenCL:
Device GeForce 9500 GT
CL_DEVICE_NAME: GeForce 9500 GT
CL_DEVICE_VENDOR: NVIDIA Corporation
CL_DRIVER_VERSION: 195.17
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 4
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 / 64
CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1625 MHz
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 511 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_SINGLE_FP_CONFIG: INF-quietNaNs round-to-nearest round-to-zero round-to-inf fma
CL_DEVICE_IMAGE 2D_MAX_WIDTH 8192
2D_MAX_HEIGHT 8192
3D_MAX_WIDTH 2048
3D_MAX_HEIGHT 2048
3D_MAX_DEPTH 2048
CL_DEVICE_EXTENSIONS: cl_khr_byte_addressable_store
cl_khr_gl_sharing
cl_nv_compiler_options
cl_nv_device_attribute_query
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
CL_DEVICE_COMPUTE_CAPABILITY_NV: 1.1
CL_DEVICE_REGISTERS_PER_BLOCK_NV: 8192
CL_DEVICE_WARP_SIZE_NV: 32
CL_DEVICE_GPU_OVERLAP_NV: CL_TRUE
CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV: CL_TRUE
CL_DEVICE_INTEGRATED_MEMORY_NV: CL_FALSE
CL_DEVICE_PREFERRED_VECTOR_WIDTH_ CHAR 1, SHORT 1, INT 1, FLOAT 1, DOUBLE 1
oclDeviceQuery, Platform Name = NVIDIA CUDA, Platform Version = OpenCL 1.0 CUDA 3.0.1, SDK Version = 4954966, NumDevs = 1, Device = GeForce 9500 GT
System Info:
Local Time/Date = 08:37:43, 11/27/2009
CPU Name: Intel® Core™2 Quad CPU Q9300 @ 2.50GHz
of CPU processors: 4
Linux version 2.6.31-15-generic-pae (buildd@rothera) (gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu8) ) #50-Ubuntu SMP Tue Nov 10 16:12:10 UTC 2009
TEST PASSED
The “oclBandwithTest” test is passed as well. Any other sample program is prompting an error at runtime, for example “oclVectorAdd”:
./bin/linux/release/oclVectorAdd Starting…
of float elements per Array = 11444777
Global Work Size = 11444992
Local Work Size = 256
of Work Groups = 44707
Allocate and Init Host Mem…
clGetPlatformID…
clGetDeviceIDs…
clCreateContext…
clCreateCommandQueue…
clCreateBuffer…
oclLoadProgSource (VectorAdd.cl)…
clCreateProgramWithSource…
Error in clCreateProgramWithSource, Line 154 in file oclVectorAdd.cpp !!!
Starting Cleanup…
oclVectorAdd.exe Exiting.
Or program “oclParticles”:
…
clCreateCommandQueue…
…loading BitonicSort_b.cl
!!! Error # 0 (CL_SUCCESS) at line 42 , in file src/oclBitonicSort_launc !!!
Exiting…
Has anyone an explanation for the source of these errors? Did I miss a crucial step within the installation process?
Yours sincerly,
Marcus Große