I would describe this release as brilliant, if a tiny bit flaky still. The CUDA nbody would not run at first, and I had to reload the CUDA kext after a hard restart. Then it all seemed to settle down and I have been able to run several examples in both CUDA and OpenCL zones. The migration of the standard examples to ocl is a massive help - thanks Nvidia!
FYI, I include the output of oclDeviceQuery on my sys. This recognizes both 285s on an 08 Mac Pro (one is 1G Mac version, other is 2G injected PC) as well as the CPU. First note it is fine with the PC card. I think what gives me a buzz is the fact that the CPU and GPU are there on a more or less equal basis…I have never seen anything like this before!
I have two questions for Nvidia,
Q1. how do I control the target device as under CUDA, where I would say e.g. ./nbody --device=0, when I want to target the same under OpenCL. So e.g. I want to run oclNbody on each of my 3 devices controlling which goes where?
Q2. What is the story with double precision on the GPU?
oclDeviceQuery.exe Starting…
OpenCL SW Info:
CL_PLATFORM_NAME: Apple
CL_PLATFORM_VERSION: OpenCL 1.0 (Jul 15 2009 23:07:32)
OpenCL SDK Version: 1.2.0.16
OpenCL Device Info:
of devices supporting OpenCL = 3:
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_NAME: GeForce GTX 285
CL_DRIVER_VERSION: CLH 1.0
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 240
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 / 64
CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1476 MHz
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_IMAGE_MAX_WIDTH: 2d width 8192, 2d height 8192, 3d width 2048, 3d height 2048, 3d depth 2048
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 256 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 1024 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_EXTENSIONS:
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_APPLE_gl_sharing
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
CL_DEVICE_PREFERRED_VECTOR_WIDTH: char 1, short 1, int 1, long 1, float 1, double 0
CL_DEVICE_VENDOR: NVIDIA
CL_DEVICE_NAME: GeForce GTX 285
CL_DRIVER_VERSION: CLH 1.0
CL_DEVICE_TYPE: CL_DEVICE_TYPE_GPU
CL_DEVICE_MAX_COMPUTE_UNITS: 240
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 / 64
CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
CL_DEVICE_MAX_CLOCK_FREQUENCY: 1476 MHz
CL_DEVICE_ADDRESS_BITS: 32
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_IMAGE_MAX_WIDTH: 2d width 8192, 2d height 8192, 3d width 2048, 3d height 2048, 3d depth 2048
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 512 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 2048 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: local
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_EXTENSIONS:
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_APPLE_gl_sharing
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
CL_DEVICE_PREFERRED_VECTOR_WIDTH: char 1, short 1, int 1, long 1, float 1, double 0
CL_DEVICE_VENDOR: Intel
CL_DEVICE_NAME: Intel® Xeon® CPU E5462 @ 2.80GHz
CL_DRIVER_VERSION: 1.0
CL_DEVICE_TYPE: CL_DEVICE_TYPE_CPU
CL_DEVICE_MAX_COMPUTE_UNITS: 8
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
CL_DEVICE_MAX_WORK_ITEM_SIZES: 1 / 1 / 1
CL_DEVICE_MAX_WORK_GROUP_SIZE: 1
CL_DEVICE_MAX_CLOCK_FREQUENCY: 2800 MHz
CL_DEVICE_ADDRESS_BITS: 64
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
CL_DEVICE_IMAGE_MAX_WIDTH: 2d width 8192, 2d height 8192, 3d width 2048, 3d height 2048, 3d depth 2048
CL_DEVICE_MAX_MEM_ALLOC_SIZE: 3072 MByte
CL_DEVICE_GLOBAL_MEM_SIZE: 12288 MByte
CL_DEVICE_ERROR_CORRECTION_SUPPORT: no
CL_DEVICE_LOCAL_MEM_TYPE: global
CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte
CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte
CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE
CL_DEVICE_EXTENSIONS:
cl_khr_fp64
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_byte_addressable_store
cl_APPLE_gl_sharing
cl_APPLE_SetMemObjectDestructor
cl_APPLE_ContextLoggingFunctions
CL_DEVICE_PREFERRED_VECTOR_WIDTH: char 16, short 8, int 4, long 2, float 4, double 2
System Info:
TEST PASSED…
Press to Quit…