I was wondering if anyone could throw me some hints regarding memory managment in OpenCL on Nvidia cards. My problem is how to determine whether the device memory is full at Buffer creation time. While the OpenCL 1.1 specification states that clCreateBuffer returns CL_OUT_OF_RESOURCES if the device memory is filled, it seems that the Nvidia OpenCL implementation does not conform to this: I ran some tests and was able to overcommit the device memory at buffer creation. From what I read, this is possible since the actual device allocation is delayed until a kernel is executed on this buffer. Once I run a kernel on the overcommited buffer, an error message is sent to my notification function. So currently I only see two ways of determining whether the device memory is packed:
Making some manual checks, i.e. ensuring that the combined buffer sizes are smaller than CL_DEVICE_GLOBAL_MEM_SIZE and that each allocation is smaller than CL_DEVICE_MAX_MEM_ALLOC_SIZE. However, there might be a problem with device memory fragmentation, that leads to buffers not fitting on the device which theoretically should.
Checking whether the execution of a kernel fails on the given buffer object. However, this option seems a bit “suboptimal” for my taste :)
So my question is: is there another - more elegant - way of determining at buffer creation time whether the given memory block fits onto the device?