cudaMallocHost fails with out of memory error

I seem to have found another wierd quirk in CUDA 1.1. For some reason my program has just started producing out of memory errors when I call cudaMallocHost. It doesn’t do it all the time though. Is there some way of monitoring the memory available for this? Is it possible I have hit some other limit? I have several buffers allocated this way but they total less than 4MB. I do free and reallocate them from time to time - is it possible this is going wrong somewhere? How do I troubleshoot the problem?