Problem with allocating big buffer

Hi all,

I’ve bought a GTX 460 2Gb card and working with CUDA.

I’ve got a problem. I cant allocate a buffer bigger than 700 mb. If for example I allocating a small buffers with size of 8kb - I’m succeeding to fill all video memory. But I just cant allocate buffer bigger than 700.

First I was thinking that this is some kind of driver bug (i.e. driver allocate a piece of memory for other application somewhere in the middle). But this seems to be too silly for driver.

So what can be the reasons?

  1. Can a piece of video memory be BAD and marked as broken by driver - so driver doesnt allocate this memory? How to check this?
  2. I’m using OpenGL interoperability - can this be the case? Althought - the first thing I do - is allocated this big buffer, and only than - allocate OpenGL resources. So it dont look like a case.
  3. Or may be it’s a bug in driver?
  4. Some other CUDA restriction?
  5. Anything else?

P.S. Windows Aero disabled - all programs except Visual Studio closed…

Thanks.

Hi all,

I’ve bought a GTX 460 2Gb card and working with CUDA.

I’ve got a problem. I cant allocate a buffer bigger than 700 mb. If for example I allocating a small buffers with size of 8kb - I’m succeeding to fill all video memory. But I just cant allocate buffer bigger than 700.

First I was thinking that this is some kind of driver bug (i.e. driver allocate a piece of memory for other application somewhere in the middle). But this seems to be too silly for driver.

So what can be the reasons?

  1. Can a piece of video memory be BAD and marked as broken by driver - so driver doesnt allocate this memory? How to check this?
  2. I’m using OpenGL interoperability - can this be the case? Althought - the first thing I do - is allocated this big buffer, and only than - allocate OpenGL resources. So it dont look like a case.
  3. Or may be it’s a bug in driver?
  4. Some other CUDA restriction?
  5. Anything else?

P.S. Windows Aero disabled - all programs except Visual Studio closed…

Thanks.

I also think so.

I also think so.

It’s not a driver bug, but rather an Windows feature.

It’s a limit of the windows driver model that has some wierd and mostly unknown way of calculating the maximum buffer size. As far as I know it has something to do with windows implementing virtual memory on the video card to assure that it has enough memory for composting the display. With me the limit is ~900MB, I ran into it when I couldn’t allocate a 1.5GB buffer on a 4GB Tesla, which definitely isn’t used for anything else (no graphics output at all).

You should be able to allocate multiple buffers where the total memory is close to 2GB if that helps.

There is a program called GPU-z that can show you exactly how much free memory you have on the card (among other things).

Linux does allow you to allocate the full memory by the way if that is any help.

It’s not a driver bug, but rather an Windows feature.

It’s a limit of the windows driver model that has some wierd and mostly unknown way of calculating the maximum buffer size. As far as I know it has something to do with windows implementing virtual memory on the video card to assure that it has enough memory for composting the display. With me the limit is ~900MB, I ran into it when I couldn’t allocate a 1.5GB buffer on a 4GB Tesla, which definitely isn’t used for anything else (no graphics output at all).

You should be able to allocate multiple buffers where the total memory is close to 2GB if that helps.

There is a program called GPU-z that can show you exactly how much free memory you have on the card (among other things).

Linux does allow you to allocate the full memory by the way if that is any help.

As tmurray pointed out in another thread, if you have a Tesla (not GeForce, sadly), you can use the “TCC” driver and avoid all the Windows graphics driver limitations and overhead.

As tmurray pointed out in another thread, if you have a Tesla (not GeForce, sadly), you can use the “TCC” driver and avoid all the Windows graphics driver limitations and overhead.

Thanks guys, now I understand.
I think my way is to go with multiple buffers, althought that complicates things…