Hi !
I would like to allocate largest possible chunk of memory on the cuda global memory.
As far as I know, the framebuffer would need some of the global memory.
So I cannot allocate the entire global memory to my application.
So I did something like this -
Using the cudaDeviceProperties() i found the total global memory in the available card. = TotalGlobalMemory.
Next I predetermined that I would leave out 100 MB for the framebuffer. FramebufferMemorySize = 100 * 1024 * 1024
So, MemorySizeToAllocate = TotalGlobalMemory - FrameBufferMemorySize;
I then do a cudaMalloc(pCudaMemory, MemorySizeToAllocate)
This works fine on my development machine. (9800 GTX, with 2x24" monitors)
But when I try the same code on a different machine it either cannot do the cudaMalloc or I see some scribbling at a few pixels on the top of the screen. ( diff machine = 280 GT with 1 30" monitor + 1 24" monitor)
If I change the FramebufferMemorySize to 200 * 1024 * 1024, the application runs on the second machine without a hitch.
Does this have anything to do with the different monitor sizes? (ie FramebufferSize?) or is there something bigger at play here?
How do I determine what should be my FramebufferSize so that my code can run on any configuration?
Also, I am using WPF for the UI on my application. I heard that WPF uses the graphics card memory as well. If that is true, how do I make sure that WPF and my cuda computations do not overlap ?
I plan to run it on a Cuda enabled laptop which might have just 128 MB global memory. Taking 100 MB out of it leaves me with practically nothing.
Let me know if I am not clear or if you need more information from my side.
All help appreciated!
Thank you!!