This is probably a retoric question, but… Is there a way to control where a cudaMalloc would allocate the data relative to offset zero in GPU RAM?
The problem is that I have very big data to allocate on the GPU, consider the following scenerio:
- Allocate 1.5GB pointerA.
- Allocate 700MB pointerB.
- Allocate 700MB pointerC.
- Allocate 700MB pointerD.
- Allocate some various small size pointers.
For a C1060 that should fit, however depending on the positions of the arrays in the 4GB address space, it might fail.
Is there a way to ensure this would fit into memory other than making the arrays smaller by dividing them to chunks???