I’m studying GPU’s for my MsC Degree and i need learn a way to transfer a larger amount of megabytes (9GB, i suppose) from host memory to GPU memory fast way. But i think which the PCI-Bus bandwidth and memory heap of application are the bottlenecks. Correct? Let us see. My current implementation is described follow:
- My database is a integer array in hard drive;
- To copy this database to GPU, i need to copy the maximum of bytes can i store in heap memory of application, and i’m repeating this step until copy all of my array to GPU memory;
- I’m not using no paralellism to copy data to GPU;
Considering my case and some trade-offs, have any optimization which i should implement to get the best bandwidth in data transfer to GPU memory? Pinned memory brings performance increase with this amount of data? I can use paralellism to copy more than one chunk array to GPU memory? Increase the heap size with nvcc it’s possible? Anybody had any of these problems?
Thanks for all.