memory distribution question

captainp · February 22, 2008, 7:33am

I have about 10000 8*8 matrices to take the inverses of. I have implement the function first and now am working on optimizations. The first optimization I am looking at is changing from generic mallocs to mallocPitch. I had a question about mallocPitch that I have seam to gotten wrong because the way I tried doesn’t output the correct results. First question how to you designate host memory to fit in the correct form to preform a cudaMemcpy2D operation for it to the device. Second if I am going to have to copy the array from global mem to shared mem, what is the best way to structure the shared mem to optimize performance? Any help would be greatly appreciated.

Topic		Replies	Views
2D array & Memory space Mostly about cudaMallocPitch & cudaMemcpy2D CUDA Programming and Performance	1	1485	October 15, 2009
cudamallocpitch and cudamemcpy2d CUDA Programming and Performance	1	1032	October 3, 2010
cudaMallocPitch CUDA Programming and Performance	5	4500	October 5, 2010
Can't get copyDeviceToHost to work with cudaMemcpy2D CUDA Programming and Performance	0	3629	November 13, 2009
trouble with cudaMemcpy2D I cant get a matrix to copy into 2D pitched memory CUDA Programming and Performance	1	922	July 13, 2009
Padding in Pitch memory CUDA Programming and Performance	2	4003	October 16, 2009
cudaMemcpy2D To Host CUDA Programming and Performance	6	3445	June 8, 2012
Problem with 2D memory copy using pitch CUDA Programming and Performance	6	6499	November 20, 2011
problem with cudaMallocPitch and cudaMemcpy2D CUDA Programming and Performance	5	6364	April 22, 2009
cudaMemcpy2D example? CUDA Programming and Performance	5	19590	February 1, 2012

memory distribution question

Related topics