cudaMemset question

eyalhir74 · October 29, 2008, 2:14pm

Hi,
Im allocating a float array using cudaMalloc of size: 84,044,410 so a total of 336,177,640 → ~320MB.
I then reset its content using cudaMemset:
CUDA_SAFE_CALL( cudaMemset( pDeviceOutput1, 0, iRawDataSize * sizeof( float ) ) ); //iRawDataSize == 84,044,410
This takes ~40ms. Is it possible? reasonable? is there a better/faster way to do this?

thanks in advance
eyal

MisterAnderson42 · October 29, 2008, 4:56pm

Let’s see:
(336177640 bytes / 40e-3 seconds) / (1 024^3 bytes/GiB) = 7.82724563 GiB/s

What hardware are you running on. That is roughly 1/10th the bandwidth available on 8800 GTX.

Although, I do recall another user on the forums finding a similar performance problem with cudaMemset before. You could simply write a kernel that writes 0’s to all the floats in a fully coalesced manner to get the full bandwidth of the device.

eyalhir74 · October 29, 2008, 5:35pm

Hi,

Thanks for the response. Im using the GeForce GTX 280.

I’ll try to use the kernel

thanks

eyal

Topic		Replies	Views
cudaMemset too slow on Xavier Jetson AGX Xavier cuda	6	1210	October 18, 2021
cudaMemset bug cudaMemset, is it really so slow ?? CUDA Programming and Performance	1	4304	December 3, 2009
fastest way to initialise large arrays cudaMemset v cudaMemcpyDeviceToDevice CUDA Programming and Performance	7	17900	March 22, 2011
Setting arrays to a value Float arrays CUDA Programming and Performance	2	16354	July 28, 2008
cudaMemset() CUDA Programming and Performance	6	19667	November 26, 2009
Memset? CUDA Programming and Performance	9	1102	June 17, 2024
How to reset __device__ array? cudaMemset does not seem to work CUDA Programming and Performance	6	5552	March 9, 2010
cudaMemset() problem CUDA Programming and Performance	8	9947	August 14, 2011
Memset problem CUDA Programming and Performance	2	3409	March 25, 2010
cudaMemset or cudaMemset2D set memory with float values CUDA Programming and Performance	10	30945	March 29, 2012

cudaMemset question

Related topics