Allocate memory for two cards - a large run-time code

Lukasz_Swierczewski · May 14, 2012, 6:56pm

I have the code:

time_start_block = clock();

	cudaSetDevice(0);

	float *matrix_device_0;

	cudaMalloc((void **) &matrix_device_0, N * N * sizeof(float) );

	float *phi_device_0;

	cudaMalloc((void **) &phi_device_0, N * sizeof(float) );

	float *vector_TEMP_device_0;

	cudaMalloc((void **) &vector_TEMP_device_0, N * sizeof(float) );

	cudaMemcpy(matrix_device_0, matrix_A, N * N * sizeof(float), cudaMemcpyHostToDevice);

	

	cudaSetDevice(1);

	float *matrix_device_1;

	cudaMalloc((void **) &matrix_device_1, N * N * sizeof(float) );

	float *phi_device_1;

	cudaMalloc((void **) &phi_device_1, N * sizeof(float) );

	float *vector_TEMP_device_1;

	cudaMalloc((void **) &vector_TEMP_device_1, N * sizeof(float) );

	

	cudaMemcpy(matrix_device_1, matrix_B, N * N * sizeof(float), cudaMemcpyHostToDevice);

	time_end_block = clock();

	time_dif_block = (float)(time_end_block-time_start_block) / CLOCKS_PER_SEC;

The measured execution time is approximately 45 seconds (Tesla S1070).

Why it takes so long?

Lukasz_Swierczewski · May 15, 2012, 10:17am

I solved the problem.

Physically, the computer has only 4 GB of RAM. This program will allocate a lot of memory and use the SWAP.

When I called ‘cudaMemcpy’ program copied the data from the SWAP memory. It really slowed down.

Topic		Replies	Views
Memory copy very slow memory copy, image CUDA Programming and Performance	10	12627	April 7, 2011
Possibly Studpid question bout cudaMemcpy CudaMemcpy getting slow by time CUDA Programming and Performance	4	2085	February 26, 2010
Questions about cudaMalloc Questions about runtime for cudaMalloc and cudaMemcpy CUDA Programming and Performance	1	3386	June 23, 2009
About CUDA CUDA Programming and Performance	2	4768	December 3, 2008
Really strange memcpy time in matrixMul at SDK CUDA Programming and Performance	6	5182	July 9, 2009
cudaMemcpy costs too much time CUDA Programming and Performance cuda	1	65	October 11, 2024
cudaMalloc and cudaMemcpy (cudaMemcpyAsync) are too time-consuming Jetson Xavier NX cuda	2	385	June 9, 2023
cudaMemcpy(dataDev, dataHost, mem_size, cudaMemcpyHostToDevice) execution time to long CUDA Programming and Performance	2	6457	January 21, 2010
Copying memory from device to Host takes too much time CUDA Programming and Performance	7	3474	October 5, 2010
Does mem copy take a lot of time? CUDA Programming and Performance	0	1082	March 1, 2010

Allocate memory for two cards - a large run-time code

Related topics