A thrust allocate error?

batman216 · November 20, 2022, 6:15am

I am using Nvida HPC SDK (2022) to complie the follow code, the basic propuse of which is to sum a N*M matrix into a N vector.

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/execution_policy.h>
#include <thrust/execution_policy.h>
#include <thrust/fill.h>

constexpr unsigned int N = 2048, M = 2048;

int main(int argc, char* argv[]) {

	thrust::device_vector<double> g_vec1(N*M);
	thrust::device_vector<double> g_vec2(N);
	thrust::fill(thrust::device, g_vec1.begin(),g_vec1.end(),1.);

	
	thrust::device_vector<thrust::
			device_vector<double>::iterator> g_it_vec(N);

	for (int i=0; i<N; i++)
		g_it_vec[i] = g_vec1.begin() + i*M;

			
	thrust::transform(g_it_vec.begin(),g_it_vec.end(),g_vec2.begin(),
		[](const auto& it) {
			return thrust::reduce(thrust::device,
				it, it+M,0.);});

}

when I run this code on the 3080Ti device, an error occurs when M > 2048 (or 1024 when I use complex):

temporary_buffer::allocate: get_temporary_buffer failed
…
temporary_buffer::allocate: get_temporary_buffer failed
terminate called after throwing an instance of ‘thrust::system::system_error’
what(): transform: failed to synchronize: cudaErrorLaunchFailure: unspecified launch failure
Aborted (core dumped)

How did this happen? Is it related to the 1024 maxium thread number of a block?
Or is there any standard means to reduce a matrix(2d array)?

Robert_Crovella · November 21, 2022, 10:16pm

I think you’ve been given the explanation on your cross-posting

system · January 14, 2023, 3:08am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unspecified launch failure error when thrust::device is used in transform_reduce CUDA Programming and Performance	3	1632	October 12, 2021
Device memory allocation fails on WSL2 CUDA-MEMCHECK cuda	1	1004	September 27, 2021
Thrust Error CUDA Programming and Performance	1	7162	February 1, 2012
THRUST and CUDA 4.0 CUDA Programming and Performance	0	10915	May 31, 2011
Problem getting thrust device functionality to work GPU-Accelerated Libraries	6	941	October 21, 2020
Thrust throws exception when device_vectors is used CUDA Setup and Installation	2	1530	December 12, 2020
thrust issue? please help me! someone familiar with thrust. CUDA Programming and Performance	4	1899	July 6, 2018
thrust in a multi threaded dll CUDA Programming and Performance	1	2925	June 16, 2011
thrust::transform wrong template type deduction CUDA Programming and Performance	0	596	May 5, 2016
why I can't use thrust::sequence with device_vector? CUDA Programming and Performance	2	815	October 31, 2013

A thrust allocate error?

Related topics