Complex structs in CUDA

Hello,
I am a beginner in CUDA. I am trying to copy a complex struct with pointers and arrays on the device but I get illegal memory access with the arrays. I am using cudaMalloc() and cudaMemCpy().
How I can correctly copy the struct on the device without errors?

This is the struct:

struct solver { 
	int* DB, nVars, nClauses, mem_used, mem_fixed, mem_max, maxLemmas, nLemmas,
		* buffer, nConflicts, * model, * reason, * falseStack, * _false, * first,
		* forced, * processed, * assigned, * next, * prev, head, res, fast, slow;
};

This is the code where I get errors (lines 3-6-7):

__device__
int propagate(struct solver* S, int* _reason, int* _processed) {
	int forced = S->reason[abs(*S->processed)];    
	while (S->processed < S->assigned) {  
		printf("processed loop\n");
		int lit = *(S->processed++);        
		int* watch = &S->first[lit];
                ...

Copying a struct of arrays to device memory is no different from independent arrays. You need to make sure you use cudaMalloc on each struct item. Could you provide that code you’re trying to implement?

Here an example

typedef struct fCUB_t {
    float *       d_minScorePerBlock {};
    float *       d_minScoreSorted {};
    unsigned int *d_minComboPerBlock {};
    unsigned int *d_minComboSorted {};
    void *        d_temp_storage     = NULL;
    size_t        temp_storage_bytes = 0;
} cubData;
cubData cubStruct {};

checkCudaErrors( cudaMalloc( reinterpret_cast<void **>( &cubStruct.d_minScorePerBlock ),
                                     numBlocksRequired * sizeof( float ) ) );
checkCudaErrors( cudaMalloc( reinterpret_cast<void **>( &cubStruct.d_minComboPerBlock ),
                                     numBlocksRequired * sizeof( unsigned int ) ) );
checkCudaErrors( cudaMalloc( reinterpret_cast<void **>( &cubStruct.d_minScoreSorted ),
                                     numBlocksRequired * sizeof( float ) ) );
checkCudaErrors( cudaMalloc( reinterpret_cast<void **>( &cubStruct.d_minComboSorted ),
                                     numBlocksRequired * sizeof( unsigned int ) ) );

And check your code with cuda-memcheck, as it will help you narrow down your issues.

Thank you a lot!

Actually I am doing this:

...

	gpuErrchk(cudaMalloc((void**)&_reason, nbytes));
	gpuErrchk(cudaMalloc((void**)&_processed, nbytes));

	gpuErrchk(cudaMemcpy(_reason, S.reason, nbytes, cudaMemcpyHostToDevice));
	gpuErrchk(cudaMemcpy(_processed, S.processed, nbytes, cudaMemcpyHostToDevice));

        ...

The problem is that the arrays _reason and _processed have not the correct values

That looks correct to me. I suggest starting with a smaller example that just performs cudaMemcpy to confirm you’re doing everything correctly. You might find one at the NVIDIA Developer blog https://devblogs.nvidia.com/.

Also, you might want to look at managed memory https://devblogs.nvidia.com/maximizing-unified-memory-performance-cuda/.