How to reset __device__ array? cudaMemset does not seem to work

The problem is the following:

I want to set an array to zero before using it.

#include <stdio.h>

#include <math.h>

#include "cuda.h"

void checkCUDAError(const char *msg)

{

  cudaError_t err = cudaGetLastError();

  if( cudaSuccess != err) {

	fprintf(stderr, "Cuda error: %s: %s.\n", msg, cudaGetErrorString( err) );

	exit(EXIT_FAILURE);

  }

}

__device__ int phi[7][7][7];

__global__ void kernel()

{ }

int main(){

		cudaMemset( phi,  0  , sizeof(phi));

 checkCUDAError("cudaMemset");

		 kernel<<<1,1>>>();

 checkCUDAError("Calling kernel");

}

This gives an error on running the kernel, cudaMemset does not give an error.

If cudaMemset is commented out, the kernel runs.

I want to have phi[7][7][7] = 0; or float 0.0f.

Any clues?

TFM. cudaMemset, like the C standard library function it is modelled after, takes a byte sized value and copies it to however many bytes you specify to the destination memory address. So both the value and the size arguments you are passing are probably to the function are probably incorrect.

I tried to modify the cudaMemset call with all possibilities.

The kernel call still gives errors or does not run .

When I declare device int phi[7][7][7], is this in device memory ?

What possibilities are there? You pass a byte to fill the memory with, the number of bytes to fill, and the starting address. That is how it works. Which leads to another interesting question…

Page 106 of the 2.3 programming guide says it is. This almost guarantees that the address being passed to cudaMemset is wrong. You will probably have to call cudaGetSymbolAddress(), and maybe cudaGetSymbolSize to make what you are trying to do work, if it is actually possible at all (I don’t know I have not tried it).

You are right, cudaGetSymbolAddress gets the job done.

Here is the working code:

[codebox]

#include <stdio.h>

#include <math.h>

#include “cuda.h”

void checkCUDAError(const char *msg)

{

cudaError_t err = cudaGetLastError();

if( cudaSuccess != err) {

fprintf(stderr, "Cuda error: %s: %s.\n", msg, cudaGetErrorString( err) );

exit(EXIT_FAILURE);

}

}

device float phi[7][7][7];

global void kernel()

{ }

int main(){

        unsigned char *d_addr = NULL;

    cudaGetSymbolAddress( (void**)&d_addr,"phi");

    cudaMemset( d_addr, 0.0f  , sizeof(phi));

checkCUDAError(“cudaMemset”);

kernel<<<1,1>>>();

cudaThreadSynchronize();

checkCUDAError(“Calling kernel”);

cudaThreadExit();

}

[/codebox]

which works in both emulation and device code.

Thank you.

Jam1,

I compiled your implementation of cuPrintf on linux by commenting out timeSetEvent(), but it outputs nothing. Do you know is there any linux equivalent of this function? Thanks!

this is from here:

http://forums.nvidia.com/index.php?showtopic=161849