Memory location persistance cudaMalloc and classes

kegatut · September 17, 2008, 9:06pm

Hey Folks,

I am working on a proof of concept that will hopefully provide excellent gains by using GPUs for image analysis. I’m finding some strange behavior that doesn’t seem to make sense to me.

I want to allocate device memory once and reuse it for each image that I want to analyze to avoid memory allocation time each time an image needs analyzing. What I have working is equivalent to this:

MyClass.h

class MyClass

{

public:

MyClass();

~MyClass();

int AnalyzeImage(unsigned char * image);

private:

unsigned char * m_ucDevImage;

};

MyClass.cpp

#include "GPUAnalyze.h"

MyClass::MyClass()

{

GPUInit(&m_ucDevImage);

}

MyClass::~MyClass()

{

GPUDestroy(m_ucDevImage);

}

int MyClass::AnalyzeImage(unsigned char * image)

{

GPUAnalyze(image,m_ucDevImage);

}

GPUAnalyze.h

void GPUInit(unsigned char ** img);

void GPUDestroy(unsigned char * img);

int GPUAnalyze(unsigned char * img, unsigned char * devImg);

GPUAnalyze.cu

void GPUInit(unsigned char ** img)

{

cudaMalloc((void**)img,1024*1024);

}

void GPUDestroy(unsigned char * img)

{

cudaFree(img);

}

int GPUAnalyze(unsigned char * img, unsigned char * devImg)

{

if ((cudaMemcpy(devImg,img,IMG_SIZE,cudaMemcpyHostToDevice)) != cudaSuccess)

{

printf("Unable to copy image to device\n");

CUDAErrorDetails(cudaGetLastError());

}

// analyze image here

return 0;

}

The first time GPUAnalyze is called, the cudaMemcpy works fine. All subsequent calls fail with cudaErrorInvalidDevicePointer. If I use a global unsigned char * in place of the MyClass::m_ucDevImage member, the system works fine. The value of the MyClass::m_ucDevImage member does not appear to change in between each of the calls to GPUAnalyze(). Any thoughts are appreciated. Thanks a bunch!

-Bryan

theMarix · September 21, 2008, 9:03am

I used a similar approach in some other code of mine, therefore the concept should work. Do you make sure you check for the error status after the kernel completed?

kegatut · September 23, 2008, 2:27am

Yep. I call cudaThreadSynchronize() and then check the return value of cudaGetLastError(). I’m currently working around it (hooray globals!). Thanks for the assurance that I’m not barking up the wrong tree.

-Bryan

theMarix · September 23, 2008, 6:02am

One thing you might want to be carefull about is copying that class. In that case you would have two references to the same device memory. Deletion of the first will release the resources and break the other one.

Topic		Replies	Views
Error accesing pointer not allocated in the same method CUDA Programming and Performance	3	956	April 25, 2012
CUDA card memory device pointers CUDA Programming and Performance	5	4829	April 28, 2009
Keep previously allocated memory on GPU CUDA Programming and Performance	5	1659	July 2, 2010
Invalid Device Pointer CUDA Programming and Performance	9	24668	January 15, 2009
Weird memory allocation behavior CUDA Programming and Performance	0	578	April 21, 2011
Two device pointers pointing out same memory address deallocation problem CUDA Programming and Performance cuda	1	365	April 8, 2024
using cuda shared memory several times CUDA Programming and Performance	1	548	November 25, 2018
cudaMalloced memory cannot be used in other functions memory managment CUDA Programming and Performance	10	7213	May 24, 2010
strange problem accessing device memory cudaMalloc and cudaMemcpy CUDA Programming and Performance	0	2321	April 2, 2010
persistent memory CUDA Programming and Performance	4	8325	September 1, 2011

Memory location persistance cudaMalloc and classes

Related topics