Hi folks,
I’m trying to create a c++ class signal processing library. I have a class called SignalGPU with a pointer to a device memory location which stores a data array with number values. That pointer (ssignal) is initialized in the SignalGPU constructor:
template<class T> SignalGPU<T>::SignalGPU(T* sgn, int s)
{
this->size = s;
HANDLE_ERROR(cudaMalloc(&this->ssignal,this->size * sizeof (T)));
HANDLE_ERROR(cudaMemcpy(this->ssignal,sgn,this->size * sizeof (T),cudaMemcpyHostToDevice));
}
I have also overwritten the + operator:
template<class T> SignalGPU<T>* SignalGPU<T>::operator+(const ISignal<T>& other) const
{
T* out;
HANDLE_ERROR(cudaMalloc(&out,this->size * sizeof (T)));
<b>CuAddition<T><<<977,1024>>>(this->ssignal,other.ssignal,this->size,out);</b>
SignalGPU<T>* newSignal = SignalGPU<T>::GetDeviceCopy(out,this->size);
HANDLE_ERROR(cudaFree(out));
return newSignal;
}
The code compiles but the program fails with this message:
an illegal memory access was encountered in …/src/DataTypes/SignalGPU.cu at line200
Al line 200 is the HANDLE_ERROR(cudaFree(out));
I have ran cuda-memecheck and the output is:
========= Invalid global read of size 8
========= at 0x000001c8 in /home/daniel/cuda-work/SAR/Debug/…/src/DataTypes/SignalGPU.cu:9:void CuAddition(double*, double*, int, double*)
========= by thread (671,0,0) in block (10,0,0)
========= Address 0x000154f8 is out of bounds
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame:/usr/lib/libcuda.so (cuLaunchKernel + 0x331) [0x138251]
…
I think that the real problem is when passing the other signal pointer to the kernel. It cannot share memory device address between different objects.
I have extensively searched and found the following topic.
But I couldn’t implement successfully this solution. Do you think this is the problem?
Thank you,
Tomás