Hello all,
I had a similar post on this forum, but I mistakenly posted in the wrong group. so I apologize to all who have already read my question.
I’m trying to write a CUDA kernel that can read from different arrays in constant memory. The array that needs to be operated on is specified by the host code. The purpose is to implement a filter than can use multiple window functions. In my application, I will have to switch between 4 different windows very often, so I’m trying to avoid having to pass the windows across the PCI bus every time I run the filter.
My approach is to pass a pointer to the kernel for the particular array that is required. I have written test code below that doesn’t print the expected values.
[codebox]#include <stdio.h>
#include <cutil.h>
//Allocate constant memory, constantA - constantD
//shortened for readability
#define SIZE 10
device constant float constantA;
//Return pointers to the constant memory
float* getA(){
return constantA;
}
//kernel copies data from constant memory to global memory
global void getConstant(float* d_ptr, const float* constant){
d_ptr[threadIdx.x] = constant[threadIdx.x];
}
int main(){
//Allocate memory
float* h_data = new float;
float* h_constantA = new float;
float* d_data;
CUDA_SAFE_CALL( cudaMalloc( (void **)&d_data, SIZE*sizeof(float)) );
//Initiaze memory
for(int i = 0; i < SIZE; i++)
h_constantA[i] = i;
//Copy memory to constant space on the GPU
CUDA_SAFE_CALL( cudaMemcpyToSymbol(reinterpret_cast<const char*> (getA()), h_constantA, SIZE*sizeof(float)) );
dim3 block(1);
dim3 thread(SIZE);
getConstant<<<block, thread>>>(d_data, getA());
CUDA_SAFE_CALL( cudaMemcpy(h_data, d_data, SIZE*sizeof(float), cudaMemcpyDeviceToHost) );
for (int i = 0; i < SIZE; i++)
printf("%f ",h_data[i]);
}
[/codebox]
If I change the kernel to the code below, it works fine, but doesn’t allow me to select witch window to use.
[codebox]global void getConstant(float* d_ptr, const float* constant){
d_ptr[threadIdx.x] = constantA[threadIdx.x];
}
[/codebox]
Does anyone have any ideas on how to make this work?
Regards,
Steve