I’ve got a problem. I’d like to pass a vector of structure from the host to the device, in order to do some computation on it, and then pass it back to the host.
I’m using a std::vector of a custom structure. Here is my code :
#include <iostream>
#include <string>
#include <vector>
#include <cuda.h>
using namespace std;
//STRUCTURE DEFINITION
struct prgm_structure
{
string name;
int fitness;
};
//VECTOR DEFINITION
typedef vector<prgm_structure> prgm_vector;
//KERNELL FUNCTION
__global__ void calcul_fitness_GPU_Device(prgm_vector* population)
{
int fitness;
fitness = population[threadIdx.x].fitness;
}
//MAIN FUNCTION
int main()
{
//Creation and initialisation of the vector "population_Host"
prgm_vector population_Host;
initialisation_population(population_Host);
//Allocation and load of the vector "population_Device"
prgm_vector* population_Device;
int size = population_Host.size() * sizeof(prgm_structure);
cudaMalloc((void**)&population_Device, size);
cudaMemcpy(population_Device, &population_Host, size, cudaMemcpyHostToDevice);
//Call of the kernell function
dim3 dimBlock(16);
dim3 dimGrid(1);
calcul_fitness_GPU_Device<<<dimGrid, dimBlock>>>(population_Device);
}
This code doesn’t compile. I don’t manage to access the elements of the vector once in the kernell. I got an error at this line in the kernell :
fitness = population[threadIdx.x].fitness;
The error is :
If change this line to :
fitness = population.at(threadIdx.x).fitness;
The error become :
And if i change it again to :
fitness = population->at(threadIdx.x).fitness;
Then the error become :
I don’t understand, and i’ve no more idea… :">
Is it possible to pass a vector of structure to the device kernell? If yes, does somebody know how to correct my code?
An std::vector, unless you use a custom allocator, only holds host memory. Use a pointer instead, or use the vector class in the CUPP project
//VECTOR DEFINITION
typedef vector<prgm_structure> prgm_vector; // for host memory only!!!
//KERNELL FUNCTION
__global__ void calcul_fitness_GPU_Device(prgm_structure* population)
{
int fitness = population[threadIdx.x].fitness;
}
//MAIN FUNCTION
int main()
{
//Creation and initialisation of the vector "population_Host"
prgm_vector population_Host;
initialisation_population(population_Host);
//Allocation and load of the vector "population_Device"
prgm_structure* population_Device;
int size = population_Host.size() * sizeof(prgm_structure);
cudaMalloc((void**)&population_Device, size);
cudaMemcpy(population_Device, &population_Host[0], size, cudaMemcpyHostToDevice);
//Call of the kernell function
dim3 dimBlock(16);
dim3 dimGrid(1);
calcul_fitness_GPU_Device<<<dimGrid, dimBlock>>>(population_Device);
}
You should also try to make sure you are not accessing memory outside what you allocated.
The address of the first element in the vector, not the address of the vector itself, is the pointer to the underlying array that you can memcopy. In *rocha’s code, it’s this part
cudaMemcpy(population_Device, &population_Host[0], size, cudaMemcpyHostToDevice);
Note taking address of the [0] element
No it works. No need to pass the whole vector : the adress of its first element is enough. If you know the adress of the first element, you can access the following ones…