Copy partial struct to constant memory

wanderine · March 1, 2011, 6:12pm

I’m trying to do 4D convolution with 14 different filters at the same time, the 4D-filters are of size 9 x 9 x 9 x 9 elements. This would require 367 KB of storage and therefore I can’t store the whole filters in constant memory at the same time. My plan is to instead store 2D slices (9 x 9) of the 4D-filters and then update them as I do the convolution.

I define a struct as

struct float14
{
float a, b, c, d, e, f, g, h, i, j, k, l, m, n;
};

device constant float14 c_Filters[9][9]

such that I can do something like c_Filters[y].a, c_Filters[y].b etc

I’ve however not figured out how I should copy a slice of the filter to the constant memory, I guess that it should be something like this (each filter is stored as [x + y * FILTER_W + z * FILTER_W * FILTER_H + t * FILTER_W * FILTER_H * FILTER_D])

cudaMemcpyToSymbol(&c_Filters, h_Filter_1[z * FILTER_W * FILTER_H + t * FILTER_W * FiLTER_H * FILTER_D], 9 * 9 * sizeof(float)), 0, cudaMemcpyHostToDevice)

Can someone help me?

How is the data for a struct with 14 elements stored? (element + x * NUMBER_OF_ELEMENTS + y * NUMBER_OF_ELEMENTS * FILTER_W ?)