Problems about iplugin layer in TensorRT-2.1

Hello,

I want to iplugin the argmax layer of caffe in TensorRT2.1,and I write the enqueue() like this:
int enqueue (int batchSize, const voidconst inputs, void* outputs, void, cudaStream_t stream) override
{
const float* bottom_data=reinterpret_cast<const float*>(inputs[0]);
float top_data = reinterpret_cast<float>(outputs[0]);
int dim,axis_dist,top_k,axis,num_top_axes,num;
dim=12;axis_dist=360480;top_k=1;axis=1;num_top_axes=4;
num=dim
360*480/dim;
std::vector<std::pair<float, int> > bottom_data_vector(dim);
for (int i = 0; i < num; ++i) {
for (int j = 0; j < dim; ++j) {
bottom_data_vector[j] = std::make_pair(
bottom_data[(i / axis_dist * dim + j) * axis_dist + i % axis_dist], j);
}

std::partial_sort(
    bottom_data_vector.begin(), bottom_data_vector.begin() + top_k,
    bottom_data_vector.end(), std::greater<std::pair<float, int> >());

for (int j = 0; j < top_k; ++j) {
      top_data[(i / axis_dist * top_k + j) * axis_dist + i % axis_dist]
        = bottom_data_vector[j].first;  

                                } 
  }
return 0;
}

However,every time the process goes to the line:
“bottom_data_vector[j] = std::make_pair(
bottom_data[(i / axis_dist * dim + j) * axis_dist + i % axis_dist], j);”
it errors as: Segmentation fault (core dumped)
,and the size of bottom data is 12360480*sizeof(float),so what may cause this problem? Thanks.

Hi,

Tensor is a GPU buffer, can’t be accessed directly via C++.

There are two solutions:
1. Memcpy: Copy the buffer back to CPU -> Apply process -> Copy the buffer back to GPU
2. CUDA: Please handle your implementation with CUDA

Thanks.