TensorRT FasterRCNN with pruning throws malloc error

Hi,
I am trying to speed up the fasterRCNN model using channel prunning. I am using model from this repo https://github.com/yihui-he/channel-pruning/releases/tag/faster-RCNN-2X4X. I changed the .prototxt as given below :

input: "data"
input_shape {
  dim: 2
  dim: 3
  dim: 224
  dim: 224
}

input: "im_info"
input_shape {
  dim: 2
  dim: 1
  dim: 1
  dim: 3
}

and Custom layers as

layer {
   bottom: "rpn_cls_score"
   top: "rpn_cls_score_reshape"
   name: "ReshapeCTo2"
   type: "IPlugin"
  #name: "rpn_cls_score_reshape"
  #type: "Reshape"
  #reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }
}

#========= RoI Proposal ============

layer {
  name: "rpn_cls_prob"
  type: "Softmax"
  bottom: "rpn_cls_score_reshape"
  top: "rpn_cls_prob"
}
layer {
  name: 'ReshapeCTo18'
  type: 'IPlugin'
  bottom: 'rpn_cls_prob'
  top: 'rpn_cls_prob_reshape'
  #reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}
layer {
  name: "RPROIFused"
  type: "IPlugin"
  bottom: 'rpn_cls_prob_reshape'
  bottom: 'rpn_bbox_pred'
  bottom: 'conv5_3'
  bottom: 'im_info'
  top: 'rois'
  top: 'pool5'
  #python_param {
  #  module: 'rpn.proposal_layer'
  #  layer: 'ProposalLayer'
  #  param_str: "'feat_stride': 16"
  #}
}

and commented out the ROIPooling layer.
When compiled it is able to build the engine but while deserializing it throws this error “corrupted double-linked list: 0x000000002911c230 *** Aborted (core dumped)”. Can someone help in figuring out this error. What might be the possible reasons for this error and how to debug this?.
Further the size of engine is “Serial Engine Size = 533552608”

I am using Cuda8.0, TensorRT-3.0.0 and platform is TX2.