TensorRT 4.0.0.3 INT8 reformat bug ?

When attempting INT8 calibration (using either the Legacy or Entropy calibrator) I get the following error
when my input images are larger than 2880x2880 :

INFO: --------------- Timing (9)
ERROR: reformat.cu (745) - Cuda Error in NCHWToNCQHW4: 9
ERROR: reformat.cu (745) - Cuda Error in NCHWToNCQHW4: 9

Error code 9 (Invalid Configuration) may indicate that some cuda block dimension may be too large.

For any smaller input sizes all goes well and INT8 inference results are good.