TRT5 Fail to run SSD640*640 But SSD512*512

My code is based on sampleUffSSD.cpp.
sampleUffSSD.cpp is ok to run SSD which inputs is 300300. When I modify my Tensorflow PB on inputshape from 300300 to 512512,(of course I modify It works and output is correct, but when I modify inputshape from 512512 to 640*640,It has problem.
It can run a few circle without any resault and print “cuda fail: 77cuda”.
Can I guess the plugin such as NMS,PriorBox_TRT,FlattenConcat_TRT ect. can’t support such big inputshape?

I find a mistakes in my I wrote a wrong plugin name.When I correct the name,I world well.