If the BuilderFlag::kFP16
is on, is the input of fp32 to fp16 auto-casted for an fp16 plugin?
I have a pure fp16 plugin at the top of a network, so the following code is needed to inject the input
config->setFlag(BuilderFlag::kFP16); // fp16 on
..... // intemediate code
// Question here: should in_features_ be fp32 or fp16?
GPU_CHECK(cudaMemcpyAsync(in_buffers_[0], in_features_, size_in, cudaMemcpyDeviceToDevice, stream_));
context_->enqueueV2(in_buffers_, stream_, nullptr); // expect FP16 running
Do I need explicitly convert the in_features_
to fp16 before the data transfer?
Thanks!