Is it 8 channels for input and output image is required for best performance?

I read in this article Convolutional Layers User Guide :: NVIDIA Deep Learning Performance Documentation what for best performance I must use 8 channels for input and output image (autoencoder model in tensorflow). So how to transform image from 3 channels to 8 channels in tensorflow?

Request you to share the model, script, profiler and performance output if not shared already so that we can help you better.
Alternatively, you can try running your model with trtexec command.

While measuring the model performance, make sure you consider the latency and throughput of the network inference, excluding the data pre and post-processing overhead.
Please refer below link for more details: