How to set the size of block and grid for Kernel when using CUDA 'render to frameBuffer' simulating OpenGL, etc?

How to set the size of block and grid for Kernel when using CUDA ‘render to frameBuffer’ simulating OpenGL, etc?
set the block(1,1,1) and grid (imageWidth, imageHeight, 1)?

To a first order approximation, I don’t think the question or an answer to it would be any different than a generic “how do I set CUDA grid/block dimensions?” question. And that might be the most frequently asked CUDA question of all time. So with a bit of searching, you can find many commentaries. A basic starting point could be to pick a block size of say (32,32) and then pick a grid size based on the dimensions of your “framebuffer” (width/32, height/32). This assumes one thread per output point, a common CUDA thread strategy/paradigm in image processing.

gosh, no, never do that in any CUDA setting where you care about performance. So develop your CUDA expertise to the point where you understand why that is, before trying to proceed to a more finely tuned solution for

whatever that may mean.