Give me the formula for calculating the blocks per grid in case of tiling

Suppose, I need ROWS*COLS number of threads for tiled matrix multiplication.

The threads-per-block is given as -

dim3 threads_per_blocks(
BLOCK_1, BLOCK_2, BLOCK_3); 

and the tile dimensions are given as TILE_1*TILE_2.

Give me the formula for calculating the

dim3 blocks_per_grid(?, ?, ?);

?

If 1 block processes 1 tile of the output matrix, compute the number of tiles in the output matrix in each dimension