convert kernel code to a recursive kernel

Dear all
how can I convert this kernel to a recursive kernel , since the kernel must call it self many times for differnt values of rec_num and threadnum ??,
,where the values of rec_num must change as follows:4,8,16,32,64,128,256,512,1024.
and the values of threadnum must change as follows:256,128,64,32,16,8,4,2,2.

the array"that i want to merge (copied to device) has "each two adjacent elements sorted " is passed to the parameter a.

and another initially empty array(on device) is passed to temp parameter.
as follows:
threadsMerge << <numOfBlocks, 256 >> >(dev_a, dev_temp, 4, 256);