Dynamic parallelism and streams

If i execute a parent kernel on a non default stream does its child kernels automatically executes on the parent stream?

No, streams created on the host cannot be used on the device, and vice-versa.
You can find more information on cdp1 here: CUDA C++ Programming Guide

I didn’t understand the answer
I create the stream on the host and call the parent kernel with. are the child kernel executed on the default stream or on the parent stream

Neither of both streams is used. The Default Stream within a Kernel is different from the default stream of the host.
But whatever stream you use in the kernel, the created stream does not progress until the parent kernel and all its child kernels have completed. (To be more precise, the parent kernel waits for all child kernels )

global void child()
{
printf(“child\n”);
}

global void parent()
{
printf(“parent\n”);
child <<<1,1>>>();
}

int main(void)
{
cudaStream_t myStream;
cudaCreateStream(&myStream);
parent<<<1,1,0,stream);
cudaStreamSynchronize(myStream);
return 0;
}

All gpu operations (parent and child ) will be on myStream?

The parent runs in myStream , the child runs in the default stream of thread block 0.

cudaStreamSynchronize(myStream) waits for both parent and child.

Thank you for your patience

Ok I think I understand.

What is the meaning of kernel stream?. Is there a kernel stream which is not the default?
Can I create a stream within the kernel ?

Assuming that this is me case:

cudaStream_t myStream1, myStream2;
cudaCreateStream(&myStream1);
cudaCreateStream(&myStream2);
parent<<<1,1,0,stream1)(data1);
parent<<<1,1,0,stream2)(data2);
cudaDeviceSynchronize();

The parent kernels may be executed in parallel, each one on its stream.
But would the children of parent 1 call may also run in parallel to the children of parent 2 call?
Is there any connection between the kernel default stream of parent 1 call to the kernel default stream of parent 2 call?

The section of the programming guide that I linked above answers some of your questions.
There is also this nvidia blog post: CUDA Dynamic Parallelism API and Principles | NVIDIA Technical Blog