How does CPU launch the CUDA kernel?

Hi all,

I am pretty new here. Just started coding in CUDA :rolleyes:

I have a few lower level questions that I am curious about:

  1. How does the CPU assign work to GPU? Does it send an address of the instructions and data to GPU and later GPU fetches it? or Does it send the all the instructions and data at once?
    If the full code is being sent, then what happened if the code i wrote exceeds the GPU memory size?
    If the GPU only gets the addresses of the instructions then, where will it be fetching from?

  2. Is the “pinned memory” used to store data and instructions or data only?

  3. What are the places inside the GPU where the code resides?

Hi all,

I am pretty new here. Just started coding in CUDA :rolleyes:

I have a few lower level questions that I am curious about:

  1. How does the CPU assign work to GPU? Does it send an address of the instructions and data to GPU and later GPU fetches it? or Does it send the all the instructions and data at once?
    If the full code is being sent, then what happened if the code i wrote exceeds the GPU memory size?
    If the GPU only gets the addresses of the instructions then, where will it be fetching from?

  2. Is the “pinned memory” used to store data and instructions or data only?

  3. What are the places inside the GPU where the code resides?

Good one… “Ocelot” might have some clues. Read the original papers of Ocelot (from Gregory Diamos and others). I think it tells you how the EXE file stores the CUDA executable code etc…

[url=“http://www.cercs.gatech.edu/tech-reports/tr2009/git-cercs-09-18.pdf”]http://www.cercs.gatech.edu/tech-reports/t...cercs-09-18.pdf[/url]
[url=“http://www.gdiamos.net/papers/ocelot-nvidia-research.pdf”]http://www.gdiamos.net/papers/ocelot-nvidia-research.pdf[/url]

You may find it in one of these papers… My vague memory…

Good one… “Ocelot” might have some clues. Read the original papers of Ocelot (from Gregory Diamos and others). I think it tells you how the EXE file stores the CUDA executable code etc…

[url=“http://www.cercs.gatech.edu/tech-reports/tr2009/git-cercs-09-18.pdf”]http://www.cercs.gatech.edu/tech-reports/t...cercs-09-18.pdf[/url]
[url=“http://www.gdiamos.net/papers/ocelot-nvidia-research.pdf”]http://www.gdiamos.net/papers/ocelot-nvidia-research.pdf[/url]

You may find it in one of these papers… My vague memory…

Yes. I found this link very useful. Now I understand what is going on.

Thank you!

Yes. I found this link very useful. Now I understand what is going on.

Thank you!

You can do a “nvcc -cuda x.cu” to generate the complete .CPP file… You can look @ the constructors, binaries etc… over there – if you need detailed info… (OR) You can decipher the same from Ocelot’s source code

You can do a “nvcc -cuda x.cu” to generate the complete .CPP file… You can look @ the constructors, binaries etc… over there – if you need detailed info… (OR) You can decipher the same from Ocelot’s source code