NVRTC without installed toolkit

Hello, I need to distribute NVRTC in a nuget package without the possibility to install the CUDA toolkit. In the end I need a tool that allows to create PTX code but I deliver all the necessary components. So there is

  • no toolkit installed
  • no visual studio installed.

With that I’m using clang to create the syntax tree. Everything works fine but I have some code that is using FLT_MAX or std::numeric_limits<float>::max();. So I added the Cxx includes, but NVRTC complains that the __config_site is not available. It is there but it has the in extension .in

Said differently, what is the best option to deliver nvrtc without installation?

If it were me, I would say the best option for me would be to deliver code that does not depend on various headers, or else to include those headers and all necessary dependencies in my distribution.

Whether or not the redistribution of “extras” is covered by appropriate EULAs I cannot say. If there is concern there it simply biases towards the first choice – deliver code that does not depend on various headers.

FLT_MAX should be easy enough to work around without pulling in a whole set of dependent headers.

This may also be of interest in the context of making nvrtc easier to work with; I’m not suggesting its a direct answer to anything you’ve asked here.

@Robert_Crovella: Thanks for your feedback. I took a closer look at what I’m doing. So just to roll it up a bit more detailed:

  • I’m using clang to get the syntax tree. From this syntax tree I create C# wrappers for our libraries
    This means I have to distribute clang and the llvm/clang headers. The question is, if I really need the CUDA/clang-headers delivered by LLVM or if I could simply use the CUDA header from the toolkit. But it seems clang needs it currently, but I don’t have any infos on this.

  • On the other side I’m using NVRTC to create the PTX code
    Originally we did this with NVCC/Visual Studio but there always was a problem with different Visual Studio versions and different CUDA versions. No one ever did this right.

This is the reason why we bundled this as an internal design time packages where all prerequisites are delivered as we wish, meaning we want a well-defined CUDA toolkit version and a well-defined compiler. With the NVRTC solution we don’t even need to define the compiler anymore and there is no installation necessary on the build servers.

(Note: regarding your concern with EULA, this is an internal package it does not get distributed).

The only thing I’m currently struggling is clang with -cudanoinc. It seems to me that the syntax tree cannot be created.