Hello, if we want to have predictable host latency when calling CuDNN APIs(GPU time can be sacrificed a little), then want to disable RTC(compiling cuda kernels at runtime) at runtime, so is there any method for us to do so?
Hi @csdncannon ,
Each engine has a behavior note, which includes a bit that says whether or not the engine uses RTC. You can filter engines based on that note.
Thanks
1 Like
Thanks a lot, so we need to switch to CuDNN-Backend APIs. Seems like currently Tensorflow/PyTorch still doesn’t accomodate a lot of fusion patterns supported by Backend API, do you have some updates on making it(Backend API) more popular?
@csdncannon we are actually pushing the adoption of our Frontend as it offers a level of abstraction that shortens development time and offers greater flexibility.
1 Like
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.