Cross compile Riva/TRT model for different Cuda Compute versions

Hardware - GPU (T4 and 3090)
Operating System: Amazon Linux 2
Riva Version: 1.8.0b0

Is it possible to riva-build and riva-deploy a model compiled for different TRT architectures?
I have a local RTX3090 (Compute capability 8.6) and would like to deploy on to a T4 (AWS g4dn instance, compute capability 7.5).
Currently I need to spin up a g4dn instance and build there before deploying, otherwise, I cannot use the model built on my own machine.

Is this a Riva or TRT limitation?

1 Like

This is a TensorRT limitation. The deploy process optimizes beyond even compute capability, and instead is looking at amount of memory available for tactics (i.e. ways to compute a layer), SM count, etc. The process you describe is our recommended approach.

We’re internally tracking this request for more flexible deployment mechanisms. Thanks for the feedback.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.