Generally speaking, different GPU architectures are not binary compatible. It is therefore a best practice to build so called “fat” binaries that contain the machine code for each GPU architecture one intends to support with a given product. In addition, for forward compatibility, the fat binary should include code in a portable intermediate virtual machine language called PTX. This can be JIT-compiled at runtime into the machine code for the GPU currently in use.
When a CUDA-accelerated app starts, the CUDA runtime first looks for machine code for the specific architecture of the GPU currently in use. If that cannot be found, it looks for PTX that can be JIT compiled. If that isn’t found it gives up, leading to the sort of error you encountered.
Check whether the app vendor has a new version of this app available that supports the Ampere architecture (compute capability 8.x; specifically 8.6 for the RTX 3060 Ti). You might also want to file an enhancement request with the vendor to include PTX in their fat binaries, as that apparently isn’t present, leading to lack of forward compatibility.