Unfortunately the bad news is that OptiX Prime is not compatible with Ampere GPUs on recent display drivers even in the OptiX SDK 6.5.0 release.
That API has been removed from OptiX 7 SDKs because is does not make use of the RTX ray tracing hardware units.
The only feasible future proof solution would be to port the OptiX Prime application over to the OptiX 7 API and that should actually not be too difficult because of the limited features the OptiX Prime API offered.
The OptiX SDKs contain an example named optixRaycasting for quite some time now (even in OptiX 6 versions) which demonstrates the below things.
OptiX Prime applications would only handle the ray-triangle intersection part with it, in an also limited acceleration structure hierarchy (one instance level over triangle geometry). Everything around that, means ray generation and shading calculations, would happen outside of that, usually in native CUDA kernels. That part can be completely reused when just implementing the ray-triangle intersection with OptiX 7 instead.
OptiX Prime only supported a completely flat hierarchy (triangle geometry only) or a single-level hierarchy (instances of triangle geometry) which are fully hardware accelerated cases in OptiX 7 on RTX boards (e.g. look for OPTIX_TRAVERSABLE_GRAPH_FLAG_ALLOW_SINGLE_LEVEL_INSTANCING) inside the OptiX 7 docs.
Building the same kind of acceleration structure and defining geometric primitives would need to be changed inside the host code. On the device side, the ray generation program takes your ray query data and shoots the rays. There would only need to be one closest hit program because all that does is returning hit results from triangles. A miss program wouldn’t be needed since that could be covered by the default initialization of the hit result (negative t_hit to indicate miss).
The benefit of using the OptiX 7 API would be full RTX hardware acceleration of the BVH traversal and ray-triangle intersection, and additionally you could handle other primitive types, have a more flexible scene hierarchy, fully custom ray query and hit result data, and some more options.
Once that is working, using the whole ray pipeline by also moving the ray generation and shading calculations into OptiX 7 device code would allow to increase the performance even more.
Meanwhile is there a way to programmatically find if Optix5.1 is not compatible on current device, so I can error appropriately instead of crashing?
You could query the CUDA device properties with the CUDA Runtime API resp. device attributes with the CUDA driver API for the streaming multi-processor version and reject too new architectures.