TensorRT Options for Deployment/Serving to be Called by Multiple PCs

I have created a TensorRT optimized image classification model and would like to deploy it to our own locally hosted GPU-enabled server so that multiple desktop client workstations can make inference calls to it in a production setting. The only ways I have found to deploy a TensorRT model this way is to use DGX-1, DGX Station, or the NVIDIA GPU Cloud. All of these options cost tens or hundreds of thousands of dollars, which is way out of our budget. Is there a cost-effective way to deploy TensorRT models to our own dedicated server?