Feature Proposal: Enable Deterministic Algorithms in Triton server PyTorch Backend

Is your feature request related to a problem? Please describe.

For many use cases, such as scientific research, debugging, or regulated environments, having reproducible model outputs is a critical requirement. The PyTorch backend currently lacks a direct mechanism to enforce deterministic algorithm execution, which can lead to slight variations in output even with the same inputs.

Describe the solution you’d like

We propose introducing a new model configuration parameter to enable deterministic operations within the PyTorch backend. This parameter would trigger at::Context::setDeterministicAlgorithms(true) during model loading or execution, ensuring that PyTorch uses deterministic algorithms whenever possible.

This would provide users with a simple and effective way to guarantee reproducibility for their models served via Triton.

Example config.pbtxt:

parameters: {
  key: "DETERMINISTIC",
  value: { string_value: "true" }
}

Describe alternatives you’ve considered
N/A

Additional context

I’ve opened a small PR to add this feature. Would appreciate a review: Enable configurable deterministic algorithm flag by yhna940 · Pull Request #150 · triton-inference-server/pytorch_backend · GitHub