Is your feature request related to a problem? Please describe.
For many use cases, such as scientific research, debugging, or regulated environments, having reproducible model outputs is a critical requirement. The PyTorch backend currently lacks a direct mechanism to enforce deterministic algorithm execution, which can lead to slight variations in output even with the same inputs.
Describe the solution you’d like
We propose introducing a new model configuration parameter to enable deterministic operations within the PyTorch backend. This parameter would trigger at::Context::setDeterministicAlgorithms(true) during model loading or execution, ensuring that PyTorch uses deterministic algorithms whenever possible.
This would provide users with a simple and effective way to guarantee reproducibility for their models served via Triton.
Example config.pbtxt:
parameters: {
key: "DETERMINISTIC",
value: { string_value: "true" }
}
Describe alternatives you’ve considered
N/A
Additional context
I’ve opened a small PR to add this feature. Would appreciate a review: Enable configurable deterministic algorithm flag by yhna940 · Pull Request #150 · triton-inference-server/pytorch_backend · GitHub