SVTProtocol - Difference between implementation and algo 2 in paper

We are setting up a Federated Learning System including differential privacy and we have noticed that the implementation of the SVTProtocol is not consistent with the implementation in the paper Li et al., Privacy-preserving Federated Brain Tumour Segmentation, arXiv:1910.00962. The implementation differs from the paper in the following points:

  1. calculation of the threshold, Algo 2, Line 5: In the paper it is Lap(s/eps_2) and in the implementation Lap(s/eps_1)
  2. calculation of eps_2: The paper proposes eps_2 = ((2qs)**(2/3)) * eps_1 and in the implementation it is eps_2 = ((2q)**(2/3)) * eps_1
  3. calculation of threshold noise, Algo 2, Line 8: In the paper it is Lap(2qs/eps_1) and in the implementation Lap(2qs/eps_2)
  4. calculation noisy answer, Algo 2, Line 9: In the paper it is Lap(qs/eps_3) and in the implementation Lap(s/eps_3)

The following questions arise for us now:

  1. is the implementation correct and what are the thoughts behind the changes?
  2. what guarantee does the implemented algorithm have? Is the guarantee (eps_1 + eps_2 + eps_3)-DP from the paper still valid?

Thank you for your support!

All the best,

Hi Iwan,

Appreciate your interest in Clara Train and FL. I had a conversation with the authors of the paper and have some feedback.

First of all, eps1, eps2, eps3, q, s are all free parameters that control the trade-off between the DP level and model performance. As long as they are positive values, they shouldn’t break the DP procedure.

That said, issue #2 is indeed a typo in the paper, it should be eps1 = (2q)(2/3)*eps2 (this is a heuristic solution from page 6 of so that we have one less free param. to tune).

Issue #1, #3, #4 are just naming convention discrepancies, where eps1, eps2 in the paper are eps2, eps1 in the code respectively; eps3 in the paper is eps3/q in the code — this decouples q and eps_3 when constructing the last Lap. distribution, so that it’s slightly easier to debug manually.

Proper choice of these parameters still heavily depends on the actual learning rate, underlying training tasks, optimization method and network structure. For param. tuning and debugging purposes, it’s helpful to experiment with a relevant validation set by visualizing the distribution of the model weights and monitoring the actual thresholds.