# NeurIPS 2023 Outstanding Paper: Privacy auditing in just one run

NeurIPS 2023 just wrapped up, and one of the two outstanding paper awards went to Privacy Auditing with One (1) Training Run, by Thomas Steinke, Milad Nasr, and Matthew Jagielski. The main result of this paper is a method for auditing the (differential) privacy guarantees of an algorithm, but much faster and more practically than previous methods. In this post, we’ll dive into what this all means.

In case you’re new to this: by now, it has been well established that ML models can leak information about their training data.
This has recently been demonstrated in a spectacular fashion for large language models and diffusion models, showing that these models are prone to *regurgitating* elements from their training dataset verbatim.
Beyond these models, training data leakage can occur to a variety of degrees in other statistical settings.
This can of course be problematic if the training data contains sensitive personal information that we do not wish to disclose.
It may may also be relevant to other adjacent considerations, including copyright infringement, which we don’t delve into here.

While there have been a number of heuristic proposals for how to deal with such problems, only one method has stood the test of time: differential privacy (DP). Roughly speaking, an algorithm (e.g., a model’s training procedure) is differentially private if its output has limited dependence (in some precise sense) on any single datapoint. This has many convenient implications: if a training procedure is differentially private, the resulting model is very unlikely to spit out training data, it is hard to predict whether a particular datapoint was in its training dataset, etc. This strong notion of privacy has been adopted by a number of organizations, including Google, Microsoft, and the US Census Bureau in the 2020 US Census. Differential privacy is a quantitative guarantee, parameterized by a value \(\varepsilon \geq 0\): the smaller \(\varepsilon\) is, the stronger the privacy protection (albeit at the cost of utility).

In order to say an algorithm is differentially private, we have to *prove* it.
By analyzing the algorithm, we obtain an *upper bound* on the value of \(\varepsilon\), i.e., a guarantee that the algorithm satsfies *at least* some prescribed level of privacy.
And we can be confident in this guarantee without running a single line of code!
A rich line of work studies a differentially private analogue of stochastic gradient descent (which includes per-example gradient clipping followed by Gaussian noise addition), providing tighter and tighter upper bounds on the value of \(\varepsilon\).

Is there any way to empirically *audit* the privacy of an algorithm?
Provided a purportedly private procedure, is there an algorithm we can run to *lower bound* the value of \(\varepsilon\)?
This would discover that the procedure enjoys privacy no better than some particular level.
There’s many reasons one might want to audit an algorithm’s privacy guarantees:

- We can see if our privacy proof is tight: if we prove and audit matching values of \(\varepsilon\), then we know that neither can be improved.
- We can see if our privacy proof is
*wrong*: if we audit a value of \(\varepsilon\) that is*greater*than the value we prove, then we know there was a bug in our privacy proof. - If we’re unable to rigorously prove an algorithm is private, auditing gives some heuristic measure of how private the algorithm is (though this is not considered best practice in settings where privacy is paramount: auditing only lower bounds \(\varepsilon\), the true value may be much higher).

There is a long line of work on this question from the perspective of *membership inference attacks*.
In a membership inference attack, we consider training a model on either a) some training dataset, or b) the same training dataset but with the inclusion of one extra datapoint (sometimes called a *canary*).
If we can correctly guess whether the canary was or was not in the training set, then we say the membership inference attack was successful.
However, recall that differential privacy limits the dependence on individual datapoints: if an algorithm is private, it means that membership inference attacks should not be very successful.
Conversely, if an attack *is* very successful, then it say the algorithm is quantitatively *not* so private.
In other words, such membership inference attacks serve as an auditing for the privacy of the algorithm.

An important technical point is that differential privacy is a *probabilistic* guarantee.
A single membership inference attack success or failure may happen by chance: in order to make conclusions about the privacy level of a procedure, we need to run the attack several times in order to estimate the *rate* of success.
Since for machine learning models, each attack corresponds to one training run, this can quickly result in prohibitive overheads.
As one extreme example, one work trains 250,000 models to audit a proposed private training algorithm, revealing a bug in its privacy proof.
While these are small models (CNNs trained on MNIST), and the authors admit their auditing was overkill (they *only* needed to train 1,000 models), in modern settings, even a *single* extra training run is prohibitively expensive, thus rendering such privacy auditing methods impractical.

Here’s where the work of Steinke, Nasr, and Jagielski comes in: it performs privacy auditing with just one (1) training run.
This could even be the same as your actual training run, thus incurring minimal overhead with respect to the standard training pipeline.
Their method does this by randomly inserting *multiple* canaries into the dataset rather than just a single one, and privacy is audited by trying to guess which canaries were and were not trained on.
If one can correctly guess the status of many canaries, this implies that the procedure is not very private.
The analysis of this framework is the tricky part, and gets quite technical.
While textbook analysis of the addition/removal of multiple canaries would rely on a property of differential privacy known as “group privacy,” this turns out to be lossy.
Instead, the authors appeal to connections between differential privacy and generalization: they show that if you add multiple canaries i.i.d. for a single run, this behaves similarly to having multiple runs each with a single canary.

In short, this work is a breakthrough in privacy auditing. It allows us to substantially reduce the computational overhead, from prohibitive to essentially negligible. Up to this point, privacy auditing has mostly been employed by those with a surplus of compute: I’m excited to see how this work will make it more accessible to the GPU-poor. Congratulations to Thomas, Milad, and Matthew on their fantastic result!

[cite this]