Open Problem - Avoiding the Union Bound for Multiple Queries

Background: Perhaps the best-studied problem in differential privacy is answering multiple counting queries. The standard approach is to add independent, appropriately-calibrated (Laplace or Gaussian) noise to each query result and apply a composition theorem. To bound the maximum error over the query answers, one takes a union bound over the independent noise samples. However, this is not optimal. The problem is to identify the optimal method (up to constant factors).

Problem 1: Is there a randomized algorithm $M : \{0,1\}^{n \times k} \rightarrow \mathbb{R}^k$ that is differentially private¹ and satisfies \[ \forall x \in \{0, 1\}^{n \times k} \quad \mathbb{E}\left[ \left\|M(x) - \sum_{i=1}^n x_i \right\|_\infty \right] \leq c \sqrt{k} \] for some constant $ c > 0$ depending only on the privacy parameters?²

Adding independent Gaussian noise to each coordinate/query yields $c \sqrt{k \log k}$ in place of $c \sqrt{k}$ above. Steinke and Ullman [SU17] showed that the union bound can almost be avoided and obtained $c \sqrt{k \log \log k}$ using correlated noise. The algorithm is nonetheless based on independent Gaussian noise, with the added step of using the exponential mechanism to identify high-error answers and correct them.

Note that a $\Omega(\sqrt{k})$ lower bound is known [BUV18, SU17]. By [BDKT12] it suffices to consider mechanisms $M$ that add instance-independent noise. That is, $M(x) = \sum_{i=1}^n x_i + Z$ where $Z$ is some fixed noise distribution over $\mathbb{R}^k$ that is independent of $x$.

Reward: For a positive solution, an all-you-can-eat sushi dinner at a sushi restaurant of your choice. If the solution is an efficiently-sampleable distribution with a closed-form density, alcohol will be included. For a negative solution, alcohol only.³

Other related work: [DSSUV15, Vad17, AS18]. A very recent work by Ganesh and Zhao [GZ20] improves the best upper bound from $c\sqrt{k \log \log k}$ to $c\sqrt{k \log \log \log k}$.

Submitted by Thomas Steinke and Jonathan Ullman on April 9, 2019.

Specifically, $M$ is either 1-zCDP [BS16] with $c$ an absolute constant or, for every $\delta > 0$, there is an $M_\delta$ that is $(1, \delta)$-DP with $c = c(\delta) = c’ \cdot \sqrt{\log (1/\delta)}$ for an absolute constant $c’$. ↩
Here $x_i \in \{0, 1\}^k \subset \mathbb{R}^k$ denotes the data vector of individual $i$ and $x = (x_1, x_2, \dots, x_n) \in \{0,1\}^{n \times k}$. For simplicity, we only consider expected error; high-probability error bounds are an immediate consequence. ↩
Restaurant need not necessarily be all-you-can-eat. Maximum redeemable value US$500. ↩

Posted by Gautam Kamath on October 3, 2020.

Categories: Open Problems

[cite this]

Subscribe to updates from DifferentialPrivacy.org by Email, on Twitter, or via RSS.