Exercise 2.21: Concentration and Data Compression
(a)
Set \(N = e^{n R}\) and \begin{equation} Y_j = \frac1n \sum_{i=1}^n \ind(X_i \neq z_i^j) \quad\text{with}\quad \ind(X_i \neq z_i^j) \overset{\text{i.i.d.}}{\sim} \Ber(\tfrac12). \end{equation} Then, by a union bound, \begin{equation} \P(d(X) \le \delta) = \P(\min_{j \in [N]} Y_j \le \delta) \le N \P(Y_j \le \delta) \le e^{n (R - \KL(\delta, 1/2))} \to 0 \end{equation} since \(R < \KL(\delta, 1/2)\) by assumption.
(b)
(i)
Let \(V\) be a random variable such that \(V \in \set{0} \cup [1, \infty)\). Then, by Cauchy–Schwarz, \begin{equation} \E[V]^2 = \E[V \ind(V \ge 1)]^2 \le \E[V^2] \P(V \ge 1) \implies \P(V \ge 1) \ge \frac{\E[V]^2}{\E[V^2]}. \end{equation} Set \(V_j = \ind(Y_j \le \delta)\) and \(V = \sum_{j=1}^N V_j\). Then \(V\) is such a random variable.
(ii)
Fix \(X\) and observe that \((Y_j)_{j=1}^N\) are independent. Then \begin{equation} \P(\min_{j \in [N]} Y_j \le \delta) = 1 - \P(\text{all } Y_j > \delta) = 1 - (1 - \P(Y_1 \le \delta))^N. \end{equation} Using that \((1 - x/n)^n \uparrow e^{-x}\) for all \(x \ge 0\), we find \begin{equation} \P(\min_{j \in [N]} Y_j \le \delta) \ge 1 - \exp(-N \P(Y_1 \le \delta)). \end{equation} Hence, we desire that \(N \P(Y_1 \le \delta) \to \infty\). To wit, \begin{equation} N \P(Y_1 \le \delta) \ge \frac{N}{n + 1} e^{-n \KL(\delta, 1/2)} \ge \frac{1}{n + 1} e^{n (R - \KL(\delta, 1/2))} \to \infty \end{equation} since \(R > \KL(\delta, 1/2)\) by assumption. We can generalise the result to random \(X\) by observing that
\[\P(\min_{j \in [N]} Y_j \le \delta \cond X) \to 0\]almost surely, taking the expectation with respect to \(X\), and interchanging limit and expectation.
Notes
-
The exercise actually sets \(N = 2^{n R}\), but the above results are easily adapted to this case.
-
The hints of the exercises suggest to prove the results by using \(V\), which we did not do.