HDS

Exercise 5.5: Gaussian and Rademacher Complexity

chapter 5

(a)

If $w_i \sim \Normal(0, 1)$, then $w_i = \e_i r_i$ where $\e_i \sim \Rad(\tfrac12)$ and $r_i$ is a half-Gaussian random variable. Consider \begin{equation} \phi(r) = \E[\,\sup_{\theta \in \Tb}\,\lra{\theta, r \had \e}], \end{equation} which we note is a convex function. Then, by Jensen’s inequality, \begin{equation} \G(\Tb) = \E[\phi(r)] \ge \phi(\E[r]) = \E[r_1] \phi(1) = \sqrt{\frac 2 \pi} \Rc(\Tb). \end{equation} Equality is achieved at $\set{-1, 1}$.

(b)

Consider the function \begin{equation} \psi_r(\theta) = \frac{r \had \theta}{\max_{i \in [d]}\abs{r_i}}, \end{equation} which we note is a contraction. Then, assuming the contraction inequality for the Rademacher complexity, we have \begin{equation} \Rc(\psi_r(\Tb)) \le \Rc(\Tb) \implies \phi(r) \le \parens{\max_{i \in [d]}\,\abs{r_i}} \Rc(\Tb). \end{equation} Take expectations to conclude: \begin{equation} \G(\Tb) = \E[\phi(r)] \le \E\sbrac{\max_{i \in [d]}\,\abs{r_i}} \Rc(\Tb) \le 2 \sqrt{\log d}\, \Rc(\Tb). \end{equation}

Published on 8 April 2021.