Exercise 2.7: Bennett's Inequality
(a)
Consider \(X\) with zero mean and \(\abs{X} \le b\). We have \begin{equation} \E[(\lambda X)^n] \le \E[\lambda^2 X^2] (\lambda b)^{n-2} = \lambda^2 \sigma^2 (\lambda b)^{n-2}. \end{equation} Therefore, \begin{equation} \E[e^{\lambda X}] = 1 + \sum_{n=2}^\infty \frac{1}{n!} \E[(\lambda X)^n] \le 1 + \lambda^2 \sigma^2\sum_{n=2}^\infty \frac{1}{n!} (\lambda b)^{n - 2} = 1 + \frac{\lambda^2 \sigma^2}{(\lambda b)^2}\parens{ \sum_{n=0}^\infty \frac{1}{n!} (\lambda b)^n - 1 - \lambda b }. \end{equation} Defining \(f(x) = (e^x - 1 - x)/x^2\) and using that \(1 + x\le e^x\) for \(x \ge 0\), we hence have that \begin{equation} \E[e^{\lambda X}] \le \exp\parens{\lambda^2 \sigma^2 f(\lambda b)}. \end{equation}
(b)
Set \(X = \sum_{i=1}^n X_i\) and \(\overline{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n \sigma_i^2.\) For \(\lambda > 0\), we have \begin{equation} \log \P(X \ge n\delta) \le -\lambda n \delta + \sum_{i=1}^n \log \E[e^{\lambda X_i}] \le - \lambda n\delta + n \lambda^2 \overline{\sigma}^2 f(\lambda b). \end{equation} Reorganising, we find \begin{equation} \log \P(X \ge n\delta) \le -\frac{n\overline{\sigma}^2}{b^2}\parens{ \frac{b \delta}{\overline{\sigma}^2} (\lambda b) - e^{\lambda b} + \lambda b + 1 }. \end{equation} Consider \begin{equation} h(t) = \inf_{x > 0}\,(tx - e^x + x + 1). \end{equation} Setting the derivative to zero gives \begin{equation} t - e^x + 1 = 0 \implies x = \log(t + 1) > 0 \implies h(t) = (t + 1) \log(t + 1) - t. \end{equation} This gives \begin{equation} \P(X \ge n\delta) \le \exp\parens{ -\frac{n\overline{\sigma}^2}{b^2}h\parens{ \frac{b \delta}{\overline{\sigma}^2} } }. \end{equation}
(c)
To show that Bennett’s inequality is as least as good as Bernstein’s inequality, we need to show that \begin{equation} - \frac{\sigma^2}{b^2} h\parens{ \frac{b t}{\sigma^2} } \le - \frac{t^2}{2(\sigma^2 + bt)} = -\frac{\sigma^2}{b^2} \frac{(bt)^2/\sigma^4}{2(1 + bt/\sigma^2)} =: -\frac{\sigma^2}{b^2}g\parens{ \frac{bt}{\sigma^2} }. \end{equation} Thus, we desire \begin{equation} h(t) = (t + 1)\log(t + 1) - t \ge \frac{t^2}{2(1 + t)} = g(t). \end{equation} To show this, we apply the Fundamental theorem of calculus twice, noting that \(h(0)=g(0)\), \(h'(0)=g'(0)\), and \begin{equation} h^{\prime\prime}(t) = \frac{1}{1+t} \ge \frac{1}{(1+t)^3} = g^{\prime\prime}(t). \end{equation}