Exercise 3.15: Bounds for Suprema of Non-Negative Functions

chapter 3

Let \((V_i)_{i=1}^n\) be an i.i.d. sequence of random variables and \(\F\) a class of functions taking values in \([0, 1]\). Consider

\begin{equation} Z = \sup_{f \in \F} \sum_{i=1}^n f(V_i). \end{equation}

Without loss of generality, assume that \(\F = \set{f^1, \ldots, f^m}.\) Define

\begin{equation} X_i = (f^1(V_i), \ldots, f^m(V_i)) \end{equation}

and consider the function

\begin{equation} Z(X) = \max_{j \in [m]} \sum_{i=1}^n X^j_i. \end{equation}

(a)

Let \(Y_k(X)\) be equal to \(X\) except with the \(k^{\text{th}}\). This eliminates one term in the sum of \(Z\), hence can only decrease it, so

\begin{equation} Z(X) \ge Z(Y_k(X)) \iff Z(X) - Z(Y_k(X)) \ge 0. \end{equation}

(b)

By tensorisation of the entropy, we have

\begin{equation} \Hb(e^{\lambda Z_k(X)}) \le \E\sbrac{ \sum_{i=1}^n \Hb(e^{\lambda Z_i(X)} \cond X_{-i}) } \end{equation}

where \(Z_i(X)\) is the function \(X_i \to Z(X)\) with all other elements fixed. Therefore, by the variational representation of the entropy,

\begin{equation} \Hb(e^{\lambda Z_k(X)}) \le \E\sbrac{ \sum_{i=1}^n \E[\psi(\lambda(Z(X) - Z(Y_i(X)))) e^{\lambda Z(X)} \cond X_{-i}] } \end{equation}

where \(\psi(u) = e^{-u} - 1 + u\).

(c)

Let \(\Ab_l\) be the collection of realisations of \(X\) that achieves the maximum at index \(l\). Without loss of generality, assume \((\Ab_l)_{l=1}^m\) to be disjoint. Then, if \(X \in \Ab_l\),

\begin{equation} Z(X) - Z(Y_k(X)) = \sum_{i=1}^n X^l_i - Z(Y_k(X)) \le X^l_k \end{equation}

by lower bounding the maximum of \(Z(Y_k(X))\) by index \(l\). Therefore, multiply by \(\lambda \ge 0\) and sum against \(\ind(X \in \Ab_l)\) to find

\begin{equation} 0 \le \lambda (Z(X) - Z(Y_k(X))) \le \lambda \sum_{l = 1}^m \ind (X \in \Ab_l) X^l_k. \end{equation}

(d)

For \(\alpha \le 1\), observe that that \(\psi(\alpha x) \le \alpha \psi(x)\). Hence, using that

\begin{equation} \sum_{l = 1}^m \ind (X \in \Ab_l) X^l_k \le 1 \end{equation}

because every function in \(\F\) takes values in \([0, 1]\), we obtain

\begin{equation} 0 \le \psi(\lambda(Z(X) - Z(Y_k(X)))) \le \psi(\lambda) \sum_{l = 1}^m \ind (X \in \Ab_l) X^l_k, \end{equation}

also using that \(\psi\) is increasing.

(e)

Combining the previous parts, we find

\begin{equation} \Hb(e^{\lambda Z_k(X)}) \le \psi(\lambda) \E\sbrac{ \sum_{l = 1}^m\ind (X \in \Ab_l) \sum_{i=1}^nX^l_i e^{\lambda Z(X)} } = \psi(\lambda) \E[Z(X) e^{\lambda Z(X)}], \end{equation}

by definition of \(\Ab_l\).

(f)

Part (e) gives that

\begin{equation} \lambda \varphi’(\lambda) - \varphi(\lambda) \log \varphi(\lambda) \le \psi(\lambda) \varphi’(\lambda). \end{equation}

Divide by \(\varphi(\lambda) > 0\):

\begin{equation} \lambda (\log \varphi)’(\lambda) - \log \varphi(\lambda) \le \psi(\lambda) (\log \varphi)’(\lambda). \end{equation}

Hence, rearranging,

\begin{equation} (\log \varphi)’(\lambda) \le \frac{1}{\lambda - \psi(\lambda)} \log \varphi(\lambda) = \frac{e^{\lambda}}{e^{\lambda} - 1} \log \varphi(\lambda), \end{equation}

using that \(\lambda - \psi(\lambda) > 0\). Because not \(Z(X) = 0\) a.s., it holds that \(\log \varphi(\lambda) > 0\), so this differential inequality can be rewritten as

\begin{equation} (\log \log \varphi)’(\lambda) \le (\log(\exp - 1))’(\lambda). \end{equation}

Therefore,

\begin{equation} \parens{\log \frac{\log \varphi}{\exp - 1}}’(\lambda) \le 0, \end{equation}

and integrating gives, for \(\lambda > \lambda_0 > 0\),

\begin{equation} \log \frac{\log \varphi(\lambda)}{e^{\lambda} - 1} \le \log \frac{\log \varphi(\lambda_0)}{e^{\lambda_0} - 1} \to \log \varphi’(0) = \log \E[Z(X)], \end{equation}

by taking \(\lambda_0 \downarrow 0\) and using L’Hopital’s rule. We conclude that

\begin{equation} \log \varphi(\lambda) \le (e^{\lambda} - 1) \E[Z(X)] \end{equation}

for all \(\lambda \ge 0\).

Published on 10 October 2020.