HDS

Exercise 15.3: Properties of the Kullback–Leibler Divergence

chapter 15

(a)

Let $\P \ll \Q$. Then, by Jensen’s Inequality, \begin{equation} \KL(\P, \Q) = \E_\P \log \frac{\sd\P}{\sd\Q} = \E_\Q \frac{\sd\P}{\sd\Q} \log \frac{\sd\P}{\sd\Q} \ge \E_\Q\sbrac{\frac{\sd\P}{\sd\Q}} \log \E_\Q\sbrac{\frac{\sd\P}{\sd\Q}} = 0 \end{equation} with equality if and only if $\frac{\sd\P}{\sd\Q}$ is constant $\Q$-a.s. Alternatively, let $A_1 \cup \cdots \cup A_n$ be a partition of the probability space. If $\Q(A_i) = 0$, then $\P(A_i) = 0$ by $\P \ll \Q$. Therefore, \begin{align} \KL(\P, \Q) &= \sum_{i=1}^n \Q(A_i) \,\E_\Q \frac{\ind_{A_i}}{\Q(A_i)} \frac{\sd\P}{\sd\Q} \log \frac{\sd\P}{\sd\Q} \newline &\ge \sum_{i=1}^n \Q(A_i) \,\E_\Q\sbrac{ \frac{\ind_{A_i}}{\Q(A_i)} \frac{\sd\P}{\sd\Q}} \log \E_\Q\sbrac{ \frac{\ind_{A_i}}{\Q(A_i)} \frac{\sd\P}{\sd\Q}} \newline &\ge \sum_{i=1}^n \P(A_i) \,\log \frac{\P(A_i)}{\Q(A_i)}. \end{align} In particular, if $\P(A) > \Q(A)$ on some measurable set $A$, then $\KL(\P, \Q) > 0$.

(b)

Let $(\lambda_i)_{i=1}^n \in [0, 1]$ sum to one. WLOG, assume that $\P = \sum_{i=1}^n \lambda_i \P_i \ll \Q$. Then \begin{equation} \frac{\sd\P}{\sd\Q} = \sum_{i=1}^n \lambda_i \frac{\sd\P_i}{\sd\Q}, \end{equation} so, using Jensen’s Inequality, \begin{equation} \KL(\P, \Q) = \E_\Q \frac{\sd\P}{\sd\Q} \log \frac{\sd\P}{\sd\Q} \le \sum_{i=1}^n \lambda_i \, \E_\Q \frac{\sd\P_i}{\sd\Q} \log \frac{\sd\P_i}{\sd\Q}. \end{equation} Conversely, let $\mu = \tfrac12 (\P + \Q)$. Then still \begin{equation} \frac{\sd\P}{\sd\mu} = \sum_{i=1}^n \lambda_i \frac{\sd\P_i}{\sd\mu}, \end{equation} so, by Jensen’s Inequality again, \begin{equation} \KL(\Q, \P) = -\E_\Q \log \frac{\sd \P/\sd \mu}{\sd \Q/ \sd \mu} \le -\sum_{i=1}^n \E_\Q \log \frac{\sd \P_i/\sd \mu}{\sd \Q/ \sd \mu}. \end{equation}

(c)

Follows directly from the observation that \begin{equation} \frac{\sd(\P_1 \otimes \P_2)}{\sd(\Q_1 \otimes \Q_2)} = \frac{\sd \P_1}{\sd \Q_1} \frac{\sd \P_2}{\sd \Q_2}. \end{equation}

Published on 9 September 2021.