Exercise 15.4: More Properties of Shannon Entropy
Recall the definition \begin{equation} H(X \cond Y) = \E \log \frac{\sd \P_{X \cond Y}}{\sd \mu}. \end{equation}
(a)
Use non-negativity of the mutual information: \begin{equation} H(X \cond Y) - H(X) = \E \log \frac{\sd \P_{X \cond Y} / \sd \mu}{\sd \P_{X} / \sd \mu} = \E_Y \KL(\P_{X \cond Y}, \P_{X}) \ge 0. \end{equation}
(b)
Follows from a direct computation: \begin{equation} H(X) + H(Y \cond X) = \E \log \frac{\sd \P_{X}}{\sd \mu}\frac{\sd \P_{Y \cond X}}{\sd \mu} = \E \log \frac{\sd \P_{X, Y}}{\sd \mu} = H(X, Y). \end{equation}
(c)
Follows from (a) applied to the r.h.s. of (b).