The Hawkes process has garnered attention in recent years for its suitability to describe the behavior of online information cascades. Here, we present a fully tractable approach to analytically describe the distribution of the number of events in a Hawkes process, which, in contrast to purely empirical studies or simulation-based models, enables the effect of process parameters on cascade dynamics to be analyzed. We show that the presented theory also allows making predictions regarding the future distribution of events after a given number of events have been observed during a time window. Our results are derived through a novel differential-equation approach to attain the governing equations of a general branching process. We confirm our theoretical findings through extensive simulations of such processes and apply them to empirical data obtained from threads of an online opinion board. This work provides the ground to perform more complete analyses of the self-exciting processes that govern the spreading of information through many communication platforms, including the potential to predict cascade dynamics within confidence limits.