Marzen, S. E.,Crutchfield, J. P.
Scientific explanation often requires inferring maximally predictive features from a given data set. Unfortunately, the collection of minimal maximally predictive features for most stochastic processes is uncountably infinite. In such cases, one compromises and instead seeks nearly maximally predictive features. Here, we derive upper bounds on the rates at which the number and the coding cost of nearly maximally predictive features scale with desired predictive power. The rates are determined by the fractal dimensions of a process' mixed-state distribution. These results, in turn, show how widely used finite-order Markov models can fail as predictors and that mixed-state predictive features can offer a substantial improvement.