When presented with complex rhythmic auditory stimuli, humans are able to track underlying temporal structure (e.g., a “beat”), both covertly and with their movements. This capacity goes far beyond that of a simple entrained oscillator, drawing on contextual and enculturated timing expectations and adjusting rapidly to perturbations in event timing, phase, and tempo. Previous modeling work has described how entrainment to rhythms may be shaped by event timing expectations, but sheds little light on any underlying computational principles that could unify the phenomenon of expectation-based entrainment with other brain processes. Inspired by the predictive processing framework, we propose that the problem of rhythm tracking is naturally characterized as a problem of continuously estimating an underlying phase and tempo based on precise event times and their correspondence to timing expectations. We present two inference problems formalizing this insight: PIPPET (Phase Inference from Point Process Event Timing) and PATIPPET (Phase and Tempo Inference). Variational solutions to these inference problems resemble previous “Dynamic Attending” models of perceptual entrainment, but introduce new terms representing the dynamics of uncertainty and the influence of expectations in the absence of sensory events. These terms allow us to model multiple characteristics of covert and motor human rhythm tracking not addressed by other models, including sensitivity of error corrections to inter-event interval and perceived tempo changes induced by event omissions. We show that positing these novel influences in human entrainment yields a range of testable behavioral predictions. Guided by recent neurophysiological observations, we attempt to align the phase inference framework with a specific brain implementation. We also explore the potential of this normative framework to guide the interpretation of experimental data and serve as building blocks for even richer predictive processing and active inference models of timing.