Hidden Semi-Markov Model
The Hidden Semi-Markov Model (HSMM) is the natural relaxation of one specific, much-criticised assumption of the HMM. Under a plain first-order HMM the exit probability from any state is constant every period, which forces the time spent in a state — the sojourn or dwell time — to follow a geometric distribution. A geometric distribution is memoryless and monotonically decreasing, so it implies the single most likely duration of any regime is one period. For market regimes that plainly persist for months or years, that is an unrealistic structural artefact of the First-Order Memory Assumption rather than a feature of the data. The HSMM replaces the implicit geometric law with an explicit dwell-time distribution attached to each state — Poisson, negative binomial, Gamma, or a non-parametric form — chosen freely to match observed regime-length behaviour. In every other respect it behaves like an HMM, and the standard estimation route represents the HSMM as an HMM on an expanded state space so that the usual EM and Viterbi machinery still applies, at higher computational cost.
The HSMM is an established model, not a fringe one. Jan Bulla and Ingo Bulla introduced it to financial time series in 2006 (“Stylized facts of financial time series and hidden semi-Markov models,” Computational Statistics & Data Analysis), motivated precisely because the autocorrelation function implied by a fitted HMM decays too fast to match the slow decay seen in real daily returns — a mismatch the geometric sojourn assumption causes and a flexible dwell-time distribution can fix. HSMMs and the closely related explicit-duration / segment models are used across ecology, biology, environmental science and econometrics; in finance they sit alongside the Statistical Jump Model and inhomogeneous-transition variants as one of the principled ways to add realism to regime models.
The critical evidence for this vault, however, is that more realistic did not translate into more profitable. Baitinger & Hoch 2024 ran a controlled contest — the standard HMM versus the HSMM — on S&P 500 returns inside a Regime-Based Asset Allocation strategy. The HSMM did outperform the HMM in-sample, exactly as its richer parameterisation predicts; but that advantage “largely disappears in out-of-sample applications,” and the authors conclude the simpler HMM “may be equally suitable for regime-based investment strategies.” Two companion findings point the same way: increasing the number of hidden states does not reliably help, and strategies on daily data beat those on monthly data. The pattern — extra parameters buy in-sample fit that does not survive out-of-sample — is the textbook signature of Overfitting in Quantitative Trading. The HSMM therefore appears here as a precise cautionary example: relaxing the geometric-duration assumption is statistically well-motivated, but the added flexibility is, on the available evidence, spent on in-sample noise rather than on tradeable structure.
The honest reading is twofold. As a statistical model the HSMM is sound and addresses a genuine HMM shortcoming. As a route to a better trading model it is, so far, unsubstantiated: Baitinger & Hoch’s result is direct evidence that complexity in this family fails the out-of-sample test, which argues for parsimony over sophistication. It generalises the Dacco and Satchell 1999 lesson — knowing the true model does not rescue you if classification is noisy — into “fitting a richer model does not rescue you either, because the richness is absorbed by noise.” Any HSMM profitability claim should be graded against whether its in-sample edge demonstrably persists out-of-sample; on the one direct comparison in this vault, it does not.
Hidden Semi-Markov Model [contradicts] First-Order Memory Assumption Hidden Semi-Markov Model [part-of] Regime Classification Baitinger & Hoch 2024 [contradicts] Hidden Semi-Markov Model Hidden Semi-Markov Model [relates] Overfitting in Quantitative Trading
Connections
- Hidden Markov Model Regime Detection — relates, source: https://www.sciencedirect.com/science/article/abs/pii/S0167947306002374
- Baitinger & Hoch 2024 — reports_underperformance, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4796238
- First-Order Memory Assumption — contradicts, source: https://arxiv.org/html/2405.13553v1
- Regime-Based Asset Allocation — tests_strategy, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4796238
- Overfitting in Quantitative Trading — suffers_overfitting_risk, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4796238
- Statistical Jump Model — relates, source: https://arxiv.org/abs/1909.05800
- Baum-Welch Estimation — uses_dataset, source: https://arxiv.org/html/2405.13553v1
Sources
- Bulla, J. & Bulla, I. (2006). Stylized facts of financial time series and hidden semi-Markov models. Computational Statistics & Data Analysis 51(4), 2192-2209. https://www.sciencedirect.com/science/article/abs/pii/S0167947306002374
- Koslik, J.-O. (2024). Hidden semi-Markov models with inhomogeneous state dwell-time distributions. arXiv:2405.13553. https://arxiv.org/html/2405.13553v1
- Baitinger, E. & Hoch, L. (2024). Simplicity versus Complexity: A Comparative Analysis of HMM and HSMM for Regime-Based Asset Allocation. SSRN Working Paper No. 4796238. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4796238
- Chiappa, S. (2019). Explicit-Duration Markov Switching Models. arXiv:1909.05800. https://arxiv.org/abs/1909.05800