Open Questions — Markov Trading Model Profitability
The standing research backlog. Questions move to Answered as P5 rounds close them; new questions are appended as research surfaces them.
Core questions (from the research brief)
- Is the popular claim — “a Markov trading model predicts future price action or regime from the current state alone, ignoring older data” — technically accurate for (a) simple Markov chains, (b) HMMs, (c) Markov regime-switching models, (d) MDPs? (partly addressed in seed round — each model note states its memory assumption; needs a consolidated answer)
- Which Markov-based approaches have actually been tested in trading, and on which markets/instruments?
- Do they produce positive returns after realistic transaction costs and slippage?
- Are returns robust out-of-sample, across market regimes, and across instruments?
- Are Markov models better used for regime detection, signal filtering, risk control, or direct trade generation?
- What evidence exists from peer-reviewed papers, working papers, reproducible code, or live/production use (as opposed to backtests)?
- Where do Markov models fail: non-stationarity, overfitting, state-definition, transition instability, latency, costs, regime misclassification?
- What is the fairest overall conclusion: profitable model, useful component, or mostly academic toy?
Questions raised during the seed round
- Live evidence gap. Every seed-round source is a backtest, simulation, or academic study. Is there any credible disclosure of Markov-family models used in live production trading (named funds, broker/exchange material)?
- Costs methodology. Transaction Costs and Slippage — what cost/slippage assumptions do the better studies use, and how sensitive are conclusions to them?
- Overfitting quantification. Overfitting in Quantitative Trading, Data-Snooping Bias — does the Markov-trading literature apply formal multiple-testing corrections (White’s Reality Check, Hansen SPA, deflated Sharpe ratio)?
- Regime vs alpha. Regime Classification — can regime labels be converted into directional alpha that beats a benchmark after costs, or is the only robust use risk control?
- Reproducibility. How much of the RL-trading and HMM-trading literature releases code, and do independent replications confirm the headline results?
- Statistical jump model. Statistical Jump Model beat the HMM in one 2024 study — is this finding replicated elsewhere?
- Baseline comparison. Which studies compare against honest baselines (buy-and-hold, moving-average crossover, ARIMA/GARCH, momentum) rather than weak strawmen?
Questions raised during round 10 (recent developments)
- Do 2024-2025 advances change the verdict? Reviewed in Recent Developments 2024-2025: neural regime models, the FinRL benchmark ecosystem, generative/world-model market simulation, and continued jump-model refinement. Finding — better tooling and more honest evaluation, but no change to the verdict; the recurring weaknesses (non-stationarity, overfitting, costs, no live evidence) all survive.
- Synthetic-data validation. Generative world models (VAE/GAN/diffusion) train policies inside a generated market — does validating the synthetic distribution against the live future just re-introduce the non-stationarity problem the technique was meant to escape? Likely yes; flagged as a subtler form of Data-Snooping Bias.
- FinRL live trading. FinRL Contests evaluate on withheld historical out-of-sample data, not disclosed live capital. Is there any FinRL-based strategy with audited live results? Not found — the Live Regime-Model Evidence Gap remains open for RL too.
Answered
- Statistical jump model replication. Partly addressed (round 10): the Statistical Jump Model line (Shu Yu and Mulvey 2024, Aydınhan Kolm Mulvey Shu 2024) now has open-source code (
jumpmodels) and continues into 2025 dynamic-factor-allocation work — but it remains the same Princeton/Kolm research network; its consistent, defensible claim stays downside-risk reduction, not benchmark-beating alpha. Still awaits fully independent replication.