Sim-to-Real Gap
The sim-to-real gap is the discrepancy between a trading agent’s performance in a historical-data simulation and its performance in a live market. Backtests typically assume the agent’s own orders do not move prices (no market impact), that orders fill instantly at the close or mid-price, and that costs are a simple fixed rate — assumptions that break in live trading, especially at institutional volume. Sun Wang An 2021 notes that learning by directly interacting with the real market is risky and impractical, so reinforcement-learning trading is trained almost entirely offline on historical data; this is why the literature is dominated by backtests and has almost no disclosed live-trading evidence.
The gap is the reinforcement-learning-specific instance of the broader Backtest-to-Live Performance Gap. The danger is not merely that reported returns shrink — it is that the agent optimises against the simulator itself. If the environment misprices liquidity, the policy that maximises simulated reward is one that exploits that mispricing, so the cost model is part of what is learned, not a footnote applied afterwards. Abbade and Reali Costa 2026 demonstrate this directly: replacing a flat 10 bp fee with an Almgren–Chriss market-impact model in open-source RL backtest environments collapsed one agent’s turnover from 19% to 1% and another’s out-of-sample return from 34% to 25% — different cost assumptions produce different policies, not just different P&L. An agent that learned to trade frequently because the simulator made trading nearly free has, by construction, no live edge.
Aggregate live evidence corroborates the gap. Buczynski, Cuzzolin and Sahakian’s review of 27 ML market-forecasting experiments found a literature claiming >90% accuracy but “conspicuously lacking in high-profile success cases”, while prominent ML-driven funds (Aidyia, Sentient Technologies) liquidated within months of launch and the Eurekahedge AI Hedge Fund Index underperformed the S&P 500 over fifteen years of real-money trading. Marcos López de Prado’s The 10 Reasons Most Machine Learning Funds Fail gives the practitioner mechanism: research-through-backtesting and single-path walk-forward testing manufacture simulated edges that do not survive deployment.
For this vault the sim-to-real gap is why a Markov-decision-process or reinforcement-learning agent with a strong simulated Sharpe ratio cannot be graded above weak-to-moderate without realistic costs and — critically — Live Trading Evidence or independent replication. No source reviewed provides a disclosed live track record for a standalone Markov/MDP/RL trading agent, so the gap remains uncrossed for every such claim in the vault.
Connections
- Reinforcement Learning Trading Policy — lacks_live_evidence, trained offline; no disclosed live record, source: https://arxiv.org/abs/2109.13851
- Markov Decision Process Trading Model — lacks_live_evidence, simulator-trained policies, source: https://arxiv.org/abs/2109.13851
- Backtest-to-Live Performance Gap — part-of, the RL-specific case of the general gap, source: https://pmc.ncbi.nlm.nih.gov/articles/PMC8019690/
- Abbade and Reali Costa 2026 — includes_costs, cost model changes the learned policy, source: https://arxiv.org/html/2603.29086
- Transaction Costs and Slippage — relates, omitted costs are a primary driver of the gap, source: https://pmc.ncbi.nlm.nih.gov/articles/PMC8019690/
- The 10 Reasons Most Machine Learning Funds Fail — supports, practitioner account of simulated-to-live failure, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3104816
- AI Hedge Fund Index Underperformance — reports_underperformance, live ML-fund evidence is negative, source: https://www.ig.com/za/prime/insights/articles/has-artificial-intelligences-impact-on-hedge-funds-been-overhype-241121
- Live Trading Evidence — lacks_live_evidence, no live Markov/RL track record exists, source: https://pmc.ncbi.nlm.nih.gov/articles/PMC8019690/
- Non-Stationarity — relates, regime shifts the simulator never saw widen the gap, source: https://arxiv.org/abs/2109.13851
Sim-to-Real Gap [part-of] Backtest-to-Live Performance Gap Transaction Costs and Slippage [causes] Sim-to-Real Gap Abbade and Reali Costa 2026 [supports] Sim-to-Real Gap Sim-to-Real Gap [contradicts] Live Trading Evidence
Sources
- Sun, S., Wang, R., & An, B. (2021). “Reinforcement Learning for Quantitative Trading.” arXiv 2109.13851. https://arxiv.org/abs/2109.13851
- Buczynski, W., Cuzzolin, F., & Sahakian, B. (2021). “A review of machine learning experiments in equity investment decision-making.” International Journal of Data Science and Analytics, 11(3), 221–242. https://pmc.ncbi.nlm.nih.gov/articles/PMC8019690/
- López de Prado, M. (2018). “The 10 Reasons Most Machine Learning Funds Fail.” Journal of Portfolio Management, 44(6), 120–133. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3104816
- IG Prime (2024). “Has artificial intelligence’s impact on hedge funds been overhyped?” https://www.ig.com/za/prime/insights/articles/has-artificial-intelligences-impact-on-hedge-funds-been-overhype-241121