The 10 Reasons Most Machine Learning Funds Fail
“The 10 Reasons Most Machine Learning Funds Fail” by Marcos López de Prado (The Journal of Portfolio Management, 2018, 44(6), 120–133; first SSRN draft December 2017) is the vault’s primary practitioner authority on the Backtest-to-Live Performance Gap. Written from direct industry observation — “I have seen many faces come and go, firms started and shut down” — it argues that the high failure rate of quantitative and ML funds is driven by ten avoidable, recurring mistakes rather than bad luck.
Several of the ten map directly onto the backtest-to-live gap. Pitfall #2, “research through backtesting”, is the practice of running an ML algorithm, backtesting its predictions, and repeating until a nice-looking backtest appears — “it does not matter if the backtest is a walk-forward out-of-sample; the fact that we are repeating a test over and over on the same data will likely lead to a false discovery.” Pitfall #9 attacks walk-forward backtesting specifically: it tests only a single historical path and is as easy to overfit as a walk-backward test, so a good walk-forward result is not evidence of an edge. Pitfall #10, backtest overfitting, formalises the point: across many trials a researcher will select the configuration with the maximum estimated Sharpe ratio even on a martingale; without disclosing the number of trials it is impossible to compute a Probability of Backtest Overfitting or a Deflated Sharpe Ratio, and the backtest is uninterpretable.
López de Prado’s recommended fixes — the meta-strategy paradigm, fractional differentiation to keep features stationary while preserving memory, Combinatorial Purged Cross-Validation to generate a distribution of backtest paths, and explicit trial-count disclosure — are the methodological standards the vault uses to grade Markov-model claims. The paper carries a negative profitability grade in the sense that its finding about quant strategies is a finding of systematic failure: it documents why a positive backtest, including a walk-forward one, is not evidence of tradeable profitability. For the vault’s research question it is decisive support for treating backtest-only Markov-model claims — which is all of them collected so far — as weak by default, and for treating the absence of a live track record (Live Trading Evidence) as the expected outcome when these pitfalls go uncontrolled.
Connections
- Marcos López de Prado — proposes_model, author, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3104816
- Backtest-to-Live Performance Gap — supports, practitioner account of why funds fail live, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3104816
- Overfitting in Quantitative Trading — supports, pitfalls 10 are forms of overfitting, source: https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf
- Out-of-Sample Backtesting — relates, argues a single walk-forward path is easily overfit, source: https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf
- Combinatorial Purged Cross-Validation — proposes_model, recommended fix for single-path backtesting, source: https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf
- Probability of Backtest Overfitting — relates, needs disclosed trial count, source: https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf
- Reinforcement Learning Trading Policy — suffers_overfitting_risk, ML trading strategies are the paper’s subject, source: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3104816
The 10 Reasons Most Machine Learning Funds Fail [supports] Backtest-to-Live Performance Gap The 10 Reasons Most Machine Learning Funds Fail [defines] Overfitting in Quantitative Trading Marcos López de Prado [proposes_model] Combinatorial Purged Cross-Validation
Sources
- López de Prado, M. (2018). “The 10 Reasons Most Machine Learning Funds Fail.” The Journal of Portfolio Management, 44(6), 120–133. DOI 10.2139/ssrn.3104816. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3104816 — full text https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf — abstract https://jpm.pm-research.com/content/44/6/120.abstract