Combinatorial Purged Cross-Validation
Combinatorial Purged Cross-Validation (CPCV) is a model-evaluation method developed by Marcos López de Prado to obtain honest out-of-sample performance estimates for financial machine-learning models. Standard k-fold cross-validation leaks information in finance because observations are not IID — serially correlated features and overlapping label horizons mean that splitting adjacent points across folds lets a model appear skilful even with irrelevant features. CPCV fixes this with purging (removing training observations whose label window overlaps a test label) and embargoing (dropping a small buffer of training observations after each test fold).
Unlike walk-forward analysis, which tests a single historical path, CPCV partitions the series into N groups, evaluates all C(N,k) train/test combinations, and reconstructs φ[N,k]=(k/N)·C(N,k) distinct backtest paths — producing a distribution of Sharpe ratios rather than one likely-overfit point estimate. That distribution is what the Deflated Sharpe Ratio and the Probability of Backtest Overfitting are computed on. It appears in this vault as the methodological gold standard against which a Markov-model backtest’s rigour is judged; a study using only a single train/test split or one walk-forward path is, by this benchmark, weak evidence.
Marcos López de Prado [defines] Combinatorial Purged Cross-Validation Combinatorial Purged Cross-Validation [supports] Out-of-Sample Backtesting Combinatorial Purged Cross-Validation [opposes] Overfitting in Quantitative Trading
Connections
- Out-of-Sample Backtesting — replication_available, source: https://en.wikipedia.org/wiki/Purged_cross-validation
- Marcos López de Prado — proposes_model, source: https://en.wikipedia.org/wiki/Purged_cross-validation
- Deflated Sharpe Ratio — relates, source: https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf
- Overfitting in Quantitative Trading — contradicts, source: https://www.garp.org/hubfs/Whitepapers/a1Z1W0000054x6lUAA.pdf