Calmar Ratio

The Calmar ratio is a risk-adjusted performance measure equal to a strategy’s annualised return divided by its maximum drawdown — the worst peak-to-trough loss over the measurement window. A higher value means more return earned per unit of worst-case loss. It was created by the California fund manager Terry W. Young and first published in 1991 in the trade journal Futures; the name is an acronym of his firm and newsletter, CALifornia Managed Accounts Reports. Young defined it as a slightly modified Sterling ratio — average annual return over the last 36 months divided by the maximum drawdown over those 36 months — calculated monthly rather than yearly, on the argument that this changes gradually and smooths out a manager’s over- and under-achievement periods more readily than the Sharpe or Sterling ratios. It is closely related to, but not identical with, the older MAR ratio, which uses all data from inception rather than a rolling 36 months.

The metric’s appeal is that maximum drawdown is intuitive and tangible — it answers “what is the worst loss I had to sit through?” — which makes the Calmar ratio popular for evaluating CTAs, hedge funds and systematic trading strategies. But that same simplicity is its weakness, and the weakness matters for how this vault grades backtests. The Calmar ratio ignores volatility and the path of losses: a deep but brief drawdown and a shallow but prolonged one can produce a similar ratio, and it says nothing about recovery time. More importantly for backtest assessment, maximum drawdown is a single extreme order statistic and is therefore highly sample-dependent — a longer or differently-windowed history will almost always surface a larger worst loss and hence a lower Calmar ratio. Comparing Calmar ratios computed over different periods is close to meaningless, so the metric is only informative when the comparison window is held fixed and it is read alongside Sharpe-type and downside-deviation measures.

In this vault the Calmar ratio is the headline profitability claim for the discrete-Markov family. It is the metric by which Wilinski 2019 — the only peer-reviewed paper in the vault that reports actual trading profit for a Markov Chain Trading Model — judges its heterogeneous rolling-window chain to have produced “good results of profit according to the Calmar criterion” on EUR USD Currency Pair hourly data and WIG20 Index daily data, for both first- and second-order chains. Wiliński cites external Calmar thresholds (attributed to Main, 2015) to call the values “excellent.”

That claim is graded weak within this vault, and the Calmar ratio itself is part of the reason. The reported figures come from a strategy whose three hyper-parameters — window length, number of windows, number of intervals — were tuned by machine learning to maximise predictive efficiency, with no clearly disclosed out-of-sample period, no transaction costs, and no buy-and-hold or random-walk benchmark. A Calmar ratio computed on a parameter-optimised in-sample simulation measures how well the optimiser found a low-drawdown path through the training data, not how the strategy would behave on unseen data after costs. The Calmar ratio is a legitimate, well-defined risk metric; presented this way it is a parameter-fitted in-sample number, not evidence of tradeable profitability. The proper bar — out-of-sample return over out-of-sample drawdown, net of realistic costs, against a benchmark — is the same one applied to every backtest in this vault.

Calmar Ratio [defines] Maximum Drawdown Wilinski 2019 [reports_profitability] Calmar Ratio Calmar Ratio [relates] Markov Chain Trading Model

Connections

Sources