Differential Sharpe Ratio

The Differential Sharpe Ratio (DSR) is an online, incremental approximation of the Sharpe ratio proposed by John Moody and collaborators in Moody Wu Liao Saffell 1998 and used throughout Moody and Saffell 2001. The ordinary Sharpe ratio is computed over a fixed window, which makes it unsuitable as a per-step reward for online learning; the DSR solves this by maintaining exponential moving averages of returns (A_t) and squared returns (B_t) and defining D_t as the first-order rate of change of the Sharpe ratio with respect to the EMA decay rate eta. Concretely D_t = (B_{t-1}·deltaA_t - 0.5·A_{t-1}·deltaB_t) / (B_{t-1} - A_{t-1}^2)^(3/2), so that S_t ≈ S_{t-1} + eta·D_t — D_t is “the influence of the trading return at time t on the cumulative Sharpe ratio.”

It is the canonical reward function for Recurrent Reinforcement Learning Trading and remains widely used in modern deep RL trading because it gives a stable, differentiable, drawdown-aware reward at every timestep without waiting for an episode to end. A downside-only variant, the Differential Downside Deviation Ratio (D3R), is the online analogue of the Sortino ratio and was used in Moody & Saffell’s USD/GBP study. As an evaluation/reward construct the DSR is well established; the open question the vault tracks is whether optimising it produces genuine out-of-sample, post-cost alpha — see the skeptical reading in Reinforcement Learning Trading Policy and the modest net result in Borrageiro Firoozye Barucca 2022.

Moody Wu Liao Saffell 1998 [defines] Differential Sharpe Ratio Differential Sharpe Ratio [relates] Recurrent Reinforcement Learning Trading

Connections

Sources