Moody Wu Liao Saffell 1998

“Performance Functions and Reinforcement Learning for Trading Systems and Portfolios” by John Moody, Lizhong Wu, Yuansong Liao and Matthew Saffell, published in the Journal of Forecasting (vol. 17, pp. 441-470, 1998) and also as a chapter in Decision Technologies for Computational Finance. It is the earliest full recurrent reinforcement learning (RRL) trading paper and the direct predecessor of Moody and Saffell 2001. It proposes training trading systems and portfolios by optimising objective functions that directly measure trading performance — profit, the Sharpe ratio, and the newly proposed Differential Sharpe Ratio — instead of training a supervised forecaster on labelled data, and reports that DSR-trained long/short traders behave more consistently than profit-maximising traders while both beat MSE-trained forecasters.

This paper is the source of the much-quoted claim of out-of-sample predictability in the monthly S&P 500 stock index over the 25-year period 1970-1994. The vault grades the profitability evidence weak: the results are single-group simulated backtests on data ending in 1994, with no cross-validation, no reported drawdown statistics, no robustness testing and no released code — it defines a method and reports an encouraging backtest rather than substantiated tradeable edge.

Moody Wu Liao Saffell 1998 [defines] Differential Sharpe Ratio Moody Wu Liao Saffell 1998 [precedes] Moody and Saffell 2001

Connections

Sources