After out-of-sample testing and walk-forward optimization, we can be reasonably confident that our GBPJPY trend following strategy is robust.
For this final article on robustness testing, we will learn more about this strategy’s risk profile using StrategyQuant’s Monte Carlo simulator. In this instance, Monte Carlo simulation will be used for informational purposes only; there will be no filtering/shortlisting of strategies, although you have the option to do so in StrategyQuant.
Monte Carlo Simulation
With random noise being a perpetual contaminant in every price dataset, your backtest statistics will never be exactly replicated in live trading. Monte Carlo simulations use repeated random sampling to determine the probabilities of obtaining different backtest outcomes, giving you an estimate of how your strategy performance might deteriorate in real-time.
There are two common trading applications of Monte Carlo simulations:
- Randomizing your backtest’s trade sequence
- Running a fresh backtest with randomized changes in prices/strategy parameters
Both these options are available in StrategyQuant. Here I’ll use the GBPJPY trend strategy to demonstrate the use of the trade sequence randomizer.
Why Randomize Your Trade Sequence?
Randomizing your backtest’s trade sequence can provide a more reliable estimate of the drawdowns you will encounter in live trading.
Every backtest, no matter how reliable, only represents a single run of your strategy over a certain historical period. Suppose you flip a fair coin 100 times and get 53 heads. If you flip it another 100 times, you are not likely to get exactly 53 heads again.
By randomizing the sequence of trades, the Monte Carlo method simulates multiple equity curves from your single backtest. Since drawdown is affected by the sequence of losing trades, these multiple equity curves can be used to produce a drawdown distribution that is more reliable than a single drawdown value.
Before we go into more detail about these distributions, a brief explanation of confidence levels is required.
Confidence Intervals for Monte Carlo Simulation
The GBPJPY trend following strategy above produced a 372-trade backtest over the 2003-2020 period. By rearranging the sequence of trades, we can arrive at 372! (372*371*370…*3*2*1) different equity curves. This is an astronomical number which no computer can process in a reasonable time!
We can achieve a compromise by simulating only a few hundred/thousand different equity curves, and then using confidence levels to quantify the uncertainty caused by this simplification.
Confidence levels are a concept borrowed from statistics, and are used to measure the degree of uncertainty in a sampling method. A confidence level refers to the probability that the sampled results contain the true value of a certain parameter.
In trading, we are usually concerned with profits and drawdowns. The typical result from a Monte Carlo simulation would look as follows:
If we use a 95% confidence level, it means there is a 5% probability that the actual profit will be smaller than $672, and that the drawdown will be larger than $220. The higher the confidence level, the more the metrics will deteriorate, but the higher the probability that those metrics will encompass your future performance.
Setting Up StrategyQuant’s Monte Carlo Simulation
The setup options for our Monte Carlo simulation are shown below:
1000 different equity curves will be simulated for our strategy. This should provide sufficiently reliable results without much computational burden. StrategyQuant offers two sampling methods: Exact and Resampling.
In the exact sampling method (also known as sampling without replacement), each trade from the original backtest of 372 trades can only be sampled once. This preserves the strategy’s probability distribution, or its performance profile.
The resampling method (sampling with replacement) allows each trade to be sampled more than once. This will alter your strategy’s probability distribution, and may be preferable if you expect market conditions to change drastically in future, or your original backtest only contains a small basket of trades.
Since our strategy was developed over 16 years, and 372 trades is a decent number, let’s stick with the exact sampling method.
Monte Carlo Simulation Results
The characteristic ‘straw broom’ charts produced from the simulations are shown below, together with a table showing performance metrics at various confidence levels.
The blue equity curve reflects the original backtest. Notice how all the overall net profit remains the same for all the equity curves; this is a consequence of sampling without replacement.
Ideally, you want the multiple equity curves to be grouped closely together, indicating consistency across the runs. Looking at the 95% confidence level, we can infer that there is a 5% probability that future drawdowns will be larger than $311.
This is over twice as large as the drawdown in the original backtest. If you are using historical drawdowns to determine your strategy’s capital allocation, using the Monte Carlo-simulated drawdowns can help you determine a more conservative value. Capital allocation is discussed in more detail in the Portfolio Composition article.
Note that Monte Carlo simulation is probabilistic in nature, so the equity curves and metrics will vary slightly every time you run the test.
Limitations of Monte Carlo Simulation
While Monte Carlo simulation can be a great tool for anticipating future risks, it has certain limitations. The following is a non-exhaustive list:
Overfitting Cannot Be Detected
Monte Carlo simulations assume that the input trades from your backtest reflect your strategy’s true performance; only the sequence of trades is altered. If your backtest is the result of an overfitted strategy, your Monte Carlo performance metrics will be artificially good.
This is an example of the popular saying ‘garbage in, garbage out.’ It is therefore best to input trades obtained from out-of-sample testing.
Similarly, if your original backtest only contains a handful of trades or covers a brief historical period, your estimations will lose their predictive value when market conditions change.
Serial Correlations Are Not Preserved
Serial correlations may exist in some strategies, whereby the outcome of a particular trade may affect the outcome of subsequent trades.
This is especially prevalent if you are a trend-follower, considering that a large trend is likely to be followed by a period of consolidation. Your trades will thus tend to follow a cyclical pattern – one large winner is likely to be followed by a string of smaller losers.
Due to its random sampling from your original backtest trades, Monte Carlo simulation cannot capture such trade dependencies.
Since 1000 simulations is a sizeable number, the effects of serial correlations should be minimal. Nonetheless, if you feel that serial correlations are important, you may consider doing Monte Carlo simulations on equity curve segments instead, whereby each segment would preserve the series of trades in the original backtest.
Market Returns Are Assumed to Be Normally Distributed
This assumption is used when computing the performance metrics at each confidence level. If your backtest contains a number of unusually large winners/losers, your metrics may be less accurate.
This can be mitigated if your backtest contains a large number of trades, but regardless, it is best to treat your Monte Carlo results as estimations. A $1000 drawdown at the 100% confidence level does not mean your future drawdowns will never exceed $1000; there is no computational method/mathematical model that can entirely replicate the sophistication of the markets.
Monte Carlo simulation randomizes your original backtest’s trade sequence, thereby creating multiple equity curves, each with a different maximum drawdown.
This allows a drawdown distribution to be generated, which can help you arrive at a conservative capital allocation for the strategy.
Alternatively, you can use Monte Carlo simulations to randomize prices and strategy parameters, thereby creating an out-of-sample test.
Through these three articles on robustness testing, I hope I have demystified some of the common test methods available in today’s commercial software. The importance of strategy robustness cannot be overstated. In today’s rapidly changing markets, robustness should be the foremost concern of every developer.
The final trading strategy can be downloaded here.
In the next stage of our development workflow, we will conclude our individual strategy development by running tick-precision backtests on them.
After the Monte Carlo test, I didn’t see you do a slippage test. I believe it is one of the most harshest tests so I think you should not skip it. Any comment? Thanks!
for trend following with a larger expectancy, a few pips slippage wont make much of a difference. Besides i added in 1 pip constant slippage for the backtest