Understanding your backtest report is an essential part of being a successful strategy developer.
Here I explain what the numbers mean, and how you can make use of each metric during strategy development.
Every automated forex trader has combed through an MT4 backtest report at some point. The optimizer reports, in particular, will likely occupy much of your attention.
Do you really need to pay attention to every metric?
My aging eyes say no. Surely some metrics are more important than others.
But before figuring out which metrics we should pay attention to, we first need to know what to look out for.
At the minimum, a good metric should satisfy the following two conditions:
1. It compares return to risk
Risk management is often a central focus of successful trading.
Drawdown is often used to quantify risk, and is defined as a peak-to-trough equity drop. Numerous measures of drawdown exist, the most common of which is the maximum drawdown obtained over the backtest period.
Considering the importance of this metric, MT4 provides the maximum drawdown in both dollar and percentage terms. We’ll go through the differences later.
2. It is normalized
A normalized metric allows convenient comparison across different markets and strategies.
Such metrics usually contain no units. Examples would be the Sharpe ratio and profit factor.
MT4 Backtest Performance Metrics
Net profit fails both of the above conditions.
Although profit and drawdown usually have an inverse relationship, some estimate of the strategy’s historical drawdown is required when evaluating your strategy’s live performance.
If you enter a prolonged drawdown, what are the chances that your strategy will recover? Is it broken?
If you consider profits in isolation, there is no way of answering these questions in real-time.
Backtests provide the benefit of hindsight. When you’re trading live, you need to trade through the drawdowns to realize the eventual profits.
While not a performance metric per se, it is crucial for the statistical significance of your backtest. The more trades you have, the more reliable your backtest statistics will be. You can use the standard error to quantify the uncertainty arising from a small number of trades.
When selecting your trading timeframe and backtest period, the number of trades in your backtest should be a key consideration. Longer backtest periods are always favourable since they produce more trades and cover more market conditions.
As a first pass, I consider any backtest containing fewer than 250 trades to be highly suspicious.
Profit factor is defined as the gross profits divided by gross losses.
The relative balance between profits and losses provides some indication of your strategy’s risk, but tells you nothing about the sequence of losing trades. Having a string of losses in tight succession could decimate your account.
The two equity curves below both have a profit factor of 1.6. The first only has a 10% drawdown, whereas the second has a 30% drawdown due to a losing streak early in the backtest.
Nonetheless, the profit factor is normalized and is a good choice when doing quick comparisons between strategies.
Also known as the expectancy or the average trade, this is the net profit divided by the number of trades.
It suffers from the same drawbacks as net profit.
I do pay attention to the expected payoff though; it should comfortably cover transaction costs in the form of spread, slippage and commissions. A common development pitfall is to select short indicator lookback periods, leading to high trading frequency and a low expectancy.
If your backtest expectancy is only 1-2 pips, any unexpected increase in trading costs could turn a profitable strategy into a loser.
Check out the GBPNZD scalping strategy below. The backtest on the left only factors in a 2-pip spread, producing an expectancy of about 2.5 pips.
I then backtested with an additional 2-pip spread and a $7 commission per lot, which effectively destroyed the strategy.
Drawdown ($) and Drawdown (%)
MT4’s backtest report displays the maximum drawdown, taking into account both closed-trade equity and floating profits/losses.
If you have a $500 drawdown from your previously closed trades, and your open positions have another $500 loss, MT4 will take your total drawdown to be $1000.
Sometimes you will see a green line hovering around the blue equity curve when inspecting your MT4 backtest report. That is the open-trade equity curve.
So should you use dollar or percentage drawdown?
This depends on whether you applied variable position sizing to your backtest. If you backtest with a fixed lot size, use dollar drawdown. If any sort of variable sizing scheme was applied, percentage drawdown will be more appropriate.
If you use percentage drawdown with a fixed lot backtest, your drawdown values will decrease as your account balance grows.
When I develop my strategies, I backtest exclusively with fixed lots, so dollar drawdown is more appropriate. After applying variable position sizing during portfolio composition, I switch over to percentage drawdown.
While maximum drawdown is the most commonly available drawdown metric, it is not necessarily the most reliable, since it captures a single value over your entire backtest.
Alternative drawdown metrics, such as the average drawdown and the Ulcer Index, are discussed here.
What Metrics Should You Use?
None of the above metrics above meet the two conditions when used in isolation.
I will instead recommend using the Profit ($) / Drawdown ($) ratio, because it:
- Compares returns to risk
- Is normalized and allows easy comparison across strategies and markets. Volatile markets tend to produce larger profits and drawdowns; the normalization takes care of this.
The easiest way to get this ratio is to export your MT4 backtest report and run the computation in Excel.
Can You Use More Than One Metric?
Of course you can. No metric is perfect, and adding a complementary metric can address some of its pitfalls.
For example, the profit/drawdown ratio is great for measuring risk-adjusted returns, but it doesn’t always paint a complete picture of the strategy’s profitability.
You may be lucky and get a high profit/drawdown because your backtest did not capture a lengthy string of losing trades. For example, the GBPNZD scalping strategy backtest above had a profit/drawdown of 6, but a profit factor of only 1.13.
One way of removing the effects of luck is to run a Monte Carlo simulation to randomize your backtest trade sequence.
Alternatively, you can create a composite metric consisting of the profit/drawdown and profit factor.
Some manipulation is required since the metrics are measuring different quantities.
This can be done by normalizing each metric between the values of 0 and 1. I will illustrate this using the optimizer report above.
- For each metric of interest, obtain the minimum and maximum value from the backtests.
- The highest metric will be normalized to 1, while the lowest will be normalized to 0, using the formula:
- After repeating the above for all metrics of interest, sum up all the normalized values to get your composite metric.
If you use a metric where a smaller value reflects better performance (stagnation is an example), you’ll need to amend the formula above to:
MT4 provides a decent selection of performance metrics, but I recommend exporting your report to Excel and computing the profit/drawdown.
This ratio provides a better assessment of your strategy’s risk-adjusted performance. I frequently use this metric during development, along with the number of trades in the backtest.
Once you have finished optimizing your strategy, you can load your MT4 backtest report into QuantAnalyzer. You will have a larger selection of metrics, including stagnation and the Sharpe ratio.
What are your favourite performance metrics? Let me know in the comments below!