How Many Trades Should Your Backtest Have?

Feb 9, 2021

We all want a large sample of trades in our backtests, but practical limitations such as data availability often get in the way.
Here I’ll explain why 30 trades is insufficient, and how you can use standard error to quantify the uncertainty arising from a small sample size.

Browsing through the MQL5 Marketplace is a fun way to discover the many types of trading algorithms in existence.

I have seen backtests containing anywhere from 5 to 5000 trades.

So how many trades (or sample size) does your backtest really need?

And if you suspect your backtest has insufficient trades, can you still make use of its results?

The Importance of Sample Size

A large number of trades increases the statistical significance of your backtest results.

In essence, this means you can be confident that your results are a true reflection of your strategy’s performance, and are not due to chance.

Since your backtest is the crucial ‘scorecard’ that accompanies your strategy from inception to live trading, you need to be sure you can trust it.

In addition, targeting a larger sample size often forces you to backtest your strategy over a longer historical period. This lets you evaluate your strategy over different market conditions, and gives you a better idea of its robustness.

Are 30 Trades Enough?

When it comes to statistical significance, the number 30 gets plenty of attention.

When you backtest your strategy, you are attempting to characterize its probability distribution, as statisticians like to say.

30 trades is usually sufficient if you’re trying to verify a distribution you have already characterized.

For example, you have a basket of 30 live trades, and you want to see how these compare to your backtest performance.

You could use a Student’s t-test or a chi-square test to verify that both sets of trades come from the same distribution. I demonstrate the use of these simple statistical tests during the forward testing phase of strategy development.

However, if it’s your first attempt at characterizing the distribution, 30 trades is woefully insufficient.

I’ll illustrate this with a common analogy in probability theory.

The Sock Analogy

Imagine you receive a large barrel of socks with the following label: 50% black, 50% white socks.

You start drawing socks from the barrel, one at a time. After 30 draws, you have 17 black socks and 13 white socks.

You conclude that the label is correct.

Suppose you now receive an unlabelled barrel of socks, and you have totally no idea what’s inside.

Would 30 draws allow you to confidently describe the contents of the barrel? Probably not!

Your strategy’s average profit/loss, win rate, stagnation etc., are all important metrics that your backtest should tell you.

It’s quite impossible to characterize a whole bunch of metrics with such a small sample size. Just look at the wall of socks below!

So How Many Trades? The Short Answer…

The more the merrier.

Obviously this answer is not particularly useful or actionable. You could face practical limitations regarding sample size for the following reasons:

  • It is difficult to find quality historical data before the year 2000
  • You trade on the higher timeframes
  • You allocate some data for out-of-sample testing

It could be a mistake to discard a promising strategy simply because it has too few trades. After all, successful trading is about making the right trade-offs.

We need a way to quantify the deterioration of backtest results arising from a small sample size.

Fortunately, statisticians have solved this dilemma for us, using a concept called standard error.

I’ll explain standard error below, then demonstrate its application using two very different backtests.

What is Standard Error?

Standard error measures the accuracy of your sampling process. It helps you gauge how reliable your backtest results are.

You can apply the standard error to any statistic, but for our trading purposes, we’ll use the mathematical expectancy (also called the average trade or the mean) of the backtest.

A trader backtesting a strategy is like a statistician sampling a population to determine some underlying parameter. To better understand what standard error means, let’s first discuss how it is used in statistics.

A Short Statistics Excursion

Imagine you want to determine the average height of 30-year old males in a country. This height would the parameter of interest. Its true value is unknown, but you hope to get a good estimate of it through the sampling process.

So you go about collecting 5 samples of data throughout the country, each consisting of 10 data points.  You get the following plot:

To get the standard error, you do the following:

  1. Compute the mean height from each of the 5 samples
  2. Compute the standard deviation of these 5 means. Excel’s STDEV function can help with this.

The standard error is equal to the standard deviation of these sample means.

Standard error measures the sample-to-sample variability of the means, and tells you how far the sample mean deviates from the true population mean. The smaller the standard error, the more representative the sample will be of the overall population.

Coming back to our trading context, a small standard error means our backtest expectancy will be close to the ‘true’ value that we would obtain if we could backtest over infinite data.

How Do We Calculate Standard Error?

In trading, we don’t have the luxury of having multiple samples as shown above. We only have one, and that’s our backtest.

Fortunately, standard error can be estimated using the simple formula:

The more trades your backtest has, the smaller the standard error. 

To get the standard deviation value, you will need the profit/loss from each individual trade in your backtest. You can easily get this value by exporting your MT4 backtest report to Excel.

  1. Apply Excel’s data filter function to the Type column. Remove all rows that have empty cells in the Profit column.
  2. Use Excel’s STDEV function to calculate the standard deviation of the individual trade profits.

How Do We Use Standard Error?

Thanks to the central limit theorem, it is usually safe to assume that the profits/losses in your backtest are normally distributed. In other words, they follow the famous ‘bell curve’ shown below:

In a normal distribution,

  • 68.3% of values lie between ±1 standard deviation
  • 95.4% of values lie between ±2 standard deviations
  • 99.7% of values lie between ±3 standard deviations

If you have a backtest with an expectancy of $100 and a standard error of $20, you can make use of the above information to estimate the following:

  • You can be 68.3% confident that your strategy’s true expectancy lies between $80 and $120 (±1 standard error)
  • You can be 95.4% confident that your strategy’s true expectancy lies between $60 and $140 (±2 standard errors)
  • You can be 99.7% confident that your strategy’s true expectancy lies between $40 and $160 (±3 standard errors)

That’s standard error in a nutshell for you. Some statistical rigour was sacrificed to arrive at the statements above, but trading is a practical moneymaking endeavour, not an academic exercise.

Examples of Standard Error Application

I’ll use two backtests to demonstrate the application of standard error. Both are from trend following strategies trading 0.1 lots throughout the backtest.

The first strategy trades on the 15-minute timeframe.

The second strategy trades on the 4-hour timeframe.

At first glance, the second strategy seems far more promising. Expectancy and profit factor are significantly higher, although the backtest only has 290 trades. Let’s see if that causes problems.

I did the following for each strategy:

  1. Compute the standard error using the procedure described above
  2. Compute the expectancy ± 2*standard errors

Fortunately, both backtests still yield a positive expectancy after subtracting 2 standard errors.

But look at how similar their ‘adjusted’ expectancies are. Standard error has exposed the backtest uncertainty arising from a small sample of trades for the H4 strategy.

This is an unfortunate reality of trading on the higher timeframes. Such strategies typically contain larger wins and losses, giving a larger standard deviation of individual trade results. Compound this with the smaller number of trades, and you get a standard error that can really erode your backtest expectancy.

Suppose you’re trading a daily trend following strategy that was developed on a small sample of trades, and it starts underperforming. Market conditions could have changed, or perhaps your backtest uncertainty is playing out in real time.

Wrapping Up

A large sample of trades minimizes the effects of luck, and helps ensure you’re discovering the true performance profile of your strategy.

Rather than establish a minimum acceptable number of trades, consider using the standard error to quantify your backtest’s uncertainty arising from a small sample of trades.

If your backtest’s expectancy is still positive after subtracting twice the standard error, it’s likely your strategy will be profitable over the long-term.

Unfortunately for part-time retail traders, backtests of higher-timeframe strategies often contain large standard errors. Trading on the lower timeframes could alleviate this. (Hint: Algorithmic trading will come in handy.)

How many trades do you like to see in your backtest? Let me know in the comments!

Powered By

Development Platform

Forex VPS

FXVM Forex VPS

Popular Posts

Laguerre RSI Trend Following Strategy

The Laguerre RSI attempts to improve the responsiveness of the regular RSI, whilst keeping whipsaw trades to a minimum. Let’s see how well it detects short-term pullbacks for a trend following strategy!

read more

What is Fixed Ratio Money Management?

Have you heard of fixed ratio money management? How does it compare to the popular fixed fractional approach? Here I’ll explain how fixed ratio works, and see how it stacks up against fixed fractional money management.

read more

Build a Diversified Portfolio With QuantAnalyzer

The ability to efficiently trade a diversified portfolio of strategies is one of the biggest advantages of algorithmic trading. Here we will use QuantAnalyzer’s Portfolio Master to build a portfolio consisting of high performing, uncorrelated strategies.

read more

What Is the QQE Indicator?

The QQE is a mysterious indicator that sometimes pops up in trading forums. Does it deserve a place alongside the more traditional momentum indicators like the RSI and CCI? Let’s add it to a trend following strategy to find out!

read more

Make your money work for you!

Get promotions, trading ideas and strategy development tips delivered to your inbox!

Comments

13 Comments

  1. John Gimenez

    Hello,

    Great article and work! I am trying to figure out if I followed correctly your instructions to calculate your application of standard error on my backtest and live test.
    BACKTEST (10 years): 893 trades, standard error(standard deviation 4.057/893 = 0.004, payoff ratio 1.53
    Expectancy -3*SE: 1.53 – 0.012 = 1.38
    LIVE TEST (3 months): 30 trades, standard error(standard deviation 3.15/30 = 0.105, payoff ratio 1.50
    Expectancy -3*SE: 1.50 – 0.315 = 1.18
    Did I do this right? Does it look like a good system?
    Thanks a lot for your valuable feedback.

    Reply
    • Wayne

      Hi John! Glad you enjoyed the article. Here’s my 2 cents:

      1) To find the standard error, you need to divide the standard deviation by the square root of the number of trades (not simply the number of trades). For your backtest example above, you should divide by square root of 893, giving you a standard error of 0.14.

      2) I recommend applying standard error to your backtest results, as a way to determine how reliable your backtest statistics are. If you do this for your live results, you will usually get a very large standard error because of the small number of trades. If you want to verify whether your live results match your backtest, you can apply simple statistical tests like the t-test or chi-square test. I detail the steps in my Incubation article: https://tradingtact.com/forward-testing/

      3) Is it a good system? Using your backtest results, Expectancy – 3*SE = 1.53 – 0.41 = 1.12. Looks decent to me. If your live results match your backtest results, you should be doing good.

      On a side note, I don’t think payoff ratio = expectancy. Payoff ratio is the average winner/average loser.

      Hope that helped!

      Reply
  2. Joseph Multon

    What if you do not know the expected value of your strategy and that is what you are trying to find out? Is it just what you think you should be getting per trade?

    Reply
    • Wayne

      Hello Joseph,

      The expected value (Net profit/# trades) is a very common metric and I believe every backtest engine should compute it for you. In MT4, it is called the expected payoff.

      I wouldn’t rely on gut feeling because expectations often differ from reality in trading.

      Reply
  3. Rajiv

    Hey Wayne, Awesome work. Thanks a lot. It was really helpful in solving the dilemma. I wanted to know if I have only 25 trades in backtest as I trade on weekly charts, is this standard error metric still helpful.
    Thanks

    Reply
    • Wayne

      Hi Rajiv, standard error would still be helpful. Such a small sample size would produce a large standard error, which illustrates the high level of uncertainty in your backtest.

      Reply
  4. Rajiv

    Thanks a lot Wayne for not only giving a good content but also responding so promptly…I would like to take your help in future for my system developments and also will be happy to spread a good word 🙂

    I tested almost 290 markets on 20-30 years data yet most of them are producing only 20-80 trades because of system parameters being designed for longer holding time or other reasons like data being available from 2021 such a case of Robinhood

    These are my results on 2H timeframe, However I am sticking to your logic of being positive after Expectance-2*SE
    Example: Trades 35/PF 3.2/GtPR 2.2/Expectancy 2.17%/SE 0.91%
    2.17%- 1.82%( 0.91 SE*2)=0.35%

    Pls advise
    Thanks

    Reply
    • Wayne

      I think it would be more appropriate to use $ for expectancy. If it’s still positive after subtracting 2SE, you should be good to go.

      Reply
      • Pankaj

        Thanks a lot for your help, Wayne.🙏

        Reply
  5. David

    Hello Wayne,

    Thank you for all the useful content and value that you are providing.

    I would like to ask:
    – do you include commision+swap when calculating expectancy and STDEV of P/L ? And what is your reasoning behind it ?

    Thanks for the answer

    Reply
    • Wayne

      Hi David,
      I always include spread, slippage and commissions when backtesting. These are unavoidable costs of trading, so it’s best to include them always.
      Strategyquant does not have a swap input, so I don’t include that for now. I am quite conservative with my spread and slippage (2 pip spread and 1 pip slippage for liquid ECN markets), so that usually compensates for the lack of swap.

      Reply
  6. Anonymous

    this is a tricky topic to cover. good attempt. i find that higher trade numbers as you say come with intraday strategies and so particulary in fx dont always mean its robust, but having said this id look for around 1000. then i like to see them also perform on other markets, which really gives you an indication of robustness as i have some strategies giving 5000 trades when looking at a basket of assets.

    Reply
    • Wayne

      thanks for your input!

      Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

Trading Strategies

What’s the Best Time to Trade Forex?

What’s the Best Time to Trade Forex?

The forex markets are open 24/5, but not all hours are created equal. Here I dissect my broker data to determine the best time to trade forex.

Forex Weekend Gaps: Can You Exploit Them?

Forex Weekend Gaps: Can You Exploit Them?

Have you noticed that forex weekend gaps usually reverse within 3 days? Here I’ll program a mean reversion strategy to exploit gaps over the last 18 years!

Money Flow Index: An Improved RSI?

Money Flow Index: An Improved RSI?

The Money Flow Index is sometimes called the volume-weighted RSI. Can it outperform the RSI in this trend following strategy?

Automated Bollinger Bands Squeeze Forex Strategy

Automated Bollinger Bands Squeeze Forex Strategy

StrategyQuant’s BBWR indicator is the perfect tool to detect a Bollinger Bands squeeze. Here I explain how it’s calculated, and use it to program a breakout strategy for the AUDJPY!

Should You Use the Kelly Criterion for Forex Trading?

Should You Use the Kelly Criterion for Forex Trading?

The Kelly criterion is a famous mathematical formula that attempts to maximize your long-term capital growth. In this post, I’ll apply it to a EURUSD breakout strategy and explain some of its potential shortcomings when applied to forex trading.

Can a Trading Pause Improve Your Trend Following Results?

Can a Trading Pause Improve Your Trend Following Results?

A temporary trading pause can improve your win rate if you’re trend following a volatile market. Here I’ll program a trading pause into a simple breakout strategy, and test its effectiveness on the Widow Maker – the GBPJPY.

Laguerre RSI Trend Following Strategy

Laguerre RSI Trend Following Strategy

The Laguerre RSI attempts to improve the responsiveness of the regular RSI, whilst keeping whipsaw trades to a minimum. Let’s see how well it detects short-term pullbacks for a trend following strategy!

How to Use the Supertrend Indicator

How to Use the Supertrend Indicator

Despite its cool name, the Supertrend indicator often seems to slip under the radar. Here I explain how it’s calculated, and combine it with moving averages to produce a simple trend following strategy.

Strategy Development

Do You Know Your System Quality Number?

Do You Know Your System Quality Number?

The System Quality Number measures the profitability & consistency of your trading system. Here’s how to calculate your SQN and use it to improve your trading!

How to Get a Realistic Backtest Spread

How to Get a Realistic Backtest Spread

Your choice of backtest spread can certainly make or break a strategy. This post will show you how to study the intraday spread variations of your market, and suggest several ways to avoid paying ridiculous spreads.

Do You Know Your Strategy’s Optimization Profile?

Do You Know Your Strategy’s Optimization Profile?

Your strategy’s optimization profile often reveals its robustness, helping you select strategies that will remain profitable in live trading. Here I explain why an optimization profile is important, and how you can easily obtain one using StrategyQuant’s optimizer.

Which MT4 Backtest Report Metrics Should You Use?

Which MT4 Backtest Report Metrics Should You Use?

Understanding your backtest report is an essential part of being a successful strategy developer. Here I explain what the numbers mean, and how you can make use of each metric during strategy development.

Out-of-sample Testing Using Monte Carlo Simulations

Out-of-sample Testing Using Monte Carlo Simulations

Traders often use Monte Carlo simulations to estimate worst-case drawdowns, but did you know they can be used for out-of-sample testing too? This post demonstrates the use of StrategyQuant’s Monte Carlo simulator to randomize historical prices and strategy parameters, helping you select robust strategies for live trading.

Build a Diversified Portfolio With QuantAnalyzer

Build a Diversified Portfolio With QuantAnalyzer

The ability to efficiently trade a diversified portfolio of strategies is one of the biggest advantages of algorithmic trading. Here we will use QuantAnalyzer’s Portfolio Master to build a portfolio consisting of high performing, uncorrelated strategies.

Strategy Optimization Using MT4

Strategy Optimization Using MT4

How do you improve your trading strategy in MT4? This post will show you how to optimize the entry and exit parameters for a moving average crossover strategy. Finally, an intraday time filter will be added to help avoid false breakouts.

Debugging & Backtesting Using MT4

Debugging & Backtesting Using MT4

With a fresh algorithm at your fingertips, how do you verify that it has been programmed correctly? This guide will show you how to use Metatrader 4’s visual backtester to debug and backtest your strategy.

Create Your Trading Algorithm in 15 Minutes (FREE)

Create Your Trading Algorithm in 15 Minutes (FREE)

Converting your trading idea into an algorithm is the first step towards reaping the benefits of automated trading. This guide will cover the creation of a simple moving average crossover algorithm, without any actual programming.

What Is Drawdown in Trading?

What Is Drawdown in Trading?

Are you getting a comprehensive assessment of your strategy’s downside? This post will discuss several methods to measure drawdowns, helping you build and select strategies that better suit your risk appetite.

How to Select the Best Trading Strategy Entry

How to Select the Best Trading Strategy Entry

With an abundance of technical indicators available, selecting your strategy’s entry conditions can be overwhelming. This post will illustrate a method to graphically compare the profitability of different entries.

Live Trading

What’s the Best Time to Trade Forex?

What’s the Best Time to Trade Forex?

The forex markets are open 24/5, but not all hours are created equal. Here I dissect my broker data to determine the best time to trade forex.

How to Find a Real Trading Guru

How to Find a Real Trading Guru

Every day I come across a trading guru offering educational content on the internet. Many of them speak of huge returns with minimal effort. Should these be trusted? Here’s some tips on how to separate the wheat from the chaff.

How to Enjoy Stress-Free Trading

How to Enjoy Stress-Free Trading

Trading is a great way to make some additional income, but not if you’re constantly pulling your hair out. Here I offer 7 tips to help make your trading profitable and stress-free.

How to Select the Best Forex VPS

How to Select the Best Forex VPS

A virtual private server (VPS) is a virtual computer that you can rent and access remotely. It provides a reliable platform on which to execute your forex strategies. This post will help you decide whether you need a VPS, and show you how to select an optimal VPS.

Make your money work for you!

Make your money work for you!

 

Get trading ideas and strategy development tips delivered to your inbox!

Thanks for subscribing!

Pin It on Pinterest

Share This