Select Page

Feb 9, 2021

###### Here I’ll explain why 30 trades is insufficient, and how you can use standard error to quantify the uncertainty arising from a small sample size.

Browsing through the MQL5 Marketplace is a fun way to discover the many types of trading algorithms in existence.

I have seen backtests containing anywhere from 5 to 5000 trades.

So how many trades (or sample size) does your backtest really need?

And if you suspect your backtest has insufficient trades, can you still make use of its results?

# The Importance of Sample Size

A large number of trades increases the statistical significance of your backtest results.

In essence, this means you can be confident that your results are a true reflection of your strategy’s performance, and are not due to chance.

Since your backtest is the crucial ‘scorecard’ that accompanies your strategy from inception to live trading, you need to be sure you can trust it.

In addition, targeting a larger sample size often forces you to backtest your strategy over a longer historical period. This lets you evaluate your strategy over different market conditions, and gives you a better idea of its robustness.

When it comes to statistical significance, the number 30 gets plenty of attention.

When you backtest your strategy, you are attempting to characterize its probability distribution, as statisticians like to say.

30 trades is usually sufficient if you’re trying to verify a distribution you have already characterized.

For example, you have a basket of 30 live trades, and you want to see how these compare to your backtest performance.

You could use a Student’s t-test or a chi-square test to verify that both sets of trades come from the same distribution. I demonstrate the use of these simple statistical tests during the forward testing phase of strategy development.

However, if it’s your first attempt at characterizing the distribution, 30 trades is woefully insufficient.

I’ll illustrate this with a common analogy in probability theory.

## The Sock Analogy

Imagine you receive a large barrel of socks with the following label: 50% black, 50% white socks.

You start drawing socks from the barrel, one at a time. After 30 draws, you have 17 black socks and 13 white socks.

You conclude that the label is correct.

Suppose you now receive an unlabelled barrel of socks, and you have totally no idea what’s inside.

Would 30 draws allow you to confidently describe the contents of the barrel? Probably not!

Your strategy’s average profit/loss, win rate, stagnation etc., are all important metrics that your backtest should tell you.

It’s quite impossible to characterize a whole bunch of metrics with such a small sample size. Just look at the wall of socks below!

The more the merrier.

Obviously this answer is not particularly useful or actionable. You could face practical limitations regarding sample size for the following reasons:

• It is difficult to find quality historical data before the year 2000
• You trade on the higher timeframes
• You allocate some data for out-of-sample testing

We need a way to quantify the deterioration of backtest results arising from a small sample size.

Fortunately, statisticians have solved this dilemma for us, using a concept called standard error.

I’ll explain standard error below, then demonstrate its application using two very different backtests.

# What is Standard Error?

Standard error measures the accuracy of your sampling process. It helps you gauge how reliable your backtest results are.

You can apply the standard error to any statistic, but for our trading purposes, we’ll use the mathematical expectancy (also called the average trade or the mean) of the backtest.

A trader backtesting a strategy is like a statistician sampling a population to determine some underlying parameter. To better understand what standard error means, let’s first discuss how it is used in statistics.

## A Short Statistics Excursion

Imagine you want to determine the average height of 30-year old males in a country. This height would the parameter of interest. Its true value is unknown, but you hope to get a good estimate of it through the sampling process.

So you go about collecting 5 samples of data throughout the country, each consisting of 10 data points.  You get the following plot:

To get the standard error, you do the following:

1. Compute the mean height from each of the 5 samples
2. Compute the standard deviation of these 5 means. Excel’s STDEV function can help with this.

The standard error is equal to the standard deviation of these sample means.

Standard error measures the sample-to-sample variability of the means, and tells you how far the sample mean deviates from the true population mean. The smaller the standard error, the more representative the sample will be of the overall population.

Coming back to our trading context, a small standard error means our backtest expectancy will be close to the ‘true’ value that we would obtain if we could backtest over infinite data.

## How Do We Calculate Standard Error?

In trading, we don’t have the luxury of having multiple samples as shown above. We only have one, and that’s our backtest.

Fortunately, standard error can be estimated using the simple formula:

To get the standard deviation value, you will need the profit/loss from each individual trade in your backtest. You can easily get this value by exporting your MT4 backtest report to Excel.

1. Apply Excel’s data filter function to the Type column. Remove all rows that have empty cells in the Profit column.
2. Use Excel’s STDEV function to calculate the standard deviation of the individual trade profits.

# How Do We Use Standard Error?

Thanks to the central limit theorem, it is usually safe to assume that the profits/losses in your backtest are normally distributed. In other words, they follow the famous ‘bell curve’ shown below:

In a normal distribution,

• 68.3% of values lie between ±1 standard deviation
• 95.4% of values lie between ±2 standard deviations
• 99.7% of values lie between ±3 standard deviations

If you have a backtest with an expectancy of \$100 and a standard error of \$20, you can make use of the above information to estimate the following:

• You can be 68.3% confident that your strategy’s true expectancy lies between \$80 and \$120 (±1 standard error)
• You can be 95.4% confident that your strategy’s true expectancy lies between \$60 and \$140 (±2 standard errors)
• You can be 99.7% confident that your strategy’s true expectancy lies between \$40 and \$160 (±3 standard errors)

That’s standard error in a nutshell for you. Some statistical rigour was sacrificed to arrive at the statements above, but trading is a practical moneymaking endeavour, not an academic exercise.

# Examples of Standard Error Application

I’ll use two backtests to demonstrate the application of standard error. Both are from trend following strategies trading 0.1 lots throughout the backtest.

The first strategy trades on the 15-minute timeframe.

The second strategy trades on the 4-hour timeframe.

At first glance, the second strategy seems far more promising. Expectancy and profit factor are significantly higher, although the backtest only has 290 trades. Let’s see if that causes problems.

I did the following for each strategy:

1. Compute the standard error using the procedure described above
2. Compute the expectancy ± 2*standard errors

Fortunately, both backtests still yield a positive expectancy after subtracting 2 standard errors.

But look at how similar their ‘adjusted’ expectancies are. Standard error has exposed the backtest uncertainty arising from a small sample of trades for the H4 strategy.

This is an unfortunate reality of trading on the higher timeframes. Such strategies typically contain larger wins and losses, giving a larger standard deviation of individual trade results. Compound this with the smaller number of trades, and you get a standard error that can really erode your backtest expectancy.

Suppose you’re trading a daily trend following strategy that was developed on a small sample of trades, and it starts underperforming. Market conditions could have changed, or perhaps your backtest uncertainty is playing out in real time.

# Wrapping Up

A large sample of trades minimizes the effects of luck, and helps ensure you’re discovering the true performance profile of your strategy.

Rather than establish a minimum acceptable number of trades, consider using the standard error to quantify your backtest’s uncertainty arising from a small sample of trades.

If your backtest’s expectancy is still positive after subtracting twice the standard error, it’s likely your strategy will be profitable over the long-term.

Unfortunately for part-time retail traders, backtests of higher-timeframe strategies often contain large standard errors. Trading on the lower timeframes could alleviate this. (Hint: Algorithmic trading will come in handy.)

How many trades do you like to see in your backtest? Let me know in the comments!

## Popular Posts

### What is Considered Good Trading Performance?

Are you taking too much risk for too little return? Perhaps the professional hedge funds can help define good trading performance.

### Automated MACD Divergence Forex Trading Strategy

The MACD is a simple and effective momentum indicator. Here’s how you can save screen time by programming a MACD divergence strategy for the GBPJPY!

### Laguerre RSI Trend Following Strategy

The Laguerre RSI attempts to improve the responsiveness of the regular RSI, whilst keeping whipsaw trades to a minimum. Let’s see how well it detects short-term pullbacks for a trend following strategy!

### What Is the Kaufman Adaptive Moving Average?

The Kaufman Adaptive Moving Average is a unique indicator that automatically adapts to the market’s noise. Here I explain its inner workings and show you how to build a trend following strategy around it.

### What is Fixed Ratio Money Management?

Have you heard of fixed ratio money management? How does it compare to the popular fixed fractional approach? Here I’ll explain how fixed ratio works, and see how it stacks up against fixed fractional money management.

### Build a Diversified Portfolio With QuantAnalyzer

The ability to efficiently trade a diversified portfolio of strategies is one of the biggest advantages of algorithmic trading. Here we will use QuantAnalyzer’s Portfolio Master to build a portfolio consisting of high performing, uncorrelated strategies.

### What Is the QQE Indicator?

The QQE is a mysterious indicator that sometimes pops up in trading forums. Does it deserve a place alongside the more traditional momentum indicators like the RSI and CCI? Let’s add it to a trend following strategy to find out!

### Do Bollinger Bands + Candlestick Patterns Work?

Bollinger Bands are great at detecting overbought and oversold conditions. Let’s use them to develop a countertrend strategy, and then refine our entries using limit entries and candlestick patterns.

### How Good Are The Bollinger Bands’ Trailing Stops?

Trailing stop losses are a popular feature in many trend following systems. Bollinger Bands, the ever-popular technical indicator among retail traders, actually contain two inbuilt trailing stops. Are these any good? Let’s find out!

#### Make your money work for you!

Get promotions, trading ideas and strategy development tips delivered to your inbox!

1. Hello,

Great article and work! I am trying to figure out if I followed correctly your instructions to calculate your application of standard error on my backtest and live test.
BACKTEST (10 years): 893 trades, standard error(standard deviation 4.057/893 = 0.004, payoff ratio 1.53
Expectancy -3*SE: 1.53 – 0.012 = 1.38
LIVE TEST (3 months): 30 trades, standard error(standard deviation 3.15/30 = 0.105, payoff ratio 1.50
Expectancy -3*SE: 1.50 – 0.315 = 1.18
Did I do this right? Does it look like a good system?
Thanks a lot for your valuable feedback.

• Hi John! Glad you enjoyed the article. Here’s my 2 cents:

1) To find the standard error, you need to divide the standard deviation by the square root of the number of trades (not simply the number of trades). For your backtest example above, you should divide by square root of 893, giving you a standard error of 0.14.

2) I recommend applying standard error to your backtest results, as a way to determine how reliable your backtest statistics are. If you do this for your live results, you will usually get a very large standard error because of the small number of trades. If you want to verify whether your live results match your backtest, you can apply simple statistical tests like the t-test or chi-square test. I detail the steps in my Incubation article: https://tradingtact.com/forward-testing/

3) Is it a good system? Using your backtest results, Expectancy – 3*SE = 1.53 – 0.41 = 1.12. Looks decent to me. If your live results match your backtest results, you should be doing good.

On a side note, I don’t think payoff ratio = expectancy. Payoff ratio is the average winner/average loser.

Hope that helped!

2. What if you do not know the expected value of your strategy and that is what you are trying to find out? Is it just what you think you should be getting per trade?

• Hello Joseph,

The expected value (Net profit/# trades) is a very common metric and I believe every backtest engine should compute it for you. In MT4, it is called the expected payoff.

I wouldn’t rely on gut feeling because expectations often differ from reality in trading.

3. Hey Wayne, Awesome work. Thanks a lot. It was really helpful in solving the dilemma. I wanted to know if I have only 25 trades in backtest as I trade on weekly charts, is this standard error metric still helpful.
Thanks

• Hi Rajiv, standard error would still be helpful. Such a small sample size would produce a large standard error, which illustrates the high level of uncertainty in your backtest.

4. Thanks a lot Wayne for not only giving a good content but also responding so promptly…I would like to take your help in future for my system developments and also will be happy to spread a good word 🙂

I tested almost 290 markets on 20-30 years data yet most of them are producing only 20-80 trades because of system parameters being designed for longer holding time or other reasons like data being available from 2021 such a case of Robinhood

These are my results on 2H timeframe, However I am sticking to your logic of being positive after Expectance-2*SE
Example: Trades 35/PF 3.2/GtPR 2.2/Expectancy 2.17%/SE 0.91%
2.17%- 1.82%( 0.91 SE*2)=0.35%

Thanks

• I think it would be more appropriate to use \$ for expectancy. If it’s still positive after subtracting 2SE, you should be good to go.

• Thanks a lot for your help, Wayne.🙏

5. Hello Wayne,

Thank you for all the useful content and value that you are providing.

– do you include commision+swap when calculating expectancy and STDEV of P/L ? And what is your reasoning behind it ?

• Hi David,
I always include spread, slippage and commissions when backtesting. These are unavoidable costs of trading, so it’s best to include them always.
Strategyquant does not have a swap input, so I don’t include that for now. I am quite conservative with my spread and slippage (2 pip spread and 1 pip slippage for liquid ECN markets), so that usually compensates for the lack of swap.

6. this is a tricky topic to cover. good attempt. i find that higher trade numbers as you say come with intraday strategies and so particulary in fx dont always mean its robust, but having said this id look for around 1000. then i like to see them also perform on other markets, which really gives you an indication of robustness as i have some strategies giving 5000 trades when looking at a basket of assets.

## What’s the Best Time to Trade Forex?

The forex markets are open 24/5, but not all hours are created equal. Here I dissect my broker data to determine the best time to trade forex.

## Forex Weekend Gaps: Can You Exploit Them?

Have you noticed that forex weekend gaps usually reverse within 3 days? Here I’ll program a mean reversion strategy to exploit gaps over the last 18 years!

## Dynamic Position Sizing: Is It Time to Go Big?

Should you increase your lot sizes for higher probability trades? Let’s code a dynamic position sizing scheme to capture more outsized wins!

## Pivot Points: A Reliable Support & Resistance Indicator

Pivot points are the perfect tool if you trade using support & resistance. Here’s how to develop an automated pivot points forex strategy!

## Money Flow Index: An Improved RSI?

The Money Flow Index is sometimes called the volume-weighted RSI. Can it outperform the RSI in this trend following strategy?

## Automated Bollinger Bands Squeeze Forex Strategy

StrategyQuant’s BBWR indicator is the perfect tool to detect a Bollinger Bands squeeze. Here I explain how it’s calculated, and use it to program a breakout strategy for the AUDJPY!

## Automated Schaff Trend Cycle Forex Strategy

The Schaff Trend Cycle is a unique combination of the MACD and Stochastic indicators. Here’s how you can use it to improve your trend following results!

## Should You Use the Kelly Criterion for Forex Trading?

The Kelly criterion is a famous mathematical formula that attempts to maximize your long-term capital growth. In this post, I’ll apply it to a EURUSD breakout strategy and explain some of its potential shortcomings when applied to forex trading.

## Forex Intermarket Correlations: How Do You Exploit Them?

Knowledge of intermarket correlations can improve your forex trading win rate. Here I explain three important types of correlations, and how you can use them to benefit your trading.

A temporary trading pause can improve your win rate if you’re trend following a volatile market. Here I’ll program a trading pause into a simple breakout strategy, and test its effectiveness on the Widow Maker – the GBPJPY.

## Can Partial Profit Taking Benefit Trend Followers?

Partial profit taking is a dilemma often faced by long-term trend followers. Could this benefit your overall strategy performance?

Let’s test!

## Ride Breakouts With This Support & Resistance Indicator!

StrategyQuant offers a support & resistance ranking indicator to help detect upcoming breakouts. Use it to program an automated USDJPY strategy!

## Automated MACD Divergence Forex Trading Strategy

The MACD is a simple and effective momentum indicator. Here’s how you can save screen time by programming a MACD divergence strategy for the GBPJPY!

## Catch Breakouts With an Inside Bar Trend Strategy!

Inside bars can be a valuable predictor of trend continuation. Here’s how you can program an inside bar, and use it to improve your trend following strategy!

## Laguerre RSI Trend Following Strategy

The Laguerre RSI attempts to improve the responsiveness of the regular RSI, whilst keeping whipsaw trades to a minimum. Let’s see how well it detects short-term pullbacks for a trend following strategy!

## Bollinger Bands Free Bar Trading Strategy

Bollinger Bands are great at predicting upcoming price reversals. Here’s how you can use its ‘free bars’ to build a reliable Asian scalper!

## How to Use the Supertrend Indicator

Despite its cool name, the Supertrend indicator often seems to slip under the radar. Here I explain how it’s calculated, and combine it with moving averages to produce a simple trend following strategy.

## 5 Easy Trailing Stops to Boost Your Trend Following Results

Trailing stops are a popular feature in trend following strategies. They provide downside protection and allow you to ride the big trends. Try out these simple yet effective trailing stops for your next trend strategy!

## Book Review: Building Winning Algorithmic Trading Systems

Serious traders invest in quality education. Should you read “Building Winning Algorithmic Trading Systems” by Kevin Davey?

## Trade Slippage: How Can You Simulate and Minimize It?

Are you a victim of excessive trade slippage? Here’s how you can minimize slippage, and more realistically simulate it in your backtests!

## Edge Ratio: A Unique Way to Quantify Entry Profitability

Selecting a profitable entry is a critical step in strategy development. Here I’ll demonstrate how to use the Edge Ratio to maximize your profit potential.

## Using Maximum Adverse Excursion for Stop Loss Placement

A catastrophic stop loss is a vital risk management tool for many traders. Here I’ll show you how to optimize your stop loss distance using maximum adverse excursion.

## Multiple Timeframe Backtesting – A Quick Robustness Test

Multiple timeframe backtesting can be a valuable addition to your strategy development workflow. Here I explain why you should do it, and how to conveniently do it in MT4 and StrategyQuant.

## How to Get a Realistic Backtest Spread

Your choice of backtest spread can certainly make or break a strategy. This post will show you how to study the intraday spread variations of your market, and suggest several ways to avoid paying ridiculous spreads.

Risk of ruin is a useful metric to help develop trading strategies that suit your risk appetite. This post explains how to calculate your risk of ruin, and how to use it to improve your trading!

## Do You Know Your Strategy’s Optimization Profile?

Your strategy’s optimization profile often reveals its robustness, helping you select strategies that will remain profitable in live trading. Here I explain why an optimization profile is important, and how you can easily obtain one using StrategyQuant’s optimizer.

## Which MT4 Backtest Report Metrics Should You Use?

Understanding your backtest report is an essential part of being a successful strategy developer. Here I explain what the numbers mean, and how you can make use of each metric during strategy development.

## Out-of-sample Testing Using Monte Carlo Simulations

Traders often use Monte Carlo simulations to estimate worst-case drawdowns, but did you know they can be used for out-of-sample testing too? This post demonstrates the use of StrategyQuant’s Monte Carlo simulator to randomize historical prices and strategy parameters, helping you select robust strategies for live trading.

## Which Forex Markets Are the Best for Trend Following?

They say the trend is your friend, but how do you find your friends in the forex jungle? This post shows you how to use the ADX indicator to identify forex markets and timeframes that trend well.

## Build a Diversified Portfolio With QuantAnalyzer

The ability to efficiently trade a diversified portfolio of strategies is one of the biggest advantages of algorithmic trading. Here we will use QuantAnalyzer’s Portfolio Master to build a portfolio consisting of high performing, uncorrelated strategies.

## Strategy Optimization Using MT4

How do you improve your trading strategy in MT4? This post will show you how to optimize the entry and exit parameters for a moving average crossover strategy. Finally, an intraday time filter will be added to help avoid false breakouts.

## Debugging & Backtesting Using MT4

With a fresh algorithm at your fingertips, how do you verify that it has been programmed correctly? This guide will show you how to use Metatrader 4’s visual backtester to debug and backtest your strategy.

Converting your trading idea into an algorithm is the first step towards reaping the benefits of automated trading. This guide will cover the creation of a simple moving average crossover algorithm, without any actual programming.

## What Is Drawdown in Trading?

Are you getting a comprehensive assessment of your strategy’s downside? This post will discuss several methods to measure drawdowns, helping you build and select strategies that better suit your risk appetite.

## How to Select the Best Trading Strategy Entry

With an abundance of technical indicators available, selecting your strategy’s entry conditions can be overwhelming. This post will illustrate a method to graphically compare the profitability of different entries.

## Who Are the Major Players in the Forex Market?

With its daily turnover of over \$6 trillion, who are the major players in the forex market? Let’s see how retail traders stack up!

## US Nonfarm Payrolls: What’s Considered a Big Miss?

A big miss in the US nonfarm payrolls can wreak havoc in the markets. What’s considered a big miss? I analyze historical values to find out!

## What’s the Best Time to Trade Forex?

The forex markets are open 24/5, but not all hours are created equal. Here I dissect my broker data to determine the best time to trade forex.

## What is Considered Good Trading Performance?

Are you taking too much risk for too little return? Perhaps the professional hedge funds can help define good trading performance.

## 5 Forex Day Trading Challenges & How to Overcome Them

Forex day trading seems to have a particular appeal to new traders. Here I highlight five hidden challenges of day trading, and offer some suggestions on how to overcome them.

## How to Find a Real Trading Guru

Every day I come across a trading guru offering educational content on the internet. Many of them speak of huge returns with minimal effort. Should these be trusted? Here’s some tips on how to separate the wheat from the chaff.

## How to Enjoy Stress-Free Trading

Trading is a great way to make some additional income, but not if you’re constantly pulling your hair out. Here I offer 7 tips to help make your trading profitable and stress-free.

## How to Select the Best Forex VPS

A virtual private server (VPS) is a virtual computer that you can rent and access remotely. It provides a reliable platform on which to execute your forex strategies. This post will help you decide whether you need a VPS, and show you how to select an optimal VPS.