GUARDLABS
GuardLabs ยท Technical note

Backtesting a Trading Strategy in Python Without Look-Ahead Bias

Look-ahead bias occurs when your backtest uses information that would not have been available at the time of the trade. This leads to unrealistically high backtest performance that fails in live trading. To prevent this in Python, you must align your trading signals so that a signal generated at the close of bar t is only applied to the returns of bar t+1.

The Golden Rule: Shift Your Signals

In a vectorized pandas backtest, the simplest way to eliminate look-ahead bias is by using the .shift(1) method on your signal series. This ensures you execute the trade on the next period's price, not the current period's price that triggered the signal.

Python Implementation

The following script fetches historical data for SPY, implements a simple Moving Average Crossover strategy, applies the shift to prevent look-ahead bias, and calculates the cumulative returns.

import numpy as np
import pandas as pd
import yfinance as yf

# 1. Download historical data
data = yf.download("SPY", start="2020-01-01", end="2023-12-31")

# Clean multi-index columns if present (common in newer yfinance versions)
if isinstance(data.columns, pd.MultiIndex):
    data.columns = data.columns.droplevel(1)

df = data[['Close']].copy()

# 2. Calculate indicators
df['SMA_fast'] = df['Close'].rolling(window=20).mean()
df['SMA_slow'] = df['Close'].rolling(window=50).mean()

# 3. Generate raw signals (1 = Long, 0 = Cash)
# This signal is calculated at the end of day t
df['Raw_Signal'] = np.where(df['SMA_fast'] > df['SMA_slow'], 1, 0)

# 4. ELIMINATE LOOK-AHEAD BIAS
# Shift the signal by 1 day so we trade on day t+1 using day t's signal
df['Executed_Signal'] = df['Raw_Signal'].shift(1)

# 5. Calculate returns
df['Market_Returns'] = df['Close'].pct_change()
df['Strategy_Returns'] = df['Market_Returns'] * df['Executed_Signal']

# 6. Calculate cumulative performance
df['Cum_Market_Returns'] = (1 + df['Market_Returns']).cumprod() - 1
df['Cum_Strategy_Returns'] = (1 + df['Strategy_Returns']).cumprod() - 1

# Drop NaN values for clean output
df.dropna(inplace=True)

# Print final performance
final_market = df['Cum_Market_Returns'].iloc[-1] * 100
final_strategy = df['Cum_Strategy_Returns'].iloc[-1] * 100

print(f"Final Market Return: {final_market:.2f}%")
print(f"Final Strategy Return: {final_strategy:.2f}%")

Code Walkthrough

  • Raw Signal Generation: df['Raw_Signal'] compares the 20-day and 50-day moving averages at the close of day t.
  • The Shift: df['Raw_Signal'].shift(1) moves the signal down by one row. The trade is executed at the market open (or close) of day t+1.
  • Return Calculation: We multiply the daily return of day t+1 (Market_Returns) by the signal generated on day t (Executed_Signal).

Common Pitfalls to Avoid

  • Using Close Prices for Execution and Signals Simultaneously: If you calculate a signal using today's close and assume you bought at today's close, your backtest is unrealistic unless you execute in the post-market session. Shifting by 1 period assumes execution at the next day's close, which is a safer baseline.
  • Technical Indicator Leakage: Functions like df['Close'].pct_change() or rolling calculations should never look forward. Always use standard rolling windows (e.g., .rolling()) which only look backward.
  • Data Min-Max Scaling: If you normalize data (common in machine learning strategies), fit your scalers only on the training split to avoid leaking future statistical properties into your test set.

Disclaimer: Past performance is not indicative of future results. Backtesting is a tool for hypothesis testing, not a guarantee of profitability.

Need this done fast? order a backtester on Kwork.

Published 2026-06-22 2 min read All articles EN / RU / ES
Need help with this?

I take on freelance fixes and builds in this area.