Skip to main content (press enter to focus)

Capital at risk -- demo environment only.

This public experience streams signed, sandboxed market data for illustrative purposes. No live trading or performance guarantees.

Research Paperpublished

Realistic Backtesting: Transaction Costs, Slippage, and Walk-Forward Optimization

Comprehensive framework for realistic backtesting including transaction costs, dynamic slippage modeling, and walk-forward optimization to prevent overfitting.

Author
Research Team
Published
October 20, 2024
BacktestingTransaction CostsMethodologyWalk-Forward

Realistic Backtesting: Transaction Costs, Slippage, and Walk-Forward Optimization

Abstract

Most backtest failures in live trading stem from unrealistic assumptions about costs, slippage, and optimization methodology. This paper details our comprehensive backtesting framework designed to minimize the gap between simulated and live performance.

The Backtest-to-Live Gap

Common reasons strategies fail live:

FactorTypical ImpactOur Approach
Transaction costs-20% to -50% of gross returnsFull cost modeling
Slippage-10% to -30% of gross returnsDynamic slippage
Market impactVariable, often ignoredSize-based impact model
OverfittingStrategy fails completelyWalk-forward testing
Look-ahead biasInflated win ratesPoint-in-time data

Transaction Cost Modeling

Fee Structure

We model complete exchange fee structures:

python
@dataclass class FeeModel: maker_fee: float = 0.001 # 0.1% for limit orders taker_fee: float = 0.002 # 0.2% for market orders funding_rate: float = 0.01 # 0.01% per 8h for perpetuals withdrawal_fee: float = 0.0005 # Network fees

Round-Trip Cost Calculation

python
def calculate_round_trip_cost( entry_type: str, # 'maker' or 'taker' exit_type: str, # 'maker' or 'taker' position_size: float, funding_periods: int, fee_model: FeeModel ) -> float: """ Calculate total round-trip transaction costs. """ entry_fee = ( fee_model.maker_fee if entry_type == 'maker' else fee_model.taker_fee ) exit_fee = ( fee_model.maker_fee if exit_type == 'maker' else fee_model.taker_fee ) total_funding = fee_model.funding_rate * funding_periods return entry_fee + exit_fee + total_funding

Realistic Cost Assumptions

For crypto trading:

  • Conservative: 0.5% round trip (taker + taker + 2% annual funding)
  • Moderate: 0.3% round trip (maker + taker + 1% annual funding)
  • Optimistic: 0.15% round trip (maker + maker, minimal funding)

We default to conservative assumptions.

Slippage Modeling

Slippage is the difference between expected and actual execution price.

Fixed Slippage (Naive)

Many backtests use fixed slippage (e.g., 0.1%). This is incorrect—slippage depends on:

  • Order size relative to book depth
  • Market volatility
  • Time of day
  • Order type

Dynamic Slippage Model

python
def estimate_slippage( order_size: float, side: str, orderbook: OrderBook, volatility: float ) -> float: """ Estimate slippage based on order size and market conditions. """ # Base slippage from spread spread = orderbook.best_ask - orderbook.best_bid spread_slippage = spread / 2 # Impact slippage from order size if side == 'buy': available_liquidity = sum(orderbook.asks[:10].volume) else: available_liquidity = sum(orderbook.bids[:10].volume) liquidity_ratio = order_size / available_liquidity impact_slippage = orderbook.mid_price * liquidity_ratio * 0.01 # Volatility adjustment vol_adjustment = 1 + (volatility / 0.02) # Baseline 2% daily vol return (spread_slippage + impact_slippage) * vol_adjustment

Market Impact Model (Square Root Law)

For larger orders, we use the square root market impact model:

Impact = σ × √(Q / V) × π

Where:

  • σ = daily price volatility
  • Q = order size
  • V = average daily volume
  • π = permanent impact coefficient (~0.1 for crypto)

Walk-Forward Optimization

Walk-forward testing prevents overfitting by simulating how the strategy would be developed and deployed in real time.

The Process

Timeline:
[----Train 1----][Test 1][----Train 2----][Test 2]...

1. Train on historical data (e.g., 6 months)
2. Optimize parameters on training set
3. Test on unseen future data (e.g., 1 month)
4. Slide window forward
5. Repeat

Implementation

python
def walk_forward_test( strategy: Strategy, data: pd.DataFrame, train_months: int = 6, test_months: int = 1, overlap: bool = False ) -> List[TestResult]: """ Perform walk-forward backtesting. """ results = [] # Calculate window sizes total_days = len(data) train_days = train_months * 30 test_days = test_months * 30 current_start = 0 while current_start + train_days + test_days <= total_days: # Define windows train_end = current_start + train_days test_end = train_end + test_days train_data = data.iloc[current_start:train_end] test_data = data.iloc[train_end:test_end] # Optimize on training data best_params = strategy.optimize(train_data) # Test on out-of-sample data strategy.set_params(best_params) result = strategy.backtest(test_data) results.append(TestResult( train_period=(current_start, train_end), test_period=(train_end, test_end), params=best_params, performance=result )) # Slide window if overlap: current_start += test_days else: current_start = test_end return results

Walk-Forward Efficiency Ratio

A healthy strategy should show consistent performance:

python
def walk_forward_efficiency(results: List[TestResult]) -> float: """ Calculate walk-forward efficiency ratio. WFE = Average Test Sharpe / Average Train Sharpe Good: WFE > 0.5 Acceptable: WFE 0.3-0.5 Poor: WFE < 0.3 (likely overfit) """ train_sharpes = [r.train_sharpe for r in results] test_sharpes = [r.test_sharpe for r in results] return np.mean(test_sharpes) / np.mean(train_sharpes)

Point-in-Time Data

The Problem

Many data sources retroactively update historical data:

  • Earnings restatements
  • Dividend adjustments
  • Split adjustments
  • Delisting handling

Using restated data creates look-ahead bias.

Our Approach

We maintain point-in-time databases:

  • Data stored as it appeared at each moment
  • No retroactive updates
  • Survivorship-bias-free (includes delisted assets)
  • Timestamps for all data points

Monte Carlo Simulation

To understand performance distribution, we run Monte Carlo simulations:

python
def monte_carlo_simulation( strategy: Strategy, data: pd.DataFrame, num_simulations: int = 1000, shuffle_method: str = 'bootstrap' ) -> MonteCarloResults: """ Run Monte Carlo simulation to estimate performance distribution. """ results = [] for _ in range(num_simulations): if shuffle_method == 'bootstrap': # Resample with replacement sampled_data = data.sample(frac=1, replace=True) elif shuffle_method == 'block_bootstrap': # Resample blocks to preserve autocorrelation sampled_data = block_resample(data, block_size=20) result = strategy.backtest(sampled_data) results.append(result) return MonteCarloResults( mean_sharpe=np.mean([r.sharpe for r in results]), std_sharpe=np.std([r.sharpe for r in results]), percentile_5_sharpe=np.percentile([r.sharpe for r in results], 5), percentile_95_sharpe=np.percentile([r.sharpe for r in results], 95), probability_positive=sum(1 for r in results if r.sharpe > 0) / num_simulations )

Our Backtesting Pipeline

Complete pipeline for strategy evaluation:

python
def evaluate_strategy(strategy: Strategy) -> EvaluationReport: """ Complete strategy evaluation pipeline. """ # 1. Load point-in-time data data = load_pit_data(strategy.asset, strategy.timeframe) # 2. Walk-forward optimization wf_results = walk_forward_test(strategy, data) # 3. Calculate metrics with realistic costs for result in wf_results: result.apply_transaction_costs(FeeModel()) result.apply_slippage_model(DynamicSlippage()) # 4. Monte Carlo simulation mc_results = monte_carlo_simulation(strategy, data) # 5. Calculate efficiency ratio wfe = walk_forward_efficiency(wf_results) # 6. Generate report return EvaluationReport( walk_forward_results=wf_results, monte_carlo_results=mc_results, walk_forward_efficiency=wfe, recommendation='PASS' if wfe > 0.3 and mc_results.probability_positive > 0.7 else 'FAIL' )

Conclusion

Realistic backtesting requires:

  1. Comprehensive cost modeling: All fees, funding, and spreads
  2. Dynamic slippage: Based on order size and market conditions
  3. Walk-forward testing: Prevents overfitting
  4. Point-in-time data: No look-ahead bias
  5. Monte Carlo simulation: Understand performance distribution

Strategies that pass our framework have a much higher probability of live trading success.


For more on our methodology, see our audit framework.

Related Research