ADR 014: Paper Trading and Backtesting Architecture¶

Status¶

Accepted (2026-01-21)

Context¶

Arbiter-Bot is a statistical arbitrage engine where strategy effectiveness must be validated before deploying real capital. The challenges are:

Risk-free evaluation - Test strategies without financial exposure
Historical validation - Backtest against recorded market data
Performance comparison - Compare simulated vs actual execution
Realistic simulation - Arbitrage opportunities are ephemeral; simplistic simulation leads to overfitting

Current state: The engine has a --dry-run mode that signs orders but doesn't submit them. This validates API integration but doesn't track paper positions, record fills, or calculate performance metrics.

This ADR covers simulation and backtesting infrastructure. Performance monitoring is covered in ADR-012 and low-latency optimizations in ADR-013.

Decision¶

Implement a multi-fidelity simulation framework with paper trading and backtesting capabilities.

Simulation Fidelity Levels¶

Level	Name	Use Case	Fill Logic	Latency
1	Basic	Quick strategy validation	Instant fill at mid-price	None
2	Realistic	Paper trading	Order book crossing, partial fills	Configurable
3	HFT	Latency-sensitive analysis	Queue position, market impact	Measured

Level 3 is out of scope for initial implementation but the architecture supports future extension.

1. Clock Abstraction¶

Decouple time from system clock to enable deterministic replay:

use chrono::{DateTime, Duration, Utc};
use std::sync::atomic::{AtomicI64, Ordering};

/// Unified time source for real-time and simulated execution
pub trait Clock: Send + Sync {
    /// Current timestamp
    fn now(&self) -> DateTime<Utc>;

    /// Advance time (no-op for RealClock). Uses chrono::Duration for calendar-aware arithmetic.
    fn advance(&self, duration: Duration);

    /// Whether this is simulated time
    fn is_simulated(&self) -> bool;
}

/// Production clock using system time
pub struct RealClock;

impl Clock for RealClock {
    fn now(&self) -> DateTime<Utc> {
        Utc::now()
    }

    fn advance(&self, _duration: Duration) {
        // No-op for real time
    }

    fn is_simulated(&self) -> bool {
        false
    }
}

/// Simulated clock for backtesting
pub struct SimulatedClock {
    current: AtomicI64,  // Milliseconds since epoch
}

impl SimulatedClock {
    pub fn new(start: DateTime<Utc>) -> Self {
        Self {
            current: AtomicI64::new(start.timestamp_millis()),
        }
    }
}

impl Clock for SimulatedClock {
    fn now(&self) -> DateTime<Utc> {
        let ms = self.current.load(Ordering::SeqCst);
        DateTime::from_timestamp_millis(ms).unwrap()
    }

    fn advance(&self, duration: Duration) {
        self.current.fetch_add(duration.num_milliseconds(), Ordering::SeqCst);
    }

    fn is_simulated(&self) -> bool {
        true
    }
}

Rationale: - Trait abstraction allows seamless switching between modes - AtomicI64 enables lock-free time updates during replay - advance() method supports event-driven time progression - Precision Note: SimulatedClock stores milliseconds since epoch. For sub-millisecond timing scenarios (HFT Level 3), consider using nanosecond precision with AtomicI64 or AtomicU128.

2. SimulatedExchangeClient¶

Implements the existing ExchangeClient trait with simulated fills:

use crate::market::client::ExchangeClient;

pub struct SimulatedExchangeClient {
    /// Matching engine for fill simulation
    matching_engine: MatchingEngine,
    /// Paper positions
    position_tracker: PositionTracker,
    /// Clock source
    clock: Arc<dyn Clock>,
    /// Configuration
    config: SimulationConfig,
}

#[async_trait]
impl ExchangeClient for SimulatedExchangeClient {
    async fn submit_order(&self, order: Order) -> Result<OrderResponse, ExchangeError> {
        // Simulate network latency if configured
        if let Some(latency) = self.config.simulated_latency {
            tokio::time::sleep(latency).await;
        }

        // Match order against order book
        let fill = self.matching_engine.match_order(&order)?;

        // Update paper positions
        self.position_tracker.record_fill(&fill);

        Ok(OrderResponse {
            order_id: Uuid::new_v4().to_string(),
            status: fill.status,
            filled_qty: fill.filled_quantity,
            avg_price: fill.average_price,
            timestamp: self.clock.now(),
        })
    }

    async fn cancel_order(&self, order_id: &str) -> Result<(), ExchangeError> {
        self.matching_engine.cancel_order(order_id)
    }

    async fn get_positions(&self) -> Result<Vec<Position>, ExchangeError> {
        Ok(self.position_tracker.get_all_positions())
    }
}

Rationale: - Implements same trait as production clients for seamless substitution - Configurable latency injection for realistic simulation - Delegates to MatchingEngine for fill logic

3. Matching Engine¶

Simulates order matching with configurable fidelity:

pub struct MatchingEngine {
    /// Current order book state (from replay or live feed)
    orderbook: Arc<RwLock<OrderBook>>,
    /// Fidelity level
    fidelity: FidelityLevel,
    /// Pending orders (for partial fill simulation)
    pending_orders: HashMap<String, Order>,
}

#[derive(Clone, Copy)]
pub enum FidelityLevel {
    /// Instant fill at mid-price
    Basic,
    /// Cross order book, partial fills
    Realistic,
}

impl MatchingEngine {
    pub fn match_order(&mut self, order: &Order) -> Result<Fill, MatchError> {
        match self.fidelity {
            FidelityLevel::Basic => self.match_basic(order),
            FidelityLevel::Realistic => self.match_realistic(order),
        }
    }

    /// Level 1: Instant fill at mid-price
    fn match_basic(&self, order: &Order) -> Result<Fill, MatchError> {
        let book = self.orderbook.read();
        let mid_price = book.mid_price().ok_or(MatchError::NoLiquidity)?;

        Ok(Fill {
            order_id: order.id.clone(),
            filled_quantity: order.quantity,
            average_price: mid_price,
            status: OrderStatus::Filled,
            fills: vec![FillLeg {
                price: mid_price,
                quantity: order.quantity,
            }],
        })
    }

    /// Level 2: Cross order book with partial fills
    fn match_realistic(&mut self, order: &Order) -> Result<Fill, MatchError> {
        let mut book = self.orderbook.write();
        let mut remaining = order.quantity;
        let mut fills = Vec::new();
        let mut total_cost = Decimal::ZERO;

        // Walk the book
        let levels = match order.side {
            Side::Buy => book.asks_iter(),
            Side::Sell => book.bids_iter(),
        };

        for (price, available) in levels {
            if remaining.is_zero() {
                break;
            }

            // Check limit price
            if let Some(limit) = order.limit_price {
                let crosses = match order.side {
                    Side::Buy => price <= limit,
                    Side::Sell => price >= limit,
                };
                if !crosses {
                    break;
                }
            }

            let fill_qty = remaining.min(*available);
            fills.push(FillLeg { price: *price, quantity: fill_qty });
            total_cost += price * fill_qty;
            remaining -= fill_qty;
        }

        let filled_qty = order.quantity - remaining;
        if filled_qty.is_zero() {
            return Err(MatchError::NoLiquidity);
        }

        let status = if remaining.is_zero() {
            OrderStatus::Filled
        } else {
            OrderStatus::PartiallyFilled
        };

        Ok(Fill {
            order_id: order.id.clone(),
            filled_quantity: filled_qty,
            average_price: total_cost / filled_qty,
            status,
            fills,
        })
    }
}

Rationale: - Level 1 (Basic) for quick iteration and sanity checks - Level 2 (Realistic) for meaningful paper trading with slippage - Partial fill support prevents overly optimistic backtests

4. Position Tracker¶

Manages paper positions with PnL calculation:

use std::sync::RwLock;
use chrono::{DateTime, Utc};

pub struct PositionTracker {
    /// Positions by market ID (RwLock for thread-safe concurrent access)
    positions: RwLock<HashMap<MarketId, Position>>,
    /// Trade history for PnL calculation
    trades: RwLock<Vec<Trade>>,
    /// Clock for timestamps
    clock: Arc<dyn Clock>,
}

#[derive(Clone)]
pub struct Position {
    pub market_id: MarketId,
    pub side: Side,
    pub quantity: Decimal,
    pub average_entry_price: Decimal,
    pub realized_pnl: Decimal,
    pub unrealized_pnl: Decimal,
    pub opened_at: DateTime<Utc>,
    pub last_updated: DateTime<Utc>,
}

impl PositionTracker {
    pub fn record_fill(&mut self, fill: &Fill) {
        let position = self.positions
            .entry(fill.market_id.clone())
            .or_insert_with(|| Position::new(fill.market_id.clone()));

        // Update position based on fill
        if fill.side == position.side || position.quantity.is_zero() {
            // Adding to position
            let total_cost = position.average_entry_price * position.quantity
                + fill.average_price * fill.filled_quantity;
            let total_qty = position.quantity + fill.filled_quantity;
            position.average_entry_price = total_cost / total_qty;
            position.quantity = total_qty;
            position.side = fill.side;
        } else {
            // Reducing/closing position
            let close_qty = fill.filled_quantity.min(position.quantity);
            let pnl = match position.side {
                Side::Buy => (fill.average_price - position.average_entry_price) * close_qty,
                Side::Sell => (position.average_entry_price - fill.average_price) * close_qty,
            };
            position.realized_pnl += pnl;
            position.quantity -= close_qty;

            // If we crossed through, handle the remainder as new position
            if fill.filled_quantity > close_qty {
                let remainder = fill.filled_quantity - close_qty;
                position.side = fill.side;
                position.quantity = remainder;
                position.average_entry_price = fill.average_price;
            }
        }

        position.last_updated = self.clock.now();

        // Record trade
        self.trades.push(Trade {
            timestamp: self.clock.now(),
            market_id: fill.market_id.clone(),
            side: fill.side,
            quantity: fill.filled_quantity,
            price: fill.average_price,
            pnl: position.realized_pnl,
        });
    }

    pub fn update_unrealized_pnl(&mut self, market_id: &MarketId, current_price: Decimal) {
        if let Some(position) = self.positions.get_mut(market_id) {
            position.unrealized_pnl = match position.side {
                Side::Buy => (current_price - position.average_entry_price) * position.quantity,
                Side::Sell => (position.average_entry_price - current_price) * position.quantity,
            };
        }
    }
}

Rationale: - Tracks both sides of arbitrage trades - Calculates realized PnL on position close - Updates unrealized PnL from market data

5. Historical Data Storage¶

SQLite for trade/position persistence, with optional Parquet for tick data:

pub struct TradeStorage {
    conn: Connection,
}

impl TradeStorage {
    pub fn new(path: &Path) -> Result<Self, StorageError> {
        let conn = Connection::open(path)?;

        conn.execute_batch(r#"
            CREATE TABLE IF NOT EXISTS trades (
                id INTEGER PRIMARY KEY,
                timestamp TEXT NOT NULL,
                market_id TEXT NOT NULL,
                side TEXT NOT NULL,
                quantity TEXT NOT NULL,
                price TEXT NOT NULL,
                pnl TEXT NOT NULL
            );

            CREATE TABLE IF NOT EXISTS market_data (
                id INTEGER PRIMARY KEY,
                timestamp TEXT NOT NULL,
                market_id TEXT NOT NULL,
                bid_price TEXT,
                ask_price TEXT,
                bid_size TEXT,
                ask_size TEXT
            );

            CREATE INDEX IF NOT EXISTS idx_trades_timestamp ON trades(timestamp);
            CREATE INDEX IF NOT EXISTS idx_market_data_timestamp ON market_data(timestamp);
        "#)?;

        Ok(Self { conn })
    }

    pub fn record_trade(&self, trade: &Trade) -> Result<(), StorageError> {
        self.conn.execute(
            "INSERT INTO trades (timestamp, market_id, side, quantity, price, pnl) VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
            params![
                trade.timestamp.to_rfc3339(),
                trade.market_id.as_str(),
                trade.side.to_string(),
                trade.quantity.to_string(),
                trade.price.to_string(),
                trade.pnl.to_string(),
            ],
        )?;
        Ok(())
    }

    pub fn query_trades(&self, from: DateTime<Utc>, to: DateTime<Utc>) -> Result<Vec<Trade>, StorageError> {
        let mut stmt = self.conn.prepare(
            "SELECT timestamp, market_id, side, quantity, price, pnl FROM trades WHERE timestamp >= ?1 AND timestamp <= ?2 ORDER BY timestamp"
        )?;

        let trades = stmt.query_map(params![from.to_rfc3339(), to.to_rfc3339()], |row| {
            Ok(Trade {
                timestamp: DateTime::parse_from_rfc3339(&row.get::<_, String>(0)?)
                    .unwrap()
                    .with_timezone(&Utc),
                market_id: MarketId::new(row.get(1)?),
                side: Side::from_str(&row.get::<_, String>(2)?).unwrap(),
                quantity: Decimal::from_str(&row.get::<_, String>(3)?).unwrap(),
                price: Decimal::from_str(&row.get::<_, String>(4)?).unwrap(),
                pnl: Decimal::from_str(&row.get::<_, String>(5)?).unwrap(),
            })
        })?;

        trades.collect::<Result<Vec<_>, _>>().map_err(Into::into)
    }
}

Rationale: - SQLite for simplicity and portability (no external database) - RFC3339 timestamps for human readability and sorting - Decimal stored as text to preserve precision - Indexed for time-range queries

6. Data Replayer¶

Replay historical market data for backtesting:

pub struct DataReplayer {
    storage: Arc<TradeStorage>,
    clock: Arc<SimulatedClock>,
    speed: f64,  // 1.0 = real-time, 10.0 = 10x faster
}

impl DataReplayer {
    pub async fn replay(
        &self,
        from: DateTime<Utc>,
        to: DateTime<Utc>,
        handler: impl Fn(MarketData) + Send,
    ) -> Result<ReplayStats, ReplayError> {
        let events = self.storage.query_market_data(from, to)?;
        let mut last_timestamp = from;
        let mut event_count = 0;

        for event in events {
            // Advance simulated clock
            let elapsed = event.timestamp - last_timestamp;
            self.clock.advance(elapsed);

            // Apply speed factor for real-time pacing (optional)
            if self.speed > 0.0 && self.speed != f64::INFINITY {
                let sleep_duration = elapsed.to_std().unwrap_or_default();
                let adjusted = sleep_duration.div_f64(self.speed);
                tokio::time::sleep(adjusted).await;
            }

            handler(event.clone());
            last_timestamp = event.timestamp;
            event_count += 1;
        }

        Ok(ReplayStats {
            events_replayed: event_count,
            time_range: to - from,
            actual_duration: Instant::now().elapsed(),
        })
    }
}

Rationale: - Configurable replay speed (instant for backtests, slower for visualization) - Clock advances event-by-event for determinism - Handler pattern allows flexible processing

7. Performance Metrics¶

Calculate strategy performance metrics:

pub struct PerformanceMetrics {
    trades: Vec<Trade>,
    equity_curve: Vec<(DateTime<Utc>, Decimal)>,
    risk_free_rate: Decimal,  // Annualized
}

impl PerformanceMetrics {
    pub fn from_trades(trades: Vec<Trade>, initial_capital: Decimal) -> Self {
        let mut equity = initial_capital;
        let mut equity_curve = vec![(trades.first().map(|t| t.timestamp).unwrap_or_else(Utc::now), equity)];

        for trade in &trades {
            equity += trade.pnl;
            equity_curve.push((trade.timestamp, equity));
        }

        Self {
            trades,
            equity_curve,
            risk_free_rate: Decimal::new(5, 2),  // 5% default
        }
    }

    /// Sharpe ratio (annualized)
    pub fn sharpe_ratio(&self) -> Option<Decimal> {
        let returns = self.daily_returns();
        if returns.len() < 2 {
            return None;
        }

        let mean = returns.iter().sum::<Decimal>() / Decimal::from(returns.len());
        let variance = returns.iter()
            .map(|r| (r - mean).powi(2))
            .sum::<Decimal>() / Decimal::from(returns.len() - 1);
        let std_dev = variance.sqrt()?;

        if std_dev.is_zero() {
            return None;
        }

        let daily_rf = self.risk_free_rate / Decimal::from(252);
        let excess_return = mean - daily_rf;
        let sharpe = (excess_return / std_dev) * Decimal::from(252).sqrt()?;

        Some(sharpe)
    }

    /// Maximum drawdown (percentage)
    pub fn max_drawdown(&self) -> Decimal {
        let mut peak = Decimal::ZERO;
        let mut max_dd = Decimal::ZERO;

        for (_, equity) in &self.equity_curve {
            if *equity > peak {
                peak = *equity;
            }
            let dd = (peak - equity) / peak;
            if dd > max_dd {
                max_dd = dd;
            }
        }

        max_dd * Decimal::from(100)  // Return as percentage
    }

    /// Win rate (percentage of profitable trades)
    pub fn win_rate(&self) -> Decimal {
        if self.trades.is_empty() {
            return Decimal::ZERO;
        }

        let wins = self.trades.iter().filter(|t| t.pnl > Decimal::ZERO).count();
        Decimal::from(wins) / Decimal::from(self.trades.len()) * Decimal::from(100)
    }

    /// Profit factor (gross profit / gross loss)
    pub fn profit_factor(&self) -> Option<Decimal> {
        let gross_profit: Decimal = self.trades.iter()
            .filter(|t| t.pnl > Decimal::ZERO)
            .map(|t| t.pnl)
            .sum();
        let gross_loss: Decimal = self.trades.iter()
            .filter(|t| t.pnl < Decimal::ZERO)
            .map(|t| t.pnl.abs())
            .sum();

        if gross_loss.is_zero() {
            return None;
        }

        Some(gross_profit / gross_loss)
    }

    /// Total PnL
    pub fn total_pnl(&self) -> Decimal {
        self.trades.iter().map(|t| t.pnl).sum()
    }

    /// Trade count
    pub fn trade_count(&self) -> usize {
        self.trades.len()
    }

    fn daily_returns(&self) -> Vec<Decimal> {
        // Group equity by day and calculate daily returns
        let mut daily: HashMap<NaiveDate, Decimal> = HashMap::new();
        for (ts, equity) in &self.equity_curve {
            daily.insert(ts.date_naive(), *equity);
        }

        let mut dates: Vec<_> = daily.keys().cloned().collect();
        dates.sort();

        dates.windows(2)
            .filter_map(|w| {
                let prev = daily.get(&w[0])?;
                let curr = daily.get(&w[1])?;
                if prev.is_zero() {
                    None
                } else {
                    Some((curr - prev) / prev)
                }
            })
            .collect()
    }
}

Rationale: - Standard financial metrics for strategy evaluation - Annualized Sharpe ratio for comparability - Max drawdown for risk assessment - Profit factor for edge quantification

Architecture Diagram¶

┌────────────────────────────────────────────────────────────────────────────┐
│                              Mode Selection                                 │
│  ┌────────────┐    ┌────────────┐    ┌────────────┐                        │
│  │   Live     │    │   Paper    │    │  Backtest  │                        │
│  │  Trading   │    │  Trading   │    │            │                        │
│  └─────┬──────┘    └─────┬──────┘    └─────┬──────┘                        │
└────────┼─────────────────┼─────────────────┼───────────────────────────────┘
         │                 │                 │
         ▼                 ▼                 ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ ExchangeClient  │ │ Simulated       │ │ Simulated       │
│ (Polymarket/    │ │ ExchangeClient  │ │ ExchangeClient  │
│  Kalshi)        │ │ + RealClock     │ │ + SimulatedClock│
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
         │                   │                   │
         │                   │                   │
         ▼                   ▼                   ▼
┌────────────────────────────────────────────────────────────────────────────┐
│                           ExecutionActor                                    │
│  ┌──────────────────────────────────────────────────────────┐             │
│  │                    Saga State Machine                     │             │
│  │  Pending → Leg1Init → Leg1Filled → Leg2Init → Completed  │             │
│  └──────────────────────────────────────────────────────────┘             │
└────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌────────────────────────────────────────────────────────────────────────────┐
│                           PositionTracker                                   │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                        │
│  │  Positions  │  │   Trades    │  │    PnL      │                        │
│  └─────────────┘  └─────────────┘  └─────────────┘                        │
└────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌────────────────────────────────────────────────────────────────────────────┐
│                           TradeStorage (SQLite)                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                        │
│  │   trades    │  │ market_data │  │  positions  │                        │
│  └─────────────┘  └─────────────┘  └─────────────┘                        │
└────────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌────────────────────────────────────────────────────────────────────────────┐
│                         PerformanceMetrics                                  │
│  ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐                  │
│  │  Sharpe   │ │ Max DD    │ │ Win Rate  │ │  Profit   │                  │
│  │  Ratio    │ │           │ │           │ │  Factor   │                  │
│  └───────────┘ └───────────┘ └───────────┘ └───────────┘                  │
└────────────────────────────────────────────────────────────────────────────┘

Implementation Phases¶

Phase	Scope	Deliverables	Dependencies
1	Core Simulation	Clock trait, SimulatedExchangeClient, MatchingEngine	None
2	Position Tracking	PositionTracker, PnL calculation	Phase 1
3	Historical Data	TradeStorage, DataReplayer, DataRecorder	Phase 2
4	Analytics	PerformanceMetrics, Report generation	Phase 3

Consequences¶

Positive¶

Risk-free strategy evaluation - Test before deploying capital
Historical validation - Backtest against real market conditions
Performance tracking - Quantify strategy edge with standard metrics
Seamless integration - SimulatedExchangeClient implements existing trait
Deterministic replay - Same data produces same results

Negative¶

Simulation bias - Even Level 2 can't capture all real-world effects
Storage overhead - Historical data requires disk space
Complexity - Multiple code paths for live/paper/backtest modes
Overfitting risk - Backtesting can lead to curve-fitted strategies

Risks¶

Risk	Mitigation
Overfit to historical data	Use walk-forward analysis; validate on out-of-sample data
Simulation diverges from reality	Compare paper vs live performance; alert on divergence
Storage grows unbounded	Implement data retention policies; archive old data
Clock drift in paper trading	Use monotonic clock; log clock skew warnings

Alternatives Considered¶

Alternative 1: External Backtesting Framework (Barter-rs)¶

Pro: Mature, feature-rich, well-tested
Con: Different architecture; requires adapting our actor model
Decision: Build internal solution that integrates with existing ExchangeClient trait

Alternative 2: Record-Replay Only (No Simulation)¶

Pro: Simpler; replays actual API responses
Con: Can't simulate new strategies; limited to historical execution
Decision: Support both simulation (for what-if) and replay (for debugging)

Alternative 3: Cloud-Based Backtesting Service¶

Pro: Scalable; offloads computation
Con: Latency; data privacy; vendor lock-in
Decision: Local execution for speed and privacy

Alternative 4: Always Level 3 Fidelity¶

Pro: Most accurate simulation
Con: Complex; requires queue position tracking; diminishing returns for arb strategies
Decision: Start with Level 1-2; architect for Level 3 extension

References¶

Barter-rs - Event-driven Rust backtesting framework
HFTBacktest - High-frequency trading backtester
Freqtrade - Python trading bot with paper trading
Hummingbot - Open-source market making bot
NautilusTrader - Institutional-grade backtesting
ADR-012: Performance Monitoring
ADR-013: Low-Latency Optimizations