ADR 007: Execution State Machine (Saga Pattern)¶
Status¶
Accepted
Context¶
Cross-platform arbitrage requires executing orders on both Polymarket (blockchain-based) and Kalshi (traditional exchange) simultaneously. True atomicity is impossible between these venues:
- Polymarket: Transaction finality depends on Polygon blockchain (~2s blocks)
- Kalshi: REST API with independent order matching
If Leg 1 succeeds but Leg 2 fails, the system holds a directional position that must be managed. This "legging risk" is the primary execution hazard in cross-platform arbitrage.
Decision¶
Implement a Saga Pattern using an explicit state machine to manage distributed transaction state and handle compensation for partial failures.
State Machine¶
enum ArbState {
Pending,
Leg1Initiated(OrderId),
Leg1Filled(FillDetails),
Leg2Initiated(OrderId, FillDetails), // Preserves Leg1 details for recovery
Completed,
Failed(Reason),
Compensating(HedgeStrategy),
Compensated,
}
State Transitions¶
Pending
│
▼ (submit leg 1)
Leg1Initiated
│
├─▶ Failed (leg 1 rejected/timeout)
│
▼ (leg 1 confirmed)
Leg1Filled
│
▼ (submit leg 2)
Leg2Initiated
│
├─▶ Compensating (leg 2 failed) ──▶ Compensated
│
▼ (leg 2 confirmed)
Completed
Key Safety Properties¶
- Leg 1 fill details preserved: On Leg 2 failure, full Leg 1 position is known
- Compensation strategies: Market dump, limit chase, or hold for manual intervention
- Audit trail: Every state transition logged with timestamps
Alternatives Considered¶
| Approach | Pros | Cons | Verdict |
|---|---|---|---|
| Saga Pattern | Explicit failure handling, auditable | Complex state machine | Chosen |
| Fire-and-forget | Simple | Unmanaged exposure risk | Rejected |
| 2PC (Two-Phase Commit) | Theoretical atomicity | Impossible across blockchain/REST | Rejected |
| Optimistic Execution | Lower latency | No recovery path on failure | Rejected |
Consequences¶
Positive¶
- Explicit handling of all failure modes
- Position exposure always known and manageable
- Full audit trail for post-mortem analysis
- Compensation logic can be tested in isolation
Negative¶
- Execution logic significantly more complex
- Higher latency due to sequential leg confirmation
- More code paths to test and maintain
Neutral¶
- Requires monitoring dashboard for stuck compensations
References¶
- Saga Pattern - Pattern documentation
- Architecture: Saga Pattern - Implementation details
- FRS Section 3.2.3 - Execution safety requirements
Linked Requirements¶
- NFR-ARCH-004: Implement saga pattern for distributed transactions
- FR-ARB-012: Implement Saga Pattern for distributed transactions
- FR-ARB-013: Automated Hedge Logic if Leg 2 fails