Market Discovery¶

Automated discovery and matching of market pairs between Polymarket and Kalshi.

Overview¶

Market discovery automates the process of finding equivalent markets across both platforms using a five-phase matching pipeline:

Phase	Technique	Purpose
1	Text Similarity	Fast initial filtering using Jaccard + Levenshtein
2	Fingerprint Matching	Structured field comparison (entity, date, threshold)
3	Embedding Matching	Semantic similarity via vector embeddings
4	LLM Verification	AI reasoning for uncertain cases
5	Feedback Learning	Continuous improvement from human decisions

The system:

Scans both platforms for active markets
Generates structured fingerprints for each market
Uses hybrid scoring (fingerprint + embedding + text) to find matches
Escalates uncertain cases to LLM verification
Presents candidates for human review and approval
Learns from decisions to improve future matching

Human Approval Required

All discovered market pairs require human confirmation before use in trading. This is a safety-critical requirement (FR-MD-003) to prevent automated mapping errors like confusing "Trump" with "Trump Jr."

Prerequisites¶

Build with Discovery Feature¶

Market discovery is an optional feature. Build with the discovery feature flag:

cargo build --manifest-path arbiter-engine/Cargo.toml --features discovery

No Credentials Required¶

Discovery commands don't require exchange credentials. They use public market data APIs:

Polymarket: Gamma API (public)
Kalshi: /v2/markets endpoint (public)

Quick Start¶

1. Run a Discovery Scan¶

Scan both platforms for matching markets:

cargo run --manifest-path arbiter-engine/Cargo.toml --features discovery -- \
  --discover-markets

This fetches markets from both platforms, runs similarity matching, and stores candidates in the discovery database (discovery.db by default).

2. Review Pending Candidates¶

List candidates awaiting review:

cargo run --manifest-path arbiter-engine/Cargo.toml --features discovery -- \
  --list-candidates --status pending

Example output:

Found 3 candidate(s):

1. ID: 550e8400-e29b-41d4-a716-446655440000
   Status: Pending
   Polymarket: Will BTC reach $100k by 2026? (poly-btc-100k)
   Kalshi: Bitcoin reaches $100,000 by 2026? (KXBTC-100K-2026)
   Similarity: 85.2%

2. ID: 6ba7b810-9dad-11d1-80b4-00c04fd430c8
   Status: Pending
   Polymarket: Will there be a government shutdown in 2026? (poly-shutdown-2026)
   Kalshi: Government shutdown exceeding 24 hours in 2026? (KXGOV-SHUTDOWN-2026)
   Similarity: 72.1%
   Warnings:
      - Settlement criteria may differ: "shutdown" vs "shutdown exceeding 24 hours"

3. Approve or Reject Candidates¶

Approve (No Warnings)¶

cargo run --manifest-path arbiter-engine/Cargo.toml --features discovery -- \
  --approve-candidate 550e8400-e29b-41d4-a716-446655440000

Approve with Warning Acknowledgment¶

If a candidate has semantic warnings, you must explicitly acknowledge them:

cargo run --manifest-path arbiter-engine/Cargo.toml --features discovery -- \
  --approve-candidate 6ba7b810-9dad-11d1-80b4-00c04fd430c8 \
  --acknowledge-warnings

Reject with Reason¶

Rejections require a documented reason for the audit trail:

cargo run --manifest-path arbiter-engine/Cargo.toml --features discovery -- \
  --reject-candidate 6ba7b810-9dad-11d1-80b4-00c04fd430c8 \
  --reason "Different settlement criteria - Polymarket uses announcement, Kalshi uses actual duration"

Workflow¶

graph TB
    A[Run Discovery Scan] --> B[Candidates Generated]
    B --> C{Review Candidate}
    C -->|No Warnings| D[Approve]
    C -->|Has Warnings| E{Acknowledge?}
    E -->|Yes| F[Approve with Acknowledgment]
    E -->|No| G[Reject with Reason]
    C -->|Invalid Match| G
    D --> H[Verified Mapping Created]
    F --> H
    G --> I[Logged to Audit Trail]
    H --> J[Available for Trading]

Understanding Similarity Scores¶

Hybrid Scoring Algorithm (Phases 2-3)¶

The matching algorithm uses a hybrid approach combining multiple signals:

Component	Weight	Description
Fingerprint Score	50%	Structured field matching (entity, date, threshold, outcome)
Embedding Similarity	40%	Semantic vector similarity
Text Similarity	10%	Jaccard + Levenshtein fallback

final_score = 0.50 × fingerprint + 0.40 × embedding + 0.10 × text

Fingerprint Field Weights (Phase 2)¶

Field	Weight	Comparison Method
Entity	30%	Exact or alias match (e.g., "BTC" → "Bitcoin")
Date	25%	Year/quarter/month overlap with ±7 day tolerance
Threshold	20%	Numeric comparison with 5% tolerance
Outcome	15%	Binary vs multi-choice structure
Source	10%	Resolution source match

Decision Thresholds¶

Score Range	Decision	Action
≥ 0.85	Auto-Approve	High confidence, minimal review
0.70-0.85	Escalate	LLM verification or human review
0.60-0.70	Review	Requires careful human inspection
< 0.60	Auto-Reject	Below matching threshold

Score Examples¶

Score	Typical Matches
90-100%	Identical entities, dates, and thresholds
75-90%	Same event, minor differences (e.g., date tolerance)
60-75%	Similar events, semantic warnings likely
<60%	Not matched (different events)

Semantic Warnings¶

The system detects potential settlement differences:

Warning Type	Example
Conditional Language	"if X happens" vs "X will happen"
Time Thresholds	"shutdown" vs "shutdown exceeding 24 hours"
Resolution Source	"OPM announcement" vs "actual event"
Outcome Definitions	"reaches $100k" vs "closes above $100k"

Always Review Warnings

Semantic warnings indicate potential differences in how markets resolve. Approving mismatched markets can result in:

One leg winning while the other loses
Significant financial loss
Unhedgeable positions

Phase 2: Fingerprint Matching¶

Fingerprint matching extracts structured fields from market titles and descriptions for precise comparison.

Market Fingerprint Structure¶

MarketFingerprint {
    entity: "Trump"              # Primary entity (person, crypto, event)
    event_type: Acquisition      # Type: PriceTarget, Election, Acquisition, etc.
    metric: { threshold: 100000 } # Numeric threshold if applicable
    resolution_window: 2026-01   # Resolution date/period
    outcome_type: Binary         # Binary or MultiOutcome
}

Entity Extraction¶

The system automatically extracts entities from market titles:

Entity Type	Examples	Pattern
Person	Trump, Biden, Harris	Named individuals
Crypto	Bitcoin, BTC, ETH	Cryptocurrencies with alias resolution
PriceTarget	$100k, $50,000	Numeric values with currency
Date	Q2 2026, June 2026	Temporal references
Event	Super Bowl, Fed Meeting	Known event types

Alias Resolution¶

Common aliases are automatically resolved:

Alias	Canonical Name
BTC	Bitcoin
ETH	Ethereum
45	Donald Trump
Fed	Federal Reserve
Pro Football Championship	Super Bowl

View Fingerprints (Debug)¶

# Show fingerprint for a specific market
cargo run --features discovery -- --show-fingerprint --ticker "KXBTC-100K-2026"

# Output:
# Fingerprint for KXBTC-100K-2026:
#   Entity: Bitcoin
#   Event Type: PriceTarget
#   Threshold: $100,000 (direction: Above)
#   Resolution: 2026-12-31
#   Outcome: Binary

Phase 3: Embedding-Based Matching¶

Embeddings capture semantic similarity that fingerprints may miss. "Super Bowl" and "Pro Football Championship" have zero word overlap but high embedding similarity.

Embedding Pipeline¶

graph LR
    A[Market Title] --> B[Embedder]
    B --> C[Vector 256-dim]
    C --> D[VectorStore]
    D --> E[Nearest Neighbors]
    E --> F[Similar Markets]

Cosine Similarity¶

Embeddings are compared using cosine similarity:

similarity = dot(embedding_A, embedding_B) / (||A|| × ||B||)

Values range from -1 (opposite) to 1 (identical).

Configuration¶

Variable	Description	Default
`DISCOVERY_EMBEDDING_DIM`	Embedding dimension	`256`
`DISCOVERY_EMBEDDING_BATCH_SIZE`	Batch size for generation	`100`

Phase 4: LLM Verification¶

For uncertain cases (score 0.60-0.85), the system can invoke LLM verification for human-level reasoning.

Escalation Rules¶

Cases are escalated to LLM when:

Trigger	Condition
Uncertain Score	Fingerprint score between 0.60-0.85
Warnings Present	Semantic warnings detected
High Value	Market volume > $10,000
Conflicting Signals	High entity match but low date match

Escalation Tiers¶

Tier	Model	Cost	Use Case
None	-	$0	Score ≥ 0.85, no warnings
Haiku	Claude Haiku	~$0.001	Initial screening
Sonnet	Claude Sonnet	~$0.01	Complex resolution analysis
Human	Manual	-	LLM uncertain or conflicts detected

Cost Management¶

LLM verification has a configurable daily budget:

# Set daily budget (default: $50/day)
export DISCOVERY_LLM_BUDGET=50.00

LLM Response Format¶

{
  "equivalent": true,
  "confidence": 0.92,
  "reasoning": "Both markets resolve based on BTC/USD spot price reaching $100,000",
  "warnings": [],
  "resolution_differences": []
}

Phase 5: Learning from Human Feedback¶

Every approval/rejection decision improves future matching accuracy.

Decision Logging¶

All decisions are logged with full context:

{
  "candidate_id": "550e8400-e29b-41d4-a716-446655440000",
  "decision": "approved",
  "fingerprint_score": 0.82,
  "embedding_score": 0.88,
  "llm_confidence": null,
  "escalation_level": "None",
  "category": "crypto",
  "entity_corrections": null
}

Alias Learning¶

When you approve a match with entity differences, the system learns new aliases:

# If you approve a match where:
#   Kalshi: "45 wins election"
#   Polymarket: "Trump wins election"
#
# The system learns: "45" → "Donald Trump"

Aliases are stored with confidence scores that increase with each confirmation.

Weight Optimization¶

The system periodically optimizes fingerprint field weights based on approval patterns:

# Initial weights:
entity: 0.30, date: 0.25, threshold: 0.20, outcome: 0.15, source: 0.10

# After 100 decisions, optimized weights might become:
entity: 0.35, date: 0.28, threshold: 0.18, outcome: 0.12, source: 0.07

Training Data Export¶

Export decisions for model training:

# Export to JSONL format for training
cargo run --features discovery -- --export-training-data --output training.jsonl

Configuration¶

Environment Variables¶

Core Settings¶

Variable	Description	Default
`DISCOVERY_SCAN_INTERVAL_SECS`	Auto-scan interval	`3600`
`DISCOVERY_SIMILARITY_THRESHOLD`	Minimum match score	`0.6`
`DISCOVERY_DB_PATH`	Database file path	`discovery.db`

Phase 2-3: Scoring¶

Variable	Description	Default
`DISCOVERY_AUTO_APPROVE_THRESHOLD`	Score for auto-approval	`0.85`
`DISCOVERY_AUTO_REJECT_THRESHOLD`	Score for auto-rejection	`0.40`
`DISCOVERY_FINGERPRINT_WEIGHT`	Weight for fingerprint score	`0.50`
`DISCOVERY_EMBEDDING_WEIGHT`	Weight for embedding score	`0.40`
`DISCOVERY_TEXT_WEIGHT`	Weight for text similarity	`0.10`

Phase 4: LLM Verification¶

Variable	Description	Default
`DISCOVERY_LLM_ENABLED`	Enable LLM verification	`false`
`DISCOVERY_LLM_BUDGET`	Daily budget in USD	`50.00`
`DISCOVERY_ESCALATION_LOW`	Lower escalation threshold	`0.60`
`DISCOVERY_ESCALATION_HIGH`	Upper escalation threshold	`0.85`

CLI Options¶

Discovery Commands¶

Flag	Description
`--discover-markets`	Run a discovery scan
`--list-candidates`	List match candidates
`--discovery-db <path>`	Custom database path
`--status <filter>`	Filter: `pending`, `approved`, `rejected`, `all`

Approval Commands¶

Flag	Description
`--approve-candidate <uuid>`	Approve a candidate
`--reject-candidate <uuid>`	Reject a candidate
`--acknowledge-warnings`	Required to approve candidates with warnings
`--reason <text>`	Required when rejecting

Debug Commands (Phase 2-5)¶

Flag	Description
`--show-fingerprint`	Display fingerprint for a market
`--test-match --kalshi <ticker> --poly <id>`	Test matching between two markets
`--evaluate-matching`	Run evaluation on golden set
`--export-training-data`	Export decisions for training

Database¶

Discovery data is stored in SQLite:

discovery.db
├── discovered_markets    # Cached market data from both platforms
├── candidates            # Match candidates with status
├── match_decisions       # Decision logging with scores (Phase 5)
├── learned_aliases       # Entity aliases from corrections (Phase 5)
├── embeddings            # Market embeddings (Phase 3)
└── audit_log            # All approval/rejection decisions

Decision Logging Schema (Phase 5)¶

CREATE TABLE match_decisions (
    id TEXT PRIMARY KEY,
    candidate_id TEXT NOT NULL,
    decision TEXT NOT NULL,          -- 'approved', 'rejected'
    fingerprint_score REAL,
    embedding_score REAL,
    llm_confidence REAL,
    escalation_level TEXT,           -- 'None', 'Haiku', 'Sonnet', 'Human'
    category TEXT,
    rejection_reason TEXT,
    entity_corrections TEXT,         -- JSON: {"old": "BTC", "new": "Bitcoin"}
    created_at TEXT DEFAULT CURRENT_TIMESTAMP
);

Custom Database Path¶

cargo run --manifest-path arbiter-engine/Cargo.toml --features discovery -- \
  --list-candidates --discovery-db /path/to/custom.db

Audit Trail¶

All decisions are logged for compliance:

{
  "timestamp": "2026-01-22T15:30:00Z",
  "action": "approve",
  "candidate_id": "550e8400-e29b-41d4-a716-446655440000",
  "polymarket_id": "poly-btc-100k",
  "kalshi_id": "KXBTC-100K-2026",
  "similarity_score": 0.852,
  "semantic_warnings": [],
  "acknowledged_warnings": false,
  "session_id": "abc123"
}

Best Practices¶

Regular Discovery¶

Run discovery scans regularly to find new market opportunities:

# Cron job example: scan every hour
0 * * * * cd /path/to/arbiter-bot && cargo run --features discovery -- --discover-markets

Review Before Major Events¶

Before high-impact events (elections, economic announcements), review pending candidates to ensure mappings are accurate.

Document Rejections¶

Always provide clear rejection reasons. This helps:

Future reviewers understand why pairs were rejected
Improve matching algorithm over time
Maintain compliance audit trail

Use Demo Environment First¶

Test the discovery workflow with --kalshi-demo to verify your review process before using production mappings.

Troubleshooting¶

"Discovery commands require the 'discovery' feature"¶

Rebuild with the feature flag:

cargo build --manifest-path arbiter-engine/Cargo.toml --features discovery

"Cannot approve: candidate has semantic warnings"¶

You must explicitly acknowledge warnings:

cargo run --features discovery -- \
  --approve-candidate <uuid> --acknowledge-warnings

"Rejection requires a reason"¶

Provide a reason with --reason:

cargo run --features discovery -- \
  --reject-candidate <uuid> --reason "Your reason here"

"Candidate not found"¶

Verify the UUID is correct:

cargo run --features discovery -- --list-candidates --status all

ADR-017: Market Discovery - Architecture decision
CLI Reference - Full command reference
Environment Variables - Configuration options

Market Discovery¶

Overview¶

Prerequisites¶

Build with Discovery Feature¶

No Credentials Required¶

Quick Start¶

1. Run a Discovery Scan¶

2. Review Pending Candidates¶

3. Approve or Reject Candidates¶

Approve (No Warnings)¶

Approve with Warning Acknowledgment¶

Reject with Reason¶

Workflow¶

Understanding Similarity Scores¶

Hybrid Scoring Algorithm (Phases 2-3)¶

Fingerprint Field Weights (Phase 2)¶

Decision Thresholds¶

Score Examples¶

Semantic Warnings¶

Phase 2: Fingerprint Matching¶

Market Fingerprint Structure¶

Entity Extraction¶

Alias Resolution¶

View Fingerprints (Debug)¶

Phase 3: Embedding-Based Matching¶

Embedding Pipeline¶

Cosine Similarity¶

Configuration¶

Phase 4: LLM Verification¶

Escalation Rules¶

Escalation Tiers¶

Cost Management¶

LLM Response Format¶

Phase 5: Learning from Human Feedback¶

Decision Logging¶

Alias Learning¶

Weight Optimization¶

Training Data Export¶

Configuration¶

Environment Variables¶

Core Settings¶

Phase 2-3: Scoring¶

Phase 4: LLM Verification¶

CLI Options¶

Discovery Commands¶

Approval Commands¶

Debug Commands (Phase 2-5)¶

Database¶

Decision Logging Schema (Phase 5)¶

Custom Database Path¶

Audit Trail¶

Best Practices¶

Regular Discovery¶

Review Before Major Events¶

Document Rejections¶

Use Demo Environment First¶

Troubleshooting¶

"Discovery commands require the 'discovery' feature"¶

"Cannot approve: candidate has semantic warnings"¶

"Rejection requires a reason"¶

"Candidate not found"¶

Related Documentation¶