Skip to content

Deploying to AWS us-east-1

How we built infrastructure-as-code with Terraform for deploying our trading system to AWS, including ECS Fargate, Aurora PostgreSQL, and ElastiCache Redis.

Why us-east-1?

Both Polymarket and Kalshi have infrastructure in the US East region. Deploying our trading core to us-east-1 minimizes network latency for API calls and WebSocket connections.

Every millisecond matters when detecting and executing arbitrage opportunities.

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                          us-east-1                               │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐                                            │
│  │   CloudFront    │                                            │
│  └────────┬────────┘                                            │
│           │                                                      │
│  ┌────────▼────────┐     ┌──────────────────────────────────┐  │
│  │       ALB       │     │         Private Subnets           │  │
│  │   (public)      │     │  ┌───────────┐  ┌────────────┐   │  │
│  └────────┬────────┘     │  │  Trading  │  │  Telegram  │   │  │
│           │              │  │   Core    │  │    Bot     │   │  │
│           │              │  │ (4 vCPU)  │  │ (0.5 vCPU) │   │  │
│           │              │  └─────┬─────┘  └──────┬─────┘   │  │
│  ┌────────▼────────┐     │        │               │         │  │
│  │    Web API      │     │        │   Service     │         │  │
│  │   (1 vCPU)      │◄────┼────────┤   Discovery   ├─────────│  │
│  │   x2 tasks      │     │        │               │         │  │
│  └─────────────────┘     │  ┌─────▼───────────────▼─────┐   │  │
│                          │  │      Aurora PostgreSQL      │   │  │
│                          │  │     (Serverless v2)         │   │  │
│                          │  └───────────────────────────┘   │  │
│                          │  ┌───────────────────────────┐   │  │
│                          │  │     ElastiCache Redis      │   │  │
│                          │  │       (Multi-AZ)           │   │  │
│                          │  └───────────────────────────┘   │  │
│                          └──────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Terraform Module Structure

We organized infrastructure into reusable modules:

infrastructure/terraform/
├── main.tf           # Root module, wires everything together
├── variables.tf      # Input variables
├── outputs.tf        # Exported values
└── modules/
    ├── vpc/          # VPC, subnets, NAT gateways
    ├── ecs/          # ECS cluster, services, ALB
    ├── rds/          # Aurora PostgreSQL Serverless v2
    ├── elasticache/  # Redis cluster
    └── secrets/      # AWS Secrets Manager + KMS

VPC Module

Multi-AZ setup with public and private subnets:

module "vpc" {
  source = "./modules/vpc"

  project_name       = var.project_name
  environment        = var.environment
  vpc_cidr           = "10.0.0.0/16"
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
}

Private subnets for ECS tasks, public subnets for ALB. NAT gateways enable outbound internet access for exchange APIs.

ECS Module

Three services with different resource profiles:

Service CPU Memory Count Purpose
Trading Core 4 vCPU 8 GB 1 Arbitrage detection
Telegram Bot 0.5 vCPU 1 GB 1 User interface
Web API 1 vCPU 2 GB 2 REST/gRPC access

Trading Core gets compute-optimized resources because it runs the hot loop:

resource "aws_ecs_task_definition" "trading_core" {
  family                   = "${local.name_prefix}-trading-core"
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = 4096  # 4 vCPU
  memory                   = 8192  # 8 GB

  container_definitions = jsonencode([{
    name  = "trading-core"
    image = var.trading_core_image

    secrets = [
      { name = "POLY_PRIVATE_KEY", valueFrom = "..." },
      { name = "KALSHI_PRIVATE_KEY", valueFrom = "..." }
    ]
  }])
}

Secrets Management

Credentials are stored in AWS Secrets Manager with KMS encryption:

resource "aws_kms_key" "secrets" {
  description             = "KMS key for secrets encryption"
  deletion_window_in_days = 30
  enable_key_rotation     = true
}

resource "aws_secretsmanager_secret" "exchange_credentials" {
  name       = "${local.name_prefix}/exchange-credentials"
  kms_key_id = aws_kms_key.secrets.arn
}

ECS tasks have IAM permissions to read secrets at startup. Secrets never touch disk.

Database: Aurora Serverless v2

Auto-scaling PostgreSQL for variable workloads:

resource "aws_rds_cluster" "main" {
  cluster_identifier = "${local.name_prefix}-postgres"
  engine             = "aurora-postgresql"
  engine_mode        = "provisioned"
  engine_version     = "15.4"
  database_name      = "arbiter"

  serverlessv2_scaling_configuration {
    min_capacity = 0.5   # Scale to zero when idle
    max_capacity = 16    # Scale up under load
  }
}

Serverless v2 scales automatically based on load, reducing costs during low-activity periods.

GitHub Actions CI/CD

Two workflows handle CI and deployment:

CI Workflow (ci.yml)

Runs on every push:

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: cargo fmt --check
      - run: cargo clippy -- -D warnings

  test:
    runs-on: ubuntu-latest
    steps:
      - run: cargo test --all-features

  build:
    runs-on: ubuntu-latest
    steps:
      - run: cargo build --release

  security:
    runs-on: ubuntu-latest
    steps:
      - run: cargo audit

Deploy Workflow (deploy.yml)

Triggered by version tags:

on:
  push:
    tags: ['v*']

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production
    steps:
      - name: Build and push images
        run: |
          docker build -t $ECR_REPO:$TAG ./arbiter-engine
          docker push $ECR_REPO:$TAG

      - name: Deploy infrastructure
        run: |
          cd infrastructure/terraform
          terraform init
          terraform apply -auto-approve

      - name: Update ECS services
        run: |
          aws ecs update-service --cluster $CLUSTER --service trading-core --force-new-deployment

Security Considerations

Layer Protection
Network Private subnets, security groups
Secrets KMS encryption, IAM policies
Database RLS, encrypted at rest
Container ECR image scanning
API JWT authentication, rate limiting

Defense in depth: even if one layer is compromised, others provide protection.

Cost Optimization

Component Strategy
ECS Fargate Spot for non-critical services
Aurora Serverless v2 scales to zero
NAT Gateway Single NAT for dev environments
Secrets Rotation reduces breach window

Production uses dedicated NAT gateways per AZ for high availability.

Verification

# Validate Terraform configuration
terraform validate

# Plan changes
terraform plan -out=tfplan

# Apply infrastructure
terraform apply tfplan

# Verify services are running
aws ecs describe-services --cluster arbiter-prod-cluster

Lessons Learned

  1. Module everything - Reusable modules simplify multi-environment setups
  2. Secrets rotation - Build in rotation from day one
  3. Serverless v2 - Aurora's new mode is genuinely useful
  4. Service discovery - ECS Cloud Map simplifies internal communication
  5. Tag-based deploys - Version tags make rollback straightforward

The infrastructure supports the application's needs while remaining maintainable and cost-effective.