Closing ADR Gaps: Nonce Management, Risk Controls, and Key Rotation¶
Completing the remaining implementation gaps across ADRs 004, 005, 007, and 009 with thread-safe nonce management, risk manager actor, compensation executor, and key rotation support.
The Gap Analysis¶
After implementing the core architecture, a review revealed several gaps between documented ADRs and actual implementation:
| ADR | Gap Identified | Resolution |
|---|---|---|
| 004 | No thread-safe nonce management for Polymarket | NonceManager with atomics |
| 005 | No risk management actor | RiskManagerActor with message protocol |
| 007 | No compensation executor | CompensationExecutor with retry strategies |
| 009 | No key rotation support | KeyRotationManager with zero-downtime rotation |
Nonce Management (ADR-004)¶
Polymarket orders require monotonically increasing nonces. In a concurrent environment, this needs careful handling.
The Problem¶
// WRONG: Race condition
let nonce = self.nonce + 1;
self.nonce = nonce; // Another thread could read same value
The Solution¶
pub struct NonceManager {
nonces: RwLock<HashMap<String, Arc<AtomicU64>>>,
}
impl NonceManager {
pub async fn next_nonce(&self, address: &str) -> U256 {
let address_lower = address.to_lowercase();
// Get or create atomic counter for this address
let counter = {
let nonces = self.nonces.read().await;
if let Some(counter) = nonces.get(&address_lower) {
counter.clone()
} else {
drop(nonces);
let mut nonces = self.nonces.write().await;
let counter = Arc::new(AtomicU64::new(
Utc::now().timestamp_millis() as u64
));
nonces.insert(address_lower.clone(), counter.clone());
counter
}
};
// Atomic increment - guaranteed unique
U256::from(counter.fetch_add(1, Ordering::SeqCst))
}
}
Key properties:
- Atomic increment: fetch_add is a single CPU instruction
- Case-insensitive: Ethereum addresses normalized to lowercase
- Timestamp initialization: Prevents collisions after restart
Risk Manager Actor (ADR-005)¶
The actor model requires all state mutation through message passing. Risk checks are a natural fit.
Message Protocol¶
pub enum RiskMessage {
CheckRisk {
user_id: UserId,
opportunity: Opportunity,
respond_to: oneshot::Sender<Result<(), RiskViolation>>,
},
RecordFill {
user_id: UserId,
fill: FillDetails,
},
// ... other messages
}
Actor Implementation¶
impl RiskManagerActor {
pub async fn run(mut self) {
while let Some(msg) = self.receiver.recv().await {
match msg {
RiskMessage::CheckRisk { user_id, opportunity, respond_to } => {
let result = self.check_risk(&user_id, &opportunity);
let _ = respond_to.send(result);
}
RiskMessage::RecordFill { user_id, fill } => {
self.record_fill(&user_id, &fill);
}
}
}
}
}
Risk checks include: - Open position limits (per-user, per-market) - Exposure limits (max capital at risk) - Daily loss limits with cooldown periods - Order rate limiting
Compensation Executor (ADR-007)¶
The saga pattern requires compensation when Leg 2 fails after Leg 1 succeeds.
Strategy Selection¶
pub enum HedgeStrategy {
Hold(String), // Hold position, manual intervention
DumpLeg1, // Market sell Leg 1 immediately
RetryLeg2, // Retry original Leg 2
LimitChaseLeg2, // Chase price with limit orders
}
impl HedgeCalculator {
pub fn select_strategy(
leg1_fill: &FillDetails,
leg2_intent: Option<&Leg2Intent>,
retry_count: u32,
config: &HedgeConfig,
) -> HedgeStrategy {
match retry_count {
0 => HedgeStrategy::RetryLeg2,
1..=2 => HedgeStrategy::LimitChaseLeg2,
_ if config.allow_market_fallback => HedgeStrategy::DumpLeg1,
_ => HedgeStrategy::Hold("Max retries exceeded".into()),
}
}
}
Execution with Retries¶
impl CompensationExecutor {
pub async fn execute(&self, leg1_fill: &FillDetails, ...) -> CompensationResult {
let mut retry_count = 0;
loop {
let strategy = HedgeCalculator::select_strategy(..., retry_count, ...);
let hedge_order = HedgeCalculator::calculate(&strategy, leg1_fill);
match self.execute_hedge_order(&hedge_order).await {
Ok(fill) => return CompensationResult::Success(fill),
Err(_) if retry_count < self.config.max_retries => {
retry_count += 1;
continue;
}
Err(e) => return CompensationResult::Failed { reason: e, ... },
}
}
}
}
Key Rotation (ADR-009)¶
Zero-downtime key rotation requires careful version management.
Rotation Workflow¶
1. Add new key version (v2)
2. Activate v2 for new encryptions
3. Old credentials still decrypt with v1
4. Re-encrypt all credentials to v2
5. Retire v1 (disable for decrypt)
6. Remove v1
Implementation¶
pub struct KeyRotationManager {
stores: RwLock<HashMap<u32, Arc<CredentialStore>>>,
versions: RwLock<HashMap<u32, KeyVersionInfo>>,
active_version: RwLock<u32>,
}
impl KeyRotationManager {
pub fn encrypt(&self, user_id: &str, credential_id: &str, plaintext: &[u8])
-> Result<VersionedCredential, KeyRotationError>
{
let version = *self.active_version.read().unwrap();
let store = self.stores.read().unwrap()
.get(&version).cloned()
.ok_or(KeyRotationError::NoKeysAvailable)?;
let encrypted = store.encrypt(user_id, plaintext)?;
Ok(VersionedCredential {
key_version: version,
encrypted,
user_id: user_id.to_string(),
})
}
pub fn decrypt_versioned(&self, versioned: &VersionedCredential)
-> Result<Vec<u8>, KeyRotationError>
{
// Try recorded version first
if let Some(store) = self.stores.read().unwrap().get(&versioned.key_version) {
if let Ok(plaintext) = store.decrypt(&versioned.user_id, &versioned.encrypted) {
return Ok(plaintext);
}
}
// Try other active versions (migration fallback)
for (&version, info) in self.versions.read().unwrap().iter() {
if version == versioned.key_version || !info.active_for_decrypt {
continue;
}
// ... try decrypt with other versions
}
Err(KeyRotationError::NoKeysAvailable)
}
}
Security Scan Results¶
All new code passed security scanning:
| Issue Type | Count | Status |
|---|---|---|
| Hardcoded secrets | 0 | Pass |
| SQL injection | 0 | Pass |
| Command injection | 0 | Pass |
| Unsafe unwrap in prod | 3 | Reviewed (RwLock acceptable) |
The unwrap() calls on RwLock are acceptable because:
1. They only fail if a thread panicked while holding the lock
2. At that point the system is already in a bad state
3. This is idiomatic Rust for lock acquisition
Test Coverage¶
All implementations follow TDD with comprehensive tests:
test market::nonce::tests::test_concurrent_nonce_uniqueness ... ok
test actors::risk::tests::test_risk_check_within_limits ... ok
test execution::compensation::tests::test_compensation_retries ... ok
test security::key_rotation::tests::test_full_rotation_workflow ... ok
test result: ok. 198 passed; 0 failed
Conclusion¶
Closing these gaps ensures the architecture matches documentation:
- ADR-004: Thread-safe nonce management prevents order collisions
- ADR-005: Risk actor enforces limits through message passing
- ADR-007: Compensation executor implements full hedge strategy suite
- ADR-009: Key rotation enables zero-downtime credential key changes
All changes tracked via GitHub issues #18-21 and verified by council review.