Skip to main content

Failover PSP Routing

Intelligent payment provider failover ensures maximum payment success rates through automatic routing to backup providers.

Failover Strategy Overview

Transaction Request


┌────────────────────────┐
│ Primary PSP (Stripe) │
│ │
│ Success Rate: 95% │
│ Response: 150ms │
└───────────┬─────────────┘

▼ (if failure)
┌────────────────────────┐
│ Backup PSP (Adyen) │
│ │
│ Success Rate: 92% │
│ Response: 200ms │
└───────────┬─────────────┘

▼ (if failure)
┌────────────────────────┐
│ Tertiary PSP (PayPal) │
│ │
│ Success Rate: 88% │
│ Response: 300ms │
└────────────────────────┘

Failover Decision Matrix

Routing Factors

FactorWeightDescription
Success Rate40%Historical transaction success
Response Time25%Average API response time
Cost20%Transaction processing fees
Geographic Fit10%Regional optimization
Feature Support5%Required feature availability

Dynamic Scoring Algorithm

function calculatePSPScore(psp: PSP, transaction: Transaction): number {
const successScore = psp.successRate * 0.4;
const speedScore = (1 - psp.avgResponseTime / maxResponseTime) * 0.25;
const costScore = (1 - psp.transactionFee / maxFee) * 0.2;
const geoScore = psp.getGeographicScore(transaction.region) * 0.1;
const featureScore = psp.supportsFeatures(transaction.requirements) * 0.05;

return successScore + speedScore + costScore + geoScore + featureScore;
}

Failure Detection

Failure Types

Soft Failures (Retry-able)

  • Network timeouts
  • Rate limiting (429)
  • Temporary server errors (5xx)
  • PSP maintenance windows
  • Temporary account issues

Hard Failures (No Retry)

  • Invalid API credentials (401)
  • Insufficient funds (402)
  • Card declined (402)
  • Fraud detection blocks
  • Permanent account suspension

Detection Mechanisms

  • HTTP status code analysis
  • Response time monitoring
  • Error pattern recognition
  • Health check failures

Intelligent Retry Logic

Exponential Backoff

class RetryStrategy {
calculateDelay(attemptNumber: number): number {
const baseDelay = 1000; // 1 second
const maxDelay = 30000; // 30 seconds
const jitter = Math.random() * 0.1; // 10% jitter

const delay = Math.min(
baseDelay * Math.pow(2, attemptNumber),
maxDelay
);

return delay * (1 + jitter);
}
}

Retry Limits

  • Maximum 3 retries per PSP
  • Total timeout of 30 seconds
  • Immediate failover for hard failures
  • Circuit breaker activation after 5 consecutive failures

Circuit Breaker Pattern

Circuit States

  • Closed: Normal operation, requests pass through
  • Open: PSP marked as failed, requests bypass
  • Half-Open: Testing PSP recovery, limited requests

Circuit Breaker Configuration

circuit_breaker:
failure_threshold: 5 # Failures to open circuit
timeout: 60s # Time before half-open attempt
success_threshold: 3 # Successes to close circuit
window_size: 100 # Sliding window for failure counting

Real-Time Health Monitoring

Health Metrics

  • Success rate (5-minute rolling average)
  • Average response time
  • Error distribution by type
  • Throughput per second
  • Queue depth and processing lag

Monitoring Dashboard

PSP Health Dashboard
┌─────────────────────────────────┐
│ PSP Status Success AvgTime │
├─────────────────────────────────┤
│ Stripe ✓ UP 95.2% 142ms │
│ Adyen ✓ UP 92.8% 198ms │
│ PayPal ⚠ SLOW 88.1% 485ms │
│ Craftgate ✗ DOWN 45.3% timeout│
└─────────────────────────────────┘

Automated Recovery

Recovery Triggers

  • Circuit breaker half-open state
  • Gradual traffic increase to recovered PSP
  • Automatic primary PSP restoration
  • Load balancing adjustment

Recovery Process

  1. Detection: Health checks detect PSP recovery
  2. Validation: Small percentage of traffic routed for testing
  3. Gradual Ramp: Incrementally increase traffic percentage
  4. Full Recovery: PSP restored to full capacity

Geographic Failover

Regional PSP Mapping

regional_preferences:
north_america:
primary: [stripe, authorize_net]
secondary: [braintree, square]
europe:
primary: [adyen, stripe]
secondary: [checkout_com, klarna]
asia_pacific:
primary: [adyen, stripe]
secondary: [regional_providers]
mena:
primary: [craftgate, adyen]
secondary: [stripe, paypal]

Cross-Region Failover

  • Automatic geographic expansion during outages
  • Latency-aware routing decisions
  • Compliance-aware regional restrictions
  • Currency conversion handling

Configuration Management

Dynamic Configuration

  • Real-time PSP priority updates
  • Feature flag-controlled routing
  • A/B testing for routing strategies
  • Emergency routing overrides

Configuration Example

{
"routing_strategy": "intelligent",
"psp_priorities": {
"default": ["stripe", "adyen", "paypal"],
"high_risk": ["adyen", "stripe"],
"low_cost": ["stripe", "craftgate", "paypal"]
},
"failover_settings": {
"max_retries_per_psp": 3,
"total_timeout_seconds": 30,
"circuit_breaker_threshold": 5
}
}

Performance Optimization

Caching Strategies

  • PSP capability caching
  • Health status caching
  • Routing decision caching
  • Configuration caching

Async Processing

  • Non-blocking PSP selection
  • Parallel health checks
  • Background metric collection
  • Asynchronous retry processing

Next Steps