Failover PSP Routing
Intelligent payment provider failover ensures maximum payment success rates through automatic routing to backup providers.
Failover Strategy Overview
Transaction Request
│
▼
┌────────────────────────┐
│ Primary PSP (Stripe) │
│ │
│ Success Rate: 95% │
│ Response: 150ms │
└───────────┬─────────────┘
│
▼ (if failure)
┌────────────────────────┐
│ Backup PSP (Adyen) │
│ │
│ Success Rate: 92% │
│ Response: 200ms │
└───────────┬─────────────┘
│
▼ (if failure)
┌────────────────────────┐
│ Tertiary PSP (PayPal) │
│ │
│ Success Rate: 88% │
│ Response: 300ms │
└────────────────────────┘
Failover Decision Matrix
Routing Factors
| Factor | Weight | Description |
|---|---|---|
| Success Rate | 40% | Historical transaction success |
| Response Time | 25% | Average API response time |
| Cost | 20% | Transaction processing fees |
| Geographic Fit | 10% | Regional optimization |
| Feature Support | 5% | Required feature availability |
Dynamic Scoring Algorithm
function calculatePSPScore(psp: PSP, transaction: Transaction): number {
const successScore = psp.successRate * 0.4;
const speedScore = (1 - psp.avgResponseTime / maxResponseTime) * 0.25;
const costScore = (1 - psp.transactionFee / maxFee) * 0.2;
const geoScore = psp.getGeographicScore(transaction.region) * 0.1;
const featureScore = psp.supportsFeatures(transaction.requirements) * 0.05;
return successScore + speedScore + costScore + geoScore + featureScore;
}
Failure Detection
Failure Types
Soft Failures (Retry-able)
- Network timeouts
- Rate limiting (429)
- Temporary server errors (5xx)
- PSP maintenance windows
- Temporary account issues
Hard Failures (No Retry)
- Invalid API credentials (401)
- Insufficient funds (402)
- Card declined (402)
- Fraud detection blocks
- Permanent account suspension
Detection Mechanisms
- HTTP status code analysis
- Response time monitoring
- Error pattern recognition
- Health check failures
Intelligent Retry Logic
Exponential Backoff
class RetryStrategy {
calculateDelay(attemptNumber: number): number {
const baseDelay = 1000; // 1 second
const maxDelay = 30000; // 30 seconds
const jitter = Math.random() * 0.1; // 10% jitter
const delay = Math.min(
baseDelay * Math.pow(2, attemptNumber),
maxDelay
);
return delay * (1 + jitter);
}
}
Retry Limits
- Maximum 3 retries per PSP
- Total timeout of 30 seconds
- Immediate failover for hard failures
- Circuit breaker activation after 5 consecutive failures
Circuit Breaker Pattern
Circuit States
- Closed: Normal operation, requests pass through
- Open: PSP marked as failed, requests bypass
- Half-Open: Testing PSP recovery, limited requests
Circuit Breaker Configuration
circuit_breaker:
failure_threshold: 5 # Failures to open circuit
timeout: 60s # Time before half-open attempt
success_threshold: 3 # Successes to close circuit
window_size: 100 # Sliding window for failure counting
Real-Time Health Monitoring
Health Metrics
- Success rate (5-minute rolling average)
- Average response time
- Error distribution by type
- Throughput per second
- Queue depth and processing lag
Monitoring Dashboard
PSP Health Dashboard
┌─────────────────────────────────┐
│ PSP Status Success AvgTime │
├─────────────────────────────────┤
│ Stripe ✓ UP 95.2% 142ms │
│ Adyen ✓ UP 92.8% 198ms │
│ PayPal ⚠ SLOW 88.1% 485ms │
│ Craftgate ✗ DOWN 45.3% timeout│
└─────────────────────────────────┘
Automated Recovery
Recovery Triggers
- Circuit breaker half-open state
- Gradual traffic increase to recovered PSP
- Automatic primary PSP restoration
- Load balancing adjustment
Recovery Process
- Detection: Health checks detect PSP recovery
- Validation: Small percentage of traffic routed for testing
- Gradual Ramp: Incrementally increase traffic percentage
- Full Recovery: PSP restored to full capacity
Geographic Failover
Regional PSP Mapping
regional_preferences:
north_america:
primary: [stripe, authorize_net]
secondary: [braintree, square]
europe:
primary: [adyen, stripe]
secondary: [checkout_com, klarna]
asia_pacific:
primary: [adyen, stripe]
secondary: [regional_providers]
mena:
primary: [craftgate, adyen]
secondary: [stripe, paypal]
Cross-Region Failover
- Automatic geographic expansion during outages
- Latency-aware routing decisions
- Compliance-aware regional restrictions
- Currency conversion handling
Configuration Management
Dynamic Configuration
- Real-time PSP priority updates
- Feature flag-controlled routing
- A/B testing for routing strategies
- Emergency routing overrides
Configuration Example
{
"routing_strategy": "intelligent",
"psp_priorities": {
"default": ["stripe", "adyen", "paypal"],
"high_risk": ["adyen", "stripe"],
"low_cost": ["stripe", "craftgate", "paypal"]
},
"failover_settings": {
"max_retries_per_psp": 3,
"total_timeout_seconds": 30,
"circuit_breaker_threshold": 5
}
}
Performance Optimization
Caching Strategies
- PSP capability caching
- Health status caching
- Routing decision caching
- Configuration caching
Async Processing
- Non-blocking PSP selection
- Parallel health checks
- Background metric collection
- Asynchronous retry processing
Next Steps
- Learn about Campaign Engine
- Understand Entitlement System
- Explore Payment Methods