S4E On-Prem is designed for horizontal scaling. Most services are stateless and can be scaled by adding replicas. This guide covers scaling strategies, HPA configuration, and node pool management.


Scaling Principles

Stateless Services

The following services are stateless and scale horizontally without coordination:

Service Recommended Replicas Scale Based On
Core API 2-10 API request rate, latency
Frontend 2-4 Concurrent users
Scan trigger 2-4 Webhook/event volume
Vulnerability scanner 3-20+ Scan queue depth, scan volume
Web crawler 3-15+ Crawl queue depth, target count
Dispatcher 1 Dispatch throughput

Singleton Services

These services must run as a single replica:

Service Reason
Scheduler Prevents duplicate schedule firings
Scan pre-processor Prevents concurrent catalog writes

Stateful Services

Scale these with caution; they require data replication:

Service Scaling Method
PostgreSQL Read replicas (Patroni, PGO)
RabbitMQ Clustering (3-node quorum)
Redis Sentinel or Cluster mode
MongoDB Replica set

Scaling Recommendations by Deployment Size

Small (< 500 assets)

core: { replicaCount: 2 }
scan: { replicaCount: 50 }
crawler: { replicaCount: 20 }
dispatcher: { replicaCount: 1 }
trigger: { replicaCount: 2 }

Medium (500–2,000 assets)

core: { replicaCount: 3 }
scan: { replicaCount: 300 }
crawler: { replicaCount: 100 }
dispatcher: { replicaCount: 1 }
trigger: { replicaCount: 5 }

Large (2,000+ assets)

core: { replicaCount: 5 }
scan: { replicaCount: 500 }
crawler: { replicaCount: 200 }
dispatcher: { replicaCount: 1 }
trigger: { replicaCount: 10 }

Database scaling

As you scale workers, ensure PostgreSQL can handle the increased connection count. Monitor connection pool utilization and adjust max_connections accordingly.

Next Steps