Horizontal Scaling

S4E On-Prem is designed for horizontal scaling. Most services are stateless and can be scaled by adding replicas. This guide covers scaling strategies, HPA configuration, and node pool management.

Scaling Principles

Stateless Services

The following services are stateless and scale horizontally without coordination:

Service	Recommended Replicas	Scale Based On
Core API	2-10	API request rate, latency
Frontend	2-4	Concurrent users
Scan trigger	2-4	Webhook/event volume
Vulnerability scanner	3-20+	Scan queue depth, scan volume
Web crawler	3-15+	Crawl queue depth, target count
Dispatcher	1	Dispatch throughput

Singleton Services

These services must run as a single replica:

Service	Reason
Scheduler	Prevents duplicate schedule firings
Scan pre-processor	Prevents concurrent catalog writes

Stateful Services

Scale these with caution; they require data replication:

Service	Scaling Method
PostgreSQL	Read replicas (Patroni, PGO)
RabbitMQ	Clustering (3-node quorum)
Redis	Sentinel or Cluster mode
MongoDB	Replica set

Scaling Recommendations by Deployment Size

Small (< 500 assets)

core: { replicaCount: 2 }
scan: { replicaCount: 50 }
crawler: { replicaCount: 20 }
dispatcher: { replicaCount: 1 }
trigger: { replicaCount: 2 }

Medium (500–2,000 assets)

core: { replicaCount: 3 }
scan: { replicaCount: 300 }
crawler: { replicaCount: 100 }
dispatcher: { replicaCount: 1 }
trigger: { replicaCount: 5 }

Large (2,000+ assets)

core: { replicaCount: 5 }
scan: { replicaCount: 500 }
crawler: { replicaCount: 200 }
dispatcher: { replicaCount: 1 }
trigger: { replicaCount: 10 }

Database scaling

As you scale workers, ensure PostgreSQL can handle the increased connection count. Monitor connection pool utilization and adjust max_connections accordingly.

Next Steps

Queue optimization -- tune RabbitMQ for scaled workloads.
Resource management -- set CPU/memory limits appropriately.
Monitoring -- track scaling metrics.