S4E On-Prem uses RabbitMQ as the primary message broker for asynchronous inter-service communication. All scan jobs, crawl pipeline stages, and event notifications flow through RabbitMQ queues.


Overview

RabbitMQ serves as the backbone for S4E's event-driven architecture:

  • Scan dispatch -- scan requests are published to queues and consumed by worker services.
  • Crawl pipeline -- multi-stage crawl operations pass through a chain of queues.
  • Event notifications -- triggers and actions are coordinated through message passing.
  • Work distribution -- messages are distributed across multiple worker replicas for parallel processing.

Deployment Options

Option 1: In-Cluster RabbitMQ (Helm Subchart)

The S4E Helm chart includes a RabbitMQ subchart:

# s4e-values.yaml
rabbitmq:
  enabled: true
  auth:
    username: s4e_mq
    password: "<strong-password>"
  persistence:
    enabled: true
    size: 20Gi
    storageClass: ssd
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 2000m
      memory: 4Gi

Option 2: External RabbitMQ

Connect to an existing RabbitMQ cluster:

rabbitmq:
  enabled: false

core:
  env:
    RABBITMQ_HOST: "rabbitmq-cluster.messaging.internal"
    RABBITMQ_PORT: "5672"
    RABBITMQ_USER: "s4e_mq"
    RABBITMQ_VHOST: "s4e"
  secrets:
    RABBITMQ_PASS: "<rabbitmq-password>"

Option 3: Clustered RabbitMQ

For high availability, deploy a RabbitMQ cluster with mirrored queues:

rabbitmq:
  enabled: true
  replicaCount: 3
  clustering:
    enabled: true
  auth:
    username: s4e_mq
    password: "<strong-password>"

Quorum queues

RabbitMQ 3.8+ supports quorum queues, which provide better data safety than classic mirrored queues. S4E automatically uses quorum queues when available.

Queue Architecture

Exchange Topology

S4E uses a topic exchange model:

Exchange Type Purpose
s4e.scan Topic Scan job routing
s4e.crawl Topic Crawler pipeline stages
s4e.events Topic System events and notifications
s4e.actions Topic Action and playbook execution
s4e.dead_letter Fanout Failed message collection

Queue Definitions

Scan Queues

Queue Routing Key Consumer
scan.dispatch scan.dispatch.* s4e-dispatcher
scan.execute scan.execute.* s4e-scan
scan.results scan.results.* s4e-core

Crawler Pipeline Queues

Queue Routing Key Consumer Stage
crawl.ffuf crawl.ffuf s4e-crawler Directory fuzzing
crawl.katana crawl.katana s4e-crawler Deep crawling
crawl.api_doc crawl.api_doc s4e-crawler API doc parsing
crawl.url_unifier crawl.url_unifier s4e-crawler URL dedup
crawl.pii crawl.pii s4e-crawler PII detection
crawl.enrichment crawl.enrichment s4e-crawler Result enrichment
crawl.finisher crawl.finisher s4e-crawler Pipeline finalization

Virtual Hosts

For multi-tenant or environment isolation, configure separate vhosts:

rabbitmqctl add_vhost s4e_production
rabbitmqctl add_vhost s4e_staging
rabbitmqctl set_permissions -p s4e_production s4e_mq ".*" ".*" ".*"

Configuration

Consumer Settings

Variable Description Default
RABBITMQ_PREFETCH_COUNT Messages fetched per consumer before acknowledgment 10
RABBITMQ_HEARTBEAT Heartbeat interval (seconds) 60
RABBITMQ_CONNECTION_TIMEOUT Connection timeout (seconds) 30
RABBITMQ_RETRY_DELAY Delay between connection retry attempts (seconds) 5
RABBITMQ_MAX_RETRIES Maximum connection retry attempts 10

Message Durability

All S4E queues are configured as durable by default:

  • Messages are persisted to disk.
  • Queues survive broker restarts.
  • Consumer acknowledgments ensure at-least-once delivery.

Dead Letter Handling

Messages that fail processing after the configured retry count are routed to the dead letter exchange:

Original Queue --> (reject after N retries) --> s4e.dead_letter --> dead_letter_queue

Monitor the dead_letter_queue for messages that require manual investigation.

Dead letter accumulation

Regularly monitor the dead letter queue depth. Accumulating dead letters indicate persistent processing failures that need attention.

Performance Tuning

Prefetch Count

The prefetch count controls how many messages a consumer fetches before processing:

Workload Recommended Prefetch
Fast tasks (< 1 second) 20-50
Medium tasks (1-30 seconds) 5-10
Slow tasks (> 30 seconds) 1-3

S4E scan workers should use a lower prefetch count because scan operations are long-running. Crawler pipeline stages can use a higher prefetch count for throughput.

Memory and Disk Alarms

Configure RabbitMQ resource limits:

# rabbitmq.conf
vm_memory_high_watermark.relative = 0.6
vm_memory_high_watermark_paging_ratio = 0.5
disk_free_limit.relative = 2.0

When the memory watermark is reached, RabbitMQ stops accepting new messages from publishers, which causes backpressure on S4E services.

Connection Limits

# rabbitmq.conf
channel_max = 2047

Ensure the maximum channel count accommodates all S4E service replicas.

Monitoring

Management UI

RabbitMQ Management UI is available at port 15672:

kubectl -n s4e port-forward svc/rabbitmq 15672:15672

Access at http://localhost:15672 with your configured credentials.

Key Metrics

Metric Description Alert Threshold
queue_messages_ready Messages waiting for consumers > 1000 for > 5 minutes
queue_messages_unacknowledged Messages being processed > 500 for > 10 minutes
consumers Active consumer count Drops to 0
message_rates.publish Message publish rate Sudden drop to 0
message_rates.deliver Message delivery rate Falls below publish rate consistently
mem_used Memory consumption > 80% of watermark

Prometheus Integration

Enable the RabbitMQ Prometheus plugin:

rabbitmq-plugins enable rabbitmq_prometheus

Metrics are exposed at http://rabbitmq:15692/metrics.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: rabbitmq-monitor
  namespace: s4e
spec:
  selector:
    matchLabels:
      app: rabbitmq
  endpoints:
    - port: prometheus
      path: /metrics
      interval: 30s

Troubleshooting

Issue Cause Solution
Queue depth growing Consumers not keeping up Scale worker replicas or increase prefetch
Connection refused RabbitMQ pod not ready Check pod status and readiness probe
Memory alarm triggered High message volume Increase memory limits or add consumers
Messages in dead letter queue Processing failures Inspect message content and worker logs
Split-brain in cluster Network partition Follow RabbitMQ partition handling procedure

Next Steps