Skip to content

Architecture & Data Flow

Overview

The SOC follows a layered data flow: collect at endpoints → correlate in Wazuh Manager → store in Wazuh Indexer → auto-dispatch to TheHive/CrowdSec/Teams → human triageinvestigation in a Case → optional forensics via Velociraptor.

Current state (2026-04-15) has 2 key deviations from the target roadmap:

  1. Orchestration runs via simple custom scripts on Wazuh Manager, not via Shuffle (SOAR). Shuffle is deployed but idle — playbooks haven't been written yet.
  2. Tier 3 components are not deployed: Suricata (NIDS) blocked by missing SPAN port, Grafana not deployed, Sigma community rules not imported.

Data flow (step by step)

Step 1. Collection

  • Wazuh agents (35 Active) on Academy's CTs/hosts collect login events, File Integrity Monitoring (FIM), processes, USB events, vulnerability scans.
  • FortiGate sends syslog to Wazuh Manager (port 514/udp).

Step 2. Reception and correlation

  • Wazuh Manager (CT702) receives events on port 1514 (agent protocol, TCP+UDP).
  • Applies decoders + rules:
    • Builtin Wazuh ruleset (thousands of rules)
    • Local /var/ossec/etc/rules/local_rules.xml (including RFC 5737 suppressions — see Daily Routine Tier 1)
  • Generates alerts with level 0-15 (0 = suppressed, 15 = critical).

Step 3. Storage

  • All events are indexed in Wazuh Indexer (3-node OpenSearch cluster):
    • CT701 primary on siem-px1
    • CT704 replica 1 on siem-px3
    • CT705 replica 2 on siem-px5
  • Data has replica factor 2 (each doc on 2 nodes) — losing 1 node doesn't lose data.

Step 4. Automatic dispatch (level ≥ 10)

⚠️ Divergence from plan: instead of a single Shuffle (SOAR) conductor, there are currently 3 independent custom scripts on Wazuh Manager (temporary — migration to Shuffle tracked as TASK-114):

Script Destination Purpose
custom-thehive.py TheHive (CT710) → POST /api/v1/alert Create Alert in TheHive (NOT Case!)
custom-crowdsec-block.py CrowdSec (CT707) → POST /v1/alerts Ban IP (with whitelist filter + TTL)
custom-teams.sh Microsoft Teams webhook Adaptive Card in SIEM-Alerts channel

These three jobs run in parallel from Wazuh Manager.

Step 5. Triage (Tier 1 analyst, human)

Analyst opens TheHive Alerts queue. For each Alert:

  1. 📄 Preview and Import (icon on the alert row) → modal with description, tags, observables.
  2. Based on the text (rule, agent, srcip, full_log) the analyst decides:
    • Suspicious"Yes, Import" → alert becomes a Case
    • FP (noise) → Cancel, "Mark as read"
  3. Cortex analyzers (VirusTotal, AbuseIPDB, MISP) run only inside a Case (not on Alerts) — the "Run analyzers" button is available on an observable in Case → Observables tab.

Alert vs Case — the fundamental distinction:

Alert Case
How it's created Automatically (Wazuh→TheHive) Manually (from an Alert via "Yes, Import")
What it means Raw warning, not yet triaged Confirmed incident under investigation
How many Many (~100-1000/day), 70% FP Few (~5-20/day), all significant
Cortex analyzers ❌ not available ✅ "Run analyzers" button on each observable
Where in TheHive UI Alerts tab Cases tab

Without the "Alert → triage → Case" intermediate step, TheHive would be flooded with noise within days.

Step 6. Investigation (Tier 2+)

Analyst works inside a TheHive Case:

  • Adds tasks (what to check)
  • Adds observables (discovered IoCs)
  • For each observable — Cortex enrichment (VT/AbuseIPDB/MISP again)
  • Searches MISP — "has anyone else seen this IoC? when? in which event?"
  • If needed — Velociraptor hunt on the host: "show processes / files / registry"

Step 7. WAN attack response

This is a parallel flow, independent of TheHive:

  • CrowdSec stores decisions locally (SQLite)
  • blocklist-mirror on CT707 exports active decisions as an HTTP feed (http://10.250.0.16/security/blocklist)
  • FortiGate pulls the feed every minute as an external-resource
  • FortiGate Policy 137 DROPs traffic from IPs in the feed

So a ban takes effect within 1-2 minutes of the Wazuh alert — the attacking IP loses access even to Academy's public services.


Diagram (current real flow)

  External threat feeds                            ┌────────────────────────┐
  (CERT-UA, MISP feeds)──────hourly feed────────▶│  MISP (CT706)           │
                                                  │  570+ events           │
                                                  └───────┬─────────┬──────┘
                                                          │         │
                                           (analyzer query│         │ hourly sync
                                            from Cortex)  │         ▼
                                                          │  ┌───────────────┐
                                                          │  │ TheHive Alerts │
                                                          │  │ queue (CT710) │
                                                          │  └───────┬───────┘
                                                          │          │ Import as Case
                                                          │          │ (human click)
                                                          ▼          ▼
                                                    ┌──────────────────────┐
                                                    │  Cortex (CT711)       │
                                                    │  ↑ enrichment         │
                                                    │  VT+AbuseIPDB+MISP    │
                                                    └──────────┬───────────┘
                     ┌───────────────┐                         │
35 Wazuh agents ─────│                │  custom-thehive.py    │
FortiGate syslog ────│ Wazuh Manager ├───────────────────────▶│
                     │    (CT702)     │  custom-crowdsec...    │
                     │                │─────────▶ CrowdSec (CT707) ──feed──▶ FortiGate Policy 137 DROP
                     │                │  custom-teams.sh
                     │                │─────────▶ MS Teams SIEM-Alerts
                     └───────┬────────┘
                             │ all events indexed
                     ┌─────────────────────────────────┐
                     │  Wazuh Indexer (OpenSearch 3-node)│
                     │  Primary:  CT701 @ siem-px1      │
                     │  Replica1: CT704 @ siem-px3      │
                     │  Replica2: CT705 @ siem-px5      │
                     └──────────┬──────────────────────┘
                                │ queries
                     ┌──────────────────────┐
                     │ Wazuh Dashboard (CT703)│
                     └──────────────────────┘


                     ┌───────────────────────────┐
                     │ Velociraptor (CT713)       │
                     │ — server deployed          │
                     │ — clients NOT yet deployed │  ⟵ subtask 26
                     └───────────────────────────┘


🟡 NOT IN FLOW currently:
   Shuffle (idle — no playbooks written, subtask 22)
   Suricata (not deployed — blocked by SPAN port)
   Grafana (not deployed, Tier 3)
   Sigma rules (not imported, Tier 3)

Integration points (active)

From To Method Status
Wazuh Agent Wazuh Manager Agent protocol (1514 TCP+UDP) 🟢 35 agents active
FortiGate Wazuh Manager Syslog (514 UDP) 🟢
Wazuh Manager Wazuh Indexer REST API (9200 TCP) 🟢 3-node cluster
Wazuh Manager TheHive custom-thehive.py → HTTPS POST /api/v1/alert 🟢 (level ≥ 10)
Wazuh Manager CrowdSec custom-crowdsec-block.py → HTTP POST /v1/alerts 🟢 (level ≥ 10, with whitelist)
Wazuh Manager MS Teams custom-teams.sh → HTTPS Power Automate webhook 🟢 (level ≥ 10)
Wazuh Dashboard Wazuh Indexer REST API (9200) 🟢
TheHive Cortex REST API from application.conf (Bearer auth) 🟢
TheHive MISP REST API from application.conf (key auth, hourly sync) 🟢 570+ events imported
Cortex VirusTotal REST API (HTTPS outbound) 🟢 500 lookups/day free tier
Cortex AbuseIPDB REST API (HTTPS outbound) 🟢 1000 lookups/day free tier
Cortex MISP (local) REST API 🟢
CrowdSec FortiGate External-resource HTTP feed (pulled every minute) 🟢 Policy 137 DROP

Planned integrations (NOT active)

Integration Purpose Tracked in BACKLOG
Wazuh Manager → Shuffle SOAR orchestration replacing custom scripts TASK-092d subtask 22
Shuffle → TheHive / Cortex / Teams / CrowdSec Playbooks for automation flows subtask 22
MISP → Wazuh CDB lists IoC matching in Wazuh rules (99906-99920) subtask 28
Suricata → Wazuh Manager Network-level detection TASK-092e
Sigma rules → Wazuh rules 100+ community detection rules TASK-092e
Wazuh Indexer → Grafana Executive dashboards for Manager/CISO TASK-092e
Teams Adaptive Card → action buttons One-click "Open in TheHive" / "Wazuh Discover" subtask 29
Velociraptor agents → Velociraptor server Endpoint forensics actually working subtask 26 (Phase 1-3)

Why this flow (historical context)

The original plan (Tier 2 deployment plan 2026-04-14) had Shuffle as the central orchestration layer. During actual deployment we decided:

  1. Ship the MVP fast — Wazuh→TheHive flow is critical, can't wait for playbook authoring
  2. Write simple custom Python/bash scripts (~99 lines total) for trivial flows
  3. Keep Shuffle deployed but idle — until more complex workflows (if → else → parallel branches) are needed

This made the MVP live in 1 day instead of weeks. The cost — code in 3 places instead of one Shuffle UI.

⚠️ TASK-114: Migration of custom scripts → Shuffle (mandatory)

Principle: if something CAN live in Shuffle instead of Wazuh Manager as a custom script — it MUST be in Shuffle. Wazuh Manager = detection + correlation. Shuffle SOAR = orchestration + response + notification.

All 3 custom scripts (custom-thehive.py, custom-crowdsec-block.py, custom-teams.sh) must be replaced by Shuffle playbooks. Migration is incremental (one script at a time with parallel testing). Tracked in BACKLOG as TASK-114.


Last updated: 2026-04-16.