Architecture & Data Flow

Overview

The SOC follows a layered data flow: collect at endpoints → correlate in Wazuh Manager → store in Wazuh Indexer → auto-dispatch to TheHive/CrowdSec/Teams → human triage → investigation in a Case → optional forensics via Velociraptor.

Current state (2026-04-15) has 2 key deviations from the target roadmap:

Orchestration runs via simple custom scripts on Wazuh Manager, not via Shuffle (SOAR). Shuffle is deployed but idle — playbooks haven't been written yet.
Tier 3 components are not deployed: Suricata (NIDS) blocked by missing SPAN port, Grafana not deployed, Sigma community rules not imported.

Data flow (step by step)

Step 1. Collection

Wazuh agents (35 Active) on Academy's CTs/hosts collect login events, File Integrity Monitoring (FIM), processes, USB events, vulnerability scans.
FortiGate sends syslog to Wazuh Manager (port 514/udp).

Step 2. Reception and correlation

Wazuh Manager (CT702) receives events on port 1514 (agent protocol, TCP+UDP).
Applies decoders + rules:
- Builtin Wazuh ruleset (thousands of rules)
- Local /var/ossec/etc/rules/local_rules.xml (including RFC 5737 suppressions — see Daily Routine Tier 1)
Generates alerts with level 0-15 (0 = suppressed, 15 = critical).

Step 3. Storage

All events are indexed in Wazuh Indexer (3-node OpenSearch cluster):
- CT701 primary on siem-px1
- CT704 replica 1 on siem-px3
- CT705 replica 2 on siem-px5
Data has replica factor 2 (each doc on 2 nodes) — losing 1 node doesn't lose data.

Step 4. Automatic dispatch (level ≥ 10)

⚠️ Divergence from plan: instead of a single Shuffle (SOAR) conductor, there are currently 3 independent custom scripts on Wazuh Manager (temporary — migration to Shuffle tracked as TASK-114):

Script	Destination	Purpose
`custom-thehive.py`	TheHive (CT710) → POST `/api/v1/alert`	Create Alert in TheHive (NOT Case!)
`custom-crowdsec-block.py`	CrowdSec (CT707) → POST `/v1/alerts`	Ban IP (with whitelist filter + TTL)
`custom-teams.sh`	Microsoft Teams webhook	Adaptive Card in SIEM-Alerts channel

These three jobs run in parallel from Wazuh Manager.

Step 5. Triage (Tier 1 analyst, human)

Analyst opens TheHive Alerts queue. For each Alert:

📄 Preview and Import (icon on the alert row) → modal with description, tags, observables.
Based on the text (rule, agent, srcip, full_log) the analyst decides:
- Suspicious → "Yes, Import" → alert becomes a Case
- FP (noise) → Cancel, "Mark as read"
Cortex analyzers (VirusTotal, AbuseIPDB, MISP) run only inside a Case (not on Alerts) — the "Run analyzers" button is available on an observable in Case → Observables tab.

Alert vs Case — the fundamental distinction:

	Alert	Case
How it's created	Automatically (Wazuh→TheHive)	Manually (from an Alert via "Yes, Import")
What it means	Raw warning, not yet triaged	Confirmed incident under investigation
How many	Many (~100-1000/day), 70% FP	Few (~5-20/day), all significant
Cortex analyzers	❌ not available	✅ "Run analyzers" button on each observable
Where in TheHive UI	Alerts tab	Cases tab

Without the "Alert → triage → Case" intermediate step, TheHive would be flooded with noise within days.

Step 6. Investigation (Tier 2+)

Analyst works inside a TheHive Case:

Adds tasks (what to check)
Adds observables (discovered IoCs)
For each observable — Cortex enrichment (VT/AbuseIPDB/MISP again)
Searches MISP — "has anyone else seen this IoC? when? in which event?"
If needed — Velociraptor hunt on the host: "show processes / files / registry"

Step 7. WAN attack response

This is a parallel flow, independent of TheHive:

CrowdSec stores decisions locally (SQLite)
blocklist-mirror on CT707 exports active decisions as an HTTP feed (http://10.250.0.16/security/blocklist)
FortiGate pulls the feed every minute as an external-resource
FortiGate Policy 137 DROPs traffic from IPs in the feed

So a ban takes effect within 1-2 minutes of the Wazuh alert — the attacking IP loses access even to Academy's public services.

Diagram (current real flow)

  External threat feeds                            ┌────────────────────────┐
  (CERT-UA, MISP feeds)──────hourly feed────────▶│  MISP (CT706)           │
                                                  │  570+ events           │
                                                  └───────┬─────────┬──────┘
                                                          │         │
                                           (analyzer query│         │ hourly sync
                                            from Cortex)  │         ▼
                                                          │  ┌───────────────┐
                                                          │  │ TheHive Alerts │
                                                          │  │ queue (CT710) │
                                                          │  └───────┬───────┘
                                                          │          │ Import as Case
                                                          │          │ (human click)
                                                          ▼          ▼
                                                    ┌──────────────────────┐
                                                    │  Cortex (CT711)       │
                                                    │  ↑ enrichment         │
                                                    │  VT+AbuseIPDB+MISP    │
                                                    └──────────┬───────────┘
                                                               │
                     ┌───────────────┐                         │
35 Wazuh agents ─────│                │  custom-thehive.py    │
FortiGate syslog ────│ Wazuh Manager ├───────────────────────▶│
                     │    (CT702)     │  custom-crowdsec...    │
                     │                │─────────▶ CrowdSec (CT707) ──feed──▶ FortiGate Policy 137 DROP
                     │                │  custom-teams.sh
                     │                │─────────▶ MS Teams SIEM-Alerts
                     └───────┬────────┘
                             │ all events indexed
                             ▼
                     ┌─────────────────────────────────┐
                     │  Wazuh Indexer (OpenSearch 3-node)│
                     │  Primary:  CT701 @ siem-px1      │
                     │  Replica1: CT704 @ siem-px3      │
                     │  Replica2: CT705 @ siem-px5      │
                     └──────────┬──────────────────────┘
                                │ queries
                                ▼
                     ┌──────────────────────┐
                     │ Wazuh Dashboard (CT703)│
                     └──────────────────────┘


                     ┌───────────────────────────┐
                     │ Velociraptor (CT713)       │
                     │ — server deployed          │
                     │ — clients NOT yet deployed │  ⟵ subtask 26
                     └───────────────────────────┘


🟡 NOT IN FLOW currently:
   Shuffle (idle — no playbooks written, subtask 22)
   Suricata (not deployed — blocked by SPAN port)
   Grafana (not deployed, Tier 3)
   Sigma rules (not imported, Tier 3)

Integration points (active)

From	To	Method	Status
Wazuh Agent	Wazuh Manager	Agent protocol (1514 TCP+UDP)	🟢 35 agents active
FortiGate	Wazuh Manager	Syslog (514 UDP)	🟢
Wazuh Manager	Wazuh Indexer	REST API (9200 TCP)	🟢 3-node cluster
Wazuh Manager	TheHive	`custom-thehive.py` → HTTPS POST `/api/v1/alert`	🟢 (level ≥ 10)
Wazuh Manager	CrowdSec	`custom-crowdsec-block.py` → HTTP POST `/v1/alerts`	🟢 (level ≥ 10, with whitelist)
Wazuh Manager	MS Teams	`custom-teams.sh` → HTTPS Power Automate webhook	🟢 (level ≥ 10)
Wazuh Dashboard	Wazuh Indexer	REST API (9200)	🟢
TheHive	Cortex	REST API from `application.conf` (Bearer auth)	🟢
TheHive	MISP	REST API from `application.conf` (key auth, hourly sync)	🟢 570+ events imported
Cortex	VirusTotal	REST API (HTTPS outbound)	🟢 500 lookups/day free tier
Cortex	AbuseIPDB	REST API (HTTPS outbound)	🟢 1000 lookups/day free tier
Cortex	MISP (local)	REST API	🟢
CrowdSec	FortiGate	External-resource HTTP feed (pulled every minute)	🟢 Policy 137 DROP

Planned integrations (NOT active)

Integration	Purpose	Tracked in BACKLOG
Wazuh Manager → Shuffle	SOAR orchestration replacing custom scripts	TASK-092d subtask 22
Shuffle → TheHive / Cortex / Teams / CrowdSec	Playbooks for automation flows	subtask 22
MISP → Wazuh CDB lists	IoC matching in Wazuh rules (99906-99920)	subtask 28
Suricata → Wazuh Manager	Network-level detection	TASK-092e
Sigma rules → Wazuh rules	100+ community detection rules	TASK-092e
Wazuh Indexer → Grafana	Executive dashboards for Manager/CISO	TASK-092e
Teams Adaptive Card → action buttons	One-click "Open in TheHive" / "Wazuh Discover"	subtask 29
Velociraptor agents → Velociraptor server	Endpoint forensics actually working	subtask 26 (Phase 1-3)

Why this flow (historical context)

The original plan (Tier 2 deployment plan 2026-04-14) had Shuffle as the central orchestration layer. During actual deployment we decided:

Ship the MVP fast — Wazuh→TheHive flow is critical, can't wait for playbook authoring
Write simple custom Python/bash scripts (~99 lines total) for trivial flows
Keep Shuffle deployed but idle — until more complex workflows (if → else → parallel branches) are needed

This made the MVP live in 1 day instead of weeks. The cost — code in 3 places instead of one Shuffle UI.

⚠️ TASK-114: Migration of custom scripts → Shuffle (mandatory)

Principle: if something CAN live in Shuffle instead of Wazuh Manager as a custom script — it MUST be in Shuffle. Wazuh Manager = detection + correlation. Shuffle SOAR = orchestration + response + notification.

All 3 custom scripts (custom-thehive.py, custom-crowdsec-block.py, custom-teams.sh) must be replaced by Shuffle playbooks. Migration is incremental (one script at a time with parallel testing). Tracked in BACKLOG as TASK-114.

Last updated: 2026-04-16.