14 KiB
GreySec MAL — Master Kanban
Product: GreySec Malware Analysis Lab
Type: Internal Build Project
Status: BUILDING
Updated: 2026-05-07
Parent debrief: ~/greysec/ops/debriefs/malware-lab-2026-05-07.md
Background
GreySec MAL is a self-hosted malware analysis sandbox for red team operators. It takes a binary payload, detonates it in an isolated Windows 11 VM instrumented with EDR (Fibratus + Whiskers + RedEdr), captures behavioral events via RabbitMQ, and produces a client-facing analysis report with a Detection Score (0-100) and MITRE ATT&CK kill chain map.
Architecture:
Payload Upload → LitterBox (:1337) → SMB Share Mount → Windows VM (:1337)
↓
Fibratus (kernel events)
Whiskers (REST API :8080)
RedEdr (EDR reporting)
↓
RabbitMQ (event queue)
↓
variant_event_consumer (Python)
↓
Supabase (structured data)
↓
Detection Score + MITRE ATT&CK Report
Current status: ARCHITECTURE VERIFIED. 4 critical bugs block end-to-end operation. Fix order is strict.
Pipeline Definition
What the product IS: Drop a binary. Get a Detection Score + MITRE ATT&CK kill chain. Client data never leaves your infrastructure.
What the client receives:
- Detection Score (0-100) — how likely this payload is to be flagged by EDR
- MITRE ATT&CK kill chain map — which tactics and techniques the payload uses
- Behavioral analysis summary — what the payload actually did (file ops, network ops, process ops)
- Raw event log (optional) — full Fibratus event stream for manual review
Target buyer:
- Red team operators testing C2 payloads before deployment
- MSSPs running adversary simulation for clients
- Security teams with HIPAA/BAA obligations that prevent cloud malware analysis
- Law firms and financial institutions with strict client confidentiality requirements
SLA (target):
- Analysis turnaround: < 5 minutes for typical payloads (< 10MB)
- Report available: via web dashboard or API
- Uptime: 99% (target, TBD with Adam)
Current State
What Works
- v1 Python payload: ran for 16 seconds, generated real EDR events, Fibratus saw them, Whiskers returned them via
/api/alerts/fibratus/since— core event path verified - RabbitMQ → variant_event_consumer → Supabase: working
- Docker-compose stack: LitterBox, RabbitMQ, Fibratus bridge, consumer all start cleanly
- Pre-flight check script exists at
~/bin/greysec/pre-flight-vm-check.sh(not yet run in a session)
What Is Broken
| # | Bug | Severity | Fix Time | Cascade |
|---|---|---|---|---|
| 1 | VM share mount \\172.28.0.1\share unreachable from Windows VM — payloads may not reach analysis dir |
CRITICAL | 30 min | Blocks all testing |
| 2 | RedEdr returns zero events despite Fibratus seeing real syscalls — event data doesn't reach final report | CRITICAL | 30-60 min | Blocks EDR validation |
| 3 | Whiskers has no Windows service wrapper — dies when parent process exits, requires manual PAExec restart | CRITICAL | 1 hour | Blocks reliability |
| 4 | manager.py lines 418-419 hardcodes init_wait_time = 5 regardless of config — payloads killed at 5s |
DEGRADED | 30 min | Blocks extended runs |
Fix order: 1 → 2 → 3 → 4. Issue 4 is blocked by Issue 1 (can't test 4 until share mount works).
BOARD
BACKLOG
- Build Detection Score algorithm (0-100 from Fibratus event frequency + severity + MITRE technique count)
- Build web dashboard for results (currently Supabase only — no client-facing UI)
- Build client upload portal (currently manual
curlto localhost:1337) - Build MITRE ATT&CK kill chain mapper (Fibratus events → ATT&CK tactic/technique IDs)
- Write
greysec-malware-pipelineskill (standalone — not yet created) - Add payload hardening guidance output (what to change in the binary to lower Detection Score)
- Set up TLS for LitterBox API (currently plain HTTP — fine for internal, not for client-facing portal)
- Build multi-user access control (when portal is client-facing, need auth)
- Benchmark performance: typical payload analysis time, max payload size, concurrent analysis capacity
IN PROGRESS
(empty — no work currently active)
VALIDATING
(empty)
DONE
- Architecture design (RabbitMQ + Fibratus + Whiskers + Supabase)
- Docker-compose stack (LitterBox + RabbitMQ + bridges)
- v1 Python payload proves end-to-end event path
- Pre-flight VM check script written (
~/bin/greysec/pre-flight-vm-check.sh) - Supabase schema for analysis results
BLOCKED
- ISSUE 1: VM share mount — Cannot test payloads until SMB share is reachable from inside VM
- ISSUE 2: RedEdr zero events — Cannot validate EDR reporting until share mount works
Technical Fix Tasks
Task 1: Fix VM Share Mount (CRITICAL — do first)
What: \\172.28.0.1\share (SMB) not reachable from inside Windows VM at 172.28.0.10
Root cause: Docker bridge network (172.28.0.0/24) may not be attached to VM network interface. SMB port 445 may be blocked by Windows Firewall.
Fix approach A: Verify Docker bridge attachment and open Windows Firewall for SMB. Fix approach B (preferred): Replace SMB mount with HTTP upload endpoint inside VM — more reliable across Docker bridge, no firewall holes.
Files to touch:
~/greysec/tools/LitterBox/docker-compose.yml(change mount mechanism)- May need new endpoint in
~/greysec/tools/LitterBox/app/analyzers/payload_receiver.py
Who: qwen2.5-coder:14b
Time: ~30 minutes
Verification: From inside VM: curl -F "file=@test.exe" http://172.28.0.1:PORT/upload returns 200
Acceptance criteria:
- VM can reach LitterBox upload endpoint
- Payload file appears in VM analysis directory
- LitterBox begins processing within 10 seconds of upload
Task 2: Fix RedEdr Zero Events (CRITICAL — do second)
What: Fibratus sees real syscalls. Whiskers /api/alerts/fibratus/since returns events. But RedEdr report shows nothing.
Root cause: Trace path: Fibratus writes to Windows Application Event Log → Whiskers reads via wevtutil → publishes over HTTP → consumer receives. Something breaks between Whiskers and final report.
Fix approach:
- Check Fibratus filter rules — are they capturing the right event types?
- Check Whiskers polling interval — is it fast enough?
- Check
variant_event_consumer.py— is it parsing Whiskers output correctly? - Run a known-syscall payload and trace events at each hop
Files to touch:
~/bin/greysec/fibratus_rabbitmq_bridge.py~/bin/greysec/variant_event_consumer.py- Fibratus config
~/greysec/tools/fibratus/config.yaml
Who: qwen2.5-coder:14b Time: ~30-60 minutes (diagnosis + fix) Verification: Run ransomware_sim_v1.py payload → confirm events in RedEdr report, not just Whiskers endpoint
Acceptance criteria:
- Payload makes real OpenProcess/CreateFile syscalls
- Fibratus events appear in Whiskers
/api/alerts/fibratus/sinceoutput - Events are parsed and stored in Supabase
- RedEdr-format report shows the events with correct timestamps
Task 3: Install Whiskers as Windows Service (CRITICAL — do third)
What: Whiskers dies when PAExec parent exits. No persistence across VM restart or process crash.
Fix: Install Whiskers as a Windows service using nssm (Non-Sucking Service Manager) or instsrv.
Files to touch:
- VM-side setup: install nssm, run
nssm install Whiskers "C:\path\to\whiskers.exe" "--port 8080"
Who: qwen2.5-coder:14b
Time: ~1 hour
Verification: Reboot VM → wait 5 minutes → confirm Whiskers still reachable at http://172.28.0.10:8080/api/alerts/fibratus/since
Acceptance criteria:
- Whiskers survives VM reboot without manual intervention
- Whiskers survives its own parent process exiting
- Health check
curl http://172.28.0.10:8080/healthreturns 200
Task 4: Fix manager.py Timeout Handler (DEGRADED — do fourth)
What: ~/greysec/tools/LitterBox/app/analyzers/manager.py lines 418-419 hardcode init_wait_time = 5 in the "terminated after" error handler, overriding config.yaml.
Fix: Change init_wait_time = 5 to init_wait_time = config.get('wait_time', 15) or similar.
Files to touch:
~/greysec/tools/LitterBox/app/analyzers/manager.py(lines ~418-419)
Who: qwen2.5-coder:14b
Time: ~30 minutes
Verification: Set wait_time: 30 in config.yaml → run a 20-second payload → confirm it runs for 20+ seconds, not 5
Acceptance criteria:
- Config value respected, not hardcoded fallback
- C payloads (v2, v3) that need > 5 seconds run to completion
Product Build Tasks
Task 5: Detection Score Algorithm
What: The primary client deliverable. A score from 0-100 that rates how likely this payload is to be detected by EDR.
Approach: Combine:
- Event count: how many syscalls per minute
- Event severity: which syscalls (OpenProcess = medium, VirtualAlloc + WriteProcess = high)
- MITRE technique count: how many distinct ATT&CK techniques used
- Network indicators: outbound connections = higher score
- Process injection indicators: highest score
Output: JSON field in Supabase + dashboard display
Formula (target): score = min(100, (event_count * 0.1) + (technique_count * 15) + (severity_multiplier * 20) + (network_indicator * 25))
Who: qwen2.5-coder:14b or glm-5.1:cloud for algorithm design Time: ~2 hours Verification: Run 3 known-clean files (calc.exe, notepad.exe) → score < 20. Run ransomware_sim payload → score > 60.
Task 6: Web Dashboard
What: Client-facing results dashboard. Currently Supabase only — no UI.
Stack: TBD (recommend: Simple Python Flask or FastAPI + HTMX for simplicity, or integrate into existing GreySec dashboard)
Pages:
- Upload page: drag-and-drop binary, job ID returned
- Results page: Detection Score, MITRE kill chain visualization, behavioral summary
- History: past analyses for the client's org
Who: qwen2.5-coder:14b (or Adam if design decision needed) Time: ~4 hours Dependencies: Task 1, 2, 5 complete first
Task 7: Client Upload Portal
What: Authenticated API endpoint for clients to submit binaries. Currently manual curl to localhost.
Features:
- API key auth per client org
- File type validation (.exe, .dll, .bin, .ps1, .py)
- Max file size: 50MB
- Sandbox: each org gets isolated analysis environment (future scope — V1 is shared infra)
Files to touch:
~/greysec/tools/LitterBox/app/analyzers/payload_receiver.py(new endpoints)~/greysec/tools/LitterBox/Config/config.yaml(API key config)
Who: qwen2.5-coder:14b Time: ~2 hours Dependencies: Task 1 (share mount fix) must be complete
Task 8: MITRE ATT&CK Kill Chain Mapper
What: Map Fibratus syscall events to MITRE ATT&CK tactic and technique IDs automatically.
Approach: Build a mapping table:
NtOpenProcess→ T1086 (PowerShell), T1055 (Process Injection)NtCreateFileon sensitive paths → T1005 (Data from System Files)VirtualAllocEx+WriteProcessMemory→ T1055 (Process Injection)CreateRemoteThread→ T1055 (Process Injection)RegSetValue→ T1112 (Modify Registry)URLDownloadToFile→ T1105 (Ingress Tool Transfer)
Output: Kill chain visualization (text or SVG) showing sequence of ATT&CK techniques used
Files to touch: ~/bin/greysec/variant_event_consumer.py (add mapping logic)
Who: qwen2.5-coder:14b Time: ~2 hours (building the mapping table is the work) Dependencies: Task 2 (RedEdr events must flow)
Definition of Done
GreySec MAL is operational when:
- All 4 critical bugs are fixed and verified
- A known-malicious payload (ransomware_sim_v1.py) produces a Detection Score > 60
- MITRE ATT&CK kill chain shows at least 3 techniques for that payload
- A known-clean payload (notepad.exe) produces a Detection Score < 20
- Analysis turnaround is < 5 minutes for a 1MB binary
- Client upload portal accepts a binary via API and returns a job ID
- Results are accessible via web dashboard within 5 minutes of upload
- Skill file
greysec-malware-pipelineexists and documents the full operational procedure - Time tracking is hooked into the pipeline (AI minutes logged to TIME-LOG)
- gbrain logging is hooked into the pipeline (findings logged post-analysis)
DEBT (Action Items from This Kanban)
| Action Item | Priority | Status | Notes |
|---|---|---|---|
| Fix VM share mount (Task 1) | CRITICAL | open | Do first — blocks all testing |
| Fix RedEdr zero events (Task 2) | CRITICAL | open | Do second — blocks reporting |
| Install Whiskers as Windows service (Task 3) | CRITICAL | open | Do third — blocks reliability |
| Fix manager.py timeout (Task 4) | DEGRADED | open | Do fourth |
| Build Detection Score algorithm (Task 5) | HIGH | open | Primary deliverable metric |
| Build web dashboard (Task 6) | HIGH | open | Client-facing UI |
| Build client upload portal (Task 7) | HIGH | open | API for clients |
| Build MITRE ATT&CK mapper (Task 8) | HIGH | open | Kill chain output |
| Write greysec-malware-pipeline skill | MEDIUM | open | Docs |
| Add TIME-LOG hook | MEDIUM | open | Cost tracking |
| Add gbrain logging hook | MEDIUM | open | Knowledge capture |