11 KiB
GreySec MAL — Technical Runbook
Product: GreySec Malware Analysis Lab
Version: 1.0
Updated: 2026-05-07
Parent: ~/greysec/tools/malware-analysis-pipeline/kanban.md
Architecture
Docker Host (Linux)
172.28.0.1
│
┌───────────────────[Docker Bridge: 172.28.0.0/24]────────────────────┐
│ │
LitterBox API Windows 11 VM
:1337 172.28.0.10
(upload portal) │
(orchestration) Fibratus (kernel)
(result storage) │
Whiskers (:8080)
RabbitMQ │
:5672 │
│ RedEdr reporting
variant_event_consumer │
│ │
Supabase │
(results DB) │
│ [SHARE MOUNT]
Web Dashboard C:\analysis\
Key address: LitterBox API = http://172.28.0.1:1337
Key address: Whiskers (inside VM) = http://172.28.0.10:8080
Component Inventory
| Component | Location | Port | Purpose | Credentials |
|---|---|---|---|---|
| LitterBox API | ~/greysec/tools/LitterBox/ |
1337 | Upload portal + orchestration | None (local) |
| RabbitMQ | Docker container | 5672 | Event queue | guest/guest (local) |
| variant_event_consumer | ~/bin/greysec/variant_event_consumer.py |
— | Parse events → Supabase | Via env |
| fibratus_rabbitmq_bridge | ~/bin/greysec/fibratus_rabbitmq_bridge.py |
— | Bridge Fibratus to RabbitMQ | Via env |
| Whiskers | Inside Windows VM | 8080 | EDR REST API | None |
| Fibratus | Inside Windows VM | — | Kernel event capture | — |
| RedEdr | Inside Windows VM | — | EDR reporting (RedEdr.exe) | — |
| Supabase | Cloud (or local) | 3000 | Results database | greysec-dev-key-2026 |
| pre-flight-vm-check.sh | ~/bin/greysec/pre-flight-vm-check.sh |
— | VM health check script | — |
Prerequisites
Before running the pipeline:
- Docker daemon running on Linux host
- Windows 11 VM running at 172.28.0.10
- Kali container reachable from host
- Supabase accessible at localhost:3000 (or cloud)
- MacBook Ollama reachable at 100.127.137.64 (for AI augmentations)
Pre-Flight Checklist
Run before every session:
# 1. Check Docker containers
docker ps --format "table {{.Names}}\t{{.Status}}" | grep -E "litterbox|rabbitmq|fibratus"
# 2. Check VM is running
ping -c 2 172.28.0.10
# 3. Check Whiskers is up
curl -s http://172.28.0.10:8080/health
# Expected: {"status":"ok"} or similar
# 4. Check RabbitMQ is up
curl -s -u guest:guest http://localhost:15672/api/overview | jq '.queue_messages'
# Expected: {"count": N, "message": "ok"}
# 5. Check Supabase reachable
curl -s http://localhost:3000/health | jq '.status'
# Expected: "ready"
# 6. Check share mount from VM side (AFTER FIX — currently broken)
# From inside VM:
# curl -F "file=@test.exe" http://172.28.0.1:1337/upload
If any check fails, resolve before uploading payloads.
Startup Sequence
Run in order. Wait for each to be healthy before moving to the next.
# 1. Start Docker stack
cd ~/greysec/tools/LitterBox
docker-compose up -d
# Wait 30 seconds
# 2. Verify containers are up
docker ps | grep -E "litterbox|rabbitmq"
# 3. Start variant_event_consumer
cd ~/greysec/tools/LitterBox
python3 ~/bin/greysec/variant_event_consumer.py &
# Or use supervisor/systemd if running as service
# 4. Verify VM is running
ping -c 1 172.28.0.10
# 5. Start Whiskers (manual PAExec — until Task 3 is done)
# From inside VM or via PAExec:
# PAExec \\172.28.0.10 -u administrator -p [password] "C:\path\to\whiskers.exe"
# Until Task 3 is done, this is manual and needs to be redone after VM reboot
# 6. Verify Whiskers is responding
curl -s http://172.28.0.10:8080/health
# 7. Verify Fibratus is running inside VM
# On VM: sc query fibratus
# Should show RUNNING
# 8. Verify RabbitMQ connection from consumer
curl -s -u guest:guest http://localhost:15672/api/overview | jq '.message_stats'
Shutdown Sequence
# 1. Stop uploading new payloads (drain queue)
# Check RabbitMQ for pending messages
curl -s -u guest:guest http://localhost:15672/api/queues | jq '.[] | select(.messages > 0)'
# 2. Stop variant_event_consumer
pkill -f variant_event_consumer
# 3. Stop Whiskers (if Task 3 not done — manual)
# On VM: taskkill /IM whiskers.exe /F
# 4. Stop Docker containers
cd ~/greysec/tools/LitterBox
docker-compose down
# 5. Shutdown VM
# virsh shutdown greysec-win11
# or from inside VM: shutdown /s /t 0
Payload Upload Procedure
Via CLI (current method)
# Upload a payload
curl -X POST http://172.28.0.1:1337/upload \
-F "file=@ransomware_sim_v1.py" \
-F "timeout=30" \
-F "metadata={\"name\":\"ransomware_sim_v1\",\"category\":\"test\",\"submitted_by\":\"operator\"}"
# Check job status
curl http://172.28.0.1:1337/jobs/[JOB_ID]
# Get results
curl http://172.28.0.1:1337/results/[JOB_ID]
Via Client Portal (future — Task 7)
# Authenticated upload (future)
curl -X POST https://[client-portal-host]/upload \
-H "Authorization: Bearer [API_KEY]" \
-F "file=@payload.exe" \
-F "timeout=60"
# Returns job_id for polling
Reading the Results
Detection Score (0-100)
The primary deliverable metric.
| Score | Interpretation | Action |
|---|---|---|
| 0-20 | Clean — no suspicious syscalls | Deployable in most environments |
| 21-40 | Low — minor suspicious activity | Review behavioral summary before deployment |
| 41-60 | Medium — multiple suspicious syscalls | Modify payload or test in isolated environment |
| 61-80 | High — significant EDR coverage | Likely to be blocked by most EDR products |
| 81-100 | Critical — extensive offensive tooling | Not recommended for production use |
MITRE ATT&CK Kill Chain
The sequence of ATT&CK tactics and techniques the payload used.
Format:
[1] T1086 — PowerShell: one-liner downloader
[2] T1055 — Process Injection: VirtualAllocEx + WriteProcessMemory
[3] T1055 — Process Injection: CreateRemoteThread
[4] T1105 — Ingress Tool Transfer: URLDownloadToFile
What to look for:
- Technique count > 3: sophisticated payload
- T1055 (Process Injection): likely evasion attempt
- T1105 (Ingress Tool Transfer): network Indicators
- T1486 (Data Encrypted for Impact): ransomware behavior
Behavioral Summary
Text summary of what the payload did:
- File operations (created/modified/deleted)
- Network operations (outbound connections, DNS queries)
- Process operations (spawned children, injected into processes)
- Registry operations (modified keys)
Troubleshooting Guide
Problem: Payload never starts processing
Symptoms: Upload returns 200 OK but no job in queue.
Diagnosis:
- Check share mount is reachable from VM (see Issue 1)
- Check
curl -v http://172.28.0.1:1337/jobs— does job appear? - Check LitterBox logs:
docker logs litterbox-api
Fix order: Verify share mount → verify upload endpoint → check LitterBox logs
Problem: Payload killed at exactly 5 seconds
Symptoms: All payloads die at 5 seconds, regardless of timeout setting.
Diagnosis: This is Issue 4. Check manager.py lines 418-419.
grep -n "init_wait_time" ~/greysec/tools/LitterBox/app/analyzers/manager.py
# Should show hardcoded value = 5
Fix: Change to respect config.yaml value.
Problem: Whiskers endpoint returns 502 or timeout
Symptoms: curl http://172.28.0.10:8080/api/alerts/fibratus/since fails.
Diagnosis: Whiskers process died (Issue 3 — no keepalive).
Fix (immediate): PAExec back into VM and restart Whiskers. Fix (permanent): Task 3 — install as Windows service.
Problem: RedEdr report is empty despite real syscalls
Symptoms: Whiskers returns events but RedEdr shows nothing.
Diagnosis: This is Issue 2. Fibratus sees events but they don't reach the final report.
Fix: Trace the event path:
- Inside VM: run
fibratus dump— are events being captured by Fibratus? curl http://172.28.0.10:8080/api/alerts/fibratus/since— does Whiskers see them?- Check
variant_event_consumerlogs — is it receiving from RabbitMQ? - Check Supabase
malware_analysestable — are events stored?
Find the break point and fix at that layer.
Problem: RabbitMQ queue not draining
Symptoms: curl -u guest:guest http://localhost:15672/api/queues shows messages accumulating.
Diagnosis: variant_event_consumer is not running or is crashing on messages.
Fix:
# Restart consumer with verbose logging
python3 -v ~/bin/greysec/variant_event_consumer.py
# Check consumer is running
ps aux | grep variant_event_consumer
Problem: VM unreachable at 172.28.0.10
Symptoms: ping 172.28.0.10 fails.
Diagnosis: VM is down or Docker bridge network changed.
Fix:
# Check VM status
virsh list --all
# Restart VM
virsh start greysec-win11
# Verify Docker bridge
docker network inspect bridge | jq '.[0].IPAM.Config[0].Subnet'
Escalation Path
If you encounter any of these, ping @Adam immediately:
- VM will not start or boots to BSOD
- Docker stack fails to start after host reboot
- Supabase is unreachable and not recoverable within 5 minutes
- MacBook Ollama needs to be re-authenticated (token expired)
- Any of the 4 critical bugs cannot be resolved within 2 hours of focused work
Before escalating:
- Document what you tried
- Note exact error messages
- Note which component is failing (ping the exact hop)
Format for escalation:
[@Adam] [COMPONENT] is broken: [ONE-LINE DESCRIPTION]
What I tried: [SHORT LIST]
Error: [EXACT ERROR]
Last working: [WHEN — if known]
Appendix: Test Payloads
| Name | Path | Purpose | Expected Behavior |
|---|---|---|---|
| ransomware_sim_v1.py | ~/greysec/engagements/litterbox-fibratus-deploy/payloads/ |
Detection Score test | 60-80 score, multiple ATT&CK techniques |
| ransomware_sim_v2.c | ~/greysec/engagements/litterbox-fibratus-deploy/payloads/ |
Extended run test | Run > 5 seconds, capture output |
| ransomware_sim_v3.c | ~/greysec/engagements/litterbox-fibratus-deploy/payloads/ |
Extended run test | Same as v2 |
| calc.exe | Windows system binary | Clean baseline test | Score < 20 |
| notepad.exe | Windows system binary | Clean baseline test | Score < 20 |