# GreySec MAL — Technical Runbook **Product:** GreySec Malware Analysis Lab **Version:** 1.0 **Updated:** 2026-05-07 **Parent:** `~/greysec/tools/malware-analysis-pipeline/kanban.md` --- ## Architecture ``` Docker Host (Linux) 172.28.0.1 │ ┌───────────────────[Docker Bridge: 172.28.0.0/24]────────────────────┐ │ │ LitterBox API Windows 11 VM :1337 172.28.0.10 (upload portal) │ (orchestration) Fibratus (kernel) (result storage) │ Whiskers (:8080) RabbitMQ │ :5672 │ │ RedEdr reporting variant_event_consumer │ │ │ Supabase │ (results DB) │ │ [SHARE MOUNT] Web Dashboard C:\analysis\ ``` **Key address:** LitterBox API = `http://172.28.0.1:1337` **Key address:** Whiskers (inside VM) = `http://172.28.0.10:8080` --- ## Component Inventory | Component | Location | Port | Purpose | Credentials | |-----------|----------|------|---------|-------------| | LitterBox API | `~/greysec/tools/LitterBox/` | 1337 | Upload portal + orchestration | None (local) | | RabbitMQ | Docker container | 5672 | Event queue | guest/guest (local) | | variant_event_consumer | `~/bin/greysec/variant_event_consumer.py` | — | Parse events → Supabase | Via env | | fibratus_rabbitmq_bridge | `~/bin/greysec/fibratus_rabbitmq_bridge.py` | — | Bridge Fibratus to RabbitMQ | Via env | | Whiskers | Inside Windows VM | 8080 | EDR REST API | None | | Fibratus | Inside Windows VM | — | Kernel event capture | — | | RedEdr | Inside Windows VM | — | EDR reporting (RedEdr.exe) | — | | Supabase | Cloud (or local) | 3000 | Results database | greysec-dev-key-2026 | | pre-flight-vm-check.sh | `~/bin/greysec/pre-flight-vm-check.sh` | — | VM health check script | — | --- ## Prerequisites Before running the pipeline: 1. Docker daemon running on Linux host 2. Windows 11 VM running at 172.28.0.10 3. Kali container reachable from host 4. Supabase accessible at localhost:3000 (or cloud) 5. MacBook Ollama reachable at 100.127.137.64 (for AI augmentations) --- ## Pre-Flight Checklist Run before every session: ```bash # 1. Check Docker containers docker ps --format "table {{.Names}}\t{{.Status}}" | grep -E "litterbox|rabbitmq|fibratus" # 2. Check VM is running ping -c 2 172.28.0.10 # 3. Check Whiskers is up curl -s http://172.28.0.10:8080/health # Expected: {"status":"ok"} or similar # 4. Check RabbitMQ is up curl -s -u guest:guest http://localhost:15672/api/overview | jq '.queue_messages' # Expected: {"count": N, "message": "ok"} # 5. Check Supabase reachable curl -s http://localhost:3000/health | jq '.status' # Expected: "ready" # 6. Check share mount from VM side (AFTER FIX — currently broken) # From inside VM: # curl -F "file=@test.exe" http://172.28.0.1:1337/upload ``` If any check fails, resolve before uploading payloads. --- ## Startup Sequence Run in order. Wait for each to be healthy before moving to the next. ```bash # 1. Start Docker stack cd ~/greysec/tools/LitterBox docker-compose up -d # Wait 30 seconds # 2. Verify containers are up docker ps | grep -E "litterbox|rabbitmq" # 3. Start variant_event_consumer cd ~/greysec/tools/LitterBox python3 ~/bin/greysec/variant_event_consumer.py & # Or use supervisor/systemd if running as service # 4. Verify VM is running ping -c 1 172.28.0.10 # 5. Start Whiskers (manual PAExec — until Task 3 is done) # From inside VM or via PAExec: # PAExec \\172.28.0.10 -u administrator -p [password] "C:\path\to\whiskers.exe" # Until Task 3 is done, this is manual and needs to be redone after VM reboot # 6. Verify Whiskers is responding curl -s http://172.28.0.10:8080/health # 7. Verify Fibratus is running inside VM # On VM: sc query fibratus # Should show RUNNING # 8. Verify RabbitMQ connection from consumer curl -s -u guest:guest http://localhost:15672/api/overview | jq '.message_stats' ``` --- ## Shutdown Sequence ```bash # 1. Stop uploading new payloads (drain queue) # Check RabbitMQ for pending messages curl -s -u guest:guest http://localhost:15672/api/queues | jq '.[] | select(.messages > 0)' # 2. Stop variant_event_consumer pkill -f variant_event_consumer # 3. Stop Whiskers (if Task 3 not done — manual) # On VM: taskkill /IM whiskers.exe /F # 4. Stop Docker containers cd ~/greysec/tools/LitterBox docker-compose down # 5. Shutdown VM # virsh shutdown greysec-win11 # or from inside VM: shutdown /s /t 0 ``` --- ## Payload Upload Procedure ### Via CLI (current method) ```bash # Upload a payload curl -X POST http://172.28.0.1:1337/upload \ -F "file=@ransomware_sim_v1.py" \ -F "timeout=30" \ -F "metadata={\"name\":\"ransomware_sim_v1\",\"category\":\"test\",\"submitted_by\":\"operator\"}" # Check job status curl http://172.28.0.1:1337/jobs/[JOB_ID] # Get results curl http://172.28.0.1:1337/results/[JOB_ID] ``` ### Via Client Portal (future — Task 7) ```bash # Authenticated upload (future) curl -X POST https://[client-portal-host]/upload \ -H "Authorization: Bearer [API_KEY]" \ -F "file=@payload.exe" \ -F "timeout=60" # Returns job_id for polling ``` --- ## Reading the Results ### Detection Score (0-100) The primary deliverable metric. | Score | Interpretation | Action | |-------|---------------|--------| | 0-20 | Clean — no suspicious syscalls | Deployable in most environments | | 21-40 | Low — minor suspicious activity | Review behavioral summary before deployment | | 41-60 | Medium — multiple suspicious syscalls | Modify payload or test in isolated environment | | 61-80 | High — significant EDR coverage | Likely to be blocked by most EDR products | | 81-100 | Critical — extensive offensive tooling | Not recommended for production use | ### MITRE ATT&CK Kill Chain The sequence of ATT&CK tactics and techniques the payload used. **Format:** ``` [1] T1086 — PowerShell: one-liner downloader [2] T1055 — Process Injection: VirtualAllocEx + WriteProcessMemory [3] T1055 — Process Injection: CreateRemoteThread [4] T1105 — Ingress Tool Transfer: URLDownloadToFile ``` **What to look for:** - Technique count > 3: sophisticated payload - T1055 (Process Injection): likely evasion attempt - T1105 (Ingress Tool Transfer): network Indicators - T1486 (Data Encrypted for Impact): ransomware behavior ### Behavioral Summary Text summary of what the payload did: - File operations (created/modified/deleted) - Network operations (outbound connections, DNS queries) - Process operations (spawned children, injected into processes) - Registry operations (modified keys) --- ## Troubleshooting Guide ### Problem: Payload never starts processing **Symptoms:** Upload returns 200 OK but no job in queue. **Diagnosis:** 1. Check share mount is reachable from VM (see Issue 1) 2. Check `curl -v http://172.28.0.1:1337/jobs` — does job appear? 3. Check LitterBox logs: `docker logs litterbox-api` **Fix order:** Verify share mount → verify upload endpoint → check LitterBox logs --- ### Problem: Payload killed at exactly 5 seconds **Symptoms:** All payloads die at 5 seconds, regardless of timeout setting. **Diagnosis:** This is Issue 4. Check `manager.py` lines 418-419. ```bash grep -n "init_wait_time" ~/greysec/tools/LitterBox/app/analyzers/manager.py # Should show hardcoded value = 5 ``` **Fix:** Change to respect config.yaml value. --- ### Problem: Whiskers endpoint returns 502 or timeout **Symptoms:** `curl http://172.28.0.10:8080/api/alerts/fibratus/since` fails. **Diagnosis:** Whiskers process died (Issue 3 — no keepalive). **Fix (immediate):** PAExec back into VM and restart Whiskers. **Fix (permanent):** Task 3 — install as Windows service. --- ### Problem: RedEdr report is empty despite real syscalls **Symptoms:** Whiskers returns events but RedEdr shows nothing. **Diagnosis:** This is Issue 2. Fibratus sees events but they don't reach the final report. **Fix:** Trace the event path: 1. Inside VM: run `fibratus dump` — are events being captured by Fibratus? 2. `curl http://172.28.0.10:8080/api/alerts/fibratus/since` — does Whiskers see them? 3. Check `variant_event_consumer` logs — is it receiving from RabbitMQ? 4. Check Supabase `malware_analyses` table — are events stored? Find the break point and fix at that layer. --- ### Problem: RabbitMQ queue not draining **Symptoms:** `curl -u guest:guest http://localhost:15672/api/queues` shows messages accumulating. **Diagnosis:** `variant_event_consumer` is not running or is crashing on messages. **Fix:** ```bash # Restart consumer with verbose logging python3 -v ~/bin/greysec/variant_event_consumer.py # Check consumer is running ps aux | grep variant_event_consumer ``` --- ### Problem: VM unreachable at 172.28.0.10 **Symptoms:** `ping 172.28.0.10` fails. **Diagnosis:** VM is down or Docker bridge network changed. **Fix:** ```bash # Check VM status virsh list --all # Restart VM virsh start greysec-win11 # Verify Docker bridge docker network inspect bridge | jq '.[0].IPAM.Config[0].Subnet' ``` --- ## Escalation Path **If you encounter any of these, ping @Adam immediately:** 1. VM will not start or boots to BSOD 2. Docker stack fails to start after host reboot 3. Supabase is unreachable and not recoverable within 5 minutes 4. MacBook Ollama needs to be re-authenticated (token expired) 5. Any of the 4 critical bugs cannot be resolved within 2 hours of focused work **Before escalating:** - Document what you tried - Note exact error messages - Note which component is failing (ping the exact hop) **Format for escalation:** ``` [@Adam] [COMPONENT] is broken: [ONE-LINE DESCRIPTION] What I tried: [SHORT LIST] Error: [EXACT ERROR] Last working: [WHEN — if known] ``` --- ## Appendix: Test Payloads | Name | Path | Purpose | Expected Behavior | |------|------|---------|-------------------| | ransomware_sim_v1.py | `~/greysec/engagements/litterbox-fibratus-deploy/payloads/` | Detection Score test | 60-80 score, multiple ATT&CK techniques | | ransomware_sim_v2.c | `~/greysec/engagements/litterbox-fibratus-deploy/payloads/` | Extended run test | Run > 5 seconds, capture output | | ransomware_sim_v3.c | `~/greysec/engagements/litterbox-fibratus-deploy/payloads/` | Extended run test | Same as v2 | | calc.exe | Windows system binary | Clean baseline test | Score < 20 | | notepad.exe | Windows system binary | Clean baseline test | Score < 20 |