362 lines
11 KiB
Markdown
362 lines
11 KiB
Markdown
|
|
# GreySec MAL — Technical Runbook
|
||
|
|
**Product:** GreySec Malware Analysis Lab
|
||
|
|
**Version:** 1.0
|
||
|
|
**Updated:** 2026-05-07
|
||
|
|
**Parent:** `~/greysec/tools/malware-analysis-pipeline/kanban.md`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
Docker Host (Linux)
|
||
|
|
172.28.0.1
|
||
|
|
│
|
||
|
|
┌───────────────────[Docker Bridge: 172.28.0.0/24]────────────────────┐
|
||
|
|
│ │
|
||
|
|
LitterBox API Windows 11 VM
|
||
|
|
:1337 172.28.0.10
|
||
|
|
(upload portal) │
|
||
|
|
(orchestration) Fibratus (kernel)
|
||
|
|
(result storage) │
|
||
|
|
Whiskers (:8080)
|
||
|
|
RabbitMQ │
|
||
|
|
:5672 │
|
||
|
|
│ RedEdr reporting
|
||
|
|
variant_event_consumer │
|
||
|
|
│ │
|
||
|
|
Supabase │
|
||
|
|
(results DB) │
|
||
|
|
│ [SHARE MOUNT]
|
||
|
|
Web Dashboard C:\analysis\
|
||
|
|
```
|
||
|
|
|
||
|
|
**Key address:** LitterBox API = `http://172.28.0.1:1337`
|
||
|
|
**Key address:** Whiskers (inside VM) = `http://172.28.0.10:8080`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Component Inventory
|
||
|
|
|
||
|
|
| Component | Location | Port | Purpose | Credentials |
|
||
|
|
|-----------|----------|------|---------|-------------|
|
||
|
|
| LitterBox API | `~/greysec/tools/LitterBox/` | 1337 | Upload portal + orchestration | None (local) |
|
||
|
|
| RabbitMQ | Docker container | 5672 | Event queue | guest/guest (local) |
|
||
|
|
| variant_event_consumer | `~/bin/greysec/variant_event_consumer.py` | — | Parse events → Supabase | Via env |
|
||
|
|
| fibratus_rabbitmq_bridge | `~/bin/greysec/fibratus_rabbitmq_bridge.py` | — | Bridge Fibratus to RabbitMQ | Via env |
|
||
|
|
| Whiskers | Inside Windows VM | 8080 | EDR REST API | None |
|
||
|
|
| Fibratus | Inside Windows VM | — | Kernel event capture | — |
|
||
|
|
| RedEdr | Inside Windows VM | — | EDR reporting (RedEdr.exe) | — |
|
||
|
|
| Supabase | Cloud (or local) | 3000 | Results database | greysec-dev-key-2026 |
|
||
|
|
| pre-flight-vm-check.sh | `~/bin/greysec/pre-flight-vm-check.sh` | — | VM health check script | — |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Prerequisites
|
||
|
|
|
||
|
|
Before running the pipeline:
|
||
|
|
|
||
|
|
1. Docker daemon running on Linux host
|
||
|
|
2. Windows 11 VM running at 172.28.0.10
|
||
|
|
3. Kali container reachable from host
|
||
|
|
4. Supabase accessible at localhost:3000 (or cloud)
|
||
|
|
5. MacBook Ollama reachable at 100.127.137.64 (for AI augmentations)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Pre-Flight Checklist
|
||
|
|
|
||
|
|
Run before every session:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Check Docker containers
|
||
|
|
docker ps --format "table {{.Names}}\t{{.Status}}" | grep -E "litterbox|rabbitmq|fibratus"
|
||
|
|
|
||
|
|
# 2. Check VM is running
|
||
|
|
ping -c 2 172.28.0.10
|
||
|
|
|
||
|
|
# 3. Check Whiskers is up
|
||
|
|
curl -s http://172.28.0.10:8080/health
|
||
|
|
# Expected: {"status":"ok"} or similar
|
||
|
|
|
||
|
|
# 4. Check RabbitMQ is up
|
||
|
|
curl -s -u guest:guest http://localhost:15672/api/overview | jq '.queue_messages'
|
||
|
|
# Expected: {"count": N, "message": "ok"}
|
||
|
|
|
||
|
|
# 5. Check Supabase reachable
|
||
|
|
curl -s http://localhost:3000/health | jq '.status'
|
||
|
|
# Expected: "ready"
|
||
|
|
|
||
|
|
# 6. Check share mount from VM side (AFTER FIX — currently broken)
|
||
|
|
# From inside VM:
|
||
|
|
# curl -F "file=@test.exe" http://172.28.0.1:1337/upload
|
||
|
|
```
|
||
|
|
|
||
|
|
If any check fails, resolve before uploading payloads.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Startup Sequence
|
||
|
|
|
||
|
|
Run in order. Wait for each to be healthy before moving to the next.
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Start Docker stack
|
||
|
|
cd ~/greysec/tools/LitterBox
|
||
|
|
docker-compose up -d
|
||
|
|
# Wait 30 seconds
|
||
|
|
|
||
|
|
# 2. Verify containers are up
|
||
|
|
docker ps | grep -E "litterbox|rabbitmq"
|
||
|
|
|
||
|
|
# 3. Start variant_event_consumer
|
||
|
|
cd ~/greysec/tools/LitterBox
|
||
|
|
python3 ~/bin/greysec/variant_event_consumer.py &
|
||
|
|
# Or use supervisor/systemd if running as service
|
||
|
|
|
||
|
|
# 4. Verify VM is running
|
||
|
|
ping -c 1 172.28.0.10
|
||
|
|
|
||
|
|
# 5. Start Whiskers (manual PAExec — until Task 3 is done)
|
||
|
|
# From inside VM or via PAExec:
|
||
|
|
# PAExec \\172.28.0.10 -u administrator -p [password] "C:\path\to\whiskers.exe"
|
||
|
|
# Until Task 3 is done, this is manual and needs to be redone after VM reboot
|
||
|
|
|
||
|
|
# 6. Verify Whiskers is responding
|
||
|
|
curl -s http://172.28.0.10:8080/health
|
||
|
|
|
||
|
|
# 7. Verify Fibratus is running inside VM
|
||
|
|
# On VM: sc query fibratus
|
||
|
|
# Should show RUNNING
|
||
|
|
|
||
|
|
# 8. Verify RabbitMQ connection from consumer
|
||
|
|
curl -s -u guest:guest http://localhost:15672/api/overview | jq '.message_stats'
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Shutdown Sequence
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Stop uploading new payloads (drain queue)
|
||
|
|
# Check RabbitMQ for pending messages
|
||
|
|
curl -s -u guest:guest http://localhost:15672/api/queues | jq '.[] | select(.messages > 0)'
|
||
|
|
|
||
|
|
# 2. Stop variant_event_consumer
|
||
|
|
pkill -f variant_event_consumer
|
||
|
|
|
||
|
|
# 3. Stop Whiskers (if Task 3 not done — manual)
|
||
|
|
# On VM: taskkill /IM whiskers.exe /F
|
||
|
|
|
||
|
|
# 4. Stop Docker containers
|
||
|
|
cd ~/greysec/tools/LitterBox
|
||
|
|
docker-compose down
|
||
|
|
|
||
|
|
# 5. Shutdown VM
|
||
|
|
# virsh shutdown greysec-win11
|
||
|
|
# or from inside VM: shutdown /s /t 0
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Payload Upload Procedure
|
||
|
|
|
||
|
|
### Via CLI (current method)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Upload a payload
|
||
|
|
curl -X POST http://172.28.0.1:1337/upload \
|
||
|
|
-F "file=@ransomware_sim_v1.py" \
|
||
|
|
-F "timeout=30" \
|
||
|
|
-F "metadata={\"name\":\"ransomware_sim_v1\",\"category\":\"test\",\"submitted_by\":\"operator\"}"
|
||
|
|
|
||
|
|
# Check job status
|
||
|
|
curl http://172.28.0.1:1337/jobs/[JOB_ID]
|
||
|
|
|
||
|
|
# Get results
|
||
|
|
curl http://172.28.0.1:1337/results/[JOB_ID]
|
||
|
|
```
|
||
|
|
|
||
|
|
### Via Client Portal (future — Task 7)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Authenticated upload (future)
|
||
|
|
curl -X POST https://[client-portal-host]/upload \
|
||
|
|
-H "Authorization: Bearer [API_KEY]" \
|
||
|
|
-F "file=@payload.exe" \
|
||
|
|
-F "timeout=60"
|
||
|
|
# Returns job_id for polling
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Reading the Results
|
||
|
|
|
||
|
|
### Detection Score (0-100)
|
||
|
|
|
||
|
|
The primary deliverable metric.
|
||
|
|
|
||
|
|
| Score | Interpretation | Action |
|
||
|
|
|-------|---------------|--------|
|
||
|
|
| 0-20 | Clean — no suspicious syscalls | Deployable in most environments |
|
||
|
|
| 21-40 | Low — minor suspicious activity | Review behavioral summary before deployment |
|
||
|
|
| 41-60 | Medium — multiple suspicious syscalls | Modify payload or test in isolated environment |
|
||
|
|
| 61-80 | High — significant EDR coverage | Likely to be blocked by most EDR products |
|
||
|
|
| 81-100 | Critical — extensive offensive tooling | Not recommended for production use |
|
||
|
|
|
||
|
|
### MITRE ATT&CK Kill Chain
|
||
|
|
|
||
|
|
The sequence of ATT&CK tactics and techniques the payload used.
|
||
|
|
|
||
|
|
**Format:**
|
||
|
|
```
|
||
|
|
[1] T1086 — PowerShell: one-liner downloader
|
||
|
|
[2] T1055 — Process Injection: VirtualAllocEx + WriteProcessMemory
|
||
|
|
[3] T1055 — Process Injection: CreateRemoteThread
|
||
|
|
[4] T1105 — Ingress Tool Transfer: URLDownloadToFile
|
||
|
|
```
|
||
|
|
|
||
|
|
**What to look for:**
|
||
|
|
- Technique count > 3: sophisticated payload
|
||
|
|
- T1055 (Process Injection): likely evasion attempt
|
||
|
|
- T1105 (Ingress Tool Transfer): network Indicators
|
||
|
|
- T1486 (Data Encrypted for Impact): ransomware behavior
|
||
|
|
|
||
|
|
### Behavioral Summary
|
||
|
|
|
||
|
|
Text summary of what the payload did:
|
||
|
|
- File operations (created/modified/deleted)
|
||
|
|
- Network operations (outbound connections, DNS queries)
|
||
|
|
- Process operations (spawned children, injected into processes)
|
||
|
|
- Registry operations (modified keys)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Troubleshooting Guide
|
||
|
|
|
||
|
|
### Problem: Payload never starts processing
|
||
|
|
|
||
|
|
**Symptoms:** Upload returns 200 OK but no job in queue.
|
||
|
|
|
||
|
|
**Diagnosis:**
|
||
|
|
1. Check share mount is reachable from VM (see Issue 1)
|
||
|
|
2. Check `curl -v http://172.28.0.1:1337/jobs` — does job appear?
|
||
|
|
3. Check LitterBox logs: `docker logs litterbox-api`
|
||
|
|
|
||
|
|
**Fix order:** Verify share mount → verify upload endpoint → check LitterBox logs
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Problem: Payload killed at exactly 5 seconds
|
||
|
|
|
||
|
|
**Symptoms:** All payloads die at 5 seconds, regardless of timeout setting.
|
||
|
|
|
||
|
|
**Diagnosis:** This is Issue 4. Check `manager.py` lines 418-419.
|
||
|
|
|
||
|
|
```bash
|
||
|
|
grep -n "init_wait_time" ~/greysec/tools/LitterBox/app/analyzers/manager.py
|
||
|
|
# Should show hardcoded value = 5
|
||
|
|
```
|
||
|
|
|
||
|
|
**Fix:** Change to respect config.yaml value.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Problem: Whiskers endpoint returns 502 or timeout
|
||
|
|
|
||
|
|
**Symptoms:** `curl http://172.28.0.10:8080/api/alerts/fibratus/since` fails.
|
||
|
|
|
||
|
|
**Diagnosis:** Whiskers process died (Issue 3 — no keepalive).
|
||
|
|
|
||
|
|
**Fix (immediate):** PAExec back into VM and restart Whiskers.
|
||
|
|
**Fix (permanent):** Task 3 — install as Windows service.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Problem: RedEdr report is empty despite real syscalls
|
||
|
|
|
||
|
|
**Symptoms:** Whiskers returns events but RedEdr shows nothing.
|
||
|
|
|
||
|
|
**Diagnosis:** This is Issue 2. Fibratus sees events but they don't reach the final report.
|
||
|
|
|
||
|
|
**Fix:** Trace the event path:
|
||
|
|
1. Inside VM: run `fibratus dump` — are events being captured by Fibratus?
|
||
|
|
2. `curl http://172.28.0.10:8080/api/alerts/fibratus/since` — does Whiskers see them?
|
||
|
|
3. Check `variant_event_consumer` logs — is it receiving from RabbitMQ?
|
||
|
|
4. Check Supabase `malware_analyses` table — are events stored?
|
||
|
|
|
||
|
|
Find the break point and fix at that layer.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Problem: RabbitMQ queue not draining
|
||
|
|
|
||
|
|
**Symptoms:** `curl -u guest:guest http://localhost:15672/api/queues` shows messages accumulating.
|
||
|
|
|
||
|
|
**Diagnosis:** `variant_event_consumer` is not running or is crashing on messages.
|
||
|
|
|
||
|
|
**Fix:**
|
||
|
|
```bash
|
||
|
|
# Restart consumer with verbose logging
|
||
|
|
python3 -v ~/bin/greysec/variant_event_consumer.py
|
||
|
|
|
||
|
|
# Check consumer is running
|
||
|
|
ps aux | grep variant_event_consumer
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Problem: VM unreachable at 172.28.0.10
|
||
|
|
|
||
|
|
**Symptoms:** `ping 172.28.0.10` fails.
|
||
|
|
|
||
|
|
**Diagnosis:** VM is down or Docker bridge network changed.
|
||
|
|
|
||
|
|
**Fix:**
|
||
|
|
```bash
|
||
|
|
# Check VM status
|
||
|
|
virsh list --all
|
||
|
|
|
||
|
|
# Restart VM
|
||
|
|
virsh start greysec-win11
|
||
|
|
|
||
|
|
# Verify Docker bridge
|
||
|
|
docker network inspect bridge | jq '.[0].IPAM.Config[0].Subnet'
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Escalation Path
|
||
|
|
|
||
|
|
**If you encounter any of these, ping @Adam immediately:**
|
||
|
|
|
||
|
|
1. VM will not start or boots to BSOD
|
||
|
|
2. Docker stack fails to start after host reboot
|
||
|
|
3. Supabase is unreachable and not recoverable within 5 minutes
|
||
|
|
4. MacBook Ollama needs to be re-authenticated (token expired)
|
||
|
|
5. Any of the 4 critical bugs cannot be resolved within 2 hours of focused work
|
||
|
|
|
||
|
|
**Before escalating:**
|
||
|
|
- Document what you tried
|
||
|
|
- Note exact error messages
|
||
|
|
- Note which component is failing (ping the exact hop)
|
||
|
|
|
||
|
|
**Format for escalation:**
|
||
|
|
```
|
||
|
|
[@Adam] [COMPONENT] is broken: [ONE-LINE DESCRIPTION]
|
||
|
|
What I tried: [SHORT LIST]
|
||
|
|
Error: [EXACT ERROR]
|
||
|
|
Last working: [WHEN — if known]
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Appendix: Test Payloads
|
||
|
|
|
||
|
|
| Name | Path | Purpose | Expected Behavior |
|
||
|
|
|------|------|---------|-------------------|
|
||
|
|
| ransomware_sim_v1.py | `~/greysec/engagements/litterbox-fibratus-deploy/payloads/` | Detection Score test | 60-80 score, multiple ATT&CK techniques |
|
||
|
|
| ransomware_sim_v2.c | `~/greysec/engagements/litterbox-fibratus-deploy/payloads/` | Extended run test | Run > 5 seconds, capture output |
|
||
|
|
| ransomware_sim_v3.c | `~/greysec/engagements/litterbox-fibratus-deploy/payloads/` | Extended run test | Same as v2 |
|
||
|
|
| calc.exe | Windows system binary | Clean baseline test | Score < 20 |
|
||
|
|
| notepad.exe | Windows system binary | Clean baseline test | Score < 20 |
|