Files
exploit-pipeline/kanban.md
T

298 lines
12 KiB
Markdown
Raw Normal View History

2026-05-08 17:46:06 -05:00
# GreySec RED — Master Kanban
**Product:** GreySec Exploit Development Pipeline
**Type:** Internal Build Project
**Status:** BUILDING
**Updated:** 2026-05-07
**Parent debrief:** `~/greysec/ops/debriefs/exploit-lab-2026-05-07.md`
---
## Background
GreySec RED is an AI-augmented reverse engineering and exploit development lab. It takes a binary target, runs it through a two-agent pipeline (RE Agent + Exploit Writer), and produces a complete vulnerability brief plus a working exploit.
**Architecture:**
```
Binary Target → RE Agent (Kali + qwen2.5-coder:abliterator)
analysis.md + struct.json
Exploit Writer (Kali + qwen2.5-coder:abliterator)
exploit.py + shellcode.bin + test-results.md
```
**Current status:** PARTIALLY OPERATIONAL. Works on easy binaries (stack0, stack1, format0). Fails silently on harder ones (heap0). Validation layer missing.
---
## Pipeline Definition
**What the product IS:**
Drop a binary. Get a vulnerability brief, a working exploit, and shellcode. No manual RE required.
**What the client receives:**
- `analysis.md` — full vulnerability analysis: vulnerable function, offset calculation, attack vector, constraints
- `struct.json` — structured vulnerability data: offset, return address, bad chars, suggested shellcode type
- `exploit.py` — working pwntools exploit targeting the binary directly
- `shellcode.bin` — position-independent shellcode for the target architecture
- `test-results.md` — proof the exploit was run against the real binary and worked
**Target buyer:**
- Security teams building internal red team toolchains
- Exploit developers who need fast turnaround on binary targets
- CTF players and competitive hacking teams needing rapid challenge solutions
- Security researchers analyzing third-party binaries for CVEs
---
## Current State
### What Works
- RE Agent on stack0, stack1, format0: analysis.md + struct.json correct, exploit.py written and correct
- Agent scripts functional: `re-agent.sh` and `exploit-writer.sh` exist and execute
- Directory structure clean: reports/, exploits/, agents/ properly organized
- Model access confirmed: Kali container can reach MacBook Ollama at 100.127.137.64 (when SSH works)
- Exploit approach correct: pwntools process/stdin for stack0, argv for stack1
### What Is Broken or Missing
| # | Issue | Severity | Fix Time | Status |
|---|-------|----------|----------|--------|
| 1 | `re-agent.sh` has no validation gate — if struct.json not produced, exploit-writer.sh runs blind | CRITICAL | 30 min | Script updated — needs testing |
| 2 | `exploit-writer.sh` has no test loop — exploit.py never run against real binary | CRITICAL | 30 min | Script updated — needs testing |
| 3 | No gbrain logging hooks — pipeline findings not captured to institutional knowledge | HIGH | 15 min | Script updated — needs testing |
| 4 | No TIME-LOG hook — ~2.5 hours of AI time completely untracked | HIGH | 15 min | Script updated — needs testing |
| 5 | heap0 RE Agent failed silently — no struct.json, exploit written from guesswork | HIGH | 20 min | Needs re-run with fixed re-agent.sh |
| 6 | No shellcode.bin produced for any binary — referenced in scripts, never built | MEDIUM | 30 min | Needs msfvenom or pwntools asm step |
| 7 | MacBook SSH blocked — password rejected twice — abliterator model unreachable | CRITICAL | TBD | Adam needs to fix SSH or Tailscale |
| 8 | No test-results.md for any binary — "verified" in kanban was false | HIGH | — | Fixes #1 and #2 address this |
| 9 | Skill file `greysec-exploit-lab` does not exist | MEDIUM | 1 hour | Needs writing |
| 10 | `exploit.py` for vuln_test uses nonstandard "launcher" approach | MEDIUM | 30 min | Needs rewrite as direct pwntools exploit |
---
## BOARD
### BACKLOG
- [ ] Write `greysec-exploit-lab` skill (standalone operational procedure)
- [ ] Build shellcode.bin generation step (msfvenom or pwntools asm) for each binary
- [ ] Rewrite vuln_test exploit.py as direct pwntools targeting, not C tempfile launcher
- [ ] Add gbrain knowledge logging for each completed target (pipeline should auto-log findings)
- [ ] Benchmark: how fast is the pipeline on intermediate binaries (heap2/3, format1-4, net0-4)?
- [ ] Windows DLL analysis path — can the pipeline handle PE/DLL analysis?
- [ ] ARM/IoT binary path — what changes needed for non-x86 targets?
- [ ] Multi-arch support: x86, x64, ARM, MIPS — shellcode generation per arch
### IN PROGRESS
- [ ] **Validate updated re-agent.sh and exploit-writer.sh on heap0**
- [ ] **Unblock MacBook SSH** (Adam's decision needed)
### VALIDATING
_(empty — waiting for heap0 re-run and MacBook SSH fix)_
### DONE
- [x] Agent pipeline architecture (RE Agent + Exploit Writer, two-stage)
- [x] Agent scripts written (re-agent.sh, exploit-writer.sh)
- [x] Directory structure created (reports/, exploits/, agents/)
- [x] Protostar binaries validated: stack0, stack1, format0 (3 of 5 complete)
- [x] Agent scripts updated with validation gates, gbrain hooks, TIME-LOG hooks
### BLOCKED
- [ ] **MacBook SSH** — abliterator model unreachable, all RE runs on cloud fallback
- [ ] heap0 re-run — blocked by MacBook SSH (RE Agent works better with abliterator model)
---
## Technical Fix Tasks
### Task 1: Validate re-agent.sh + exploit-writer.sh on heap0 (HIGH PRIORITY)
**What:** The updated scripts have validation gates and test loops. Run them against heap0 to confirm they work.
**Test procedure:**
```bash
cd ~/greysec/engagements/exploit-lab
./agents/re-agent.sh heap0 /opt/protostar/bin/heap0
# Expected: analysis.md + struct.json produced
# Expected: FAIL and exit 1 if struct.json missing
./agents/exploit-writer.sh heap0
# Expected: exploit.py written
# Expected: exploit.py run against real binary
# Expected: test-results.md with PASS or FAIL
```
**Acceptance criteria:**
- re-agent.sh exits 1 if struct.json not produced
- exploit-writer.sh refuses to run if struct.json missing
- exploit.py is run and result captured in test-results.md
- Both scripts log to gbrain and TIME-LOG
**Who:** qwen2.5-coder:14b (can be self-verified)
**Time:** 20-30 minutes
---
### Task 2: Fix MacBook SSH Access (CRITICAL — Adam's Decision Needed)
**What:** Password SSH to `adamsloggett@100.127.137.64` rejected twice. The abliterator model only lives on MacBook.
**Option A — Fix the password:**
The current Mac password is `V4sTGRZqm#dW5@aW` (rotated 2026-05-05). Try from the Linux host directly to confirm whether this is a Tailscale SSH issue or a password issue:
```bash
ssh adamsloggett@100.127.137.64
# from the Linux host — not through Tailscale if possible
```
**Option B — Use Tailscale SSH instead:**
Tailscale SSH (`ssh -h tailscale adamsloggett@100.127.137.64`) bypasses password auth using Tailscale's SSH certificate authority. This requires:
```bash
# On the Mac: enable Tailscale SSH
tailscale set --ssh
# On the Linux host: use Tailscale hostname
ssh adamsloggett@macbook-pro-2
# Or: ssh adamsloggett@100.127.137.64 -o "ProxyCommand tailscale ssh --bg %h"
```
**Option C — Copy the abliterator model to Linux:**
If MacBook is unreachable, pull `huihui_ai/qwen2.5-coder-abliterate:latest` from MacBook's Ollama and host it on Linux. This requires enough disk space (~10GB).
**Who:** Adam needs to pick an option and act. Hermes will execute once the path is confirmed.
**Time:** Option A or B: ~15 minutes. Option C: ~30 minutes.
---
### Task 3: Re-Run heap0 with Fixed Scripts
**What:** heap0's RE Agent failed silently in the original run. With the validation gate in place, re-run it.
**Procedure:**
```bash
cd ~/greysec/engagements/exploit-lab
./agents/re-agent.sh heap0 /opt/protostar/bin/heap0
# Should produce struct.json or FAIL
./agents/exploit-writer.sh heap0
# Should produce tested exploit.py
```
**Acceptance criteria:**
- struct.json produced for heap0
- analysis.md accurate (verify offset and WINNER address)
- exploit.py written and tested — PASS in test-results.md
**Who:** qwen2.5-coder:14b
**Time:** 20-30 minutes
---
### Task 4: Shellcode Generation Step
**What:** No shellcode.bin has ever been produced. The agent scripts reference it but the step doesn't exist.
**Approach:** Use `msfvenom` or pwntools' `asm()` to generate shellcode based on the binary architecture.
**For Protostar binaries (x86, static):**
```bash
# Example for stack0 (calls execve("/bin/sh"))
msfvenom -p linux/x86/exec CMD=/bin/sh -f raw -a x86 --platform linux
```
**Or via pwntools in exploit.py:**
```python
from pwn import *
shellcode = asm(shellcraft.i386.linux.sh())
```
**Files to touch:** `exploit-writer.sh` — add shellcode generation as a post-exploit step.
**Who:** qwen2.5-coder:14b
**Time:** 30 minutes to add the step and test on stack0
---
### Task 5: Rewrite vuln_test exploit.py (MEDIUM)
**What:** The current vuln_test exploit.py uses a nonstandard "launcher" approach (compiles a C helper program inside a Python tempfile). Rewrite it as a direct pwntools process targeting the actual `/tmp/vuln_test` binary.
**Why:** The agent contract specifies pwntools process/exploit targeting the binary directly. The launcher approach is a workaround that suggests the model didn't fully understand how to exploit the binary via pwntools standard interface.
**Who:** qwen2.5-coder:14b
**Time:** 30 minutes
---
## Capability Expansion Tasks
### Tier 1: Beginner (WHAT EXISTS)
**Binaries:** stack0, stack1, format0 (Protostar)
**Skills needed:** Basic buffer overflow, format string exploitation
**Time per binary:** ~20-30 minutes with pipeline
**Status:** OPERATIONAL — 3 of 3 complete
### Tier 2: Intermediate (IN BACKLOG)
**Binaries:** heap2, heap3, format1-4, net0-4 (Protostar)
**Skills needed:** Heap grooming, UAF, fastbin dup, format string chaining
**Time per binary:** ~40-60 minutes with pipeline
**What needs building:** None — same pipeline, harder targets
### Tier 3: Advanced (IN BACKLOG)
**Binaries:** Fusion (Web, HTTP, SQL, etc. — advanced Protostar)
**Skills needed:** ROP chains, ASLR/DEP bypass, heaptechniques
**What needs building:** ROP gadget finder integration, libc database lookup
**Time per binary:** ~60-90 minutes with pipeline
### Tier 4: Elite (FUTURE)
**Targets:** Real-world binaries, DLL analysis, kernel modules
**Skills needed:** Full RE, CVE research, kernel exploitation
**What needs building:** Windows VM path, DLL analysis pipeline, kernel debug setup
---
## Product Tiers (Internal Planning)
| Tier | Target | Output | Complexity | Time |
|------|--------|--------|------------|------|
| Beginner | stack/heap/format (Protostar) | analysis + exploit | Easy | 20-30 min |
| Intermediate | Protostar advanced, VWA | analysis + exploit + ROP | Medium | 40-60 min |
| Advanced | real-world binaries | analysis + struct.json + suggested exploit path | Hard | 60-120 min |
| Elite | 0-day research | analysis only (no exploit — model limitations) | Expert | TBD |
---
## Definition of Done
GreySec RED is operational when:
1. re-agent.sh validates struct.json and exits non-zero if missing
2. exploit-writer.sh tests exploit.py against real binary and reports PASS/FAIL
3. gbrain logging is wired and firing after each completed target
4. TIME-LOG is updated after each pipeline run
5. heap0 re-run produces correct struct.json (validated against known values: offset 80, WINNER 0x08048464)
6. shellcode.bin is generated for at least stack0
7. All 5 Protostar binaries (stack0, stack1, format0, heap0, vuln_test) have PASS in test-results.md
8. MacBook SSH is unblocked and abliterator model is reachable
9. Skill file `greysec-exploit-lab` exists and documents operational procedure
10. At least one intermediate binary (heap2 or format1) has been processed end-to-end and PASS
---
## DEBT (Action Items from This Kanban)
| Action Item | Priority | Status | Notes |
|------------|----------|--------|-------|
| Validate updated scripts on heap0 | CRITICAL | open | Confirm validation gates work |
| Unblock MacBook SSH | CRITICAL | blocked | Adam's decision needed |
| Re-run heap0 with fixed scripts | HIGH | open | After Task 1 |
| Build shellcode.bin generation step | HIGH | open | msfvenom or pwntools asm |
| Rewrite vuln_test exploit.py | MEDIUM | open | Direct pwntools approach |
| Test intermediate binaries (heap2, format1) | MEDIUM | open | Pipeline validation |
| Write greysec-exploit-lab skill | MEDIUM | open | Operational docs |
| Add ROP gadget finder for advanced tier | LOW | backlog | Future |
| Validate pipeline against Windows DLL | LOW | backlog | Future |