298 lines
12 KiB
Markdown
298 lines
12 KiB
Markdown
|
|
# GreySec RED — Master Kanban
|
||
|
|
**Product:** GreySec Exploit Development Pipeline
|
||
|
|
**Type:** Internal Build Project
|
||
|
|
**Status:** BUILDING
|
||
|
|
**Updated:** 2026-05-07
|
||
|
|
**Parent debrief:** `~/greysec/ops/debriefs/exploit-lab-2026-05-07.md`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Background
|
||
|
|
|
||
|
|
GreySec RED is an AI-augmented reverse engineering and exploit development lab. It takes a binary target, runs it through a two-agent pipeline (RE Agent + Exploit Writer), and produces a complete vulnerability brief plus a working exploit.
|
||
|
|
|
||
|
|
**Architecture:**
|
||
|
|
```
|
||
|
|
Binary Target → RE Agent (Kali + qwen2.5-coder:abliterator)
|
||
|
|
↓
|
||
|
|
analysis.md + struct.json
|
||
|
|
↓
|
||
|
|
Exploit Writer (Kali + qwen2.5-coder:abliterator)
|
||
|
|
↓
|
||
|
|
exploit.py + shellcode.bin + test-results.md
|
||
|
|
```
|
||
|
|
|
||
|
|
**Current status:** PARTIALLY OPERATIONAL. Works on easy binaries (stack0, stack1, format0). Fails silently on harder ones (heap0). Validation layer missing.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Pipeline Definition
|
||
|
|
|
||
|
|
**What the product IS:**
|
||
|
|
Drop a binary. Get a vulnerability brief, a working exploit, and shellcode. No manual RE required.
|
||
|
|
|
||
|
|
**What the client receives:**
|
||
|
|
- `analysis.md` — full vulnerability analysis: vulnerable function, offset calculation, attack vector, constraints
|
||
|
|
- `struct.json` — structured vulnerability data: offset, return address, bad chars, suggested shellcode type
|
||
|
|
- `exploit.py` — working pwntools exploit targeting the binary directly
|
||
|
|
- `shellcode.bin` — position-independent shellcode for the target architecture
|
||
|
|
- `test-results.md` — proof the exploit was run against the real binary and worked
|
||
|
|
|
||
|
|
**Target buyer:**
|
||
|
|
- Security teams building internal red team toolchains
|
||
|
|
- Exploit developers who need fast turnaround on binary targets
|
||
|
|
- CTF players and competitive hacking teams needing rapid challenge solutions
|
||
|
|
- Security researchers analyzing third-party binaries for CVEs
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Current State
|
||
|
|
|
||
|
|
### What Works
|
||
|
|
- RE Agent on stack0, stack1, format0: analysis.md + struct.json correct, exploit.py written and correct
|
||
|
|
- Agent scripts functional: `re-agent.sh` and `exploit-writer.sh` exist and execute
|
||
|
|
- Directory structure clean: reports/, exploits/, agents/ properly organized
|
||
|
|
- Model access confirmed: Kali container can reach MacBook Ollama at 100.127.137.64 (when SSH works)
|
||
|
|
- Exploit approach correct: pwntools process/stdin for stack0, argv for stack1
|
||
|
|
|
||
|
|
### What Is Broken or Missing
|
||
|
|
|
||
|
|
| # | Issue | Severity | Fix Time | Status |
|
||
|
|
|---|-------|----------|----------|--------|
|
||
|
|
| 1 | `re-agent.sh` has no validation gate — if struct.json not produced, exploit-writer.sh runs blind | CRITICAL | 30 min | Script updated — needs testing |
|
||
|
|
| 2 | `exploit-writer.sh` has no test loop — exploit.py never run against real binary | CRITICAL | 30 min | Script updated — needs testing |
|
||
|
|
| 3 | No gbrain logging hooks — pipeline findings not captured to institutional knowledge | HIGH | 15 min | Script updated — needs testing |
|
||
|
|
| 4 | No TIME-LOG hook — ~2.5 hours of AI time completely untracked | HIGH | 15 min | Script updated — needs testing |
|
||
|
|
| 5 | heap0 RE Agent failed silently — no struct.json, exploit written from guesswork | HIGH | 20 min | Needs re-run with fixed re-agent.sh |
|
||
|
|
| 6 | No shellcode.bin produced for any binary — referenced in scripts, never built | MEDIUM | 30 min | Needs msfvenom or pwntools asm step |
|
||
|
|
| 7 | MacBook SSH blocked — password rejected twice — abliterator model unreachable | CRITICAL | TBD | Adam needs to fix SSH or Tailscale |
|
||
|
|
| 8 | No test-results.md for any binary — "verified" in kanban was false | HIGH | — | Fixes #1 and #2 address this |
|
||
|
|
| 9 | Skill file `greysec-exploit-lab` does not exist | MEDIUM | 1 hour | Needs writing |
|
||
|
|
| 10 | `exploit.py` for vuln_test uses nonstandard "launcher" approach | MEDIUM | 30 min | Needs rewrite as direct pwntools exploit |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## BOARD
|
||
|
|
|
||
|
|
### BACKLOG
|
||
|
|
|
||
|
|
- [ ] Write `greysec-exploit-lab` skill (standalone operational procedure)
|
||
|
|
- [ ] Build shellcode.bin generation step (msfvenom or pwntools asm) for each binary
|
||
|
|
- [ ] Rewrite vuln_test exploit.py as direct pwntools targeting, not C tempfile launcher
|
||
|
|
- [ ] Add gbrain knowledge logging for each completed target (pipeline should auto-log findings)
|
||
|
|
- [ ] Benchmark: how fast is the pipeline on intermediate binaries (heap2/3, format1-4, net0-4)?
|
||
|
|
- [ ] Windows DLL analysis path — can the pipeline handle PE/DLL analysis?
|
||
|
|
- [ ] ARM/IoT binary path — what changes needed for non-x86 targets?
|
||
|
|
- [ ] Multi-arch support: x86, x64, ARM, MIPS — shellcode generation per arch
|
||
|
|
|
||
|
|
### IN PROGRESS
|
||
|
|
|
||
|
|
- [ ] **Validate updated re-agent.sh and exploit-writer.sh on heap0**
|
||
|
|
- [ ] **Unblock MacBook SSH** (Adam's decision needed)
|
||
|
|
|
||
|
|
### VALIDATING
|
||
|
|
|
||
|
|
_(empty — waiting for heap0 re-run and MacBook SSH fix)_
|
||
|
|
|
||
|
|
### DONE
|
||
|
|
|
||
|
|
- [x] Agent pipeline architecture (RE Agent + Exploit Writer, two-stage)
|
||
|
|
- [x] Agent scripts written (re-agent.sh, exploit-writer.sh)
|
||
|
|
- [x] Directory structure created (reports/, exploits/, agents/)
|
||
|
|
- [x] Protostar binaries validated: stack0, stack1, format0 (3 of 5 complete)
|
||
|
|
- [x] Agent scripts updated with validation gates, gbrain hooks, TIME-LOG hooks
|
||
|
|
|
||
|
|
### BLOCKED
|
||
|
|
|
||
|
|
- [ ] **MacBook SSH** — abliterator model unreachable, all RE runs on cloud fallback
|
||
|
|
- [ ] heap0 re-run — blocked by MacBook SSH (RE Agent works better with abliterator model)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Technical Fix Tasks
|
||
|
|
|
||
|
|
### Task 1: Validate re-agent.sh + exploit-writer.sh on heap0 (HIGH PRIORITY)
|
||
|
|
|
||
|
|
**What:** The updated scripts have validation gates and test loops. Run them against heap0 to confirm they work.
|
||
|
|
|
||
|
|
**Test procedure:**
|
||
|
|
```bash
|
||
|
|
cd ~/greysec/engagements/exploit-lab
|
||
|
|
./agents/re-agent.sh heap0 /opt/protostar/bin/heap0
|
||
|
|
# Expected: analysis.md + struct.json produced
|
||
|
|
# Expected: FAIL and exit 1 if struct.json missing
|
||
|
|
|
||
|
|
./agents/exploit-writer.sh heap0
|
||
|
|
# Expected: exploit.py written
|
||
|
|
# Expected: exploit.py run against real binary
|
||
|
|
# Expected: test-results.md with PASS or FAIL
|
||
|
|
```
|
||
|
|
|
||
|
|
**Acceptance criteria:**
|
||
|
|
- re-agent.sh exits 1 if struct.json not produced
|
||
|
|
- exploit-writer.sh refuses to run if struct.json missing
|
||
|
|
- exploit.py is run and result captured in test-results.md
|
||
|
|
- Both scripts log to gbrain and TIME-LOG
|
||
|
|
|
||
|
|
**Who:** qwen2.5-coder:14b (can be self-verified)
|
||
|
|
**Time:** 20-30 minutes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Task 2: Fix MacBook SSH Access (CRITICAL — Adam's Decision Needed)
|
||
|
|
|
||
|
|
**What:** Password SSH to `adamsloggett@100.127.137.64` rejected twice. The abliterator model only lives on MacBook.
|
||
|
|
|
||
|
|
**Option A — Fix the password:**
|
||
|
|
The current Mac password is `V4sTGRZqm#dW5@aW` (rotated 2026-05-05). Try from the Linux host directly to confirm whether this is a Tailscale SSH issue or a password issue:
|
||
|
|
```bash
|
||
|
|
ssh adamsloggett@100.127.137.64
|
||
|
|
# from the Linux host — not through Tailscale if possible
|
||
|
|
```
|
||
|
|
|
||
|
|
**Option B — Use Tailscale SSH instead:**
|
||
|
|
Tailscale SSH (`ssh -h tailscale adamsloggett@100.127.137.64`) bypasses password auth using Tailscale's SSH certificate authority. This requires:
|
||
|
|
```bash
|
||
|
|
# On the Mac: enable Tailscale SSH
|
||
|
|
tailscale set --ssh
|
||
|
|
|
||
|
|
# On the Linux host: use Tailscale hostname
|
||
|
|
ssh adamsloggett@macbook-pro-2
|
||
|
|
# Or: ssh adamsloggett@100.127.137.64 -o "ProxyCommand tailscale ssh --bg %h"
|
||
|
|
```
|
||
|
|
|
||
|
|
**Option C — Copy the abliterator model to Linux:**
|
||
|
|
If MacBook is unreachable, pull `huihui_ai/qwen2.5-coder-abliterate:latest` from MacBook's Ollama and host it on Linux. This requires enough disk space (~10GB).
|
||
|
|
|
||
|
|
**Who:** Adam needs to pick an option and act. Hermes will execute once the path is confirmed.
|
||
|
|
**Time:** Option A or B: ~15 minutes. Option C: ~30 minutes.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Task 3: Re-Run heap0 with Fixed Scripts
|
||
|
|
|
||
|
|
**What:** heap0's RE Agent failed silently in the original run. With the validation gate in place, re-run it.
|
||
|
|
|
||
|
|
**Procedure:**
|
||
|
|
```bash
|
||
|
|
cd ~/greysec/engagements/exploit-lab
|
||
|
|
./agents/re-agent.sh heap0 /opt/protostar/bin/heap0
|
||
|
|
# Should produce struct.json or FAIL
|
||
|
|
|
||
|
|
./agents/exploit-writer.sh heap0
|
||
|
|
# Should produce tested exploit.py
|
||
|
|
```
|
||
|
|
|
||
|
|
**Acceptance criteria:**
|
||
|
|
- struct.json produced for heap0
|
||
|
|
- analysis.md accurate (verify offset and WINNER address)
|
||
|
|
- exploit.py written and tested — PASS in test-results.md
|
||
|
|
|
||
|
|
**Who:** qwen2.5-coder:14b
|
||
|
|
**Time:** 20-30 minutes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Task 4: Shellcode Generation Step
|
||
|
|
|
||
|
|
**What:** No shellcode.bin has ever been produced. The agent scripts reference it but the step doesn't exist.
|
||
|
|
|
||
|
|
**Approach:** Use `msfvenom` or pwntools' `asm()` to generate shellcode based on the binary architecture.
|
||
|
|
|
||
|
|
**For Protostar binaries (x86, static):**
|
||
|
|
```bash
|
||
|
|
# Example for stack0 (calls execve("/bin/sh"))
|
||
|
|
msfvenom -p linux/x86/exec CMD=/bin/sh -f raw -a x86 --platform linux
|
||
|
|
```
|
||
|
|
|
||
|
|
**Or via pwntools in exploit.py:**
|
||
|
|
```python
|
||
|
|
from pwn import *
|
||
|
|
shellcode = asm(shellcraft.i386.linux.sh())
|
||
|
|
```
|
||
|
|
|
||
|
|
**Files to touch:** `exploit-writer.sh` — add shellcode generation as a post-exploit step.
|
||
|
|
|
||
|
|
**Who:** qwen2.5-coder:14b
|
||
|
|
**Time:** 30 minutes to add the step and test on stack0
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Task 5: Rewrite vuln_test exploit.py (MEDIUM)
|
||
|
|
|
||
|
|
**What:** The current vuln_test exploit.py uses a nonstandard "launcher" approach (compiles a C helper program inside a Python tempfile). Rewrite it as a direct pwntools process targeting the actual `/tmp/vuln_test` binary.
|
||
|
|
|
||
|
|
**Why:** The agent contract specifies pwntools process/exploit targeting the binary directly. The launcher approach is a workaround that suggests the model didn't fully understand how to exploit the binary via pwntools standard interface.
|
||
|
|
|
||
|
|
**Who:** qwen2.5-coder:14b
|
||
|
|
**Time:** 30 minutes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Capability Expansion Tasks
|
||
|
|
|
||
|
|
### Tier 1: Beginner (WHAT EXISTS)
|
||
|
|
**Binaries:** stack0, stack1, format0 (Protostar)
|
||
|
|
**Skills needed:** Basic buffer overflow, format string exploitation
|
||
|
|
**Time per binary:** ~20-30 minutes with pipeline
|
||
|
|
**Status:** OPERATIONAL — 3 of 3 complete
|
||
|
|
|
||
|
|
### Tier 2: Intermediate (IN BACKLOG)
|
||
|
|
**Binaries:** heap2, heap3, format1-4, net0-4 (Protostar)
|
||
|
|
**Skills needed:** Heap grooming, UAF, fastbin dup, format string chaining
|
||
|
|
**Time per binary:** ~40-60 minutes with pipeline
|
||
|
|
**What needs building:** None — same pipeline, harder targets
|
||
|
|
|
||
|
|
### Tier 3: Advanced (IN BACKLOG)
|
||
|
|
**Binaries:** Fusion (Web, HTTP, SQL, etc. — advanced Protostar)
|
||
|
|
**Skills needed:** ROP chains, ASLR/DEP bypass, heaptechniques
|
||
|
|
**What needs building:** ROP gadget finder integration, libc database lookup
|
||
|
|
**Time per binary:** ~60-90 minutes with pipeline
|
||
|
|
|
||
|
|
### Tier 4: Elite (FUTURE)
|
||
|
|
**Targets:** Real-world binaries, DLL analysis, kernel modules
|
||
|
|
**Skills needed:** Full RE, CVE research, kernel exploitation
|
||
|
|
**What needs building:** Windows VM path, DLL analysis pipeline, kernel debug setup
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Product Tiers (Internal Planning)
|
||
|
|
|
||
|
|
| Tier | Target | Output | Complexity | Time |
|
||
|
|
|------|--------|--------|------------|------|
|
||
|
|
| Beginner | stack/heap/format (Protostar) | analysis + exploit | Easy | 20-30 min |
|
||
|
|
| Intermediate | Protostar advanced, VWA | analysis + exploit + ROP | Medium | 40-60 min |
|
||
|
|
| Advanced | real-world binaries | analysis + struct.json + suggested exploit path | Hard | 60-120 min |
|
||
|
|
| Elite | 0-day research | analysis only (no exploit — model limitations) | Expert | TBD |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Definition of Done
|
||
|
|
|
||
|
|
GreySec RED is operational when:
|
||
|
|
1. re-agent.sh validates struct.json and exits non-zero if missing
|
||
|
|
2. exploit-writer.sh tests exploit.py against real binary and reports PASS/FAIL
|
||
|
|
3. gbrain logging is wired and firing after each completed target
|
||
|
|
4. TIME-LOG is updated after each pipeline run
|
||
|
|
5. heap0 re-run produces correct struct.json (validated against known values: offset 80, WINNER 0x08048464)
|
||
|
|
6. shellcode.bin is generated for at least stack0
|
||
|
|
7. All 5 Protostar binaries (stack0, stack1, format0, heap0, vuln_test) have PASS in test-results.md
|
||
|
|
8. MacBook SSH is unblocked and abliterator model is reachable
|
||
|
|
9. Skill file `greysec-exploit-lab` exists and documents operational procedure
|
||
|
|
10. At least one intermediate binary (heap2 or format1) has been processed end-to-end and PASS
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## DEBT (Action Items from This Kanban)
|
||
|
|
|
||
|
|
| Action Item | Priority | Status | Notes |
|
||
|
|
|------------|----------|--------|-------|
|
||
|
|
| Validate updated scripts on heap0 | CRITICAL | open | Confirm validation gates work |
|
||
|
|
| Unblock MacBook SSH | CRITICAL | blocked | Adam's decision needed |
|
||
|
|
| Re-run heap0 with fixed scripts | HIGH | open | After Task 1 |
|
||
|
|
| Build shellcode.bin generation step | HIGH | open | msfvenom or pwntools asm |
|
||
|
|
| Rewrite vuln_test exploit.py | MEDIUM | open | Direct pwntools approach |
|
||
|
|
| Test intermediate binaries (heap2, format1) | MEDIUM | open | Pipeline validation |
|
||
|
|
| Write greysec-exploit-lab skill | MEDIUM | open | Operational docs |
|
||
|
|
| Add ROP gadget finder for advanced tier | LOW | backlog | Future |
|
||
|
|
| Validate pipeline against Windows DLL | LOW | backlog | Future |
|