Every "AI security tool" demo I've seen does the same dance: paste a vulnerable function in, watch GPT-4 explain that strcpy is unsafe, applaud. Real codebases don't work like that. They have hundreds of files, weird build systems, dead code paths, and vulnerabilities that only matter in context. PatchPilot is my attempt at building something that actually survives contact with a real repo.
The architecture is six specialized agents, not one general one. Each is small, opinionated, and easy to debug — and the LLM only shows up where pattern matching genuinely can't go.
Why six agents instead of one big LLM call
The dirty secret of single-prompt LLM tools is that they hallucinate fixes. They invent functions that don't exist, miss imports, and patch the symptom instead of the cause. Splitting the workflow forces each step to commit to a structured output that the next step can verify.
[01 DECODE] apk → smali / web → ast # parse, normalize [02 SCAN] regex + taint flow # 8 vuln categories [03 CLASSIFY] CWE id + severity + context # is this real or noise? [04 RAG] vector lookup → CWE / OWASP # ground the LLM [05 PATCH] LLM + retrieved context # propose unified diff [06 VALIDATE] re-scan patched output # did we actually fix it?
If the patch agent hallucinates a fix, the validate agent catches it on re-scan. If the classify agent flags a false positive, the RAG agent retrieves no relevant CVE and the pipeline aborts with a "low confidence" verdict instead of generating noise.
Decode: APKs are the hard part
Web apps are easy — Python, JavaScript, PHP all parse cleanly. Android APKs are not. They're zipped, the bytecode is dex, the resources are binary XML, and the obfuscators have a vested interest in making it all look like garbage.
The decoder uses APKTool to crack the APK back to smali, then walks the smali looking for the dangerous Android API surface — WebView.loadUrl with user input, Cipher.getInstance("DES"), exported activities with no permission gate, hardcoded API keys in strings.xml, the usual horror show.
def decode_apk(path):
out = f"/tmp/decoded/{uuid4()}"
subprocess.run(["apktool", "d", path, "-o", out, "-f"], check=True)
return {
"smali": glob(f"{out}/smali*/**/*.smali", recursive=True),
"manifest": parse_manifest(f"{out}/AndroidManifest.xml"),
"strings": extract_strings(f"{out}/res/values/strings.xml"),
"libs": glob(f"{out}/lib/**/*.so", recursive=True)
}
RAG: what the LLM sees before it patches
Naive prompting was the first version — "here's a vulnerable function, fix it." The model would happily invent fixes. Adding RAG was the unlock. Before the patch agent runs, the RAG agent retrieves the three most similar CWE descriptions, the canonical fix pattern from OWASP, and any prior patches in the repo's git history that touched similar code.
I run Mistral 7B locally via Ollama. It's not as smart as GPT-4, but I never have to send my client's source code to OpenAI, the cost is zero, and the latency is predictable. Grounding it with RAG closes most of the capability gap.
Validate: the agent that audits the agent
Every patch goes through the original scanner again before being marked successful. If the same regex still triggers on the patched output, the patch is rejected. If new vulnerabilities appear, the patch is rejected. If the patch fails to apply (line offsets shifted, syntax error), it's rejected.
This sounds obvious in retrospect, but the first version didn't have it. About 1 in 5 LLM-generated patches in my early tests introduced new vulnerabilities while fixing the original. Validate-by-re-scan eliminated that class of failure entirely.
The lesson kept being the same one — never trust the LLM to grade its own homework. Every assertion the model makes has to be independently verifiable by something dumber and more deterministic.
The 8 vulnerability categories
| SQL injection | string concat to query, taint flow |
| Command injection | shell=True, exec/system with input |
| XSS (reflected/stored) | unescaped output, innerHTML sinks |
| Path traversal | user input → file open without canonicalization |
| Hardcoded secrets | entropy + format heuristics on strings |
| Weak crypto | MD5/SHA1/DES/ECB mode |
| SSRF | user-controlled URL in HTTP client |
| Insecure deserialization | pickle.loads, ObjectInputStream, etc. |
What's next
The big missing piece is dataflow analysis across files. Right now each agent works on one file at a time. A real SQL injection might originate in a request handler, flow through three service classes, and only become exploitable in a database utility. I want to bolt on a lightweight tree-sitter-based call graph and let taint flow chase it across the repo.
Also: hooking it into RealTimeDefender for closed-loop detect-and-patch in homelab environments. The attack hits, the IDS tells PatchPilot what kind, PatchPilot scans the candidate code paths and ships a fix to staging.