Methodology

PoC verification pipeline

When a CVE arrives, a disclosure alone is just a claim. Our verification pipeline reproduces the exploit on a clean WordPress install, captures it on video and trace, and has a separate model decide whether the run actually demonstrated impact.

What it is

Verified exploits, not claimed ones

CVE-2026-4365
LearnPress ≤ 4.3.2.8
Verified
02:14
Reproduction
  1. 1Extract wp_rest nonce from window.lpData on a public quiz page
  2. 2POST /lp-ajax-handle with lp-load-ajax=delete_question_answer
  3. 3Confirm target quiz row was deleted via wp db query
Judge verdictEXPLOITABLE

A CVE disclosure is a text description - "this parameter is unsanitized," "this endpoint lacks a nonce check." The fact that something can be exploited in the abstract doesn't say whether it has been successfully reproduced against a real WordPress install. We treat a CVE as unverified until the pipeline has stood up a clean WP, installed the plugin at the vulnerable version, walked through the exploit with a real browser, and recorded the impact.

The pipeline runs two models in a cascade. A fast, cheap, lightweight LLM tries first; if it can't land the exploit cleanly, a stronger frontier LLM picks up on the same environment and gets a second attempt. The same class of strong model then reviews the tool-call log and issues an independent verdict - so a successful run isn't marked "failed" by a weaker judge the way a weaker judge is prone to do when tie-breaking its own decisions.

The output of a successful verification is an artifact, not a message. A YouTube video of the exploit, a Playwright trace you can step through request-by-request, a standalone exploit script you can re-run, and the exact vulnerable code snippet with the fix diff - all produced from the same reproduction session.

How it works

Research, reproduce, judge

Four phases. Phase 1 plans on the main server. Phases 2–3 run inside an ephemeral VM that is torn down after every task. Phase 4 reviews the tool-log after the fact, on the main server again, with no ability to mutate the run.
Input
CVE record + patch diff + plugin source
Lightweight research LLM
Reads the diff, writes a structured research plan
Frontier-LLM fallback
Used when the lightweight tier can't produce a usable plan
Ephemeral VM
Fresh snapshot per task, destroyed on completion
Docker WordPress + MariaDB
Plugin installed at exact vulnerable version
Playwright browser
Records video + trace of every action
Lightweight LLM first
Fast + cheap; lands most straightforward PoCs
escalation
Frontier LLM second
Invoked if the lightweight tier can't verify impact
Tool budget: 40 turns
http_request · wp_cli · bash_exec · browser_*
Frontier LLM judge
Same tier as the escalation executor, independent verdict
Rules
Must exploit via vulnerable endpoint; wp_cli only for verification
Lightweight fallback
Used only when the frontier-LLM API key is absent
YouTube video
Unlisted, embedded on the vulnerability page
Playwright trace
Step-through of every request + DOM snapshot
Standalone exploit
Runnable script + vulnerable code + fix diff
Limits

What the pipeline can't verify

complex
Stateful chains
Multi-user, multi-session, cron-gated exploits
skipped
WAF-dependent paths
Exploits that only work on a live target with specific rules
flaky
Time-sensitive
Race conditions, TOCTOU, timing side-channels

An unverified CVE isn't a harmless CVE. The vulnerability page shows both - verified exploits get a video and a green badge, unverified ones still surface every signal the static analyzer can extract.

Stateful exploit chains. A vulnerability that requires three users interacting across two pages and a cron job to land can sometimes be reproduced by a model with a 40-turn budget, but often can't. These fall back to static analysis - we still score the plugin, still flag the CVE, we just don't ship a video.

WAF / hardening dependent. Some exploits are interesting specifically because they evade a popular WAF rule. Reproducing them against a bare Docker-WordPress with no WAF either succeeds trivially (not a useful signal) or fails mysteriously (the WAF isn't there to evade). We don't simulate arbitrary production hardening.

Race conditions. TOCTOU, check-then-use, and timing side-channel exploits reproduce non-deterministically. A single agent run either happens to hit the window or doesn't; I don't currently re-run enough times to establish statistical confidence. Honest label in those cases: "research_complete, poc_pending".

Go deeper

Subsystem deep-dive

If you want the file-level detail on how the cascade is actually implemented - substrate, agent loop, judge prompts, budget discipline, evidence capture - the dedicated architecture page covers it in full.

Mika Sipilä
For nerds only

Hi - Mika here. I built WP-Safety solo, so the methodology below is genuinely how it works, not a marketing sketch. The deep-dives are where I go long on the non-obvious details. Strictly optional - the plugin and CVE pages carry the full story without any of this.

Mika Sipilä·Founder, WP-Safety.org

See verified PoCs in action.

Browse the CVE database and filter for verified entries - each one has a video, a Playwright trace, and a standalone reproduction script you can run yourself.