Security score methodology
Every plugin and theme in the database carries a 0–100 score. Here's exactly what goes into it, how the weights are assigned, and what the number does and doesn't tell you.
A single number, multiple signals
- Patched high CVE (2024)−6
- Raw SQL in 2 files−4
- Slow patch velocity−3
Security is a distribution, not a boolean - but users have to make a yes/no install decision. The score collapses the distribution into one reviewable number so the decision can happen quickly, while the underlying signals stay visible on the plugin's page for anyone who wants to audit how the number was produced.
Higher is safer. A fresh plugin with zero CVEs, strong escaping patterns in its code, and an active maintainer scores in the 90s. A plugin with multiple unpatched critical CVEs, weak sanitization at dozens of sinks, and an absent maintainer scores in the teens.
The score is recomputed deterministically from inputs that are themselves publicly auditable: the CVE feed, the plugin's own source code from WordPress.org's SVN, our AST analyzer, and our CommonCrawl-derived deployment observations. Given the same inputs, the pipeline produces the same score - no manual overrides, no human fingers on the scale.
From raw signals to a single number
What each deduction costs
The aggregator starts at 100 and subtracts these ranges based on the evidence. Ranges (rather than fixed points) let the aggregator distinguish a single moderate issue from a pattern of them.
| Signal | Deduction range | When it applies |
|---|---|---|
| Unpatched critical CVE | −15 to −20 | Any CVE with CVSS ≥ 9.0 that is not yet fixed upstream. |
| Patched critical CVE | −4 to −8 | Decays with time since patch; recent fixes count more. |
| Unpatched high / medium | −8 to −12 | CVSS 4.0–8.9 without a fix available. |
| Critical taint flow | −10 to −15 | AST-derived sink reachable from an unauthenticated source. |
| Raw SQL queries | −5 to −10 | String-concatenated queries bypassing $wpdb->prepare(). |
| Missing nonce / cap checks | −5 to −10 | AJAX or REST handlers with no verify_nonce / current_user_can. |
| Unescaped output | −3 to −8 | echo $var without an esc_* wrapper at >N locations. |
| Abandoned maintenance | −5 to −10 | No release in 18+ months, no response to disclosures. |
| Developer trust drag | −3 to −8 | Same author has slow patch velocity on other plugins. |
What the score can't see
The score is a prior, not a verdict. Always cross-check the plugin page's attack surface and bundled-library list before installing on critical infrastructure.
Zero-days by definition. Vulnerabilities that haven't been disclosed don't move the score. A plugin can score 95 today and 40 tomorrow when a critical CVE lands - the score reflects what's known, not what's hidden.
Your specific configuration. A plugin with an AJAX handler that's dangerous only when combined with a rare WordPress setting might score fine in aggregate and still be catastrophic on your site. The plugin page's attack surface map shows the raw entry points; use the score as a prior, not a verdict.
Supply-chain risk. Bundled libraries are surfaced on the plugin page but not directly deducted in the score. A plugin shipping an outdated copy of a common library can be a real risk even when the plugin's own code is clean. I'm working on folding this into the score.
How the signals actually get produced
The scoring engine is the end of the pipeline; the interesting work happens in the stages that produce the signals feeding it. The deep-dive below unpacks the AST taint analyzer that provides the deterministic code-security signals.

Hi - Mika here. I built WP-Safety solo, so the methodology below is genuinely how it works, not a marketing sketch. The deep-dives are where I go long on the non-obvious details. Strictly optional - the plugin and CVE pages carry the full story without any of this.
Taint analysis
AST-level inter-procedural data-flow tracking across 7 superglobal sources, 31 sinks, and 47 sanitizers. The two-phase algorithm, the WordPress-specific special cases (prepare, array_map, nonce/capability guards), and the edge cases the analyzer intentionally leaves unchased.
PoC agent cascade
The autonomous-agent pipeline that produces verified CVE reproductions. Lightweight-LLM → frontier-LLM cascade, ephemeral WordPress substrate, independent frontier judge, Playwright evidence bundle, and the budgets that keep everything honest.
See the score in action.
Browse any plugin's page to see its score, every deduction that went into it, and the raw evidence behind each one.