Methodology

Security score methodology

Every plugin and theme in the database carries a 0–100 score. Here's exactly what goes into it, how the weights are assigned, and what the number does and doesn't tell you.

What it is

A single number, multiple signals

87/ 100
Example deductions
  • Patched high CVE (2024)−6
  • Raw SQL in 2 files−4
  • Slow patch velocity−3
Illustrative - real plugin pages show every deduction with a link back to the source evidence.

Security is a distribution, not a boolean - but users have to make a yes/no install decision. The score collapses the distribution into one reviewable number so the decision can happen quickly, while the underlying signals stay visible on the plugin's page for anyone who wants to audit how the number was produced.

Higher is safer. A fresh plugin with zero CVEs, strong escaping patterns in its code, and an active maintainer scores in the 90s. A plugin with multiple unpatched critical CVEs, weak sanitization at dozens of sinks, and an absent maintainer scores in the teens.

The score is recomputed deterministically from inputs that are themselves publicly auditable: the CVE feed, the plugin's own source code from WordPress.org's SVN, our AST analyzer, and our CommonCrawl-derived deployment observations. Given the same inputs, the pipeline produces the same score - no manual overrides, no human fingers on the scale.

How it's computed

From raw signals to a single number

Four signal families flow into a deterministic aggregator, which an LLM annotates with human-readable deductions. The score itself is always reproducible from the signals.
Inputs - 4 signal families
CVE feed
Wordfence Intelligence, NVD, PatchStack Core
Severity mix
Counts per Critical / High / Medium / Low
up to −20
Patch status
Patched vs unpatched, days-to-patch
up to −15
AST taint flows
Sources → sinks, with sanitizer coverage
Code signals
Dangerous functions, raw SQL, output escaping
Attack surface
AJAX / REST / hooks, nonce + cap checks
Patch velocity
Median time-to-patch across this dev's plugins
up to −10
Maintenance
Recency of last release, abandonment heuristics
Historical CVE rate
CVEs per year across their portfolio
Install count
WP.org active_installs
Version distribution
% of live installs on vulnerable vs fixed
Hosting mix
From our CommonCrawl corpus
Normalize
Each signal → 0–1 scalar; missing data fails closed
Weight
Severity × recency decay × exposure
Sum & clamp
Subtract weighted deductions from 100, floor at 0
Small explainer LLM
Reads the numeric deductions + source evidence
Generates prose
Human-readable 'why' for each deduction
Never alters the number
Score and deductions are computed before the LLM is called
Score (0–100)
Displayed on the plugin / theme page
Deductions list
Every point lost, with a link back to the evidence
Reproducible
Same inputs → same score, always
The actual weights

What each deduction costs

The aggregator starts at 100 and subtracts these ranges based on the evidence. Ranges (rather than fixed points) let the aggregator distinguish a single moderate issue from a pattern of them.

SignalDeduction rangeWhen it applies
Unpatched critical CVE−15 to −20Any CVE with CVSS ≥ 9.0 that is not yet fixed upstream.
Patched critical CVE−4 to −8Decays with time since patch; recent fixes count more.
Unpatched high / medium−8 to −12CVSS 4.0–8.9 without a fix available.
Critical taint flow−10 to −15AST-derived sink reachable from an unauthenticated source.
Raw SQL queries−5 to −10String-concatenated queries bypassing $wpdb->prepare().
Missing nonce / cap checks−5 to −10AJAX or REST handlers with no verify_nonce / current_user_can.
Unescaped output−3 to −8echo $var without an esc_* wrapper at >N locations.
Abandoned maintenance−5 to −10No release in 18+ months, no response to disclosures.
Developer trust drag−3 to −8Same author has slow patch velocity on other plugins.
Limitations

What the score can't see

0 sig
Zero-days
Undisclosed CVEs can't be counted
varies
Your configuration
Plugin × your WP setup interactions
WIP
Supply chain
Bundled libs surfaced, not yet scored

The score is a prior, not a verdict. Always cross-check the plugin page's attack surface and bundled-library list before installing on critical infrastructure.

Zero-days by definition. Vulnerabilities that haven't been disclosed don't move the score. A plugin can score 95 today and 40 tomorrow when a critical CVE lands - the score reflects what's known, not what's hidden.

Your specific configuration. A plugin with an AJAX handler that's dangerous only when combined with a rare WordPress setting might score fine in aggregate and still be catastrophic on your site. The plugin page's attack surface map shows the raw entry points; use the score as a prior, not a verdict.

Supply-chain risk. Bundled libraries are surfaced on the plugin page but not directly deducted in the score. A plugin shipping an outdated copy of a common library can be a real risk even when the plugin's own code is clean. I'm working on folding this into the score.

Go deeper

How the signals actually get produced

The scoring engine is the end of the pipeline; the interesting work happens in the stages that produce the signals feeding it. The deep-dive below unpacks the AST taint analyzer that provides the deterministic code-security signals.

Mika Sipilä
For nerds only

Hi - Mika here. I built WP-Safety solo, so the methodology below is genuinely how it works, not a marketing sketch. The deep-dives are where I go long on the non-obvious details. Strictly optional - the plugin and CVE pages carry the full story without any of this.

Mika Sipilä·Founder, WP-Safety.org

See the score in action.

Browse any plugin's page to see its score, every deduction that went into it, and the raw evidence behind each one.