Skip to content
Benchmark context

Datasets and CVSS coverage

A quick primer on the OWASP Benchmark (Java) and the Java CVE Benchmark (165 CVEs), plus a compact donut that shows how Manual detections distribute across CVSS severities.

OWASP Benchmark v1.2
Synthetic Java test cases

Total

2,740

Vulnerable

1,415

Non-vuln

1,325

The OWASP Benchmark is designed to stress-test FP/FN trade-offs. Its F1 score is best interpreted next to the real-world CVE_R metric used throughout this site.

Java CVE Benchmark
Real-world 165 CVEs / 37 unique weaknesses

CVEs

165

CWE classes

8

Notes

2

  • CWE-664 and CWE-707 are relatively easier wins yet still miss many cases.
  • Breakdown of the 48 detected CVEs (by CWE): CWE-284:3, CWE-664:32, CWE-691:2, CWE-693:6, CWE-703:2, CWE-707:14, CWE-710:1.
CVSS detections (Manual)
Based on the Java CVE Benchmark Manual Check column.

high

13 of 40 CVEs detected

32.5%

medium

32 of 120 CVEs detected

26.7%

low

3 of 5 CVEs detected

60%