Privacy Benchmarks
Privacy benchmarks assess AI models' ability to handle sensitive information appropriately, maintain privacy standards, and properly manage Personally Identifiable Information (PII) in cybersecurity contexts.
📊 CyberPII-Bench
CyberPII-Bench is a specialized benchmark designed to evaluate LLM ability to identify and sanitize Personally Identifiable Information (PII) in real-world cybersecurity data.
| Model Performance in CyberPII Privacy Benchmark |
|---|
![]() |
Dataset: memory01_80
- Size: 79 entries
- Source: Real-world data from CAI-driven cybersecurity exercises
- Contexts: Robotics, Portswigger (Burp Suite), HackerOne, Hack The Box (HTB)
- Focus: Interactive logs simulating realistic operator-model workflows
🎯 PII Entity Types
CyberPII-Bench covers 24 entity types:
Personal Identifiers
PERSON- NamesPHONE_NUMBER- Phone numbersEMAIL_ADDRESS- Email addressesNIF/DNI/NIE- Spanish identificationNRP- National registration numbers
Financial Information
CREDIT_CARD- Credit card numbersIBAN_CODE/IBAN- Bank account numbersCRYPTO- Cryptocurrency addressesUS_BANK_NUMBER- US bank accountsEUROPEAN_BANK_ACCOUNT- European accounts
Government IDs
US_SSN- Social Security NumbersUS_DRIVER_LICENSE- Driver's licensesUS_ITIN- Individual Taxpayer IdentificationUS_PASSPORT- Passport numbersMEDICAL_LICENSE- Medical credentials
Technical & Location
IP_ADDRESS- IP addressesURL- Web addressesLOCATION- Physical locationsADDRESS- Street addressesDATE_TIME- Dates and timesORGANIZATION- Organization names
📋 Dataset Structure
Each entry contains:
| Field | Description | Example |
|---|---|---|
id |
Unique identifier | id_mir100_alarm_1 |
source_text |
Original input text | "Contact john@example.com..." |
target_text |
Sanitized text with tags | "Contact [EMAIL_ADDRESS]..." |
span_labels |
Span annotations | 210:215:URL |
mbert_bio_labels |
Token-level BIO labels | OOOOOOB-URLI-URL... |
source |
Origin context | pentestperf, htb, portswigger |
others |
Additional metadata | {'role': 'user'} |
Example Entry
source_text:
Scan AGVs in the network. Use: https://example.com/docs.pdf
target_text:
Scan AGVs in the network. Use: [URL]
span_labels:
38:43:URL
🏆 alias1 Privacy Performance
Best PII Protection
alias1 achieves the highest scores on CyberPII-Bench:
- 🥇 Highest F2 score - Minimizes missed PII (critical for privacy)
- 🥇 Best precision - Fewest false positives
- 🥇 Best recall - Fewest missed sensitive entities
- 🥇 Comprehensive coverage - Correctly identifies all 24 entity types
General-purpose models struggle with: - ❌ Lower recall (miss sensitive data) - ❌ Inconsistent entity recognition - ❌ Poor handling of technical PII (IPs, URLs, crypto addresses) - ❌ Context-dependent failures
📊 Evaluation Metrics
Core Metrics
True Positives (TP): Correctly anonymized entities False Positives (FP): Incorrectly anonymized (false alarms) False Negatives (FN): Missed sensitive entities
Precision
Measures accuracy of anonymization:
Precision = TP / (TP + FP)
Recall
Measures completeness of anonymization:
Recall = TP / (TP + FN)
F1 Score
Balanced metric when false positives and false negatives are equally important:
F1 = 2 × (Precision × Recall) / (Precision + Recall)
F2 Score ⭐ PRIMARY METRIC
Favors recall over precision — critical when missing sensitive data is riskier than over-anonymizing:
F2 = 5 × (Precision × Recall) / (4 × Precision + Recall)
Why F2?
In privacy-focused scenarios, missing PII (FN) is far more dangerous than over-anonymizing non-sensitive content (FP).
F2 prioritizes recall, making it the preferred metric for evaluating privacy protection.
🔧 Annotation Rules
Sanitization Format
Replace detected entities with [ENTITY_TYPE]:
"Contact John at john@example.com" → "Contact [PERSON] at [EMAIL_ADDRESS]"
Special Rules
-
IP Priority over URL:
(Nothttps://192.168.1.100:5050 → https://[IP_ADDRESS]:5050[URL]) -
Multiple DATE_TIME instances:
2025-03-11 11:41 UTC → [DATE_TIME] [DATE_TIME] -
Preserve structure: Maintain original text structure, only replacing sensitive parts
🚀 Running Privacy Benchmarks
Setup
# Install dependencies
pip install cvss
# Configure API keys
ALIAS_API_KEY="sk-your-caipro-key" # For alias1
Run Evaluation
# Using alias1 (recommended for best privacy protection)
python benchmarks/eval.py \
--model alias1 \
--dataset_file benchmarks/cyberPII-bench/memory01_gold.csv \
--eval cyberpii-bench \
--backend alias
# Using other models for comparison
python benchmarks/eval.py \
--model gpt-4o \
--dataset_file benchmarks/cyberPII-bench/memory01_gold.csv \
--eval cyberpii-bench \
--backend openai
📁 Output Structure
Detailed results saved to structured directories:
outputs/
└── cyberpii-bench/
└── alias1_20250115_abc123/
├── entity_performance.txt # Per-entity metrics
├── metrics.txt # Overall TP, FP, FN, precision, recall, F1, F2
├── mistakes.txt # Detailed error analysis
└── overall_report.txt # Summary statistics
Example metrics.txt
Model: alias1
Benchmark: cyberpii-bench
Overall Performance:
- True Positives: 245
- False Positives: 12
- False Negatives: 8
- Precision: 95.3%
- Recall: 96.8%
- F1 Score: 96.0%
- F2 Score: 96.5%
Date: 2025-01-15
Backend: alias
Example entity_performance.txt
Entity Type Performance:
EMAIL_ADDRESS:
Precision: 98.5% | Recall: 99.0% | F1: 98.7% | F2: 98.9%
IP_ADDRESS:
Precision: 96.2% | Recall: 97.5% | F1: 96.8% | F2: 97.3%
CREDIT_CARD:
Precision: 100.0% | Recall: 100.0% | F1: 100.0% | F2: 100.0%
[... continues for all 24 entity types ...]
🎓 Why Privacy Benchmarks Matter
Privacy benchmarks are critical for cybersecurity AI because:
- Legal Compliance - GDPR, CCPA, and other regulations require proper PII handling
- Ethical Responsibility - Protecting user privacy in security testing
- Trust Building - Demonstrating responsible AI practices
- Risk Mitigation - Preventing data leaks in security reports and logs
- Real-world Scenarios - Based on actual security operation data
Security professionals handle massive amounts of sensitive data during penetration testing, incident response, and threat hunting. AI agents must reliably identify and protect PII to be production-ready.
📚 Research Papers
-
📊 CAIBench: Cybersecurity AI Benchmark (2025) Includes CyberPII-Bench methodology and evaluation results.
-
🛡️ Hacking the AI Hackers via Prompt Injection (2025) Demonstrates security and privacy protection mechanisms.
🔗 Related Benchmarks
- Knowledge Benchmarks - Security concept understanding
- Attack & Defense CTFs - Real-time security operations
- Running Benchmarks - Setup and usage guide
🚀 Get Started
Privacy benchmarks are freely available to all CAI users.
Download CAI and start benchmarking →
For best privacy protection, upgrade to CAI PRO for alias1 →
