Cyber LLM Benchmark Hub Logo
Cyber LLM Benchmark Hub
  • Home
  • Benchmarks
  • Contact
Support
Cyber LLM Benchmark Hub Logo
Cyber LLM Benchmark Hub
  • Home
  • Benchmarks
  • Contact
Support
Cyber LLM Benchmark Hub Logo
Cyber LLM Benchmark Hub
  • Home
  • Benchmarks
  • Contact
Support
Cyber LLM Benchmark Hub Logo
Cyber LLM Benchmark Hub
  • Home
  • Benchmarks
  • Contact
Support
Back to Benchmarks
LLM SafetyJailbreakingPrompt InjectionSafety Guardrails

CySecBench

Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

View Paper
Quick Stats

Top Score

N/A

Models Evaluated

0

Dataset Size

12,662 samples

Last Updated

January 2, 2025

Availability

Dataset ✓Code ✓
Metrics Tracked
jailbreak success-rateresistance score
Dataset Information

12,662 close-ended prompts across 10 attack-type categories designed to evaluate jailbreaking techniques in the cybersecurity domain

Number of Tasks

3

Jailbreaking EvaluationSafety AssessmentPrompt Obfuscation Testing
Model Results
Detailed scores for each model evaluated on this benchmark

Verified metadata only

No verified public primary numeric leaderboard/result table has been extracted into the catalog yet; metadata and source links were refreshed during the 2026-05-12 audit.

Review Source
Cyber LLM Benchmark Hub

Cyber LLM Benchmark Hub

Benchmarking frontier models across cybersecurity tasks.

BenchmarksContact

© 2026 Cyber LLM Benchmark Hub