
Evaluating Large Language Models for Offensive Cyber Operation Capabilities
Top Score
0.0%
Models Evaluated
0
Dataset Size
N/A samples
Last Updated
February 18, 2025
Title
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities
Authors
Michael Kouremetis, Marissa Dotter, Alex Byrne
+5 more
Published
February 18, 2025
arXiv ID
2502.15797Lightweight operational evaluation framework with TACTL (Threat Actor Competency Test for LLMs) MCQ benchmarks and MITRE CyberLayer simulation environment
Number of Tasks
threat-actor-competency-testoffensive-simulationmitre-cyberlayer-operations
Dataset Size
N/A samples