
Benchmarking Generative Agents for Penetration Testing with 33 vulnerable systems
Top Score
0.0%
Models Evaluated
0
Dataset Size
33 samples
Last Updated
October 28, 2024
Title
AutoPenBench: Benchmarking Generative Agents for Penetration Testing
Authors
Luca Gioacchini, Marco Mellia, Idilio Drago
+3 more
Published
October 28, 2024
arXiv ID
2410.0322533 vulnerable systems of increasing difficulty including in-vitro and real-world scenarios with generic and specific milestone evaluation
Number of Tasks
vulnerability-exploitationprivilege-escalationnetwork-penetrationweb-exploitation
Dataset Size
33 samples