Research
CAI is built on a strong foundation of peer-reviewed research establishing the field of Cybersecurity AI as a distinct research domain. Our work spans theoretical frameworks, practical implementations, educational initiatives, and rigorous empirical evaluations.
π Research Impact & Achievements
π Competitions and Challenges
CAI has demonstrated exceptional performance in real-world security competitions:
π Key Research Findings
-
Pioneered LLM-powered AI Security with PentestGPT, establishing the foundation for the Cybersecurity AI research domain
-
3,600Γ performance improvement over human penetration testers in standardized CTF benchmark evaluations
-
CVSS 4.3-7.5 severity vulnerabilities identified in production systems through automated security assessment
-
Democratization of AI-empowered vulnerability research: CAI enables both non-security domain experts and experienced researchers to conduct more efficient vulnerability discovery, expanding the security research community while empowering small and medium enterprises to conduct autonomous security assessments
-
Systematic evaluation of large language models across both proprietary and open-weight architectures, revealing substantial gaps between vendor-reported capabilities and empirical cybersecurity performance metrics
-
Established autonomy levels in cybersecurity and argued about autonomy vs automation in the field
-
Collaborative research initiatives with international academic institutions focused on developing cybersecurity education curricula and training methodologies
-
Comprehensive defense framework against prompt injection in AI security agents: developed and empirically validated a multi-layered defense system
-
Explored the Cybersecurity of Humanoid Robots with CAI, identifying new attack vectors showing how humanoids (a) operate simultaneously as covert surveillance nodes and (b) can be purposed as active cyber operations platforms
π Research Publications
The Cybersecurity AI research line has produced 8+ papers and technical reports with active research collaborations:
Core Framework & Foundations
| CAI: An Open, Bug Bounty-Ready Cybersecurity AI |
The Dangerous Gap Between Automation and Autonomy |
CAI Fluency: Educational Framework |
Hacking the AI Hackers via Prompt Injection |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
1. CAI: An Open, Bug Bounty-Ready Cybersecurity AI (April 2025)
Authors: V. Mayoral-Vilches et al. arXiv: 2504.06017
Core framework paper establishing CAI as a lightweight, open-source platform for building AI-powered security tools. Demonstrates 3,600Γ performance improvement over manual testing and presents systematic evaluation across multiple LLMs.
2. Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy (June 2025)
Authors: V. Mayoral-Vilches arXiv: 2506.23592
Establishes 6-level taxonomy distinguishing automation from autonomy in Cybersecurity AI systems. Critical for understanding current capabilities and limitations of AI security tools.
3. CAI Fluency: A Framework for Cybersecurity AI Fluency (August 2025)
Authors: V. Mayoral-Vilches, J. Wachter, C. Chavez, C. Schachner, L.J. Navarrete-Lozano, M. Sanz-GΓ³mez arXiv: 2508.13588
Comprehensive educational platform for democratizing cybersecurity AI knowledge. Provides structured learning paths for practitioners and researchers.
4. Cybersecurity AI: Hacking the AI Hackers via Prompt Injection (August 2025)
Authors: V. Mayoral-Vilches, P.M. Rynning arXiv: 2508.21669
Demonstrates prompt injection attacks against AI security tools and presents four-layer guardrail defense system validated through empirical testing.
Application Domains
| Humanoid Robots as Attack Vectors |
The Cybersecurity of a Humanoid Robot |
Evaluating Agentic Cybersecurity in Attack/Defense CTFs |
CAIBench: Meta-Benchmark for Cybersecurity AI |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
5. Cybersecurity AI: Humanoid Robots as Attack Vectors (September 2025)
Authors: V. Mayoral-Vilches arXiv: 2509.14139
Systematic security assessment of humanoid robots showing they operate simultaneously as covert surveillance nodes and can be purposed as active cyber operations platforms.
6. Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs (October 2025)
Authors: F. Balassone, V. Mayoral-Vilches, S. Rass, M. Pinzger, G. Perrone, S.P. Romano, P. Schartner arXiv: 2510.17521
Real-world evaluation of AI agents in Attack & Defense CTFs. Shows 54.3% defensive patching success and 28.3% offensive initial access, validating CAI's practical effectiveness.
7. CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents (October 2025)
Authors: V. Mayoral-Vilches, F. Balassone, L.J. Navarrete-Lozano, M. Sanz-GΓ³mez, M. Crespo-Γlvarez, S. Rass, M. Pinzger arXiv: 2510.24317
Comprehensive meta-benchmark framework for evaluating cybersecurity AI across Jeopardy CTFs, Attack & Defense CTFs, Cyber Ranges, Knowledge tasks, and Privacy benchmarks.
π Research Collaborations
CAI benefits from ongoing research collaborations with academic institutions worldwide. Our collaborative research model focuses on:
Current Collaboration Areas
- π¬ Benchmark Development: Creating standardized evaluation frameworks for cybersecurity AI
- π Educational Initiatives: Developing curricula and training materials for AI security education
- ποΈ Framework Extensions: Building specialized agents and tools for specific security domains
- π Empirical Studies: Conducting large-scale evaluations of AI model capabilities
- π‘οΈ Defense Mechanisms: Researching guardrails and safety mechanisms for AI security tools
Academic Partnerships
We provide special support for: - β PhD Research Projects - Long-term collaborations on fundamental research questions - β Academic Benchmarking Studies - Access to CAIBench infrastructure and datasets - β Security Education Initiatives - Course materials, lab environments, and training support - β Open-source Contributions - Integration of research prototypes into production CAI
π€ Call for Research Collaborations
We actively seek research partnerships with academic institutions, research labs, and individual researchers interested in advancing the field of Cybersecurity AI.
Research Opportunities
Interested in Collaborating?
We welcome research collaborations in the following areas:
π Core Research Questions: - Autonomous vs semi-autonomous security testing - Multi-agent coordination for complex security scenarios - Evaluation frameworks and benchmarks for AI security capabilities - Safety and alignment for offensive security AI - Human-AI collaboration in security operations
π οΈ Applied Research: - Domain-specific security agents (cloud, IoT, OT/ICS, robotics) - Novel tool integration and extension mechanisms - Real-world case studies and deployments - Educational frameworks and training methodologies - Privacy-preserving AI for security testing
π Empirical Studies: - Large-scale comparative evaluations - Longitudinal studies of AI security tool effectiveness - User studies and human factors research - Performance analysis across diverse security domains
Benefits of Collaboration
For Researchers:
- π Access to CAI PRO infrastructure and alias1 model
- π Early access to benchmarks and datasets
- π€ Co-authorship opportunities on joint publications
- π‘ Direct influence on CAI development roadmap
- π€ Speaking opportunities at CAI community meetings
For Institutions: - π Educational licenses for teaching and courses - ποΈ Custom deployments and infrastructure support - π Integration of student projects into CAI ecosystem - π Visibility in the growing CAI research community
π§ Get in Touch
Interested in research collaboration? We'd love to hear from you!
Contact: research@aliasrobotics.com
Please include: - Your research interests and proposed collaboration areas - Institutional affiliation (if applicable) - Relevant publications or projects - Specific resources or support needed
We typically respond within 48 hours and can schedule an initial discussion call to explore collaboration opportunities.
π Citation
If you use CAI in your research, please cite our work (ordered by publication date):
@article{mayoral2025cai,
title={CAI: An Open, Bug Bounty-Ready Cybersecurity AI},
author={Mayoral-Vilches, V{\'\i}ctor and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Espejo, Lidia Salas and Crespo-{\'A}lvarez, Marti{\~n}o and Oca-Gonzalez, Francisco and Balassone, Francesco and Glera-Pic{\'o}n, Alfonso and Ayucar-Carbajo, Unai and Ruiz-Alcalde, Jon Ander and Rass, Stefan and Pinzger, Martin and Gil-Uriarte, Endika},
journal={arXiv preprint arXiv:2504.06017},
year={2025}
}
@article{mayoral2025automation,
title={Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy},
author={Mayoral-Vilches, V{\'\i}ctor},
journal={arXiv preprint arXiv:2506.23592},
year={2025}
}
@article{mayoral2025fluency,
title={CAI Fluency: A Framework for Cybersecurity AI Fluency},
author={Mayoral-Vilches, V{\'\i}ctor and Wachter, Jasmin and Chavez, Crist{\'o}bal RJ and Schachner, Cathrin and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a},
journal={arXiv preprint arXiv:2508.13588},
year={2025}
}
@article{mayoral2025hacking,
title={Cybersecurity AI: Hacking the AI Hackers via Prompt Injection},
author={Mayoral-Vilches, V{\'\i}ctor and Rynning, Per Mannermaa},
journal={arXiv preprint arXiv:2508.21669},
year={2025}
}
@article{mayoral2025humanoid,
title={Cybersecurity AI: Humanoid Robots as Attack Vectors},
author={Mayoral-Vilches, V{\'\i}ctor},
journal={arXiv preprint arXiv:2509.14139},
year={2025}
}
@article{balassone2025evaluation,
title={Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs},
author={Balassone, Francesco and Mayoral-Vilches, V{\'\i}ctor and Rass, Stefan and Pinzger, Martin and Perrone, Gaetano and Romano, Simon Pietro and Schartner, Peter},
journal={arXiv preprint arXiv:2510.17521},
year={2025}
}
@article{mayoral2025caibench,
title={CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents},
author={Mayoral-Vilches, V{\'\i}ctor and Balassone, Francesco and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Crespo-{\'A}lvarez, Marti{\~n}o and Rass, Stefan and Pinzger, Martin},
journal={arXiv preprint arXiv:2510.24317},
year={2025}
}
π Additional Resources
- π Complete Research Library - All 24+ peer-reviewed publications
- π CAIBench Benchmarks - Comprehensive evaluation framework
- π Competition Results - CTF and hackathon achievements
- π CAI Fluency - Educational materials and tutorials
- π» GitHub Repository - Source code and examples
Join the Cybersecurity AI research community - Let's advance the state of the art together! π







