Research

CAI is built on a strong foundation of peer-reviewed research establishing the field of Cybersecurity AI as a distinct research domain. Our work spans theoretical frameworks, practical implementations, educational initiatives, and rigorous empirical evaluations.

📊 Research Impact & Achievements

🏆 Competitions and Challenges

CAI has demonstrated exceptional performance in real-world security competitions:

)

📈 Key Research Findings

Pioneered LLM-powered AI Security with PentestGPT, establishing the foundation for the Cybersecurity AI research domain
3,600× performance improvement over human penetration testers in standardized CTF benchmark evaluations
CVSS 4.3-7.5 severity vulnerabilities identified in production systems through automated security assessment
Democratization of AI-empowered vulnerability research: CAI enables both non-security domain experts and experienced researchers to conduct more efficient vulnerability discovery, expanding the security research community while empowering small and medium enterprises to conduct autonomous security assessments
Systematic evaluation of large language models across both proprietary and open-weight architectures, revealing substantial gaps between vendor-reported capabilities and empirical cybersecurity performance metrics
Established autonomy levels in cybersecurity and argued about autonomy vs automation in the field
Collaborative research initiatives with international academic institutions focused on developing cybersecurity education curricula and training methodologies
Comprehensive defense framework against prompt injection in AI security agents: developed and empirically validated a multi-layered defense system
Explored the Cybersecurity of Humanoid Robots with CAI, identifying new attack vectors showing how humanoids (a) operate simultaneously as covert surveillance nodes and (b) can be purposed as active cyber operations platforms

📚 Research Publications

The Cybersecurity AI research line has produced 8+ papers and technical reports with active research collaborations:

Core Framework & Foundations

CAI: An Open, Bug Bounty-Ready Cybersecurity AI	The Dangerous Gap Between Automation and Autonomy	CAI Fluency: Educational Framework	Hacking the AI Hackers via Prompt Injection

1. CAI: An Open, Bug Bounty-Ready Cybersecurity AI (April 2025)

Authors: V. Mayoral-Vilches et al. arXiv: 2504.06017

Core framework paper establishing CAI as a lightweight, open-source platform for building AI-powered security tools. Demonstrates 3,600× performance improvement over manual testing and presents systematic evaluation across multiple LLMs.

2. Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy (June 2025)

Authors: V. Mayoral-Vilches arXiv: 2506.23592

Establishes 6-level taxonomy distinguishing automation from autonomy in Cybersecurity AI systems. Critical for understanding current capabilities and limitations of AI security tools.

3. CAI Fluency: A Framework for Cybersecurity AI Fluency (August 2025)

Authors: V. Mayoral-Vilches, J. Wachter, C. Chavez, C. Schachner, L.J. Navarrete-Lozano, M. Sanz-Gómez arXiv: 2508.13588

Comprehensive educational platform for democratizing cybersecurity AI knowledge. Provides structured learning paths for practitioners and researchers.

4. Cybersecurity AI: Hacking the AI Hackers via Prompt Injection (August 2025)

Authors: V. Mayoral-Vilches, P.M. Rynning arXiv: 2508.21669

Demonstrates prompt injection attacks against AI security tools and presents four-layer guardrail defense system validated through empirical testing.

Application Domains

Humanoid Robots as Attack Vectors	The Cybersecurity of a Humanoid Robot	Evaluating Agentic Cybersecurity in Attack/Defense CTFs	CAIBench: Meta-Benchmark for Cybersecurity AI

5. Cybersecurity AI: Humanoid Robots as Attack Vectors (September 2025)

Authors: V. Mayoral-Vilches arXiv: 2509.14139

Systematic security assessment of humanoid robots showing they operate simultaneously as covert surveillance nodes and can be purposed as active cyber operations platforms.

6. Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs (October 2025)

Authors: F. Balassone, V. Mayoral-Vilches, S. Rass, M. Pinzger, G. Perrone, S.P. Romano, P. Schartner arXiv: 2510.17521

Real-world evaluation of AI agents in Attack & Defense CTFs. Shows 54.3% defensive patching success and 28.3% offensive initial access, validating CAI's practical effectiveness.

7. CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents (October 2025)

Authors: V. Mayoral-Vilches, F. Balassone, L.J. Navarrete-Lozano, M. Sanz-Gómez, M. Crespo-Álvarez, S. Rass, M. Pinzger arXiv: 2510.24317

Comprehensive meta-benchmark framework for evaluating cybersecurity AI across Jeopardy CTFs, Attack & Defense CTFs, Cyber Ranges, Knowledge tasks, and Privacy benchmarks.

🎓 Research Collaborations

CAI benefits from ongoing research collaborations with academic institutions worldwide. Our collaborative research model focuses on:

Current Collaboration Areas

🔬 Benchmark Development: Creating standardized evaluation frameworks for cybersecurity AI
🎓 Educational Initiatives: Developing curricula and training materials for AI security education
🏗️ Framework Extensions: Building specialized agents and tools for specific security domains
📊 Empirical Studies: Conducting large-scale evaluations of AI model capabilities
🛡️ Defense Mechanisms: Researching guardrails and safety mechanisms for AI security tools

Academic Partnerships

We provide special support for: - ✅ PhD Research Projects - Long-term collaborations on fundamental research questions - ✅ Academic Benchmarking Studies - Access to CAIBench infrastructure and datasets - ✅ Security Education Initiatives - Course materials, lab environments, and training support - ✅ Open-source Contributions - Integration of research prototypes into production CAI

🤝 Call for Research Collaborations

We actively seek research partnerships with academic institutions, research labs, and individual researchers interested in advancing the field of Cybersecurity AI.

Research Opportunities

Interested in Collaborating?

We welcome research collaborations in the following areas:

🔍 Core Research Questions: - Autonomous vs semi-autonomous security testing - Multi-agent coordination for complex security scenarios - Evaluation frameworks and benchmarks for AI security capabilities - Safety and alignment for offensive security AI - Human-AI collaboration in security operations

🛠️ Applied Research: - Domain-specific security agents (cloud, IoT, OT/ICS, robotics) - Novel tool integration and extension mechanisms - Real-world case studies and deployments - Educational frameworks and training methodologies - Privacy-preserving AI for security testing

📊 Empirical Studies: - Large-scale comparative evaluations - Longitudinal studies of AI security tool effectiveness - User studies and human factors research - Performance analysis across diverse security domains

Benefits of Collaboration

For Researchers: - 🔓 Access to CAI PRO infrastructure and alias1 model - 📊 Early access to benchmarks and datasets - 🤝 Co-authorship opportunities on joint publications - 💡 Direct influence on CAI development roadmap - 🎤 Speaking opportunities at CAI community meetings

For Institutions: - 🎓 Educational licenses for teaching and courses - 🏗️ Custom deployments and infrastructure support - 📚 Integration of student projects into CAI ecosystem - 🌍 Visibility in the growing CAI research community

📧 Get in Touch

Interested in research collaboration? We'd love to hear from you!

Contact: research@aliasrobotics.com

Please include: - Your research interests and proposed collaboration areas - Institutional affiliation (if applicable) - Relevant publications or projects - Specific resources or support needed

We typically respond within 48 hours and can schedule an initial discussion call to explore collaboration opportunities.

📖 Citation

If you use CAI in your research, please cite our work (ordered by publication date):

@article{mayoral2025cai,
  title={CAI: An Open, Bug Bounty-Ready Cybersecurity AI},
  author={Mayoral-Vilches, V{\'\i}ctor and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Espejo, Lidia Salas and Crespo-{\'A}lvarez, Marti{\~n}o and Oca-Gonzalez, Francisco and Balassone, Francesco and Glera-Pic{\'o}n, Alfonso and Ayucar-Carbajo, Unai and Ruiz-Alcalde, Jon Ander and Rass, Stefan and Pinzger, Martin and Gil-Uriarte, Endika},
  journal={arXiv preprint arXiv:2504.06017},
  year={2025}
}

@article{mayoral2025automation,
  title={Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy},
  author={Mayoral-Vilches, V{\'\i}ctor},
  journal={arXiv preprint arXiv:2506.23592},
  year={2025}
}

@article{mayoral2025fluency,
  title={CAI Fluency: A Framework for Cybersecurity AI Fluency},
  author={Mayoral-Vilches, V{\'\i}ctor and Wachter, Jasmin and Chavez, Crist{\'o}bal RJ and Schachner, Cathrin and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a},
  journal={arXiv preprint arXiv:2508.13588},
  year={2025}
}

@article{mayoral2025hacking,
  title={Cybersecurity AI: Hacking the AI Hackers via Prompt Injection},
  author={Mayoral-Vilches, V{\'\i}ctor and Rynning, Per Mannermaa},
  journal={arXiv preprint arXiv:2508.21669},
  year={2025}
}

@article{mayoral2025humanoid,
  title={Cybersecurity AI: Humanoid Robots as Attack Vectors},
  author={Mayoral-Vilches, V{\'\i}ctor},
  journal={arXiv preprint arXiv:2509.14139},
  year={2025}
}

@article{balassone2025evaluation,
  title={Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs},
  author={Balassone, Francesco and Mayoral-Vilches, V{\'\i}ctor and Rass, Stefan and Pinzger, Martin and Perrone, Gaetano and Romano, Simon Pietro and Schartner, Peter},
  journal={arXiv preprint arXiv:2510.17521},
  year={2025}
}

@article{mayoral2025caibench,
  title={CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents},
  author={Mayoral-Vilches, V{\'\i}ctor and Balassone, Francesco and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Crespo-{\'A}lvarez, Marti{\~n}o and Rass, Stefan and Pinzger, Martin},
  journal={arXiv preprint arXiv:2510.24317},
  year={2025}
}

🔗 Additional Resources

📚 Complete Research Library - All 24+ peer-reviewed publications
📊 CAIBench Benchmarks - Comprehensive evaluation framework
🏆 Competition Results - CTF and hackathon achievements
🎓 CAI Fluency - Educational materials and tutorials
💻 GitHub Repository - Source code and examples

Join the Cybersecurity AI research community - Let's advance the state of the art together! 🚀