Security

Introducing Ostorlab Security Testing Benchmarks: Real Vulnerabilities, Real Impact

The first open-source benchmark suite featuring 93 realistic vulnerable mobile apps that mirror actual CVE and bug bounty findings - not theoretical textbook examples.

Mon 22 September 2025

One-line pitch: The first open-source benchmark suite featuring 93 realistic vulnerable mobile apps that mirror actual CVE and bug bounty findings - not theoretical textbook examples.

The Problem We're Solving

Security teams waste countless hours validating automated testing tools against unrealistic, academic vulnerabilities that never appear in production. You need to know if your security scanner can catch the PIN bypass that could cost millions, not the millionth SQL injection in a login form that no real app would ever ship.

After analyzing thousands of bug bounty reports and CVEs, we realized the industry desperately needed benchmarks that reflect what security teams actually worry about - logical bugs, authentication bypasses, and complex vulnerability chains that traditional benchmarks ignore.

Our Solution: Reality-Based Security Testing

We've open-sourced 93 vulnerable mobile applications (72 Android, 21 iOS) that compile the most commonly reported real-world vulnerabilities. Each app represents actual functionality you'd find in production - from banking apps to event trackers - with vulnerabilities that mirror genuine security incidents.

✨ What Makes This Different

Real-World Applications: Not toy examples - actual banking apps, money transfer systems, and event trackers with realistic functionality
Bug Bounty-Inspired: Every vulnerability is based on actual CVE reports and bug bounty findings worth real money
Automation Challenges: Includes traditionally "impossible to automate" logical bugs like PIN bypasses and OAuth account takeovers
Comprehensive Coverage: 70+ unique vulnerability classes across mobile platforms

🎯 Who Should Use This?

Security Tool Developers: Validate your SAST/DAST tools against real vulnerabilities, not academic exercises
Security Teams: Benchmark and compare different scanning solutions using realistic test cases
Researchers: Study how real vulnerabilities manifest in mobile applications
Pentesters: Train on applications that mirror actual client environments

📱 Sample Applications Included

Each application represents a real-world use case with embedded vulnerabilities:

  • Banking App - Authentication bypasses, insecure data storage
  • Money Transfer - Transaction tampering, session management flaws
  • Event Tracker - Privacy leaks, intent redirection vulnerabilities

Unlike traditional benchmarks that focus on basic OWASP Top 10, we include complex, real-world vulnerability patterns:

Authentication & Authorization

  • PIN/Passcode bypasses
  • 2FA bypass mechanisms
  • OAuth account takeover (without PKCE)
  • Biometric authentication bypasses
  • Session persistence after password change

Data Exposure

  • Firebase database takeover
  • Sensitive data in cleartext storage
  • Google Advertising ID misuse
  • Location data exposure
  • Hardcoded secrets in production code

Complex Logic Flaws

  • Intent redirection vulnerabilities
  • Task hijacking scenarios
  • Broadcast injection attacks
  • WebView JavaScript bridge exploitation
  • Path traversal in ZIP processing

Platform-Specific Issues

Android-Specific:

  • Tapjacking vulnerabilities
  • Unprotected critical activities/services
  • Provider SQL injection
  • Grant URI permission escalation

iOS-Specific:

  • Deeplink CSRF attacks
  • WebKit internal file access
  • URL link spoofing
  • Promotion code brute force
  • Unencrypted session information exposure

🚦 Getting Started

# Clone the repository
git clone https://github.com/Ostorlab/benchmarks.git

# Navigate to Android or iOS samples
cd benchmarks/mobile/android  # or benchmarks/mobile/ios

# Each app includes:
# - Source code
# - Build instructions
# - Vulnerability documentation
# - Exploitation guides

📊 Why This Matters

Traditional vulnerable app collections like DVWA or GoatDroid serve their purpose for learning, but they fail to represent modern mobile security challenges. Our benchmarks bridge the gap between academic exercises and real-world security testing.

Consider this: A typical security scanner might catch 100% of SQL injections in test apps but miss critical logic flaws that constitute 60% of actual bug bounty payouts. These benchmarks let you measure what truly matters.

🤝 Join the Movement

We're building a community around realistic security testing. Here's how you can contribute:

For Developers:

  • Add new vulnerable apps following our contribution guide
  • Port existing vulnerabilities to new platforms
  • Improve documentation and exploitation guides

For Security Teams:

  • Share vulnerability patterns you've seen in the wild
  • Provide feedback on benchmark relevance
  • Help prioritize new vulnerability additions

For Tool Vendors:

  • Test your tools and share detection rates
  • Contribute detection logic improvements
  • Sponsor development of specific vulnerability categories

📈 Current Status & Roadmap

Available Now:

  • ✅ 72 Android vulnerable applications
  • ✅ 21 iOS vulnerable applications
  • ✅ 70+ unique vulnerability classes
  • ✅ Comprehensive documentation

Coming Soon:

  • 🔄 Flutter/React Native hybrid app vulnerabilities
  • 🔄 CI/CD integration examples
  • 🔄 Automated benchmark scoring system
  • 🔄 Web dashboard for results comparison

💬 Get Involved

GitHub: github.com/Ostorlab/benchmarks
Contributing: Read our guide
Discussion: Open an issue for questions or suggestions
Twitter: Follow @OstorlabSec for updates