Security

Mapping Dependency Confusion: A Novel Detection Approach using Source Map Files

The article delves into dependency confusion vulnerabilities and introduces a novel detection and exploitation technique then provides actionable steps to mitigate the risks associated with this vulnerabillity.

Introduction

The software supply chain encompasses all elements involved in a code's lifecycle, from development to deployment, including code, binaries, and dependencies sourced from repositories or package managers.

While using third-party code can save time and effort, it also introduces potential security risks, as flaws in these dependencies can compromise the entire supply chain security.

Some companies opt to use internal dependencies instead of third-party code, hosted on private or public registries with private scopes, but this approach is not immune to security issues and can lead to dependency confusion, a novel vulnerability pattern in the software supply chain.

Dependency confusion can occur in many different package managers. In this article, we'll be focusing on npm, a widely used package manager for Javascript.

Table of Contents

Dependency Confusion Overview

Dependency confusion occurs when a malicious actor manages to trick a package manager (NPM, PyPI, RubyGems, JFrog Artifactory...) to download a malicious package or dependency instead of a legitimate one. This can be achieved through several misconfigurations and wrong assumptions, notably:

Typos

One of the simplest ways dependency confusion might occur is through a typo where an attacker might anticipate that developers may inadvertently install the malicious package instead of the intended one due to typographical errors or oversight, typically by publishing a malicious package similar to a popular package but with a typo or small variation in the name.

Package priority

Developers might assume that the package manager will always prioritize internal packages over public ones, leading them to rely on package names that exist internally but not on public registries. This can work well until a malicious actor figures out that package name and decides to claim it on the public registry.

This is an example of package.json file of a javascript project

{
  "name": "my-project",
  "version": "1.0.0",
  "description": "Project 1.0.0",
  "main": "index.js",
  "scripts": {
    "start": "node index.js",
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "Corp",
  "license": "MIT",
  "dependencies": {
    "express": "^4.17.1",
    "lodash": "^4.17.21",
    "axios": "^0.25.0",
    "moment": "^2.29.1",
    "internal-corp-package": "^1.5.0"
  },
  "devDependencies": {
    "nodemon": "^2.0.15",
    "eslint": "^8.6.0",
    "eslint-config-airbnb-base": "^15.0.0",
    "eslint-plugin-import": "^2.25.1"
  }
}

One dependency stands out which is internal-corp-package. Its name suggests that it's from an internal registry and/or meant for internal use. We can verify if it exists on the public registry.

Figure 1: Package lookup on npm public registry

This means the package only exists on an internal registry, we suppose a malicious actor discovered this internal package and decided to publish a malicious package with the same name on the public npm registry, we'd have the following workflow:

Figure 2: Dependency confusion workflow

Here is a breakdown of the figure above:

  • Registry is not explicitly configured

This means the user didn't explicitly instruct npm to use the internal registry, either by running npm config set registry http://[internal registry] or specifying it in .npmrc file. In such a case, npm will automatically default to the public npm registry https://registry.npmjs.org/ for installation, meaning it will install the malicious package published by the attacker.

  • Registry is properly configured

Here we have the following two scenarios:

  • The specified package version is strict

If the the specified package version is strict like 1.0.0, and that version exists on the private registry, npm will automatically default to it, meaning it will install the legitimate package. An example of package.json where the version is strict is like:

{
  ...
  "dependencies": {
    ...
    "internal-corp-package": "1.0.0"
  },
  ...
}
  • The specified package version is loose (range)

Besides a strict version, npm allows to express versions in three different ways:

  • ^1.0.0: npm will install version 1.0.0 or any version greater than 1.0.0 and less than 2.0.0 like 1.9.5 (can upgrade subversion and patch level but not major version).
  • ~1.0.0: means npm will install version 1.0.0 or any version greater than or equal to 1.0.0 and less than 1.1.0 like 1.0.9 (can only upgrade patch level).
  • *: means there is no restriction on the installed version, in such case npm will default to the latest version.

In all the cases above, an attacker can still perform a dependency confusion attack, as long as they manage to get the right version. For instance, if the specified version is ^1.0.0, an attacker can create and publish a malicious package with version 1.0.1 and it will pass the check.

Dependency Confusion Impact

Dependency confusion can be escalated into remote code execution fairly easily. The attacker has many ways to execute arbitrary code through the malicious package. One of the easiest ways to achieve that is through the scripts property of package.json, where the attacker can specify commands to execute during any stage of the installation, for example:

{
  "name": "internal-corp-package",
  "version": "1.0.1",
  "description": "Definitely not a malicious package!",
  "main": "index.js",
  "scripts": {
    "postinstall": "ping [malicious host]",
  },
  "author": "",
  "license": "ISC"
}

Unclaimed organization name

NPM registries have a feature to use an organization scope where the package name is like @org/package, some organizations might register their organization name on the private registry but not on the public npm registry. If a malicious actor manages to claim that organization name on the public npm registry they can use it to perform dependency confusion attack.

Exploiting Dependency Confusion Through Source Map Files

Source map files, often referred to simply as source maps, are files that provide a mapping between the source code of a program and the transformed code that is executed by the browser or another JavaScript engine. They are commonly used in web development to help in debugging and error reporting, especially when working with minified or transpiled code.

While source maps are useful for development and debugging purposes, they can have security implications, particularly when they are inadvertently exposed in production builds. If a source map file is left publicly exposed in a production build, it can allow an attacker to reconstruct the original un-minified source code and list its dependencies, which are typically in the node_modules folder, having access to the list of dependencies can be a paramount to dependency confusion.

A typical source map file generated by webpack looks like:

{
  "version": 3,
  "file": "bundle.js.map",
  "sources": [
    "webpack:///./src/file1.ts",
    "webpack:///./src/file2.ts",
    "webpack:///./node_modules/internal-corp-package/utils.js"
  ],
  "sourcesContent": [
    "console.log('This is file1');", 
    "console.log('This is file2');",
    "console.log('This is internal package utils file');"
  ],
  "mappings": "AAAA,OAAO,EAAE;AACP,OAAO,CAAC,GAAG,EAAE",
  "sourceRoot": ""
}

This source map allows us to recover the original un-minified code of src/file.ts, src/file2.ts and node_modules/internal-corp-package/utils.js.

In some occurrences it was possible to find some dependencies embedded directly in the source code, having access to those dependencies' names could as well be a paramount to dependency confusion.

What is even more worrying about this technique is that since the attacker now has access to the original code of the the dependencies above, they can inject a stealthy backdoor into the code without breaking it, allowing them to run malicious code on the target environment while keeping it operational, things that makes it even harder to detect since the applications depending on this vulnerable package will keep running without a sign of compromise, unlike the scripts field mentioned above which can be easily spotted.

Besides the risk of dependency confusion, when an attacker recovers the original source code of internal javascript applications, this code might contain hard-coded credentials that are otherwise hard to detect in minified code, or comments that are otherwise stripped in transpiled code.

Scanning Bug Bounty Programs

During our analysis, we run a scan over 191 assets from different bug bounty programs, roughly 5% of whom were found vulnerable to dependency confusion using this technique.

Figure 3: Dependency confusion detected

Protecting Against Dependency Confusion Attacks

While there isn't a silver bullet for dependency confusion, there are steps you can take to mitigate the risk to a certain extent, notably:

  • Use an organization scope

NPM registries have the option to register an organization name and use it as a scope. The organization can be used in your internal registry but you'll have to register it on the public npm registry as well. This way an attacker will have no way to perform dependency confusion without having access to your org.

  • Register internal packages names on public npm registry

One way to proactively stay ahead of attackers is by claiming the package names you're using internally on the public npm registry. However, this approach is not bulletproof as you might miss some packages.

  • Validate package integrity

Each npm package version has a sha512 hash. You can use that hash to verify the integrity of the package before downloading it.

  • Version Pinning

Instead of using loose versions like ^1.0.0, you can use the exact 1.0.0 version, this way NPM will not default to a higher version. The caveat here is that the version pin will not apply to transitive dependencies.

  • Package Locking

Utilize package-lock.json or yarn.lock files to lock down the specific versions of dependencies used in your project. This prevents unexpected changes to dependency versions during installations, locking both direct and transitive dependencies.

Conclusion

In conclusion, dependency confusion has evolved into a pressing concern that can compromise the security of entire software supply chains, thus, putting in measures to safeguard against it has taken a paramount importance.