Stacklok Insight is a free-to-use web app that provides data and scoring on the supply chain risk for open source packages.
Public, open source package managers are a cornerstone of the development ecosystem. Nearly all developers either use or have used systems like npm, PyPI or Crates to access open source libraries.
Most package managers today don’t formally validate the connection between a package and its source repo (though notably, npm now supports publishing packages with sigstore provenance). Malicious actors exploit this by uploading packages to public repositories and tricking developers into installing them through techniques like copying and manipulating metadata from legitimate packages. The burden today is on developers to spot and avoid malicious attacks that are deliberately hard to detect.
So as a developer, what can you do to make sure you’re installing the right package, and not malicious code? In this post, we’ll explore three common attacks on package managers, tips for avoiding them, and what package managers are doing to improve the safety of their public repositories.
1. Masquerading / starjacking. “Masquerading” is just like how it sounds: a malicious actor copies metadata—like the README file, repository link, website link, and other data—from a popular package and uses it to disguise their malicious package, so that developers think they’re installing the right one. Meanwhile, “starjacking” is a form of masquerading, in which malicious actors link their package to a popular package’s source code repository on GitHub that has a high number of stars. This makes the malicious package look like it’s popular and downloaded often, and therefore “safe.”
Example:
The example below (uncovered by jFrog) shows a malicious package masquerading as the Marked JavaScript package. All of the package’s metadata is the same in the two repos, but you can see that the names in the upper-left corner are slightly different.
2. Typosquatting. Typosquatting is an attack in which a malicious actor intentionally misspells the name of a popular open source package—like using the name “nxt” instead of “next.” This attack preys on developers who make a typo when typing out the name of the package to install (e.g., when entering “pip install [package name]”). If they accidentally mistype the package name, they could unintentionally download malicious code. Typosquatting is a highly effective technique—past research demonstrates how a potential typosquatting attack would have forced more than 17,000 computers to execute arbitrary code.
Example:
In this example from ReversingLabs, a malicious actor used the package name node-hide-console-windows—one letter off from a legitimate npm package, node-hide-console-window. The actor also used the masquerading attack to copy all of the metadata from the legitimate package, even publishing the same number of versions.
Critically, the researchers who discovered this noted that the maintainer account responsible for the malicious package was newly created, and not connected to any other npm projects.
The reputable package:
ReversingLabs researchers discovered a malicious package that was typosquatting as this package, based on its suspicious behavior. The malicious package fetched a rootkit package and launched it, connecting to the attacker’s command and control network. The attacker could then run any commands on the developer’s computer at any time, and with little oversight.
3. Dependency confusion. Attackers may exploit a software's dependencies by creating malicious packages in a public repository that have the same name as internal, private packages that an organization uses.
For example, if a malicious actor knows the name of a package hosted on the internal servers of Company X, they can create a package with the same name and host it on a public repository, with a very high version number. Some package managers that organizations use internally don’t properly differentiate between private packages that were developed internally, and packages that were cached from the public registry.
These internal registries may then “fall back” to the public registry—even for internal packages—and install the higher version from the public registry. So this attack essentially tricks the package manager into giving a developer at Company X the malicious external package to install instead of the legitimate internal one, because it assumes the malicious package is the most recent version.
Example:
Security researcher Alex Birsan, who coined the phrase “dependency confusion,” wrote about using this method to hack Netflix, Apple, and other companies: “Squatting valid internal package names was a nearly sure-fire method to get into the networks of some of the biggest tech companies out there, gaining remote code execution, and possibly allowing attackers to add backdoors during builds.”
Alex noted that he was able to find private package names for these companies on GitHub and on posts in internet forums—but also through internal files embedded in public javascript files:
“By far the best place to find private package names turned out to be… inside javascript files. Apparently, it is quite common for internal package.json files, which contain the names of a javascript project’s dependencies, to become embedded into public script files during their build process, exposing internal package names. Similarly, leaked internal paths or require() calls within these files may also contain dependency names. Apple, Yelp, and Tesla are just a few examples of companies who had internal names exposed in this way.”
To hack Netflix, Alex published a fake “malicious” source package on PyPI with the same name as an internal Netflix package, and a very high version number. As jFrog researchers noted, source packages can execute code without user intervention, immediately upon installation:
Image credit: jFrog
As seen in the examples above, it’s common for malicious actors to use multiple attacks together—like creating a package name that’s slightly different from the real and reputable package, and copying all of the metadata and repo source link from that package.
Always seek to verify a package’s source of origin. Before you install an open source package, make sure you can connect that package back to its source repo, and ideally take time to inspect the source code itself. While this might take a little more work, it’s a clear way to verify that a package is what it says it is.
Ideally, an open source package has a verifiable link back to its source code because it was built and signed with sigstore. If so, you can verify the signed artifact using Cosign.
Lacking that information, make sure to click through to the source repo link for a package listed on a public repository.
Always double-check package names before you install them. When using a package manager to install an open source package, make sure you verify the legitimate name of the package you’re installing. Misspelling a package or installing a package with a slightly different name can mean installing a malicious package.
Use sigstore to sign your own software artifacts. Sigstore supports keyless signing containers, blobs, Git commits, and other artifacts using an OpenID Connect identity. This makes artifact signing much simpler than using keys, and can help ensure that malicious actors can’t tamper with or alter your artifact after it’s been built.
You can use GitHub Actions to automatically sign container images by default using the “Docker Publish” GitHub Actions workflow.
Stacklok’s open source supply chain security platform, Minder, also includes a built-in rule type to verify that artifacts have been signed before code is merged.
Use Trusty in your IDE to vet open source packages before you install them. Trusty, a free-to-use service from Stacklok, can help you identify unsafe and potentially malicious packages before you install them. Trusty’s package scoring system takes into account factors like repo and author activity as well as provenance and the likelihood of typosquatting.
As mentioned above, authors of malicious packages don’t tend to have a history of activity on GitHub. Also, malicious actors often copy source repo metadata for reputable packages, meaning that they lack a credible link back to that source repo.
On Trusty, packages with low author activity scores and no verifiable links back to the listed source code repo will have low scores. You can use this information to help you verify whether you’re installing the right package. (Trusty’s IDE extension will help you spot unsafe and potentially malicious packages as you’re importing them into your code.)
At Stacklok, we look forward to continuing to partner with developer communities and package managers to support the adoption of sigstore and increase the safety of open source software.
Looking for a way to vet the safety of open source packages before you install them? Try Trusty, our free-to-use service that scores open source packages based on activity, proof of origin, and likelihood of malicious activity.