Stacklok Insight is a free-to-use web app that provides data and scoring on the supply chain risk for open source packages.
To understand whether an open source package is safe to use, we’ve historically relied on CVEs (or the lack of CVEs) as the leading indicator: if there are no known vulnerabilities in the code, then it’s OK to use. While it makes sense not to use open source software that contains known vulnerabilities, there are other factors that can be just as risky as known vulnerabilities.
This post explores other risk factors outside of CVEs that you should consider when determining whether open source software is truly safe to use. Below, we’ve outlined five of those risk factors and how you can evaluate them.
When we're referring to malicious packages here, we mean packages that attack the developer's machine. It seems obvious that you shouldn’t use malicious open source software, but traditional software scanners may not find malicious packages in your software if they don’t have any known CVEs.
The open source project OSV.dev (sponsored by Google) recently added support for malicious package reporting in its data feed. OSV is an open source project created and sponsored by Google. It provides a standardized format for vulnerability and malicious package data that can be used by both vulnerability database producers and open source consumers, with the goal of making it easier for developers and security teams to automate and triage vulnerability reporting.
Stacklok’s free-to-use web app, Trusty, uses data from OSV.dev to flag malicious open source packages. This data is also integrated into our Minder platform, allowing you to apply policies that flag and block malicious packages in pull requests, so that they can’t be merged into production code.
An example of a malicious package in Trusty
When you board a plane, you’re required to provide identification, like a passport, to prove that you are who you’re claiming to be. But proof of identification isn’t required for open source packages that are uploaded to a package registry, like npm or PyPI. That means that package authors can list any source code repository in the package’s metadata, and it’s up to you to decide whether you believe it.
Fortunately, free and open source tools exist to help you verify a package’s provenance. Sigstore is an open source project that helps developers to cryptographically sign and verify a package’s proof of origin. If a package has been signed using Sigstore, that signature is tamper-proof and recorded in a public ledger for verification. Registries like npm are taking action to help developers automatically sign their packages when uploading them to a registry, to improve security.
Trusty provides Sigstore provenance information for signed packages. Additionally, for packages that haven’t been cryptographically signed, Trusty uses a method called “historical provenance” to map Git tags and releases from the source code repository to the public version releases on the package manager repository. Because this information is nearly impossible to fake—someone would have to go back in time and make fraudulent releases at the same time as valid tags—this method allows us to prove that the package comes from its claimed source repository with a high degree of accuracy.
It’s a good idea to always inspect the source code before you import an open source library or framework into your project. Because using a low-quality dependency can ultimately affect your software’s end-user experience and SLOs, you need to make sure that the software is doing what you expect, and that the code is reliable and secure.
It’s important to note that reviewing the source code only makes sense after you’ve established the package of origin (see item #2 above).
When you take a dependency on third-party code, you’re trusting that the author has good intentions. But we know that’s not always the case. The XZ Utils vulnerability, for example, was introduced by a contributor who used social engineering tactics to take over control of the project from the maintainer, and insert malware.
How can you identify packages with “unsafe” authors? At Stacklok, our research team has been working on creating mappings of authors and projects. Our goal is to help you flag packages that have authors or contributors who have been linked to malicious packages (either currently or in the past), as well as flag changes in author / contributor behavior. Read more about our work on a “proof of diligence” algorithm here.
One reason why vulnerabilities don’t always signal risk is because a project with active and committed maintainers is likely to issue a patch for a known CVE quickly to keep your project safe. But an open source library that hasn’t been updated in three years? You’ll be on the hook yourself to fix any issues that come up.
Taking a dependency on an unmaintained package—or a package with a single maintainer—is highly risky. But if the maintainer hasn’t reported the package as deprecated, it can take time to figure out whether the package is being actively maintained. To make this easier, Trusty analyzes repository and author activity for each package, looking at both volume-based factors like number of recent commits and contributor count, as well as recency, like activity level and open issues. Each package is given a score from 0-10 (with 10 being the highest).
An example of a deprecated package in Trusty, with a low score due to low repository activity.
It’s important to make sure that you have a process in place to review your dependencies and make sure they’re safe to use. Using a scanning tool that only checks for known vulnerabilities and none of the other risk factors listed above could be putting you, your organization, and your end users at risk.
Stacklok’s products, Trusty and Minder, can help you choose safer dependencies and make sure malicious packages stay out of your production code. Test them out for free, and let us know what you think—join our Discord community.