Blog

How to detect and block malicious typosquatting attacks on open source packages

North Korean hackers recently employed a technique called "typosquatting" to trick developers who misspelled a popular package name into downloading a Trojan program onto their machines. Stacklok has built protection against typosquatting attacks like these into our products, to help you block them.

Author: Pankaj Telang
/
6 mins read
/
Mar 13, 2024
Cybersecurity

Earlier this year, Japanese cybersecurity officials determined that a North Korean hacking team (Lazarus Group) had uploaded tainted packages to the Python PyPI registry. The hackers used a strategy called “typosquatting,” giving their packages a similar name to the reputable package “pycrypto,” an encryption toolkit for Python—for example, one package was named “pycryptoenv.” 

This strategy tricked developers who misspelled the name of “pycrypto” into downloading a malicious package on their machine that infected it with a Trojan program, “Comebacker,” that could be used to inject malware and ransomware, and steal credentials. 

This isn’t the first time malicious attackers have used this strategy with PyPI packages. Back in 2019, two libraries from the same developer were uploaded to PyPI with similar names to popular libraries; when installed, they would steal SSH and GPG keys from the projects of infected developers. One of those typosquatting packages had been available for nearly a year before it was detected. And in 2023, an attacker uploaded thousands of malicious packages to the PyPI registry with randomly generated names that were similar to reputable packages. 

Because typosquatting attacks like these are becoming more common, we’ve taken action in Trusty to proactively analyze packages that are uploaded to public registries (including PyPI, npm, Maven, Go, and crates) for the likelihood of typosquatting. Here’s how we do it.

Detecting typosquatting in Trusty  

Trusty is a free-to-use web app from Stacklok that analyzes data about thousands of open source packages and ranks them based on their supply chain risk. Trusty looks at factors like repo and author activity; the presence of security best practices, like artifact signing; and the presence of malicious activity, like typosquatting and starjacking.

An example of Trusty’s scoring dimensions for the open source Python package pandas

To identify likely typosquatting attacks, we rely on popularity data and a method of data analysis called the “Levenshtein distance.” Here’s how it works. 

Table 1 below shows the names of malicious typosquatting packages that have been discovered in the past, and how those names compare to the popular package the developer intended to install.  

You can see that the differences between the two packages names are very slight—one or two changes—making it easy for a developer to mistype and accidentally install malicious code.

Malicious Package

Popular Package

Difference

mumpy

numpy

Replace “m” by “n”

virtualnv

virtualenv

Delete “e”

crypt

crypto

Delete “o”

pysprak

pyspark

Swap “a” and “r”

setup-tools

setuptools

Insert “-”

urlib3

urllib3

Delete “l”

openvc

opencv

Swap “c” and “v”

Table 1: Past examples of typosquatting packages

Step 1: Identify packages that have similar names to a given package

To find similarly named packages, Trusty needs to calculate the distance between the name of a given package and the names of all known packages. 

In information theory, the Levenshtein distance is a widely used measure of distance between two strings. For a pair of strings (x, y), the Levenshtein distance is defined as the number of deletions, insertions, or substitutions required to transform x into y. 

For example, the Levenshtein distance between “test” and “best” is 1, as “test” can be transformed into “best” with one substitution: replacing “t” with “b”. Table 2 below shows the Levenshtein distance for the packages from Table 1.

Malicious Package

Popular Package

Levenshtein Distance

mumpy

numpy

1

virtualnv

virtualenv

1

crypt

crypto

1

pysprak

pyspark

2

setup-tools

setuptools

1

urlib3

urllib3

1

openvc

opencv

2

Table 2: The Levenshtein distance for past examples of typosquatting packages

For a given package, Trusty uses Levenshtein distance to identify the packages with similar names.

Step 2: Assign a typosquatting score for the given package

As mentioned earlier, attackers typically name typosquatting packages similarly to existing popular packages. In Trusty, we use repo and author activity as a proxy for package popularity, and assign scores for both (read more about our scoring here). Repo activity scores are based on factors including the number of stars, forks, open issues, and watchers, while author activity scores are based on the number of public repos that author has, as well as number of followers. The repo and author activity scores are combined into a single activity score. Malicious packages tend to have a particularly low activity score. 

Along with Levenshtein distance, Trusty uses the activity scores to assign a typosquatting score for some given package, say X. If the activity score of package X is lower than the minimum activity score among similarly named packages, it is highly likely to be a typosquatting package. Therefore, Trusty assigns it a lower typosquatting score closer to 0. 

Conversely, if the activity score of package X exceeds the maximum activity score among similarly named packages, it is less likely to be a typosquatting package. In this case, Trusty assigns it a high typosquatting score closer to 10. 

If the activity score of package X falls between the minimum and maximum activity scores among similarly named packages, Trusty assigns it a score between 5 and 8, based on where package X's score lies within that range.

Step 3: Factor the typosquatting score into the overall Trusty Score for the package

Trusty aggregates the various scores it computes alongwith the typosquatting score to assign an overall score to a package. The aggregate score of a typosquatting package tends to be very low.

Examples of typosquatting scores in Trusty

Let’s take a look at some actual examples of package typosquatting scores in Trusty. 

For the reputable Python package requests, Figure 1 below shows the typosquatting score calculated by Trusty. Since requests is a popular package with high repo and author activity, it is unlikely to be a typosquatting package. As expected, the typosquatting score of requests is high (9).  

Figure 1: The typosquatting score for the reputable Python package "requests"

Figure 2 below shows the typosquatting score of a Python package named “requests5.” This package has a name that is very similar to the reputable requests package, but it has no repo and author activity scores since its repo and author information is not available. So it is very likely a typosquatting package. As a result, its typosquatting score is lower (5).

Figure 2: The typosquatting score for the likely malicious package "requests5"

Trusty also displays all of the potential typosquatting packages for a given package, as shown below. This can help our security researchers (and external researchers) identify potential real-world instances of typosquatting, and make developers aware when they’re installing the popular package.

Warnings of possible typosquatting attacks on the reputable "requests" package

How Stacklok can help you block typosquatting attacks

To avoid falling prey to typosquatting attacks, developers need to exercise caution when installing open source packages, and make sure they’re installing the right one. 

In Trusty—as shown above—typosquatting packages have low overall Trusty scores, so you can use Trusty to evaluate open source packages before you install them. To effectively catch typosquatting packages before they’re integrated into your source tree, you can use Stacklok’s open source platform, Minder

Minder helps you apply and automatically enforce security policies across your repos, including a policy to flag pull requests that contain dependencies with low Trusty Scores. With this policy in place, Minder can alert you or even block a pull request that introduces low-scoring dependencies (you can configure your scoring threshold). This is a low-friction way that allows you to catch risky dependencies—like typosquatting packages—before a PR is merged. 

Pankaj Telang

Principal Engineer, Data Science & ML

Pankaj has over 20 years of experience in the areas of AI, ML, computer vision, cybersecurity, and software development. Prior to Stacklok, Pankaj worked as a Principal Staff Scientist for SAS, focused on cybersecurity and computer vision, where he developed ML algorithms for detecting suspicious user and device activities from network communications.

Stacklok logo
© 2024 Stacklok