Typosquatting in Open Source

Open source public repositories are vulnerable to a wide range of exploits by hackers, but by far the most popular attacks are some form of typosquatting, in which typosquatters aim to exploit popular packages by creating a similarly-named malware-infected version. In general, the process involves:

  1. Finding a popular open source package.
  2. Duplicating the package and adding some form of malicious code.
  3. Renaming the malware-infected package and uploading it to a public repository in the hope that developers will mistype and download it rather than the valid package.

Open source typosquatting differs from domain typosquatting which is typified by hackers that attempt to redirect web browsers to fake websites with similar domains to real sites (e.g., web addresses like goggle.com), or register known brands or trademarks with the goal of creating a malicious website (known as cybersquatting or domain squatting), or otherwise luring internet users with fake sites/malicious sites.

In contrast, open source typosquatting attacks take many different forms, including:

  • Typosquatting – misspellings of typical package name variations include letter omission (requests -> reqests), repetition (jquery -> jquerry) and transposition (electron -> electorn).
  • Brandsquatting – uses the name of a package that is popular in one ecosystem and uploads a malicious version of that package using the exact same name to another ecosystem in order to take advantage of brand name recognition unbeknownst to the brand owner. For example, uploading a malicious version of Python’s scipy to a Rust repository in the hope that a Python programmer new to the Rust ecosystem will download it.
  • Combosquatting – combines popular package names with words like “security” or “api” or “tools.” For example, axios-api or django-tools.
  • Dependency Confusion – happens when a malicious package named identically to a corporation’s internally developed package is uploaded to a public repository (e.g., amazon-payments, apple- or microsoft-gre). If there are no domain name checks in place and the malicious package is a newer version than the internal one, the package manager may retrieve it instead.

These kinds of typosquatting attacks are incredibly effective, and have become one of the most popular ways to compromise software vendors, often with the goal of extracting data.

The Threat of Typosquatting

Typosquatting is a key facilitator of cybercrime. The problem lies in the importance with which package names are treated by public open source repositories that are the primary typosquatting website vector. This unfairly puts the onus on a busy developer’s typing and proofreading skills, whether they’re creating CI/CD scripts or interactively downloading packages from the command line. Once a typsoquatted package enters the organization, it can compromise cybersecurity in a number of ways:

  • Install malware in a development environment in order to exfiltrate personal data and/or sensitive information, including passwords, API keys, login credentials, etc.
  • Create security holes in the software vendor’s dev, test and build environments that might be exploited by ransomware, or similar threat.
  • Create backdoors in software that can expose customers to subsequent cyberattack.

How to Prevent Typosquatting

The most common typosquatting sites are open source repositories. While public repositories do enforce unique package names within their ecosystem, they do nothing to prevent exploits across ecosystems (i.e., brandsquatting). Nor do they implement controls to catch potentially typosquatted packages when they’re initially uploaded. However, most ecosystems are quick to remove typosquatted packages as soon as they are discovered and reported by users. Unfortunately, that  can still mean typosquatted packages are downloaded hundreds of thousands of times before they’re removed. In other words, user beware.

Some methods organizations can use to help mitigate the threat of typosquatting include:

  • Using prebuilt runtime environments. Centrally built and development environments can be reviewed by a QA and security team before they are shared with developers, reducing the likelihood of containing a typosquatted package.
  • Check for vulnerabilities and scan all dependencies on import with Static Analysis Software Testing (SAST) and Dynamic Analysis Software Testing (DAST) tools to ensure they don’t contain malicious code.
  • Create a script that can check for and flag potential typosquatted packages in any config file.
  • Slow down! And double-check for spelling errors before you hit return on any install command. You can only scam yourself.

Unfortunately, typosquatting is here to stay simply because typosquatting works so well for cybercriminals. Consider using a service like the ActiveState Platform that provides:

  • A vetted catalog of open source packages.
  • An import routine that checks for malicious code and effectively quarantines suspect dependencies.
  • A secure build service that automatically creates prebuilt runtime environments that you can examine and test out before sharing.

Sign up for a free account and try the ActiveState Platform for yourself.

Related Links

  • Use case: Secure Environment Management
  • Blog: How to Detect Typosquatting with Python
  • Quick read: Software Supply Chain Threats