A Software Composition Analysis (SCA) tool provides organizations with three key ways to understand the open source software used within their application:

  • Generates a manifest of all open source components
  • Lists all open source license(s) per component
  • Identifies all known security vulnerabilities associated with the components

Because open source components make up >80% of a modern application’s codebase (ie., the runtime environment on which the other 20% of proprietary code runs), a SCA tool is essential to help development and security teams identify:

  • Which open source components are viable for inclusion in an application’s codebase versus those that are not (for example, EOL components or those that are poorly maintained may not be eligible for inclusion).
  • Which open source licenses are approved for use, or else may be at odds with the overall license of your codebase.
  • Which open source components are affected by security vulnerabilities, malware or other compromised code.

In other words, SCA tools play a key role in helping organizations to:

  • Ensure all dependencies are approved for use in order to mitigate support and maintenance risk.
  • Ensure license compliance in order to mitigate the risk of IP lawsuits.
  • Ensure security compliance by identifying the number and severity level of vulnerabilities present in the codebase, thereby helping prioritize security efforts and reduce Mean Time To Remediation (MTTR).
  • Ensure software supply chain security by identifying components that feature typosquatting or have been compromised with trojans, backdoors, and other malicious code. 

More and more SCA tools can also generate a Software Bill Of Materials (SBOMs), as well, in order to help development teams track code changes and help customers more quickly identify vulnerabilities in common components across all the software they deploy.

All of these use cases have made scanning the codebase with a SCA tool an essential process throughout the software development lifecycle, but especially within Continuous Integration/Continuous Deployment (CI/CD) pipelines where they are critical for ensuring the delivery of compliant, non-vulnerable software. 

Unfortunately, many SCA tools simply don’t work as advertised. 

Why SCA Tools Fail

The results generated by SCA tools can be incorrect because of limitations with the source material they scan against, which primarily include package management definition files/manifests. For example, most SCA tools will employ the following methodology:

  1. Scan the File System: the most common process SCA tools implement to identify the open source components you’re using is simply to scan the local file system in order to identify all of the definition files/manifests (eg., requirements.txt, CPANfile, gemfile, etc) that are present.
  2. Parse Definition Files: the definitions/manifests are then parsed to extract the list of dependencies they contain. SCA tools assume that each list is both complete and accurate. 
  3. Construct a Dependency Graph: based on the scan results, the tool can now create a dependency tree that shows the relationships between each dependency.
  4. Identify Vulnerabilities: by referring to a database of known vulnerabilities, the SCA tool can identify which dependencies contain which Common Vulnerabilities and Exposures (CVEs), as well as their severity level.
  5. Generate an SBOM (optional): SCA tools that generate a dependency graph will also be able to generate an SBOM.

Unfortunately, relying on package manager definitions/manifests can result in a number of issues, including:

  • False Negatives – dependencies not explicitly defined in the definition file will not be included in the SCA tool’s output. This can result in vulnerabilities being present in the software, but not identified by the tool.
  • “Out of Scope” Dependencies – some package managers accommodate sections for defining scopes by deployment environment (ie., a list of dependencies required in dev vs test vs production, etc). Because runtime dependencies vary by environment, SCA tools may not be able to discern which dependencies are actually required at runtime.
  • Transitive Dependencies – generating an accurate and complete list of transitive dependencies generally requires knowledge of how the top-level dependency was built, including any linked C libraries or other OS-specific binaries. This is as true for SCA tools as it is for generating SBOMs.
  • False Positives – organizations that take a “one size fits all” approach to their runtimes will often ship unused dependencies (eg., test harnesses). This will result in SCA tools generating false positives: identifying vulnerabilities in dependencies that are not exploitable since they’re never executed.

The good news is that most modern SCA tools won’t rely solely on what they discover in definition files/manifests, but will also examine signatures, hashes, and other evidence of a dependency’s existence. The best tools are also capable of scanning both source code and application binaries for a more comprehensive approach. While these additional methods can help overcome many of the issues listed above, you’ll want to ask your SCA tool vendor to explain how they explicitly address each potential shortcoming. 

Vulnerability Remediation

Arguably, the most important function of a SCA tool is raising awareness of vulnerabilities present in the codebase, as well as providing remediation guidance. Information provided can include:

  • Exploitability – a measure of how exploitable the vulnerability may be, usually expressed as a severity rating
  • Maturity – how long the vulnerability has been known to exist
  • Effort to Fix – an estimate of the difficulty to patch the vulnerability
  • Reachability – an estimate of whether the vulnerable method is actually being called by the software
  • Remediation – typically an indication of whether a patch or newer version of the component exists. Some SCA tools may automatically create pull requests with updated/patched versions of affected components ready for the organization to take. 
  • Datedness – the age of a component, which can be an early warning sign of risk.

All of which can help organizations gauge risk and prioritize fixes. But the key benefit of a SCA tool is only realized when Mean Time To Remediate (MTTR) is decreased. For example, by: 

  1. Decreasing time to awareness of a vulnerability 
  2. Decreasing the time it takes to investigate whether the vulnerability affects the software
  3. Decreasing the time it takes to patch or upgrade the dependency
  4. Decrease the time it takes to rebuild the runtime environment

While all tools can provide #1, only some are capable of #2 and #3, and none help with #4, which can be plagued by dependency hell. In practice, however, organizations continue to rely on their own vulnerability investigations, if only to confirm the SCA tool’s findings. Even when these four steps have been completed, organizations still have to retest and redeploy their application, which can mean waiting days or weeks for a pre-planned deployment window to open up. As a result, MTTR is typically still measured in weeks, if not months. 

License Compliance

SCA tools will generally surface the open source license associated with each dependency by referring not only to what’s declared in the license header, but may also identify embedded licenses down to the snippet level. 

Issues can arise here, as well. For example:

  • Dependencies can feature multiple top-level licenses or none at all, making it difficult to understand the risk of including it in a codebase.
  • Embedded licenses can be at odds with the overall license, potentially turning a commercial-friendly license into a non-commercial-friendly one.
  • The author of a dependency may have included third-party code without copying over the license, posing a risk to compliance.

While some SCA tools are capable of flagging these issues, they’ll often err on the side of caution and generate reports that contain a number of false positives that will require further investigation.

SCA Tool Comparison

The following chart compares the tools offered by a number of prominent SCA vendors whose capabilities are rated by key feature as either Basic (B), Good (G) or Advanced (A):

Snyk1Mend2Synopsys3Sonatype4Veracode5Checkmarx6
Vulnerability IdentificationBGGAGG
Remediation CapabilitiesAAAGGG
License IdentificationBGAABB
SBOM GenerationBGAGBB

While any of these vendors will likely provide a good enough solution for most software vendors, their limitations may prevent them from being effective, depending on your requirements. For example (in reference to the superscript number associated with each vendor): 

  1. Developer Focused – this is really a double-edged sword, since Snyk’s ability to make developers care about security is a win, but the tradeoff is that compliance and security users may find the tool’s capabilities lacking. 
  2. Remediation-Centric– another double-edged sword, since Mend’s ability to shorten remediation efforts is key, but other features suffer because of the focus. 
  3. False Positives – Synopsys’ Black Duck SCA is the best-in-class solution, but when it comes to vulnerability detection it lags behind Sonatype.
  4. High TCO – requires the purchase and deployment of multiple Sonatype products, including Nexus Artifact Repository, Lifecycle for SCA, and an Advanced Legal Pack.
  5. Added On – Veracode is a SAST vendor that has added on SCA capabilities, which makes it a good option for their existing customer base, but is unlikely to be the key buying reason for new customers. 
  6. New Entrant – like Veracode, Checkmarx is a SAST vendor that is currently in the process of building out SCA capabilities in their platform. 

Conclusions

With both vulnerabilities and software supply chain attacks growing at unprecedented levels, discovering and flagging security issues within the open source software that makes up 80% of all modern codebases is a vital step in the software development process. To that end, SCA is an essential tool in the toolbox of those fighting the good fight.  

Despite the fact that SCA tools are based on the flawed premise of starting with package manager definitions/manifests, many of them have added further checks and balances to ensure they’re operating against a complete and accurate set of dependencies. Unfortunately, it only takes one missed dependency to cause an organization to ship a compromised product.  

ActiveState, while not a SCA vendor, takes a different approach to delivering many of the benefits of a SCA solution. By building every runtime environment from vendored source code, including libraries written in C/C++, Fortran, etc, ActiveState is the only vendor that can ensure a complete and accurate set of dependencies. ActiveState is also the only vendor that can automatically rebuild the runtime environment while avoiding dependency hell. This is reflected in our SBOM generation capabilities, which also stand out against the competition

ActiveState surfaces licensing and vulnerability information, but it may be better employed by security-conscious organizations looking for a way to check that their existing SCA tool is generating a complete and accurate list of dependencies. 

Next Steps:

Read our Scalable Dependency Vendoring: Best Practices to understand how we create a complete and accurate set of dependencies for runtime environments.