Repositories bracket either end of the software supply chain for most organizations in the software industry, providing the means to store imported software assets at the start of the supply chain, as well as make built software artifacts available to developers within the organization or else external customers. As such, repositories are both the first contact with “the enemy” and the last point of contact to ensure the security and integrity of the software being provided to users.
Modern software organizations employ two different types of repositories during their software development process:
- Code Repositories – store source code and other software development assets, such as documentation, tests, and scripts. They typically also integrate some kind of version control system in order to be able to track changes made to each asset. Examples include GitHub and Subversion.
- Binary Repositories – store software artifacts, which are typically compiled application or package binaries, in a versioned retrieval system. They typically also integrate some form of monitoring/scanning for package updates. Examples include Sonatype Nexus and JFrog Artifactory.
In practice:
- Code repositories are the best way to curate and manage source code, and make it available to the build process.
- Binary repositories are the best way to curate and manage built software artifacts, and make them available to the build process and/or users.
Issues arise, however, when organizations use their binary repository at the front end of their supply chain, importing pre-built open source artifacts which are then made available to their build process. This practice represents one of the greatest risks when it comes to securing the software supply chain.
Repository Best Practices for Supply Chain Security
Organizations that import pre-built, open source packages typically quarantine them in a binary repository and use traditional AppSec tools to scan them for vulnerabilities, malware, typosquatting, etc. This method is effective when the imported code is not obfuscated.
Unfortunately, organizations also Import pre-built binary packages that are far harder to scan since the code is obfuscated. Investigating scan results can slow down the development process, or, when incomplete, can inadvertently result in compromised software builds.
To mitigate this risk, security-conscious organizations should consider either sourcing their prebuilt binaries from a trusted vendor, or else adopting a DIY approach:
- Import the source code for all required open source packages and their dependencies.
- Build all open source packages and their dependencies, as well as any linked C, Fortran, etc libraries from source code for each OS your organization requires.
Commonly called dependency vendoring, this process is a dependency management best practice that recommends importing only open source dependency source code into your repository – never prebuilt packages. In this way, you can avoid the security issues associated with prebuilt binaries, as well as ensure the security and integrity of your codebase.
Built artifacts would then be staged in a binary repository where they can be monitored for emerging vulnerabilities and made available to subsequent build processes or for further software development work.
Repositories Compared
To better understand your choices when it comes to code and binary repositories, we’ll compare some of the most popular commercial offerings.
Source Code Repositories Compared
Each code repository solution has its own strengths and weaknesses, while providing much the same set of basic features and functionalities. This is because the majority of code repository solutions are based on the same underlying technology: git, which represents ~90% of all code repository deployments.
GitHub | GitLab | Bitbucket | |
Git-based | Y | Y | Y |
Code review aides | Y | Y | Y |
Documentation wiki | Y | Y | Y |
Private repositories | Free | Free | Free |
CI/CD | Y | Y | Y |
Team support | Y | Y | Y |
Semantic search | N | N | Y |
Vulnerability Mgmt | Y | Y | N |
Issue Tracker | Y | Y | N (relies on Jira) |
While alternatives to git exist, such as Subversion, CVS, Azure DevOps Server, etc, they represent a small slice of the market pie. In fact, GitHub alone has more than 28M users and 57M repositories.
Binary Repositories Compared
Much like source code repositories, any comparison of binary repositories often comes down to a difference of degrees rather than distinct features. The repositories reviewed here are all under active development, support multiple package formats and binary file types, and provide numerous support options.
Sonatype Nexus | JFrog Artifactory | AWS CodeArtifact | |
Multi Language Support | Strong | Very Strong | Weak |
Binary Types | Strong | Very Strong | Weak |
Security Scanning | Yes | Yes | No |
Access Control | Many options | Many options | Limited options |
Hybrid/On Prem | Y | Y | No |
API & CLI | Y | Y | Y |
Searchability | Very Strong | Very Strong | Weak |
Cleanup | Strong | Strong | Weak |
Artifactory and Nexus are by far and away the leaders in the binary repository space, but choosing between them may come down to pricing or support for your specific language/artifact.
Conclusions – Securely Managing Dependencies & Binaries
While the use of code repositories has a long history associated with managing the proprietary code developers create, binary repositories are a relatively recent addition for most organizations, providing a far better way to manage binary artifacts used in and generated by the software development process.
When it comes to vendoring open source dependencies, however, this supply chain security best practice is rarely adopted except by the largest organizations. The problem lies in the fact that dependency vendoring just doesn’t cost-effectively scale across any organization that has multiple development teams and/or technology stacks despite the advantages, which include:
- Allows you to generate a complete dependency graph, including build-time dependencies and OS-native dependencies. This pays benefits downstream, such as ensuring that no vulnerabilities slip under the radar, and ensuring that the SBOM you generate is complete.
- Provides assurances as to the security and integrity of all your dependencies, as long as your CI/CD system supports a declarative pipeline and generates a reproducible build.
To make dependency vendoring cost-effective, ActiveState automates much of the resource and time-intensive processes for you by acting as:
- Code Repo – ingests and vets dependency source code from multiple public repositories, while patching zero-day vulnerabilities as they arise.
- CI/CD – automatically builds all dependencies from source code (including linked C and Fortran libraries) in a declarative, reproducible manner using a SLSA build level 3 hardened build service.
- Binary Repo – makes the securely built artifacts and/or packaged runtimes available to your development teams or CI/CD pipeline, or else can be used to populate your binary repository with secure artifacts.
In other words, ActiveState functions in much the same way you might use your own code and binary repositories when it comes to open source dependencies, but automates away much of the overhead so you can focus on developing software rather than managing it.
Next Steps:
Read how you can free up time and resources, while ensuring supply chain security by using ActiveState for your dependency vendoring needs.