If you work with Python, you’ll eventually run into projects with complex dependency trees like the above example. Since Python mostly hides these dependencies from you and automatically pulls them in as needed, it can be surprising to find out just how deep your dependencies go. As a result, you’re unlikely to have much insight into the compatibility and/or conflicts amongst dependencies. Until it’s too late…
Approaches to Managing Python Dependencies
Luckily, you have a number of options you can explore to help you address these shortcomings, including:
- Vendored dependencies – i.e. self-managed dependencies
- Pros: Guaranteed compatibility, availability and security; the best option if you’ve created custom patches for specific packages.
- Cons: Results in a larger repo since you have to host all the dependencies yourself, which also means you may have to create your own wheels. Also requires lots of extra work to perform updates, check for vulnerabilities, manage conflicts with system installs, etc.
- Verdict: This kind of approach is very labor intensive, and can be a distraction from just coding your app, slowing time to market. Avoid it if you can.
- Requirements.txt/ Pipfile – the standard way to manage dependencies in Python
- Pros: shared source (PyPI) eliminates the need to manage each dependency yourself.
- Cons: Dependencies are constantly being updated, so discipline is required to make sure you’re pinning your package versions and creating a lockfile to prevent newer versions breaking your app.
- Verdict: A LOT less painful than vendored dependencies, but it still requires discipline.
- Pre-built distributions – eg., ActiveState’s Python distribution, ActivePython
- Pros: Requires no discipline since the vendor has already done the work to make sure your dependencies are compatible, secure, up-to-date, etc.
- Cons: Rarely updated more than once per quarter; may contain more/less packages than you need for your use case.
- Verdict: If it fits your use case and you can afford the costs, this is the most painless way to deal with dependencies.
Tools for Managing Python Dependency Conflicts
Regardless of which of the above methods you choose to help manage dependency compatibility, you’ll also need a way to manage dependency conflicts. The best practice is to make sure you‘re creating a unique virtual environment for each project. Virtual environments ensure the Python packages you’re installing won’t conflict with any of your other projects, or (more commonly) with your system-level resources. Python provides a number of tools for creating virtual environments, including:
- Virtualenv – the classic way
- Pros: Tried and trusted.
- Cons: Need to pip install everything you require; the more environments you create, the harder they are to manage.
- Verdict: Still works well, but requires more manual work than pipenv.
- Pipenv – the new Python community standard is a single app that combines both virtualenv and pip.
- Pros: Create a virtual environment and populate it with everything you need from a Requirements.txt file. Supports Pipfile.lock, which contains a fully resolved dependency tree for your project ensuring deterministic builds.
- Cons: It can run , and currently supports only a single workflow.
- Verdict: Offers the best way to combat Python dependency issues and ensure environment reproducibility.
The easiest way to create a virtual environment and manage your dependencies is to install Python 3.9 from ActiveState, which will automatically install Python 3.9 into a virtual environment for you so you can start working on your project right away. Install Python 3.9
Advanced Dependency Management
Unfortunately, most real world Python projects are more complicated than those supported by pip tools. For example, pip tools currently provides no way to:
- Distinguish packages from the dependencies they pull in
- Create a set dependencies and versions that are known to be compatible
- Update a virtual environment efficiently with a single command
- Support multiple configurations of dependencies that aren’t just subsets of the original. For example, one configuration for each environment: development, test/CI/CD, production, etc.
If you require any of these features, consider the ActiveState Platform, which offers advanced dependency management for Python projects.
Distinguish Packages from Dependencies
Packages refer to those Python libraries that are required to run your application. Dependencies are libraries that the packages require to be able to run. There are other dependencies, as well, including transitive dependencies (which are dependencies of dependencies) and Operating System or OS-level dependencies (which are dependencies required to run the application on the specified OS).
For each Python project the ActiveState Platform provides:
- A complete Bill of Materials (BoM) view showing:
- The Python version
- Packages
- Direct dependencies, as well as transitive dependencies
- OS-level dependencies
- Shared dependencies (ie., OpenSSL)
As a result, you can easily distinguish between top level packages you explicitly included in your project versus their dependencies that were pulled in automatically.
Dependency Resolution
Python dependency resolution refers to the ability for a package manager (like pip) to be able to not only pull in all the dependencies required by the packages in your project, but also be able to ensure they are all compatible with each other. Unfortunately, neither pip nor pip tools are able to accomplish this.
By way of comparison, the ActiveState Platform will not only resolve all dependencies to ensure they work together but will also flag unresolvable conflicts and even suggest a manual workaround to the problem.
Ready to see for yourself? You can try the ActiveState Platform by signing up for a free account using your email or GitHub credentials. Or sign up for a free demo and let us show you how you can better manage dependencies.
Watch this video to learn how to use the ActiveState Platform to create a Python 3.9 environment, and then use the Platform’s CLI (State Tool) to install and manage it.
By following the instructions and selecting a more recent version of pandas will resolve the conflict. Note that when upgrading a package, the ActiveState Platform will show you all the cascading configuration changes so you can know the ramifications of any change to your environment BEFORE you commit to the update.