Not only does having the right API reduce learning time, it also relieves the development community of the need to have certain debates: design discussions that would have spawned long and contentious mailing list threads simply do not come up. That may not be quite the same thing as pure technical or aesthetic beauty, but in a project with many participants and a constant turnover rate, it’s a beauty you can use. Beautiful Code, Chapter Two, p.28
This post presumes some basic experience with crypto, and is targeted at enterprise or government Python developers who need to use crypto in their work. This post contains some highly opinionated statements as it reflects an ongoing discussion started at PyCon 2010 in Atlanta. We want your comments. Cryptography has never been a strong side of Python. This is somewhat strange, especially taking into account Python’s quickly growing share in the enterprise development market. A choice of high quality cryptography toolkits is a distinctive mark of a mature enterprise platform, yet Python still has very little to offer. Despite the number of open source and proprietary projects out there, their quality is not sufficient to satisfy a demanding enterprise developer. As an illustration, none of the existing modules can be used to treat classified documents in the US or even to file a tax report in Russia — it’s simply illegal. It’s not the number of toolkits which is important, but rather feature coverage and quality. It’s better to have just a few (maybe two) mutually completing systems, than 50 very low-level partial implementations. The most notable candidates for such a “super-toolkit” role were reviewed in the “Python Crypto: State of the Art” posts. The question is whether any of those can fulfill that role. Let’s talk about what an ideal Python Cryptography Toolkit could be.
Architecture: it’s the law
It seems very likely that an enterprise user would be interested in an extensible, layered, configurable architecture with the ability to replace some parts with alternative implementations. There is a good reason for this. Cryptography is a highly regulated topic: to have the law on its side, an organization has to do things exactly as the state regulatory bodies say. For example, in China, a license is required for using just any kind of cryptography. In Russia, an electronic document is considered legal only if it was treated according to the GOST R 34.10 standard. In the US, only certain algorithms, such as ECDSA can be used by the government. In Israel, any use of cryptography is tightly controlled. The vast majority of the countries restrict export of cryptography products; those who have no such restrictions (like Brazil) are either considering or implementing them. This has one interesting consequence: the structure of the local regulations could make one use different products in different situations for purely legal reasons. Just think what implications the Export Control rules could have on a US-based corporation sending encrypted documents to its overseas branches. Now, imagine those branches interacting with local tax agencies in a secure way. Sound like a nightmare? Another incentive for such a layered architecture could be composability. The chances that a complete toolkit could emerge without massive investments into its development are slim. Solution? — an architecture that makes it possible to combine the best features from different low-level toolkits. This specifically means such an aggregating API is destined to be high level.
Higher level APIs
One particular way of solving the extensibility problem is a design where low-level algorithms are encapsulated in “providers”, sometimes called “engines”, typically implemented as loadable modules. These providers are accessible indirectly through a layer of abstract higher-level protocol APIs. Instead of calling functions like RSA_private_encrypt() and then a whole bunch of others, one could just call “sign” method on a Signature object, configured to use RSA scheme. Later on, the application could be reconfigured to use Elliptic Curve cryptography by merely changing the configuration file (instead of rewriting the source code). This is something called “cryptography agnosticism”. Unfortunately, none of the existing Python toolkits featuring higher-level API (except the pure-Python cryptopy
) enable developers to create extensions in Python, only in C. Another role such an API would have is paving the way things should go. Crypto is difficult in a sense that it is very easy to do things wrong: choose a wrong padding scheme or a wrong block mode and you’re in a trouble. A good framework should limit usage of all those brittle primitives so nobody can be hurt.
Easy-peasy
There are several projects providing “easy wrappers” around low-level algorithm APIs. How useful are these? The problem is, higher-level cryptography APIs are not about “ease”, they are about shifting focus from algorithms to protocols. Of course, they typically provide auxiliary features like automatic block padding and such, but that is much more about interoperability than ease of use. For example, there are different padding schemes for DES; all are correct, all are incompatible. There is nothing in the algorithm itself that dictates which scheme to use or even allows to indicate what scheme this particular stream was padded with. So, the only logical place to encapsulate such protocol features is a higher-level API. To illustrate my point: there are traffic laws and there are business logistic rules. These two represent different levels of abstraction. The upper level depends on the lower, yet they are separate. The same is with cryptography: a multiparty document signature scheme, for instance, is of a higher level of abstraction comparing to hash functions, private keys, etc, even though it depends on the latter. So the real problem is I can’t name a single Python framework that operates at this upper abstraction level; a framework providing substantially more than just object-oriented wrappers around low-level crypto algorithms. I’m quite convinced it might make much more sense to start from scratch and model such an API either after MS CNG or JCA with all necessary Python adaptations.
The voice in the desert
All too often, developers using cryptography can be heard saying “Please, please, give me an API with sane defaults that I can use without reading the Applied Cryptography book.”
Are those developers lazy or stupid? Of course not. It’s a reasonable cry for help from the abstraction gap they find themselves stuck in. What it actually means is “I don’t want to implement all those higher-level protocols using the low-level algorithm APIs. That’s none of my business.” Is there someone to hear that cry?