Wednesday 10 August 2011

E's no good - Distinguishing between distinguished names


For most people, changes to the policies and standards that describe how the grid should work are met with a resounding 'so what'. For anyone involved in the day-to-day management of grid systems, it is an opportunity to join a collective sign-of-relief.

It is another example of where the 'political' aspects of international research collide with the technical solutions and the needs of researchers who don't give a damn how it works, as long as it lets them do their jobs.

X.509 certificates are complicated because what they represent is complicated - a link in a chain of trust between particular individuals or institutions.

Identities within certificates are tied to Distinguished Names or DNs. A DN is a lists of attributes - such as country, institution and personal name - that uniquely identify a single person, or computer, or service.

The way a DN is stored within a certificate is well-defined but completely incomprehensible to anything that is not a computer program. For many practical purposes, the DN needs to be presented so it can be understood by a person.

A glance at the OpenSSL X509_NAME_print_ex documentation shows how brain-twistingly complicated it can be translating a DN into something that a human being can read.

There is a more detailed explanation on the NGS Wiki. This is the quick tour..

Each individual attribute within a DN has a 'type' and a 'value'.

The type identifies what is being represented - a name, or an email address. It isn't really a name but but a unique sequence of numbers called an Object Identifier. Something like: 1,2,840,113549,1,9,1.

People, inexplicably, find sequences like 1,2,840,113549,1,9,1 hard to remember so for our benefit, 1,2,840,113549,1,9,1 is also known as "Email", "emailAddress" and - occasionally - "E".

The value is depends on the type. For 1,2,840,113549,1,9,1 - it is a string of letters represented in what is known as UTF-8. UTF-8 was developed to represent any letter from any language - but most Grid certification authorities deliberately restrict themselves to the 26 letters of the English alphabet, the numbers 0 to 9 and a few symbols. If they didn't, things would rapidly become even more complicated.

In human-friendly form, the DNs that Jens is working to abolish look very much like

 /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/Email=support@grid-support.ac.uk
or maybe
 /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/emailAddress=support@grid-support.ac.uk
or even, very rarely
 /C=UK/O=eScience/OU=Manchester/L=MC/CN=voms.ngs.ac.uk/E=support@grid-support.ac.uk
Which variant you get depends on which version of which software is processing the certificate.

The problems appear when DNs are compared as strings of letters rather than in what could be called their 'raw' form.

Most software is smart enough to canonicalise these awkward examples by chosing One True Name for 1,2,840,113549,1,9,1 and substituting this before comparing. Not all software packages agree on which name is the One True Name.

It is now common practice to represent certificate chains in .LSC format - which are simply lists of human-friendly DNs. These may be simple to distribute and do not need to be updated every time the certificate is renewed.

The would be good enough - if it wasn't for that troublesome email address.

No comments: