On-line and Batch Digital Identity Assertions

[There may be better names, but I can’t think of any right now.]

If we assume for a second that a digital identity comprises, at the least:

An identifier for the subject (eg my driver’s license number)
A set of assertions (or claims) about the subject (eg my name, address and photo)

then there are two ways of implementing this:

one can come up with a "package" for them, such as a file, or a card, that contains both the identifier and the assertions, which means that all information is sent around as a package. (Traditional driver’s licenses do that, where serial number and address and so forth are part of the same physical card)
one can send only the identifier, and provide an on-line query capability to determine the assertions. (In such a scenario, I would memorize my driver’s license serial number, and, when asked by the cop, they would on-line query the DMV server somewhere for my address and photo using the serial number as the query etc.) [For non-Californians: the DMV is the government entity that issues driver’s licenses around here.]

I have the distinct impression that many people talking about digital identity today implicitly assume it is the former and not the latter: everything gets packaged and sent around for processing when the processor is ready ("batch"), rather than providing information on demand ("on-line"). Note that I do not intend this as a criticism, just as an observation; making it explicit might help us all think through the issues more clearly.

Of course, LID is very much in the "on-line" camp.

Which is better? Of course that depends. For example, in an environment where on-line connectivity is not available, just having an identifier does nobody any good (analogy: the cop in the desert without mobile data services can’t check who I am). However, where on-line connectivity is available, the on-line alternative has many advantages, such as:

The information obtained on-line is much more likely up-to-date (analogy: if I have moved since my driver’s license was issued, chances are much higher that the on-line information is up-to-date than that I have an up-to-date "batch" driver’s license. How many times have you been asked: "is this address on your license still correct?").
Every on-line access can be logged, or can be subject to real-time approval. Imagine your driver’s license is stolen by somebody who looks similar to you. They can wreck lots of havoc with your life before you can stop them. Even after you have stopped them, you can’t even identify which havoc they wrecked because you don’t know what they used your driver’s license for. The on-line version can stop the digital identity’s misuse the minute you realize it has been stolen. And after the fact, you know exactly what it was used for by examining the log.
All the information on the "batch" package can potentially be falsified, as soon as the mechanism for packaging it is cracked (the fake-your-passport of so many movies etc.). What isn’t there for anybody to falsify and crack can’t be cracked, and in the on-line version, nothing is there… In my view, things like driver’s licenses and signed digital messages being sent around are inherently simpler to falsify than the same information held on a server that can employ all the same techniques for protection, and also rely on physical access control etc.
One can deploy the digital identity (well, the identifier part but that’s all that is needed to be deployed in the on-line case) before one needs to decide on the entirety of information that needs to be available to users of this digital identity. Or, one can make more information available to new classes of users that weren’t even foreseen at the time the digital identity was first issued. For example, if somebody decided that they needed to not only know my current address, but the three addresses before that, the on-line alternative can add those three addresses after the first use of the digital identity without impacting anybody. The batch alternative is far more complex in these cases as it would require the re-issue of a new driver’s license, for example. Which then would expose my past three addresses to everybody, even those 99% of people who do not need to and should not know. (This is why we support the return of different information to different people in LID.)

Of course there is nothing to prevent systems from implementing both at the same time. We do that, to an extent, in LID already with our VCard export. And I think that is what is happening in the real world:

In the past, the cops may have been satisfied when you showed your driver’s license, but today, they also do a query on you, on-line, to identify you as a mass murderer, say. Same with passports at the borders. Personally, I don’t know which one is trusted more: the "batch" information you present (e.g. your passport) or the "on-line" information (e.g. the INS database). But there is a trend towards on-line, just like it is anywhere.

The fact that you are in possession of the passport adds an element of security over just memorizing your passport serial number. But note that putting information such as address and name on it, when on-line connectivity is available, adds nothing. (Side note: I have a very hard time understanding those people who want to put all sorts of biometrics on identity cards. They assume a high-tech environment for reading it, but somehow don’t assume that basic internet connectivity is available in the same place)

Within LID, we tend to think on-line first, and serialize into something that can be batch-processed if needed. I like this philosophy …