The Challenges of Open Data — example: Digital Identity


Last week, I posted about why forcing identity data into name-value pairs is an architectural dead end. Of the many comments that I received, those from Phil Hunt and Mark Wilcox, in particular, turned out to warrant a much more detailed response than I initially thought. I realized that they are raising a much broader topic that one could call "The Challenges of Open Data", applied to the example application domain of digital identity. I hopefully will get around to writing an article on the general case some time soon, but for now, I’ll focus on digital identity data. Because that is already complex enough, I have broken down my thoughts into multiple posts, which will be published over a few days, one at a time, and which, for convenience, will be linked from here.

I’m rephrasing the points that were made as questions (and hope I don’t miss anything really important) and also add a few related questions.

  • “How is identity data different from other kinds of data, such as transactional data? Where does one start and the other end? What overlaps are there?” (go to separate post)
  • “What is a good way of thinking about the (conceptual) structure of identity information? Is it RDF? Is it name-value pairs? Is it SQL? Is it … [long list of potential candidates].” (go to separate post)
  • “Given that LDAP seems to work for identity data in many use cases, where does the need for more complex structures for identity data arise, and what are those more complex structures?” (go to separate post)
  • “Applications written against LDAP directories from one vendor often do not work against LDAP directories of other vendors. If we want to build a ubiquitous identity layer on the internet, how are we going to solve this problem?”. This question is really about how to deal with multiple ontologies of identity data. (go to separate post)
  • “What is the best way of representing identity information for the purpose of storage and retrieval?” (go to separate post)
  • “What is the best way of representing identity information during exchange on the open internet?” (go to separate post)
  • “Can name-value-pair-based representation of identity information be “fixed” with few additional conventions that add a lot of power? E.g. by extending the allowed values (in the name-value pairs) to be “pointers” to other name-value pairs?” (go to separate post)