The wonderful, but so far uncharted world of personal data

The term “Personal Data” is everywhere, but almost nobody knows anything about it. How can that be?

Let’s start with what we do know about:

We know about “government data”. Well, at least the government does. It is where information technology originated, for the census, for taxing and so forth. The government collects data about people (and lots of other things), manages it, analyzes it, and sends tax bills :-) Is this data personal? No. It’s about people, but there’s nothing personal about it. Governments certainly know how to use government data.

We also know about “business data”. It’s the data that businesses collect and manage and analyze etc. to help the business do its business thing: generates sales, maintain profits, grow market share, etc. Lots of people have lots of experience with business data. Is this data personal? No. It might be about people, such as customers and employees, but there’s nothing personal about it either. Businesses sure are quite competent about business data.

So “personal data” is the data that people collect and manage and analyze and so forth, for their own, personal reasons. To make their own life better in some way. And, I posit, collectively we know very little about it. Here are some examples that we do know something about:

  • Personal e-mail. We do have some idea how people use their personal e-mail, what they keep and what they delete, how they search for old e-mail and so forth.
  • Hard drives on “personal computers” owned by people for personal (rather than work) purposes. But even that we know little details about: what files do people store, how do they organize their files, why, how could we help them get more value out of what they have etc.

But that’s about where it ends. We don’t know how people manage their tax returns. Their photos. Their receipts. Their travel histories. Their appliance maintenance records. Their kids’ allowances. And pretty much anything.

I claim that all the other data that’s often described as “personal data” — like my health records — isn’t actually “personal data”: it is “business data” because it is collected and used by a business to further its own objectives. The only reason we’d ever think of it as “personal data” is because we have so little data that’s truly personal — as defined above — and we’d like to have so much more. Well, it isn’t a democracy, but he’s at least our goon, saying he’s in business to serve us, that kind of “personal data” euphemism.

About truly “personal data”, we know very little. Just how little becomes clear when you realize that the following questions are absolute no-brainers for business and government data, but completely unclear for personal data:

  • Where does personal data get stored? (so it doesn’t become business or government data, just like businesses wouldn’t want to store all their data with government or vice versa)
  • What overall structure does personal data have? Is it files and directories like on a PC? Or relational tables like in business applications? “Unstructured” data like e-mail mailboxes? Some combination?
  • Where’s the personal “master data” and how is it managed?
  • How does personal data get governed?
  • How does personal data get maintained? integrated? analyzed? purged?
  • And finally, how does personal data ideally get used? By the person whose data it is, for their own purposes, whatever they are?

Once we solved those, and a few more, and have good software to support them, the world will be entirely different. Then we’ll have the right to talk about “personal data”. Before that, it’s just a euphemism.