The upside of sharing personal data


We had a rather interesting lunch yesterday at the MyData Silicon Valley Hub, on the occasion of a visit by Julian Ranger, board member of MyData Global and founder of personal data startup digi.me. Julian offered a striking proposition that I had not heard before. It is very much worth writing about, starting with some context:

Most discussions of personal data issues tends to focus on the perils: personal data gets leaked and privacy is compromised, unsavory actors use it to harm people, governments to control and repress, and so forth. So it’s become common wisdom that we should do what we can to discourage and reduce the collection, storage and use of personal data by 3rd parties. That is of course all true.

But during lunch yesterday, the discussion soon moved to the opposite: the opportunities created by using more personal data in more places. Those opportunties center around the insight that, clearly, with more data, in principle, more things are possible than with less data. That’s of course the reason why everybody in their castles in the cloud is so hell-bent getting as much of our personal data as they can. There is a lot of value to be derived from personal data.

For example, the Netflix recommendation engine can only recommend movies I might like if it has access to a bunch of data about me. Currently it only knows how I interacted with other movies on Netflix before, but that’s all, so the algorithm can only be about as good as it is now. If it had more data about me from outside of Netflix — such as, for example, which books I read or where I have traveled, or my feelings about romantic relationships — it could be more accurate. This would help Netflix’s business, but it would also help me: I would spend more of my leisure time watching a movie I’m going to like than scrolling through endless listings I don’t like. (Actually this is a hyothetical as I essentially stopped watching movies years ago. Somehow they all have the same plot, and they bore me. But back to the example.) Here’s a really compelling one: if I had post-traumatic stress disorder from trauma in an armed conflict, and Netflix knew it, it could avoid recommending movies to me that are likely to trigger an episode. That would be a huge benefit for PTSD sufferers! You can come up with lots of similar examples.

But of course, if Netflix suddenly started to collect (or somehow obtain) our medical and psychological conditions, we would be aghast, and justly so. Intimate details should be none of the business of a movie-showing company, and they should not be permitted to get access to them, we would say.

(To be clear, this kind of analysis applies not just to Netflix, but to pretty much any company that can improve their business, and/or the value of their offerings to their customer, using data. And that, today, is basically everybody. It might even include governments. But let’s stick with the Netflix example.)

Now look at the problem from Netflix’s point of view. At this point, they have more or less done everything they can do to make the movie recommendation algorithm as good as possible for their business, and as pleasurable for me, given the data they have about me. To improve further, they have to get more data. And to do that, I think they have essentially three choices:

  • follow the playbook of Facebook and Google: come up with a bunch of adjacent products that collect data about me from all around the web, not just on the Netflix site. Facebook has done that with the Like button, and Login with Facebook. Google has done it with their ad networks, Analytics and Android. Then they could correlate that data with the core Netflix data they already have, and make their algorithm better. This strategy still works, but this kind of involuntary personal data collection behind the guise of something else is exactly what consumers hate, and the resistance is arming itself with ad blockers and privacy legislation. It does not look like a very promising strategy any more.
  • purchase or exchange personal data with other companies that have complementary data sets about me. This approach largely has the same downsides as the previous one, and in addition, much of the interesting data is not actually for sale because other companies consider it their proprietary advantage.
  • and here comes Julian with a third option: they could ask the user nicely for it. Make a proposition to the customer that they can accept or reject at their sole discretion. If the customer accepts, all is well, the algorithm gets better and everybody is happy. If the customer does not, everybody is in the same position as before, so nothing has been lost. But because the customer is in control over this decision, the company cannot get away with dirty personal data management practices any more and they need to clean up their act. Otherwise the word spreads, customers will not provide their data and the product will lose competitiveness against any recommendation engine that does treat their customer so well that they feel safe to provide their data.

In Julian’s (paraphrased) words — correct me please, Julian, if you read this and should I misrepresent you: the big companies are going to be forced to get the approval of the user to use personal data, otherwise their data-based product innovation grinds to a halt, starting about now.

This is interesting, isn’t it?

So how would this work practically? Users aren’t going to type in a ton of personal data into each of the websites they use. Even if they did, that data would be out of date quickly and lose its advantage. Enter another piece of the puzzle: users increasingly have the right to obtain their personal data from companies that store or use it.

I said earlier that much of the data that would be useful to a company like Netflix is not available for sale. Why would YouTube tell Netflix what kinds of movies I watch there? Nobody likes to share customer data with a competitor.

But now the GDPR, with the CCPA right behind it, make that data available, in electronic form. Not to Netflix, but to the user, and the user has the right to do whatever they please with that data. So they could choose to make their YouTube watching data available to Netflix, but refuse to provide their Netflix watching data to YouTube.

Why would they do that? Because the user might trust Netflix to respect their wishes, but not YouTube. (Or the other way around, this is just an example.) They might provide both to their local library, to be told what related books they recommend. And, if they trust their librarian, they might send over their electronic medical record, too, to get the right books for some of their ailments.

Voila, I have presented you with 1) more use of more personal information in more places, for more innovation and better results for consumers and companies, while 2) forcing companies to respect the privacy-related wishes of their customers. In the same, inseparable package.

There is a lot more thinking to be done, of course, and much product development, before this vision becomes practical. Zero-knowledge proofs and the like might also have a big role to play. And the biggest obstable of all would be to change the attitude and behaviors at big surveillance capitalist companies today.

But here is a plausible scenario for how it is in their own interest to clean up their act. Very intriguing …