Monday, April 17, 2017

Big Data: Toward A Richer Social Science — Social Physics

    To build living labs that produce these sort of dense, continuous measurements, new legal and software tools had to be developed in order to protect the rights and privacy of the people in these labs, to insure that they are fully informed about what is happening to their data, and that that they maintain the right to opt out at any time. These `big data’ solutions, originally developed for human subjects research, have played an important role as examples in government policy debates over personal privacy, and have helped to shape both the US Consumer Privacy Bill of Rights and the EU Data Protection acts.

  • The approach taken is to give participants direct legal control over sharing personal data not only with researchers but also among themselves and with commercial and civic entities. The participants `own’ the data being collected, in the sense that they have legal rights of ownership, and can control where it goes. This has an interesting consequence for social and medical science: since dense data is being continuously collected into the users’ personal data stores, conducting a new experiment simply requires obtaining informed consent from the participants. The time and cost of recruiting new subjects and making new measurements is zero since the participants and their data already exist, and as a consequence both the cost and time required to conduct a new experiment is cut dramatically.


  • Who Owns The Data In A Data-Driven Society?

    How do you get the data out of those silos? The first step is you have to figure out who owns that data. Does the telephone company own it, just because it happened to be collected while you were walking around with your phone? Maybe they have some right to use it. But what the discussions are among all the participants, including the telephone companies, is that you're the only one that has final disposal of it. They would have the ability to keep copies to offer services that you've requested, but you, the individual, have to have the final say.

    Some situations are, of course, more complex. What about if the data is a transaction with a merchant? Well, they have a right to the data too. But by assigning rights of ownership to people (which is not exactly the same as legal ownership) what you do is you make it possible to break data out of the silos. You've turned it into a personal asset that can then be shared for value in return. You can make it a liquid asset that can be used to build government systems, social systems, or for-profit systems. That's the world we're moving towards.

    Is there opposition to this? Surprisingly little. The incumbents in the Internet are probably the major opposition because (and I don't mean to pick on them) Facebook and Google grew up in a completely unregulated environment. It is natural for them to think that they have control over the data, but now they're slowly, slowly coming around to the idea that they're going to have to compromise on that.

    However the people who have the most valuable data are the banks, the telephone companies, the medical companies, and they're very highly regulated industries. As a consequence they can't really leverage that data the way they'd like to unless they get buy-in from both the consumer and the regulators. The deal that they've been willing to cut is that they will give consumers control over their data in return for being able to make them offers about using their data.

    That gets these companies out of the regulator's pocket. It gives them a white hat, because they explicitly asked you if you wanted to op in, and it lets them make money, which is what they desperately want. And it appears that if you treat people's data in this sort of responsible manner, people will willingly share their data. It is a win-win-win solution to the privacy problem, and it's the companies that grew up in an unregulated environment, or the companies that are in gray markets that are likely to dry up, that are most strongly opposed.

    We are beginning to see is services that leverage personal data in this sort of respectful manner. Services such as really personal recommendations, identity certification without passwords, and personal public services for transportation, health, and so forth. All these areas are undergoing tectonic changes, and the more that we can use specific data about specific people, the better we can make the system work.

    These dramatic improvements in societies' systems goes back to what I was saying earlier. Today societies' systems are built on big averages and indices, e.g., this class of people do this and this market's moving that way. But really, it's all made up of millions and millions of small interactions, and with Big Data we can get down and design things that really work for us on a personal level, rather than just being treated as another type A4 consumer.

