Today data is owned by corporations, governments, and research institutions. But it is generated by users, citizens and participants. Perhaps they should own it.
A million workers working for nothingJohn Lennon
You better give ’em what they really own
We got to put you down
When we come into town
© Sony/ATV Music Publishing LLC
The business models of tech giants like Google, facebook and Amazon are driven by data. Their huge profits are only possible because people relinquish the rights to the data they generate. In a recent article, we argued that an alternative to corporate or governmental ownership of data is to have participants retain ownership. Under this model, data become an asset that participants allow researchers or corporations to license – either in the interests of the public good or for compensation. Participants would build a personal data warehouse. As time progresses, the value of the data asset would grow with its extent. Researchers/corporations might then offer compensation for a given type of data and participants would consent on a case by case basis. The researchers would be purchasing the right to analyze data, not the data itself, so the participant is then free to participate in other studies and to earn additional compensation from other researchers for the same data.
While this proposal requires a shift in the way in which researchers and corporations understand their relationship to data, it has a number of advantages:
(a) participants would make decisions about the use of their data on a case by case basis – a form of dynamic consent (Kaye et al., 2015; Williams et al., 2015). Researchers would provide a statement about the use to which the data would be put in their request and participants would provide their consent with this use in mind. Ethics boards, advocacy groups and government are also likely to play roles in deciding which projects are appropriate, but we would argue that in most cases participants should retain the right to control their data.
(b) participants would be incentivised to curate their data to ensure it is as complete as possible as this would make it more likely to be requested. Missing data is a significant problem, so any dynamic that engages participants is desirable.
(c) currently, people’s understanding of the relative value of data and the privacy implications of allowing others to access it is rudimentary. If participants retain ownership of their data and participate in a data marketplace, they will come to understand which kinds of data are most valuable both to them collectively and to researchers. The promise then is that a more nuanced understanding of privacy will emerge.
(d) the weak link with current open repositories is the time between the publication of the paper and the posting of the data. Well-meaning researchers struggle to format, document and post their data. Publication standards aimed at sharing will certainly affect this tendency, however, this requires surveillance and enforcement. By contrast, in the approach advocated herein, the data would be submitted directly to the repository by the participant. Using a key published with the paper, a replicator can immediately access the set (with the permission of the participants) without additional processes, thus removing a key impediment to sharing (c.f. born open, Rouder, 2016).
(e) participant ownership of data may lead to increased engagement in and understanding of the scientific process – common objectives in citizen science projects (Bonney, Shirk, Phillips, Wiggins, Ballard, Miller-Rushing & Parrish, 2014).
(f) in many cases, the data that is most valuable to researchers belongs to members of special populations who are commonly financially disadvantaged. Ensuring they are able to retain ownership of their data could provide a supplemental income to people with financial needs. If this mechanism is to work to a substantial degree, the data should be seen as capital for rent not as labour as is the case with internet work providers such as Amazon Mechanical Turk or Prolific Academic.
(g) more generally if everyone is able to build a data asset then the income derived from this asset might be an alternative to instituting a Universal Basic Income (UBI). A UBI is a form of welfare and as a consequence will be opposed by many people. Also it is extremely costly to implement at scale. By contrast, recognizing people’s data rights provides a mechanism by which people can acquire capital to generate income that does not rely upon the wealth of their families or their skills and abilities. It is not a form of welfare but rather a market driven redistribution of wealth.
(h) The democratization of data rights would also make it easier for researchers to access large data sets. Right now the tech giants act as gatekeepers to the biggest sets and tend to guard their hoards jealously. If people retained the rights to their data they would be free to lease it to a wide range of researchers.
A key requirement of the vision that I have laid out above is the ability to process data without having access to it (which would allow people to copy it). In future posts, I will discuss the Private language that we are developing that provides this cabability.
We sit at a juncture. Our understanding of privacy and data rights is developing rapidly. We have the ability to establish the patterns which will guide data usage into the future. Now is the time to act. Power to the people!