Public Lab Research note

Thoughts on a model for community data enclosure

by warren | December 06, 2020 19:09 06 Dec 19:09 | #25177 | #25177

I wrote a long-winded reply to @liz's nice tweet positing an RFP process that communities could run to solicit and evaluate potential helpers or harmers.

Just pasting my response below because it was so long that it's probably easier to read in this format! Thanks!

Hi all, returning here after giving it a think!

This gets at a broader idea of restructuring power and perspective that disrupts the subject/researcher frame. Imagining an RFP issued by a community not only makes researchers the ones who must jump through hoops and do their homework, but it changes the structure of consent by asserting community agency.

Often, health researchers, (or technologists, or even activists) rely on their own structures of ethical consent (like IRBs) to enter into partnerships. But although institutional review boards (IRB) are important and well-intentioned, they focus on a single moment of informed consent, and then operate within those bounds.

The problem with this is that the benefits and harms of a project may evolve over time (and so may our understanding of them). In addition, the parties involved may not speak for the whole range of stakeholders who may be affected by the work.

I think researchers can feel stuck because they lack the ethical tools or frameworks to approach this problem, or even think about it.

Community data, or individual data, currently enters a “consented-to space” which is controlled by researchers bound by ethical standards and practices. These can be quite strict; limiting the use of data to only a particular study and requiring re-consent for additional direct work with the data (I say direct because of course people can cite the study in other studies, or do meta-studies).

I think new models are required to address this, and to recognize the importance of ongoing relationships and ongoing trust-building for the lifetime of the data. In the WhereWeBreathe project I participated in at @PublicLab, we prototyped a means of locking researchers out of health datasets using an architecture of data management which granted access only to those electing to share their data.

No access to raw data would be allowed to researchers, who could only submit requests to visualize the data, and see the outcome of the visualization. Individuals could accept such requests and grant access, adding their data to the pool.

While this brought up significant challenges to data analysis and study design, including questions of priming – where contributors (those called “subjects” in a typical study design) being able (or not) to see the effect of their inclusion on the visualizations generated, might impact the outcomes in many or most study designs.

Contributors could also remove their data from the pool at any time, which would not affect already-generated visualizations but any future ones. And even more challenging was the control of privacy and anonymity such a system might or might not allow for participants.

But importantly, it structured the data collection as primarily a process serving contributors themselves, rather than researchers. Contributors had full access to one another’s data, and could use the system to compile a “dossier” of their own data for evidential use, for example in court, or to present to journalists. Researchers would be forced to build trust with contributors in order to gain and retain access to the data.

This system design surprised the researchers we were working with. At first they didn’t understand that we were proposing a system which would structurally exclude them from the data. But to their great credit, they thought on these issues and were supportive of the architecture, even going as far as to suggest ways that priming could be mitigated through alternative study designs.

I was reminded of the ways in which AIDS researchers in the 90s were able (and willing) to adapt clinical trial design in response to the hard work of AIDS activists, including patients. However, this didn’t come without a fight – and although you can learn about this story in Alan Irwin’s 1995 book Citizen Science: A Study of People, Expertise, and Sustainable Development, and in Caren Cooper’s 2014 blog post (, the framing of “How could the lay public improve their research?” seems an institutionally centric way to understand what happened.

Of course it’s good that dialogue and collaboration emerged, but the problem statement wasn’t only that AIDS activists wanted to accelerate and improve institutional science, but that they were fighting for the recognition that science and health policy “as usual” was woefully insufficient, and that the mistakes of formal science and health experts were leading to immoral and unnecessary deaths. The goal wasn’t a noble attempt to “make science better” but a (perhaps even more noble) fight over the meaning of expertise and the negotiation of informed consent when many institutional experts wouldn’t even acknowledge the problems in their approach.

Fundamentally, activists and patients weren’t considered part of the research process except as subjects – or troublemakers. And such attitudes persist – I remember how shaken I was to hear someone at a recent scientific meeting argue – angrily – that such activists bore responsibility for many subsequent deaths for having disrupted clinical trials.

Just as in the AIDS crisis, environmental crises cannot be ethically – or successfully – fought without the centering of people and communities who directly face harms. And structural change will take imagination and care. The system we prototyped at @PublicLab was an attempt to remake such collaboration at a structural level, and had plenty of possible flaws, but critically, it imagined a different configuration, a different set of assumptions, about how ownership, consent, and the lifecycle of data, could work.

(Regarding the title, I'm tempted to call this structural pattern "Community data enclosure" but that is probably too inflammatory as it pokes at the fissures between the ethics of open source/open access and the ethics of appropriation)


Why do we need to discuss this problem in the first place and figure out a solution? This is an injustice within itself, thanks for this post @warren if people lived by the Golden Rule, we wouldnt have to try and figure out a solution. But now we are relegated to compensate for a system (higher education) that supports profits and power over people (thats what it is!). I know this is the sad truth, and I love your idea and what y'all did to restructure the power of data back to the community affected, but I cant help feel just a bit sad that this is the kind of world we live in.

Is this a question? Click here to post it to the Questions page.

Reply to this comment...

Thanks Jeff! 😃 It might be interesting as well to trace back to indigenous data sovereignty values, concepts, practices co-developed in the 80s and written in the 90s in the PPGIS community. Community data sovereignty values, concepts, practices have also been present in decentralized internet movements, such as DiscoTech "discovering technology" movement in Detroit, and RISE NYC since the early to mid twenty-teens (and the work that preceded that program) in which there were discussions about using the network for decentralized data collection and storage towards community data sovereignty.

Linking to this great 2010 Clean Air Act Academic Research Policy Manifesto:

The Coalition controls the information collected through community studies conducted on the Coalition’s behalf. The Board must approve the study before it is conducted, as well as resulting presentations and publications before they are submitted outside of the Coalition. Academics wishing to use data collected through Coalition studies must submit a proposal to the Coalition’s Board of Directors who will vote to approve or deny access to the data.

Reply to this comment...

Login to comment.