datasets in Kelty, Bodo & Allen 2018

of antagonism is shifting. While for-profit publishers are
seemingly conceding to Guerrilla Open Access, they are
opening new territories: platforms centralizing data, metrics
and workflows, subsuming academic autonomy into new
processes of value extraction.
The 2010s brought us hope and then realization how little
digital networks could help revolutionary movements. The
redistribution toward the wealthy, assisted by digitization, has
eroded institutions of solidarity. The embrace of privilege—
marked by misogyny, racism and xenophobia—this has catalyzed
is nowhere more evident than in the climate denialism of the
Trump administration. Guerrilla archiving of US government
climate change datasets, as recounted by Laurie Allen,
indicates that more technological innovation simply won't do
away with the 'post-truth' and that our institutions might be in
need of revision, replacement and repair.
As the contributions to this pamphlet indicate, the terms
of struggle have shifted: not only do we have to continue
defending our shadow libraries, but we need to take back the
autonomy of knowledge production and rebuild institutional
grounds of solidarity.

Memory of the World
http://memoryoftheworld.org

5

Recursive
Publics and
Open Access

Christopher
Kelty

Ten years ago, I published a book calledTwo Bits: The Cultural Significance of Free
Software (Kelty 2008).1 Duke University Press and

tudents, and faculty; on institutions, departments, and programs. They produce data
on the performance, on the success and the failure of the whole domain of research
and education. This is the data that is being privatized, enclosed, packaged, and sold
back to us.

Drip, drip, drop, its only nostalgia. My heart is light, as I don’t have to worry about
gutting the library. Soon it won’t matter at all.

Taylorism reached academia. In the name of efficiency, austerity, and transparency,
our daily activities are measured, profiled, packaged, and sold to the highest bidder.
But in this process of quantification, knowledge on ourselves is lost for us, unless we
pay. We still have some patchy datasets on what we do, on who we are, we still have
this blurred reflection in the data-mirrors that we still do control. But this path of
self-enlightenment is quickly waning as less and less data sources about us are freely
available to us.

22

Own Nothing

Who is downloading books and articles? Everyone. Radical open access? We won,
if you like.

Balazs Bodo

23

I strongly believe that information on the self is the foundation
of self-determination. We need to have data on how we operate,
on what we do in order to know who we are. This is what is being
privatized away from the academic community, this is being
taken away from us.
Radical open access. Not of content, but of the data about
ourse

tly. However, the value of these shadow libraries relies
on the existence of the widely agreed upon trusted versions.
If in doubt about whether a copy is trustworthy, scholars
can turn to more mainstream copies, if necessary. This was
not the situation we faced building Data Refuge. Instead, we
were often dealing with the sole public, authoritative copy
of a federal dataset and had to assume that, if it were taken
down, there would be no way to check the authenticity of
other copies. The data was not easily pulled out of systems
as the data and the software that contained them were often
inextricably linked. We were dealing with unique, tremendously
valuable, but often difficult-to-untangle datasets rather than
neatly packaged publications. The workflow we established
was designed to privilege authenticity and trustworthiness
over either the speed of the copying or the easy usability of
the resulting data. 2 This extra care around authenticity was
necessary because of the politicized nature of environmental
data that made many people so worried about its removal
after the election. It was important that our project
supported the strongest possible scientific arguments that
could be made with the data we were ‘saving’. That meant
that our copies of the data needed to be citable in scientific
scholarly papers, and that those citations needed to be
able to withstand hostile political f

ur systems of establishing and
signaling trustworthiness, quality, reliability and stability of information are in dire
need of creative intervention as well. It is not just publishing but all of our systems
for discovering, sharing, acquiring, describing and storing that scholarship that
need support, maintenance, repair, and perhaps in some cases, replacement. And
this work will rely on scholars, as well as expert information practitioners from a
range of fields (Caswell 2016).

¹ At the time of this writing, we are working
on un-packing and repackaging the data
within Data Refuge for eventual inclusion
in various Research Library Repositories.

Ideally, of course, all federally produced
datasets would be published in neatly
packaged and more easily preservable
containers, along with enough technical
checks to ensure their validity (hashes,
checksums, etc.) and each agency would
create a periodical published inventory of
datasets. But the situation we encountered
with Data Refuge did not start us in
anything like that situation, despite the
hugely successful and important work of
the employees who created and maintained
data.gov. For a fuller view of this workflow,
see my talk at CSVConf 2017 (Allen 2017).

2

Closing note: The workflow established and used at Data Rescue events was
designed to tackle this set of difficult issues, but needed refinement, and was retired
in mid-2017. T

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.