Bodo
Libraries in the Post-Scarcity Era
2015


Libraries in the Post-Scarcity Era
Balazs Bodo

Abstract
In the digital era where, thanks to the ubiquity of electronic copies, the book is no longer a scarce
resource, libraries find themselves in an extremely competitive environment. Several different actors are
now in a position to provide low cost access to knowledge. One of these competitors are shadow libraries
- piratical text collections which have now amassed electronic copies of millions of copyrighted works
and provide access to them usually free of charge to anyone around the globe. While such shadow
libraries are far from being universal, they are able to offer certain services better, to more people and
under more favorable terms than most public or research libraries. This contribution offers insights into
the development and the inner workings of one of the biggest scientific shadow libraries on the internet in
order to understand what kind of library people create for themselves if they have the means and if they
don’t have to abide by the legal, bureaucratic and economic constraints that libraries usually face. I argue
that one of the many possible futures of the library is hidden in the shadows, and those who think of the
future of libraries can learn a lot from book pirates of the 21 st century about how users and readers expect
texts in electronic form to be stored, organized and circulated.
“The library is society’s last non-commercial meeting place which the majority of the population uses.”
(Committee on the Public Libraries in the Knowledge Society, 2010)
“With books ready to be shared, meticulously cataloged, everyone is a librarian. When everyone is
librarian, library is everywhere.” – Marcell Mars, www.memoryoftheworld.org
I have spent the last few months in various libraries visiting - a library. I spent countless hours in the
modest or grandiose buildings of the Harvard Libraries, the Boston and Cambridge Public Library
systems, various branches of the Openbare Bibliotheek in Amsterdam, the libraries of the University of
Amsterdam, with a computer in front of me, on which another library was running, a library which is
perfectly virtual, which has no monumental buildings, no multi-million euro budget, no miles of stacks,
no hundreds of staff, but which has, despite lacking all what apparently makes a library, millions of
literary works and millions of scientific books, all digitized, all available at the click of the mouse for
everyone on the earth without any charge, library or university membership. As I was sitting in these

1

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

physical spaces where the past seemed to define the present, I was wondering where I should look to find
the library of the future: down to my screen or up around me.
The library on my screen was Aleph, one of the biggest of the countless piratical text collections on the
internet. It has more than a million scientific works and another million literary works to offer, all free to
download, without any charge or fee, for anyone on the net. I’ve spent months among its virtual stacks,
combing through the catalogue, talking to the librarians who maintain the collection, and watching the
library patrons as they used the collection. I kept going back to Aleph both as a user and as a researcher.
As a user, Aleph offered me books that the local libraries around me didn’t, in formats that were more
convenient than print. As a researcher, I was interested in the origins of Aleph, its modus operandi, its
future, and I was curious where the journey to which it has taken the book-readers, authors, publishers
and libraries would end.
In this short essay I will introduce some of the findings of a two year research project conducted on
Aleph. In the project I looked at several things. I reconstructed the pirate library’s genesis in order to
understand the forces that called it to life and shaped its development. I looked at its catalogue to
understand what it has to offer and how that piratical supply of books is related to the legal supply of
books through libraries and online distributors. I also acquired data on its usage, so was able to
reconstruct some aspects of piratical demand. After a short introduction, in the first part of this essay I
will outline some of the main findings, and in the second part will situate the findings in the wider context
of the future of libraries.

Book pirates and shadow librarians
Book piracy has a fascinating history, tightly woven into the history of the printing press (Judge, 1934),
into the history of censorship (Wittmann, 2004), into the history of copyright (Bently, Davis, & Ginsburg,
2010; Bodó, 2011a) and into the history of European civilization (Johns, 2010). Book piracy, in the 21st or
in the mid-17th century is an activity that has deep cultural significance, because ultimately it is a story
about how knowledge is circulated beyond and often against the structures of political and economic
power (Bodó, 2011b), and thus it is a story about the changes this unofficial circulation of knowledge
brings.
There are many different types of book pirates. Some just aim for easy money, others pursue highly
ideological goals, but they are invariably powerful harbingers of change. The emergence of black markets
whether they be of culture, of drugs or of arms is always a symptom, a warning sign of a friction between

2

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

supply and demand. Increased activity in the grey and black zones of legality marks the emergence of a
demand which legal suppliers are unwilling or unable to serve (Bodó, 2011a). That friction, more often
than not, leads to change. Earlier waves of book piracy foretold fundamental economic, political, societal
or technological shifts (Bodó, 2011b): changes in how the book publishing trade was organized (Judge,
1934; Pollard, 1916, 1920); the emergence of the new, bourgeois reading class (Patterson, 1968; Solly,
1885); the decline of pre-publication censorship (Rose, 1993); the advent of the Reformation and of the
Enlightenment (Darnton, 1982, 2003), or the rapid modernization of more than one nation (Khan &
Sokoloff, 2001; Khan, 2004; Yu, 2000).
The latest wave of piracy has coincided with the digital revolution which, in itself, profoundly upset the
economics of cultural production and distribution (Landes & Posner, 2003). However technology is not
the primary cause of the emergence of cultural black markets like Aleph. The proliferation of computers
and the internet has just revealed a more fundamental issue which all has to do with the uneven
distribution of the access to knowledge around the globe.
Sometimes book pirates do more than just forecast and react to changes that are independent of them.
Under certain conditions, they themselves can be powerful agents of change (Bodó, 2011b). Their agency
rests on their ability to challenge the status quo and resist cooptation or subjugation. In that effect, digital
pirates seem to be quite resilient (Giblin, 2011; Patry, 2009). They have the technological upper hand and
so far they have been able to outsmart any copyright enforcement effort (Bodó, forthcoming). As long as
it is not completely possible to eradicate file sharing technologies, and as long as there is a substantial
difference between what is legally available and what is in demand, cultural black markets will be here to
compete with and outcompete the established and recognized cultural intermediaries. Under this constant
existential threat, business models and institutions are forced to adapt, evolve or die.
After the music and audiovisual industries, now the book industry has to address the issue of piracy.
Piratical book distribution services are now in direct competition with the bookstore on the corner, the
used book stall on the sidewalk, they compete with the Amazons of the world and, like it or not, they
compete with libraries. There is, however, a significant difference between the book and the music
industries. The reluctance of music rights holders to listen to the demands of their customers caused little
damage beyond the markets of recorded music. Music rights holders controlled their own fates and those
who wanted to experiment with alternative forms of distribution had the chance to do so. But while the
rapid proliferation of book black markets may signal that the book industry suffers from similar problems
as the music industry suffered a decade ago, the actions of book publishers, the policies they pursue have
impact beyond the market of books and directly affect the domain of libraries.

3

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

The fate of libraries is tied to the fate of book markets in more than one way. One connection is structural:
libraries emerged to remedy the scarcity in books. This is true both for the pre-print era as well as in the
Gutenberg galaxy. In the era of widespread literacy and highly developed book markets, libraries offer
access to books under terms publishers and booksellers cannot or would not. Libraries, to a large extent,
are defined to complement the structure of the book trade. The other connection is legal. The core
activities of the library (namely lending, copying) are governed by the same copyright laws that govern
authors and publishers. Libraries are one of the users in the copyright system, and their existence depends
on the limitations of and exceptions to the exclusive rights of the rights holders. The space that has been
carved out of copyright to enable the existence of libraries has been intensely contested in the era of
postmodern copyright (Samuelson, 2002) and digital technologies. This heavy legal and structural
interdependence with the market means that libraries have only a limited control over their own fate in the
digital domain.
Book pirates compete with some of the core services of libraries. And as is usually the case with
innovation that has no economic or legal constraints, pirate libraries offer, at least for the moment,
significantly better services than most of the libraries. Pirate libraries offer far more electronic books,
with much less restrictions and constraints, to far more people, far cheaper than anyone else in the library
domain. Libraries are thus directly affected by pirate libraries, and because of their structural
interdependence with book markets, they also have to adjust to how the commercial intermediaries react
to book piracy. Under such conditions libraries cannot simply count on their survival through their legacy.
Book piracy must be taken seriously, not just as a threat, but also as an opportunity to learn how shadow
libraries operate and interact with their users. Pirate libraries are the products of readers (and sometimes
authors), academics and laypeople, all sharing a deep passion for the book, operating in a zone where
there is little to no obstacle to the development of the “ideal” library. As such, pirate libraries can teach
important lessons on what is expected of a library, how book consumption habits evolve, and how
knowledge flows around the globe.

Pirate libraries in the digital age
The collection of texts in digital formats was one of the first activities that computers enabled: the text file
is the native medium of the computer, it is small, thus it is easy to store and copy. It is also very easy to
create, and as so many projects have since proved, there are more than enough volunteers who are willing
to type whole books into the machine. No wonder that electronic libraries and digital text repositories
were among the first “mainstream” application of computers. Combing through large stacks of matrix-

4

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

printer printouts of sci-fi classics downloaded from gopher servers is a shared experience of anyone who
had access to computers and the internet before it was known as the World Wide Web.
Computers thus added fresh momentum to the efforts of realizing the age-old dream of the universal
library (Battles, 2004). Digital technologies offered a breakthrough in many of the issues that previously
posed serious obstacles to text collection: storage, search, preservation, access have all become cheaper
and easier than ever before. On the other hand, a number of key issues remained unresolved: digitization
was a slow and cumbersome process, while the screen proved to be too inconvenient, and the printer too
costly an interface between the text file and the reader. In any case, ultimately it wasn’t these issues that
put a break to the proliferation of digital libraries. Rather, it was the realization, that there are legal limits
to the digitization, storage, distribution of copyrighted works on the digital networks. That realization
soon rendered many text collections in the emerging digital library scene inaccessible.
Legal considerations did not destroy this chaotic, emergent digital librarianship and the collections the adhoc, accidental and professional librarians put together. The text collections were far too valuable to
simply delete them from the servers. Instead, what happened to most of these collections was that they
retreated from the public view, back into the access-controlled shadows of darknets. Yesterday’s gophers
and anonymous ftp servers turned into closed, membership only ftp servers, local shared libraries residing
on the intranets of various academic, business institutions and private archives stored on local hard drives.
The early digital libraries turned into book piracy sites and into the kernels of today’s shadow libraries.
Libraries and other major actors, who decided to start large scale digitization programs soon needed to
find out that if they wanted to avoid costly lawsuits, then they had to limit their activities to work in the
public domain. While the public domain is riddled with mind-bogglingly complex and unresolved legal
issues, but at least it is still significantly less complicated to deal with than copyrighted and orphan works.
Legally more innovative, (or as some would say, adventurous) companies, such as Google and Microsoft,
who thought they had sufficient resources to sort out the legal issues soon had to abandon their programs
or put them on hold until the legal issues were sorted out.
There were, however, a large group of disenfranchised readers, library patrons, authors and users who
decided to ignore the legal problems and set out to build the best library that could possibly be built using
the digital technologies. Despite the increased awareness of rights holders to the issue of digital book
piracy, more and more communities around text collections started defy the legal constraints and to
operate and use more or less public piratical shadow libraries.

5

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Aleph1
Aleph2 is a meta-library, and currently one of the biggest online piratical text collections on the internet.
The project started on a Russian bulletin board devoted to piracy in around 2008 as an effort to integrate
various free-floating text collections that circulated online, on optical media, on various public and private
ftp servers and on hard-drives. Its aim was to consolidate these separate text collections, many of which
were created in various Russian academic institutions, into a single, unified catalog, standardize the
technical aspects, add and correct missing or incorrect metadata, and offer the resulting catalogue,
computer code and the collection of files as an open infrastructure.

From Russia with love
It is by no means a mistake that Aleph was born in Russia. In post-Soviet Russia the unique constellation
of several different factors created the necessary conditions for the digital librarianship movement that
ultimately led to the development of Aleph. A rich literary legacy, the Soviet heritage, the pace with
which various copying technologies penetrated the market, the shortcomings of the legal environment and
the informal norms that stood in for the non-existent digital copyrights all contributed to the emergence of
the biggest piratical library in the history of mankind.
Russia cherishes a rich literary tradition, which suffered and endured extreme economic hardships and
political censorship during the Soviet period (Ermolaev, 1997; Friedberg, Watanabe, & Nakamoto, 1984;
Stelmakh, 2001). The political transformation in the early 1990’s liberated authors, publishers, librarians
and readers from much of the political oppression, but it did not solve the economic issues that stood in
the way of a healthy literary market. Disposable income was low, state subsidies were limited, the dire
economic situation created uncertainty in the book market. The previous decades, however, have taught
authors and readers how to overcome political and economic obstacles to access to books. During the
Soviet times authors, editors and readers operated clandestine samizdat distribution networks, while
informal book black markets, operating in semi-private spheres, made uncensored but hard to come by
books accessible (Stelmakh, 2001). This survivalist attitude and the skills that came with it became handy
in the post-Soviet turmoil, and were directly transferable to the then emerging digital technologies.

1

I have conducted extensive research on the origins of Aleph, on its catalogue and its users. The detailed findings, at
the time of writing this contribution are being prepared for publication. The following section is brief summary of
those findings and is based upon two forthcoming book chapters on Aleph in a report, edited by Joe Karaganis, on
the role of shadow libraries in the higher education systems of multiple countries.
2
Aleph is a pseudonym chosen to protect the identity of the shadow library in question.

6

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Russia is not the only country with a significant informal media economy of books, but in most other
places it was the photocopy machine that emerged to serve such book grey/black markets. In pre-1990
Russia and in other Eastern European countries the access to this technology was limited, and when
photocopiers finally became available, computers were close behind them in terms of accessibility. The
result of the parallel introduction of the photocopier and the computer was that the photocopy technology
did not have time to lock in the informal market of texts. In many countries where the photocopy machine
preceded the computer by decades, copy shops still capture the bulk of the informal production and
distribution of textbooks and other learning material. In the Soviet-bloc PCs instantly offered a less costly
and more adaptive technology to copy and distribute texts.
Russian academic and research institutions were the first to have access to computers. They also had to
somehow deal with the frustrating lack of access to up-to-date and affordable western works to be used in
education and research (Abramitzky & Sin, 2014). This may explain why the first batch of shadow
libraries started in a number of academic/research institutions such as the Department of Mechanics and
Mathematics (MexMat) at Moscow State University. The first digital librarians in Russia were
mathematicians, computer scientists and physicists, working in those institutions.
As PCs and internet access slowly penetrated Russian society, an extremely lively digital librarianship
movement emerged, mostly fuelled by enthusiastic readers, book fans and often authors, who spared no
effort to make their favorite books available on FIDOnet, a popular BBS system in Russia. One of the
central figures in these tumultuous years, when typed-in books appeared online by the thousands, was
Maxim Moshkov, a computer scientist, alumnus of the MexMat, and an avid collector of literary works.
His digital library, lib.ru was at first mostly a private collection of literary texts, but soon evolved into the
number one text repository which everyone used to depose the latest digital copy on a newly digitized
book (Мошков, 1999). Eventually the library grew so big that it had to be broken up. Today it only hosts
the Russian literary classics. User generated texts, fan fiction and amateur production was spin off into the
aptly named samizdat.lib.ru collection, low brow popular fiction, astrology and cheap romance found its
way into separate collections, and so did the collection of academic/scientific books, which started an
independent life under the name of Kolkhoz. Kolkhoz, which borrowed its name from the commons
based agricultural cooperative of the early Soviet era, was both a collection of scientific texts, and a
community of amateur librarians, who curated, managed and expanded the collection.
Moshkov and his library introduced several important norms into the bottom-up, decentralized, often
anarchic digital library movement that swept through the Russian internet in the late 1990’s, early 2000’s.
First, lib.ru provided the technological blueprint for any future digital library. But more importantly,

7

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Moshkov’s way of handling the texts, his way of responding to the claims, requests, questions, complaints
of authors and publishers paved the way to the development of copynorms (Schultz, 2007) that continue
to define the Russian digital library scene until today. Moshkov was instrumental in the creation of an
enabling environment for the digital librarianship while respecting the claims of authors, during times
when the formal copyright framework and the enforcement environment was both unable and unwilling to
protect works of authorship (Elst, 2005; Sezneva, 2012).

Guerilla Open Access
Around the time of the late 2000’s when Aleph started to merge the Kolkhoz collection with other, freefloating texts collections, two other notable events took place. It was in 2008 when Aaron Swartz penned
his Guerilla Open Access Manifesto (Swartz, 2008), in which he called for the liberation and sharing of
scientific knowledge. Swartz forcefully argued that scientific knowledge, the production of which is
mostly funded by the public and by the voluntary labor of academics, cannot be locked up behind
corporate paywalls set up by publishers. He framed the unauthorized copying and transfer of scientific
works from closed access text repositories to public archives as a moral act, and by doing so, he created
an ideological framework which was more radical and promised to be more effective than either the
creative commons (Lessig, 2004) or the open access (Suber, 2013) movements that tried to address the
access to knowledge issues in a more copyright friendly manner. During interviews, the administrators of
Aleph used the very same arguments to justify the raison d'être of their piratical library. While it seems
that Aleph is the practical realization of Swartz’s ideas, it is hard to tell which served as an inspiration for
the other.
It was also in around the same time when another piratical library, gigapedia/library.nu started its
operation, focusing mostly on making freely available English language scientific works (Liang, 2012).
Until its legal troubles and subsequent shutdown in 2012, gigapedia/library.nu was the biggest English
language piratical scientific library on the internet amassing several hundred thousand books, including
high-quality proofs ready to print and low resolution scans possibly prepared by a student or a lecturer.
During 2012 the mostly Russian-language and natural sciences focused Alephs absorbed the English
language, social sciences rich gigapedia/library.nu, and with the subsequent shutdown of
gigapedia/library.nu Aleph became the center of the scientific shadow library ecosystem and community.

Aleph by numbers

8

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

By adding pre-existing text collections to its catalogue Aleph was able to grow at an astonishing rate.
Aleph added, on average 17.500 books to its collection each month since 2009, and as a result, by April
2014 is has more than 1.15 million documents. Nearly two thirds of the collection is in English, one fifth
of the documents is in Russian, while German works amount to the third largest group with 8.5% of the
collection. The rest of the major European languages, like French or Spanish have less than 15000 works
each in the collection.
More than 50 thousand publishers have works in the library, but most of the collection is published by
mainstream western academic publishers. Springer published more than 12% of the works in the
collection, followed by the Cambridge University Press, Wiley, Routledge and Oxford University Press,
each having more than 9000 works in the collection.
Most of the collection is relatively recent, more than 70% of the collection being published in 1990 or
after. Despite the recentness of the collection, the electronic availability of the titles in the collection is
limited. While around 80% of the books that had an ISBN number registered in the catalogue3 was
available in print either as a new copy or a second hand one, only about one third of the titles were
available in e-book formats. The mean price of the titles still in print was 62 USD according to the data
gathered from Amazon.com.
The number of works accessed through of Aleph is as impressive as its catalogue. In the three months
between March and June, 2012, on average 24.000 documents were downloaded every day from one of
its half-a-dozen mirrors.4 This means that the number of documents downloaded daily from Aleph is
probably in the 50 to 100.000 range. The library users come from more than 150 different countries. The
biggest users in terms of volume were the Russian Federation, Indonesia, USA, India, Iran, Egypt, China,
Germany and the UK. Meanwhile, many of the highest per-capita users are Central and Eastern European
countries.

What Aleph is and what it is not
Aleph is an example of the library in the post scarcity age. It is founded on the idea that books should no
longer be a scarce resource. Aleph set out to remove both sources of scarcity: the natural source of
3

Market availability data is only available for that 40% of books in the Aleph catalogue that had an ISBN number
on file. The titles without a valid ISBN number tend to be older, Russian language titles, in general with low
expected print and e-book availability.
4
Download data is based on the logs provided by one of the shadow library services which offers the books in
Aleph’s catalogue as well as other works also free and without any restraints or limitations.

9

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

scarcity in physical copies is overcome through distributed digitization; the artificial source of scarcity
created by copyright protection is overcome through infringement. The liberation from both constraints is
necessary to create a truly scarcity free environment and to release the potential of the library in the postscarcity age.
Aleph is also an ongoing demonstration of the fact that under the condition of non-scarcity, the library can
be a decentralized, distributed, commons-based institution created and maintained through peer
production (Benkler, 2006). The message of Aleph is clear: users left to their own devices, can produce a
library by themselves for themselves. In fact, users are the library. And when everyone has the means to
digitize, collect, catalogue and share his/her own library, then the library suddenly is everywhere. Small
individual and institutional collections are aggregated into Aleph, which, in turn is constantly fragmented
into smaller, local, individual collections as users download works from the collection. The library is
breathing (Battles, 2004) books in and out, but for the first time, this circulation of books is not a zero
sum game, but a cumulative one: with every cycle the collection grows.
On the other hand Aleph may have lots of books on offer, but it is clear that it is neither universal in its
scope, nor does it fulfill all the critical functions of a library. Most importantly Aleph is disembedded
from the local contexts and communities that usually define the focus of the library. While it relies on the
availability of local digital collections for its growth, it has no means to play an active role in its own
development. The guardians of Aleph can prevent books from entering the collection, but they cannot
pay, ask or force anyone to provide a title if it is missing. Aleph is reliant on the weak copy-protection
technologies of official e-text repositories and the goodwill of individual document submitters when it
comes to the expansion of the collection. This means that the Aleph collection is both fragmented and
biased, and it lacks the necessary safeguards to ensure that it stays either current or relevant.
Aleph, with all its strengths and weaknesses carries an important lesson for the discussions on the future
of libraries. In the next section I’ll try situate these lessons in the wider context of the library in the post
scarcity age.

The future of the library
There is hardly a week without a blog post, a conference, a workshop or an academic paper discussing the
future of libraries. While existing libraries are buzzing with activity, librarians are well aware that they
need to re-define themselves and their institutions, as the book collections around which libraries were
organized slowly go the way the catalogue has gone: into the digital realm. It would be impossible to give

10

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

a faithful summary of all the discussions on the future of libraries is such a short contribution. There are,
however, a few threads, to which the story of Aleph may contribute.

Competition
It is very rare to find the two words: libraries and competition in the same sentence. No wonder: libraries
enjoyed a near perfect monopoly in their field of activity. Though there may have been many different
local initiatives that provided free access to books, as a specialized institution to do so, the library was
unmatched and unchallenged. This monopoly position has been lost in a remarkably short period of time
due to the internet and the rapid innovations in the legal e-book distribution markets. Textbooks can be
rented, e-books can be lent, a number of new startups and major sellers offer flat rate access to huge
collections. Expertise that helps navigate the domains of knowledge is abundant, there are multiple
authoritative sources of information and meta-information online. The search box of the library catalog is
only one, and not even the most usable of all the different search boxes one can type a query in5.
Meanwhile there are plenty of physical spaces which offer good coffee, an AC plug, comfortable chairs
and low levels of noise to meet, read and study from local cafes via hacker- and maker spaces, to coworking offices. Many library competitors have access to resources (human, financial, technological and
legal) way beyond the possibilities of even the richest libraries. In addition, publishers control the
copyrights in digital copies which, absent of well fortified statutory limitations and exceptions, prevent
libraries keeping up with the changes in user habits and with the competing commercial services.
Libraries definitely feel the pressure. “Libraries’ offers of materials […] compete with many other offers
that aim to attract the attention of the public. […] It is no longer enough just to make a good collection
available to the public.” (Committee on the Public Libraries in the Knowledge Society, 2010) As a
response, libraries have developed different strategies to cope with this challenge. The common thread in
the various strategy documents is that they try to redefine the library as a node in the vast network of
institutions that provide knowledge, enable learning, facilitate cooperation and initiate dialogues. Some of
the strategic plans redefine the library space as an “independent medium to be developed” (Committee on
the Public Libraries in the Knowledge Society, 2010), and advise libraries to transform themselves into
culture and community centers which establish partnerships with citizens, communities and with other
public and private institutions. Some librarians propose even more radical ways of keeping the library

5

ArXiv, SSRN, RePEc, PubMed Central, Google Scholar, Google Books, Amazon, Mendeley, Citavi,
ResearchGate, Goodreads, LibraryThing, Wikipedia, Yahoo Answers, Khan Academy, specialized twitter and other
social media accounts are just a few of the available discovery services.

11

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

relevant by, for example, advocating more opening hours without staff and hosting more user-governed
activities.
In the research library sphere, the Commission on the Future of the Library, a task force set up by the
University of California Berkeley defined the values the university research library will add in the digital
age as “1) Human expertise; 2) Enabling infrastructure; and 3) Preservation and dissemination of
knowledge for future generations.” (Commission on the Future of the Library, 2013). This approach is
from among the more conservative ones, still relying on the hope that libraries can offer something
unique that no one else is able to provide. Others, working at the Association of Research Libraries are
more like their public library counterparts, defining the future role of the research libraries as a “convener
of ‘conversations’ for knowledge construction, an inspiring host; a boundless symposium; an incubator;
a 3rd space both physically and virtually; a scaffold for independence of mind; and a sanctuary for
freedom of expression, a global entrepreneurial engine” (Pendleton-Jullian, Lougee, Wilkin, & Hilton,
2014), in other words, as another important, but in no way unique node in the wider network of
institutions that creates and distributes knowledge.
Despite the differences in priorities, all these recommendations carry the same basic message. The unique
position of libraries in the center of a book-based knowledge economy, on the top of the paper-bound
knowledge hierarchy is about to be lost. As libraries are losing their monopoly of giving low cost, low
restrictions access to books which are scarce by nature, and they are losing their privileged and powerful
position as the guardians of and guides to the knowledge stored in the stacks. If they want to survive, they
need to find their role and position in a network of institutions, where everyone else is engaged in
activities that overlap with the historic functions of the library. Just like the books themselves, the power
that came from the privileged access to books is in part dispersed among the countless nodes in the
knowledge and learning networks, and in part is being captured by those who control the digital rights to
digitize and distribute books in the digital era.
One of the main reasons why libraries are trying to redefine themselves as providers of ancillary services
is because the lack of digital lending rights prevents them from competing on their own traditional home
turf - in giving free access to knowledge. The traditional legal limitations and exceptions to copyright that
enabled libraries to fulfill their role in the analogue world do not apply in the digital realm. In the
European Union, the Infosoc Directive (“Directive 2001/29/EC on the harmonisation of certain aspects of
copyright and related rights in the information society,” 2001) allows for libraries to create digital copies
for preservation, indexing and similar purposes and allows for the display of digital copies on their
premises for research and personal study (Triaille et al., 2013). While in theory these rights provide for

12

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

the core library services in the digital domain, their practical usefulness is rather limited, as off-premises
e-lending of copyrighted works is in most cases6 only possible through individual license agreements with
publishers.
Under such circumstances libraries complain that they cannot fulfill their public interest mission in the
digital era. What libraries are allowed to do under their own under current limitations and exceptions, is
seen as inadequate for what is expected of them. But to do more requires the appropriate e-lending
licenses from rights holders. In many cases, however, libraries simply cannot license digitally for e-lending. In those cases when licensing is possible, they see transaction costs as prohibitively high; they
feel that their bargaining positions vis-à-vis rightholders is unbalanced; they do not see that the license
terms are adapted to libraries’ policies, and they fear that the licenses provide publishers excessive and
undue influence over libraries (Report on the responses to the Public Consultation on the Review of the
EU Copyright Rules, 2013).
What is more, libraries face substantial legal uncertainties even where there are more-or-less well defined
digital library exceptions. In the EU, questions such as whether the analogue lending rights of libraries
extend to e-books, whether an exhaustion of the distribution right is necessary to enjoy the lending
exception, and whether licensing an e-book would exhaust the distribution right are under consideration
by the Court of Justice of the European Union in a Dutch case (Rosati, 2014b). And while in another case
(Case C-117/13 Technische Universität Darmstadt v Eugen Ulmer KG) the CJEU reaffirmed the rights of
European libraries to digitize books in their collection if that is necessary to give access to them in digital
formats on their premises, it also created new uncertainties by stating that libraries may not digitize their
entire collections (Rosati, 2014a).
US libraries face a similar situation, both in terms of the narrowly defined exceptions in which libraries
can operate, and the huge uncertainty regarding the limits of fair use in the digital library context. US
rights holders challenged both Google’s (Authors Guild v Google) and the libraries (Authors Guild v
HathiTrust) rights to digitize copyrighted works. While there seems to be a consensus of courts that the
mass digitization conducted by these institutions was fair use (Diaz, 2013; Rosati, 2014c; Samuelson,
2014), the accessibility of the scanned works is still heavily limited, subject to licenses from publishers,
the existence of print copies at the library and the institutional membership held by prospective readers.
While in the highly competitive US e-book market many commercial intermediaries offer e-lending
6

The notable exception being orphan works which are presumed to be still copyrighted, but without an identifiable
rights owner. In the EU, the Directive 2012/28/EU on certain permitted uses of orphan works in theory eases access
to such works, but in practice its practical impact is limited by the many constraints among its provisions. Lacking
any orphan works legislation and the Google Book Settlement still in limbo, the US is even farther from making
orphan works generally accessible to the public.

13

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

licenses to e-book catalogues of various sizes, these arrangements also carry the danger of a commercial
lock-in of the access to digital works, and render libraries dependent upon the services of commercial
providers who may or may not be the best defenders of public interest (OECD, 2012).
Shadow libraries like Aleph are called into existence by the vacuum that was left behind by the collapse
of libraries in the digital sphere and by the inability of the commercial arrangements to provide adequate
substitute services. Shadow libraries are pooling distributed resources and expertise over the internet, and
use the lack of legal or technological barriers to innovation in the informal sphere to fill in the void left
behind by libraries.

What can Aleph teach us about the future of libraries?
The story of Aleph offers two, closely interrelated considerations for the debate on the future of libraries:
a legal and an organizational one. Aleph operates beyond the limits of legality, as almost all of its
activities are copyright infringing, including the unauthorized digitization of books, the unauthorized
mass downloads from e-text repositories, the unauthorized acts of uploading books to the archive, the
unauthorized distribution of books, and, in most countries, the unauthorized act of users’ downloading
books from the archive. In the debates around copyright infringement, illegality is usually interpreted as a
necessary condition to access works for free. While this is undoubtedly true, the fact that Aleph provides
no-cost access to books seems to be less important than the fact that it provides an access to them in the
first place.
Aleph is a clear indicator of the volume of the demand for current books in digital formats in developed
and in developing countries. The legal digital availability, or rather, unavailability of its catalogue also
demonstrates the limits of the current commercial and library based arrangements that aim to provide low
cost access to books over the internet. As mentioned earlier, Aleph’s catalogue is mostly of recent books,
meaning that 80% of the titles with a valid ISBN number are still in print and available as a new or used
print copy through commercial retailers. What is also clear, that around 66% of these books are yet to be
made available in electronic format. While publishers in theory have a strong incentive to make their most
recent titles available as e-books, they lag behind in doing so.
This might explain why one third of all the e-book downloads in Aleph are from highly developed
Western countries, and two third of these downloads are of books without a kindle version. Having access
to print copies either through libraries or through commercial retailers is simply not enough anymore.
Developing countries are a slightly different case. There, compared to developed countries, twice as many

14

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

of the downloads (17% compared to 8% in developed countries) are of titles that aren’t available in print
at all. Not having access to books in print seems to be a more pressing problem for developing countries
than not having access to electronic copies. Aleph thus fulfills at least two distinct types of demand: in
developed countries it provides access to missing electronic versions, in developing countries it provides
access to missing print copies.
The ability to fulfill an otherwise unfulfilled demand is not the only function of illegality. Copyright
infringement in the case of Aleph has a much more important role: it enables the peer production of the
library. Aleph is an open source library. This means that every resource it uses and every resource it
creates is freely accessible to anyone for use without any further restrictions. This includes the server
code, the database, the catalogue and the collection. The open source nature of Aleph rests on the
ideological claim that the scientific knowledge produced by humanity, mostly through public funds
should be open for anyone to access without any restrictions. Everything else in and around Aleph stems
from this claim, as they replicate the open access logic in all the other aspects of Aleph’s operation. Aleph
uses the peer produced Open Library to fetch book metadata, it uses the bittorrent and ed2k P2P networks
to store and make books accessible, it uses Linux and MySQL to run its code, and it allows its users to
upload books and edit book metadata. As a consequence of its open source nature, anyone can contribute
to the project, and everyone can enjoy its benefits.
It is hard to quantify the impact of this piratical open access library on education, science and research in
various local contexts where Aleph is the prime source of otherwise inaccessible books. But it is
relatively easy to measure the consequences of openness at the level of the Aleph, the library. The
collection of Aleph was created mostly by those individuals and communities who decided to digitize
books by themselves for their own use. While any single individual is only capable of digitizing a few
books at the maximum, the small contributions quickly add up. To digitize the 1.15 million documents in
the Aleph collection would require an investment of several hundred million Euros, and a substantial
subsequent investment in storage, collection management and access provision (Poole, 2010). Compared
to these figures the costs associated with running Aleph is infinitesimal, as it survives on the volunteer
labor of a few individuals, and annual donations in the total value of a few thousand dollars. The hundreds
of thousands who use Aleph on a more or less regular basis have an immense amount of resources, and by
disregarding the copyright laws Aleph is able to tap into those resources and use them for the
development of the library. The value of these resources and of the peer produced library is the difference
between the actual costs associated with Aleph, and the investment that would be required to create
something remotely similar.

15

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

The decentralized, collaborative mass digitization and making available of current, thus most relevant
scientific works is only possible at the moment through massive copyright infringement. It is debatable
whether the copyrighted corpus of scientific works should be completely open, and whether the blatant
disregard of copyrights through which Aleph achieved this openness is the right path towards a more
openly accessible body of scientific knowledge. It is also yet to be measured what effects shadow libraries
may have on the commercial intermediaries and on the health of scientific publishing and science in
general. But Aleph, in any case, is a case study in the potential benefits of open sourcing the library.

Conclusion
If we can take Aleph as an expression of what users around the globe want from a library, then the answer
is that there is a strong need for a universally accessible collection of current, relevant (scientific) books
in restrictions-free electronic formats. Can we expect any single library to provide anything even remotely
similar to that in the foreseeable future? Does such a service have a place in the future of libraries? It is as
hard to imagine the future library with such a service as without.
While the legal and financial obstacles to the creation of a scientific library with as universal reach as
Aleph may be difficult the overcome, other aspects of it may be more easily replicable. The way Aleph
operates demonstrates the amount of material and immaterial resources users are willing to contribute to
build a library that responds to their needs and expectations. If libraries plan to only ‘host’ user-governed
activities, it means that the library is still imagined to be a separate entity from its users. Aleph teaches us
that this separation can be overcome and users can constitute a library. But for that they need
opportunities to participate in the production of the library: they need the right to digitize books and copy
digital books to and from the library, they need the opportunity to participate in the cataloging and
collection building process, they need the opportunity to curate and program the collection. In other
words users need the chance to be librarians in the library if they wish to do so, and so libraries need to be
able to provide access not just to the collection but to their core functions as well. The walls that separate
librarians from library patrons, private and public collections, insiders and outsiders can all prevent the
peer production of the library, and through that, prevent the future that is the closest to what library users
think of as ideal.

16

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

References
Abramitzky, R., & Sin, I. (2014). Book Translations as Idea Flows: The Effects of the Collapse of
Communism

on

the

Diffusion

of

Knowledge

(No.

w20023).

Retrieved

from

http://papers.ssrn.com/abstract=2421123
Battles, M. (2004). Library: An unquiet history. WW Norton & Company.
Benkler, Y. (2006). The wealth of networks : how social production transforms markets and freedom.
New Haven: Yale University Press.
Bently, L., Davis, J., & Ginsburg, J. C. (Eds.). (2010). Copyright and Piracy An Interdisciplinary
Critique. Cambridge University Press.
Bodó, B. (2011a). A szerzői jog kalózai. Budapest: Typotex.
Bodó, B. (2011b). Coda: A Short History of Book Piracy. In J. Karaganis (Ed.), Media Piracy in
Emerging Economies. New York: Social Science Research Council.
Bodó, B. (forthcoming). Piracy vs privacy–the analysis of Piratebrowser. IJOC.
Commission on the Future of the Library. (2013). Report of the Commission on the Future of the UC
Berkeley Library. Berkeley: UC Berkeley.
Committee on the Public Libraries in the Knowledge Society. (2010). The Public Libraries in the
Knowledge Society. Copenhagen: Kulturstyrelsen.
Darnton, R. (1982). The literary underground of the Old Regime. Cambridge, Mass: Harvard University
Press.
Darnton, R. (2003). The Science of Piracy: A Crucial Ingredient in Eighteenth-Century Publishing.
Studies on Voltaire and the Eighteenth Century, 12, 3–29.
Diaz, A. S. (2013). Fair Use & Mass Digitization: The Future of Copy-Dependent Technologies after
Authors Guild v. Hathitrust. Berkeley Technology Law Journal, 23.
Directive 2001/29/EC on the harmonisation of certain aspects of copyright and related rights in the
information society. (2001). Official Journal L, 167, 10–19.
Elst, M. (2005). Copyright, freedom of speech, and cultural policy in the Russian Federation.
Leiden/Boston: Martinus Nijhoff.
Ermolaev, H. (1997). Censorship in Soviet Literature: 1917-1991. Rowman & Littlefield.
Friedberg, M., Watanabe, M., & Nakamoto, N. (1984). The Soviet Book Market: Supply and Demand.
Acta Slavica Iaponica, 2, 177–192.
Giblin, R. (2011). Code Wars: 10 Years of P2P Software Litigation. Cheltenham, UK ; Northampton,
MA: Edward Elgar Publishing.

17

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Johns, A. (2010). Piracy: The Intellectual Property Wars from Gutenberg to Gates. University Of
Chicago Press.
Judge, C. B. (1934). Elizabethan book-pirates. Cambridge: Harvard University Press.
Khan, B. Z. (2004). Does Copyright Piracy Pay? The Effects Of U.S. International Copyright Laws On
The Market For Books, 1790-1920. Cambridge, MA: National Bureau Of Economic Research.
Khan, B. Z., & Sokoloff, K. L. (2001). The early development of intellectual property institutions in the
United States. Journal of Economic Perspectives, 15(3), 233–246.
Landes, W. M., & Posner, R. A. (2003). The economic structure of intellectual property law. Cambridge,
Mass.: Harvard University Press.
Lessig, L. (2004). Free culture : how big media uses technology and the law to lock down culture and
control creativity. New York: Penguin Press.
Liang, L. (2012). Shadow Libraries. e-flux. Retrieved from http://www.e-flux.com/journal/shadowlibraries/
Patry, W. F. (2009). Moral panics and the copyright wars. New York: Oxford University Press.
Patterson, L. R. (1968). Copyright in historical perspective (p. vii, 264 p.). Nashville,: Vanderbilt
University Press.
Pendleton-Jullian, A., Lougee, W. P., Wilkin, J., & Hilton, J. (2014). Strategic Thinking and Design—
Research Library in 2033—Vision and System of Action—Part One. Colombus, OH: Association of
Research

Libraries.

Retrieved

from

http://www.arl.org/about/arl-strategic-thinking-and-design/arl-

membership-refines-strategic-thinking-and-design-at-spring-2014-meeting
Pollard, A. W. (1916). The Regulation Of The Book Trade In The Sixteenth Century. Library, s3-VII(25),
18–43.
Pollard, A. W. (1920). Shakespeare’s fight with the pirates and the problems of the transmission of his
text. Cambridge [Eng.]: The University Press.
Poole, N. (2010). The Cost of Digitising Europe’s Cultural Heritage - A Report for the Comité des Sages
of

the

European

Commission.

Retrieved

from

http://nickpoole.org.uk/wp-

content/uploads/2011/12/digiti_report.pdf
Report on the responses to the Public Consultation on the Review of the EU Copyright Rules. (2013).
European Commission, Directorate General for Internal Market and Services.
Rosati, E. (2014a). Copyright exceptions and user rights in Case C-117/13 Ulmer: a couple of
observations. IPKat. Retrieved October 08, 2014, from http://ipkitten.blogspot.co.uk/2014/09/copyrightexceptions-and-user-rights-in.html

18

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Rosati, E. (2014b). Dutch court refers questions to CJEU on e-lending and digital exhaustion, and another
Dutch reference on digital resale may be just about to follow. IPKat. Retrieved October 08, 2014, from
http://ipkitten.blogspot.co.uk/2014/09/dutch-court-refers-questions-to-cjeu-on.html
Rosati, E. (2014c). Google Books’ Library Project is fair use. Journal of Intellectual Property Law &
Practice, 9(2), 104–106.
Rose, M. (1993). Authors and owners : the invention of copyright. Cambridge, Mass: Harvard University
Press.
Samuelson, P. (2002). Copyright and freedom of expression in historical perspective. J. Intell. Prop. L.,
10, 319.
Samuelson, P. (2014). Mass Digitization as Fair Use. Communications of the ACM, 57(3), 20–22.
Schultz, M. F. (2007). Copynorms: Copyright Law and Social Norms. Intellectual Property And
Information Wealth v01, 1, 201.
Sezneva, O. (2012). The pirates of Nevskii Prospekt: Intellectual property, piracy and institutional
diffusion in Russia. Poetics, 40(2), 150–166.
Solly, E. (1885). Henry Hills, the Pirate Printer. Antiquary, xi, 151–154.
Stelmakh, V. D. (2001). Reading in the Context of Censorship in the Soviet Union. Libraries & Culture,
36(1), 143–151.
Suber,

P.

(2013).

Open

Access

(Vol.

1).

Cambridge,

MA:

The

MIT

Press.

doi:10.1109/ACCESS.2012.2226094
Swartz,

A.

(2008).

Guerilla

Open

Access

Manifesto.

Aaron

Swartz.

Retrieved

from

https://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt
Triaille, J.-P., Dusollier, S., Depreeuw, S., Hubin, J.-B., Coppens, F., & Francquen, A. de. (2013). Study
on the application of Directive 2001/29/EC on copyright and related rights in the information society (the
“Infosoc Directive”). European Union.
Wittmann, R. (2004). Highwaymen or Heroes of Enlightenment? Viennese and South German Pirates and
the German Market. Paper presented at the History of Books and Intellectual History conference.
Princeton University.
Yu, P. K. (2000). From Pirates to Partners: Protecting Intellectual Property in China in the Twenty-First
Century.

American

University

Law,

50.

Retrieved

from

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=245548
Мошков, М. (1999). Что вы все о копирайте. Лучше бы книжку почитали (Библиотеке копирайт не
враг). Компьютерры, (300).

19


Kelty, Bodo & Allen
Guerrilla Open Access
2018


Memory
of the
World

Edited by

Guerrilla
Open Access
Christopher
Kelty

Balazs
Bodo

Laurie
Allen

Published by Post Office Press,
Rope Press and Memory of the
World. Coventry, 2018.
© Memory of the World, papers by
respective Authors.
Freely available at:
http://radicaloa.co.uk/
conferences/ROA2
This is an open access pamphlet,
licensed under a Creative
Commons Attribution-ShareAlike
4.0 International (CC BY-SA 4.0)
license.
Read more about the license at:
https://creativecommons.org/
licenses/by-sa/4.0/
Figures and other media included
with this pamphlet may be under
different copyright restrictions.
Design by: Mihai Toma, Nick White
and Sean Worley
Printed by: Rope Press,
Birmingham

This pamphlet is published in a series
of 7 as part of the Radical Open
Access II – The Ethics of Care
conference, which took place June
26-27 at Coventry University. More
information about this conference
and about the contributors to this
pamphlet can be found at:
http://radicaloa.co.uk/conferences/
ROA2
This pamphlet was made possible due
to generous funding from the arts
and humanities research studio, The
Post Office, a project of Coventry
University’s Centre for Postdigital
Cultures and due to the combined
efforts of authors, editors, designers
and printers.

Table of Contents

Guerrilla Open Access:
Terms Of Struggle
Memory of the World
Page 4

Recursive Publics and Open Access
Christopher Kelty
Page 6

Own Nothing
Balazs Bodo
Page 16

What if We Aren't the Only
Guerrillas Out There?
Laurie Allen
Page 26

Guerilla
Open
Access:
Terms Of
Struggle

In the 1990s, the Internet offered a horizon from which to imagine what society
could become, promising autonomy and self-organization next to redistribution of
wealth and collectivized means of production. While the former was in line with the
dominant ideology of freedom, the latter ran contrary to the expanding enclosures
in capitalist globalization. This antagonism has led to epochal copyfights, where free
software and piracy kept the promise of radical commoning alive.
Free software, as Christopher Kelty writes in this pamphlet, provided a model ‘of a
shared, collective, process of making software, hardware and infrastructures that
cannot be appropriated by others’. Well into the 2000s, it served as an inspiration
for global free culture and open access movements who were speculating that
distributed infrastructures of knowledge production could be built, as the Internet
was, on top of free software.
For a moment, the hybrid world of ad-financed Internet giants—sharing code,
advocating open standards and interoperability—and users empowered by these
services, convinced almost everyone that a new reading/writing culture was
possible. Not long after the crash of 2008, these disruptors, now wary monopolists,
began to ingest smaller disruptors and close off their platforms. There was still
free software somewhere underneath, but without the ‘original sense of shared,
collective, process’. So, as Kelty suggests, it was hard to imagine that for-profit
academic publishers wouldn't try the same with open access.
Heeding Aaron Swartz’s call to civil disobedience, Guerrilla Open Access has
emerged out of the outrage over digitally-enabled enclosure of knowledge that
has allowed these for-profit academic publishers to appropriate extreme profits
that stand in stark contrast to the cuts, precarity, student debt and asymmetries
of access in education. Shadow libraries stood in for the access denied to public
libraries, drastically reducing global asymmetries in the process.

4

This radicalization of access has changed how publications
travel across time and space. Digital archiving, cataloging and
sharing is transforming what we once considered as private
libraries. Amateur librarianship is becoming public shadow
librarianship. Hybrid use, as poetically unpacked in Balazs
Bodo's reflection on his own personal library, is now entangling
print and digital in novel ways. And, as he warns, the terrain
of antagonism is shifting. While for-profit publishers are
seemingly conceding to Guerrilla Open Access, they are
opening new territories: platforms centralizing data, metrics
and workflows, subsuming academic autonomy into new
processes of value extraction.
The 2010s brought us hope and then realization how little
digital networks could help revolutionary movements. The
redistribution toward the wealthy, assisted by digitization, has
eroded institutions of solidarity. The embrace of privilege—
marked by misogyny, racism and xenophobia—this has catalyzed
is nowhere more evident than in the climate denialism of the
Trump administration. Guerrilla archiving of US government
climate change datasets, as recounted by Laurie Allen,
indicates that more technological innovation simply won't do
away with the 'post-truth' and that our institutions might be in
need of revision, replacement and repair.
As the contributions to this pamphlet indicate, the terms
of struggle have shifted: not only do we have to continue
defending our shadow libraries, but we need to take back the
autonomy of knowledge production and rebuild institutional
grounds of solidarity.

Memory of the World
http://memoryoftheworld.org

5

Recursive
Publics and
Open Access

Christopher
Kelty

Ten years ago, I published a book calledTwo Bits: The Cultural Significance of Free
Software (Kelty 2008).1 Duke University Press and my editor Ken Wissoker were
enthusiastically accommodating of my demands to make the book freely and openly
available. They also played along with my desire to release the 'source code' of the
book (i.e. HTML files of the chapters), and to compare the data on readers of the
open version to print customers. It was a moment of exploration for both scholarly
presses and for me. At the time, few authors were doing this other than Yochai Benkler
(2007) and Cory Doctorow2, both activists and advocates for free software and open
access (OA), much as I have been. We all shared, I think, a certain fanaticism of the
convert that came from recognizing free software as an historically new, and radically
different mode of organizing economic and political activity. Two Bits gave me a way
to talk not only about free software, but about OA and the politics of the university
(Kelty et al. 2008; Kelty 2014). Ten years later, I admit to a certain pessimism at the
way things have turned out. The promise of free software has foundered, though not
disappeared, and the question of what it means to achieve the goals of OA has been
swamped by concerns about costs, arcane details of repositories and versioning, and
ritual offerings to the metrics God.
When I wrote Two Bits, it was obvious to me that the collectives who built free
software were essential to the very structure and operation of a standardized
Internet. Today, free software and 'open source' refer to dramatically different
constellations of practice and people. Free software gathers around itself those
committed to the original sense of a shared, collective, process of making software,
hardware and infrastructures that cannot be appropriated by others. In political
terms, I have always identified free software with a very specific, updated, version
of classical Millian liberalism. It sustains a belief in the capacity for collective action
and rational thought as aids to establishing a flourishing human livelihood. Yet it
also preserves an outdated blind faith in the automatic functioning of meritorious
speech, that the best ideas will inevitably rise to the top. It is an updated classical
liberalism that saw in software and networks a new place to resist the tyranny of the
conventional and the taken for granted.

6

Christopher Kelty

By contrast, open source has come to mean something quite different: an ecosystem
controlled by an oligopoly of firms which maintains a shared pool of components and
frameworks that lower the costs of education, training, and software creation in the
service of establishing winner-take-all platforms. These are built on open source, but
they do not carry the principles of freedom or openness all the way through to the
platforms themselves.3 What open source has become is now almost the opposite of
free software—it is authoritarian, plutocratic, and nepotistic, everything liberalism
wanted to resist. For example, precarious labor and platforms such as Uber or Task
Rabbit are built upon and rely on the fruits of the labor of 'open source', but the
platforms that result do not follow the same principles—they are not open or free
in any meaningful sense—to say nothing of the Uber drivers or task rabbits who live
by the platforms.
Does OA face the same problem? In part, my desire to 'free the source' of my book
grew out of the unfinished business of digitizing the scholarly record. It is an irony
that much of the work that went into designing the Internet at its outset in the
1980s, such as gopher, WAIS, and the HTML of CERN, was conducted in the name
of the digital transformation of the library. But by 2007, these aims were swamped
by attempts to transform the Internet into a giant factory of data extraction. Even
in 2006-7 it was clear that this unfinished business of digitizing the scholarly record
was going to become a problem—both because it was being overshadowed by other
concerns, and because of the danger it would eventually be subjected to the very
platformization underway in other realms.
Because if the platform capitalism of today has ended up being parasitic on the
free software that enabled it, then why would this not also be true of scholarship
more generally? Are we not witnessing a transition to a world where scholarship
is directed—in its very content and organization—towards the profitability of the
platforms that ostensibly serve it?4 Is it not possible that the platforms created to
'serve science'—Elsevier's increasing acquisition of tools to control the entire lifecycle of research, or ResearchGate's ambition to become the single source for all
academics to network and share research—that these platforms might actually end up
warping the very content of scholarly production in the service of their profitability?
To put this even more clearly: OA has come to exist and scholarship is more available
and more widely distributed than ever before. But, scholars now have less control,
and have taken less responsibility for the means of production of scientific research,
its circulation, and perhaps even the content of that science.

Recursive Publics and Open Access

7

The Method of Modulation
When I wrote Two Bits I organized the argument around the idea of modulation:
free software is simply one assemblage of technologies, practices, and people
aimed at resolving certain problems regarding the relationship between knowledge
(or software tools related to knowledge) and power (Hacking 2004; Rabinow
2003). Free software as such was and still is changing as each of its elements
evolve or are recombined. Because OA derives some of its practices directly from
free software, it is possible to observe how these different elements have been
worked over in the recent past, as well as how new and surprising elements are
combined with OA to transform it. Looking back on the elements I identified as
central to free software, one can ask: how is OA different, and what new elements
are modulating it into something possibly unrecognizable?

Sharing source code
Shareable source code was a concrete and necessary achievement for free
software to be possible. Similarly, the necessary ability to circulate digital texts
is a significant achievement—but such texts are shareable in a much different way.
For source code, computable streams of text are everything—anything else is a
'blob' like an image, a video or any binary file. But scholarly texts are blobs: Word or
Portable Document Format (PDF) files. What's more, while software programmers
may love 'source code', academics generally hate it—anything less than the final,
typeset version is considered unfinished (see e.g. the endless disputes over
'author's final versions' plaguing OA).5 Finality is important. Modifiability of a text,
especially in the humanities and social sciences, is acceptable only when it is an
experiment of some kind.
In a sense, the source code of science is not a code at all, but a more abstract set
of relations between concepts, theories, tools, methods, and the disciplines and
networks of people who operate with them, critique them, extend them and try to
maintain control over them even as they are shared within these communities.

avoid the waste of 'reinventing the wheel' and of pathological
competition, allowing instead modular, reusable parts that
could be modified and recombined to build better things in an
upward spiral of innovation. The 1980s ideas of modularity,
modifiability, abstraction barriers, interchangeable units
have been essential to the creation of digital infrastructures.
To propose an 'open science' thus modulates this definition—
and the idea works in some sciences better than others.
Aside from the obviously different commercial contexts,
philosophers and literary theorists just don't think about
openness this way—theories and arguments may be used
as building blocks, but they are not modular in quite the
same way. Only the free circulation of the work, whether
for recombination or for reference and critique, remains a
sine qua non of the theory of openness proposed there. It
is opposed to a system where it is explicit that only certain
people have access to the texts (whether that be through
limitations of secrecy, or limitations on intellectual property,
or an implicit elitism).

Writing and using copyright licenses
Of all the components of free software that I analyzed, this
is the one practice that remains the least transformed—OA
texts use the same CC licenses pioneered in 2001, which
were a direct descendant of free software licenses.

For free software to make sense as a solution, those involved first had to
characterize the problem it solved—and they did so by identifying a pathology in
the worlds of corporate capitalism and engineering in the 1980s: that computer
corporations were closed organizations who re-invented basic tools and
infrastructures in a race to dominate a market. An 'open system,' by contrast, would

A novel modulation of these licenses is the OA policies (the
embrace of OA in Brazil for instance, or the spread of OA
Policies starting with Harvard and the University of California,
and extending to the EU Mandate from 2008 forward). Today
the ability to control the circulation of a text with IP rights is
far less economically central to the strategies of publishers
than it was in 2007, even if they persist in attempting to do
so. At the same time, funders, states, and universities have all
adopted patchwork policies intended to both sustain green
OA, and push publishers to innovate their own business
models in gold and hybrid OA. While green OA is a significant
success on paper, the actual use of it to circulate work pales

8

Recursive Publics and Open Access

Defining openness

Christopher Kelty

9

in comparison to the commercial control of circulation on the
one hand, and the increasing success of shadow libraries on
the other. Repositories have sprung up in every shape and
form, but they remain largely ad hoc, poorly coordinated, and
underfunded solutions to the problem of OA.

Coordinating collaborations
The collective activity of free software is ultimately the
most significant of its achievements—marrying a form of
intensive small-scale interaction amongst programmers,
with sophisticated software for managing complex objects
(version control and GitHub-like sites). There has been
constant innovation in these tools for controlling, measuring,
testing, and maintaining software.
By contrast, the collective activity of scholarship is still
largely a pre-modern affair. It is coordinated largely by the
idea of 'writing an article together' and not by working
to maintain some larger map of what a research topic,
community, or discipline has explored—what has worked and
what has not.
This focus on the coordination of collaboration seemed to
me to be one of the key advantages of free software, but it
has turned out to be almost totally absent from the practice
or discussion of OA. Collaboration and the recombination of
elements of scholarly practice obviously happens, but it does
not depend on OA in any systematic way: there is only the
counterfactual that without it, many different kinds of people
are excluded from collaboration or even simple participation
in, scholarship, something that most active scholars are
willfully ignorant of.

Fomenting a movement
I demoted the idea of a social movement to merely one
component of the success of free software, rather than let
it be—as most social scientists would have it—the principal
container for free software. They are not the whole story.

10

Christopher Kelty

Is there an OA movement? Yes and no. Librarians remain
the most activist and organized. The handful of academics
who care about it have shifted to caring about it in primarily
a bureaucratic sense, forsaking the cross-organizational
aspects of a movement in favor of activism within universities
(to which I plead guilty). But this transformation forsakes
the need for addressing the collective, collaborative
responsibility for scholarship in favor of letting individual
academics, departments, and disciplines be the focus for
such debates.
By contrast, the publishing industry works with a
phantasmatic idea of both an OA 'movement' and of the actual
practices of scholarship—they too defer, in speech if not in
practice, to the academics themselves, but at the same time
must create tools, innovate processes, establish procedures,
acquire tools and companies and so on in an effort to capture
these phantasms and to prevent academics from collectively
doing so on their own.
And what new components? The five above were central to
free software, but OA has other components that are arguably
more important to its organization and transformation.

Money, i.e. library budgets
Central to almost all of the politics and debates about OA
is the political economy of publication. From the 'bundles'
debates of the 1990s to the gold/green debates of the 2010s,
the sole source of money for publication long ago shifted into
the library budget. The relationship that library budgets
have to other parts of the political economy of research
(funding for research itself, debates about tenured/nontenured, adjunct and other temporary salary structures) has
shifted as a result of the demand for OA, leading libraries
to re-conceptualize themselves as potential publishers, and
publishers to re-conceptualize themselves as serving 'life
cycles' or 'pipeline' of research, not just its dissemination.

Recursive Publics and Open Access

11

Metrics
More than anything, OA is promoted as a way to continue
to feed the metrics God. OA means more citations, more
easily computable data, and more visible uses and re-uses of
publications (as well as 'open data' itself, when conceived of
as product and not measure). The innovations in the world
of metrics—from the quiet expansion of the platforms of the
publishers, to the invention of 'alt metrics', to the enthusiasm
of 'open science' for metrics-driven scientific methods—forms
a core feature of what 'OA' is today, in a way that was not true
of free software before it, where metrics concerning users,
downloads, commits, or lines of code were always after-thefact measures of quality, and not constitutive ones.
Other components of this sort might be proposed, but the
main point is to resist to clutch OA as if it were the beating
heart of a social transformation in science, as if it were a
thing that must exist, rather than a configuration of elements
at a moment in time. OA was a solution—but it is too easy to
lose sight of the problem.
Open Access without Recursive Publics
When we no longer have any commons, but only platforms,
will we still have knowledge as we know it? This is a question
at the heart of research in the philosophy and sociology
of knowledge—not just a concern for activism or social
movements. If knowledge is socially produced and maintained,
then the nature of the social bond surely matters to the
nature of that knowledge. This is not so different than asking
whether we will still have labor or work, as we have long known
it, in an age of precarity? What is the knowledge equivalent of
precarity (i.e. not just the existence of precarious knowledge
workers, but a kind of precarious knowledge as such)?

knowledge and power is shifting dramatically, because the costs—and the stakes—
of producing high quality, authoritative knowledge have also shifted. It is not so
powerful any longer; science does not speak truth to power because truth is no
longer so obviously important to power.
Although this is a pessimistic portrait, it may also be a sign of something yet to
come. Free software as a community, has been and still sometimes is critiqued as
being an exclusionary space of white male sociality (Nafus 2012; Massanari 2016;
Ford and Wajcman 2017; Reagle 2013). I think this critique is true, but it is less a
problem of identity than it is a pathology of a certain form of liberalism: a form that
demands that merit consists only in the content of the things we say (whether in
a political argument, a scientific paper, or a piece of code), and not in the ways we
say them, or who is encouraged to say them and who is encouraged to remain silent
(Dunbar-Hester 2014).
One might, as a result, choose to throw out liberalism altogether as a broken
philosophy of governance and liberation. But it might also be an opportunity to
focus much more specifically on a particular problem of liberalism, one that the
discourse of OA also relies on to a large extent. Perhaps it is not the case that
merit derives solely from the content of utterances freely and openly circulated,
but also from the ways in which they are uttered, and the dignity of the people
who utter them. An OA (or a free software) that embraced that principle would
demand that we pay attention to different problems: how are our platforms,
infrastructures, tools organized and built to support not just the circulation of
putatively true statements, but the ability to say them in situated and particular
ways, with respect for the dignity of who is saying them, and with the freedom to
explore the limits of that kind of liberalism, should we be so lucky to achieve it.

Do we not already see the evidence of this in the 'posttruth' of fake news, or the deliberate refusal by those in
power to countenance evidence, truth, or established
systems of argument and debate? The relationship between

12

Christopher Kelty

Recursive Publics and Open Access

13

References

¹ https://twobits.net/download/index.html

Benkler, Yochai. 2007. The Wealth of Networks: How Social Production Transforms Markets
and Freedom. Yale University Press.
Dunbar-Hester, Christina. 2014. Low Power to the People: Pirates, Protest, and Politics in
FM Radio Activism. MIT Press.
Ford, Heather, and Judy Wajcman. 2017. “‘Anyone Can Edit’, Not Everyone Does:
Wikipedia’s Infrastructure and the Gender Gap”. Social Studies of Science 47 (4):
511–527. doi:10.1177/0306312717692172.
Hacking, I. 2004. Historical Ontology. Harvard University Press.
Kelty, Christopher M. 2014. “Beyond Copyright and Technology: What Open Access Can
Tell Us About Precarity, Authority, Innovation, and Automation in the University
Today”. Cultural Anthropology 29 (2): 203–215. doi:10.14506/ca29.2.02.
——— . 2008. Two Bits: The Cultural Significance of Free Software. Durham, N.C.: Duke
University Press.
Kelty, Christopher M., et al. 2008. “Anthropology In/of Circulation: a Discussion”. Cultural
Anthropology 23 (3).
Massanari, Adrienne. 2016. “#gamergate and the Fappening: How Reddit’s Algorithm,
Governance, and Culture Support Toxic Technocultures”. New Media & Society 19 (3):
329–346. doi:10.1177/1461444815608807.
Nafus, Dawn. 2012. “‘Patches don’t have gender’: What is not open in open source
software”. New Media & Society 14, no. 4: 669–683. Visited on 04/01/2014. http://
doi:10.1177/1461444811422887.
Rabinow, Paul. 2003. Anthropos Today: Reflections on Modern Equipment. Princeton
University Press.
Reagle, Joseph. 2013. “"Free As in Sexist?" Free Culture and the Gender Gap”. First
Monday 18 (1). doi:10.5210/fm.v18i1.4291.

² https://craphound.com/

³ For example, Platform Cooperativism
https://platform.coop/directory

See for example the figure from ’Rent
Seeking by Elsevier,’ by Alejandro Posada
and George Chen (http://knowledgegap.
org/index.php/sub-projects/rent-seekingand-financialization-of-the-academicpublishing-industr preliminary-findings/)
4

See Sherpa/Romeo
http://www.sherpa.ac.uk/romeo/index.php
5

14

Christopher Kelty

Recursive Publics and Open Access

15

Own
Nothing

the contexts we were fleeing from. We made a choice to leave
behind the history, the discourses, the problems and the pain
that accumulated in the books of our library. I knew exactly
what it was I didn’t want to teach to my children once we moved.
So we did not move the books. We pretended that we would
never have to think about what this decision really meant. Up
until today. This year we needed to empty the study with the
shelves. So I’m standing in our library now, the dust covering
my face, my hands, my clothes. In the middle of the floor there
are three big crates and one small box. The small box swallows
what we’ll ultimately take with us, the books I want to show to
my son when he gets older, in case he still wants to read. One of
the big crates will be taken away by the antiquarian. The other
will be given to the school library next door. The third is the
wastebasket, where everything else will ultimately go.

Balazs
Bodo

Flow My Tears
My tears cut deep grooves into the dust on my face. Drip, drip,
drop, they hit the floor and disappear among the torn pages
scattered on the floor.
This year it dawned on us that we cannot postpone it any longer:
our personal library has to go. Our family moved countries
more than half a decade ago, we switched cultures, languages,
and chose another future. But the past, in the form of a few
thousand books in our personal library, was still neatly stacked
in our old apartment, patiently waiting, books that we bought
and enjoyed — and forgot; books that we bought and never
opened; books that we inherited from long-dead parents and
half-forgotten friends. Some of them were important. Others
were relevant at one point but no longer, yet they still reminded
us who we once were.
When we moved, we took no more than two suitcases of personal
belongings. The books were left behind. The library was like
a sick child or an ailing parent, it hung over our heads like an
unspoken threat, a curse. It was clear that sooner or later
something had to be done about it, but none of the options
available offered any consolation. It made no sense to move
three thousand books to the other side of this continent. We
decided to emigrate, and not to take our past with us, abandon

16

Balazs Bodo

Drip, drip, drip, my tears flow as I throw the books into this
last crate, drip, drip, drop. Sometimes I look at my partner,
working next to me, and I can see on her face that she is going
through the same emotions. I sometimes catch the sight of
her trembling hand, hesitating for a split second where a book
should ultimately go, whether we could, whether we should
save that particular one, because… But we either save them all
or we are as ruthless as all those millions of people throughout
history, who had an hour to pack their two suitcases before they
needed to leave. Do we truly need this book? Is this a book we’ll
want to read? Is this book an inseparable part of our identity?
Did we miss this book at all in the last five years? Is this a text
I want to preserve for the future, for potential grandchildren
who may not speak my mother tongue at all? What is the function
of the book? What is the function of this particular book in my
life? Why am I hesitating throwing it out? Why should I hesitate
at all? Drop, drop, drop, a decision has been made. Drop, drop,
drop, books are falling to the bottom of the crates.
We are killers, gutting our library. We are like the half-drown
sailor, who got entangled in the ropes, and went down with the
ship, and who now frantically tries to cut himself free from the
detritus that prevents him to reach the freedom of the surface,
the sunlight and the air.

Own Nothing

17

advantages of a fully digital book future. What I see now is the emergence of a strange
and shapeshifting-hybrid of diverse physical and electronic objects and practices,
where the relative strengths and weaknesses of these different formats nicely
complement each other.
This dawned on me after we had moved into an apartment without a bookshelf. I grew
up in a flat that housed my parents’ extensive book collection. I knew the books by their
cover and from time to time something made me want to take it from the shelf, open
it and read it. This is how I discovered many of my favorite books and writers. With
the e-reader, and some of the best shadow libraries at hand, I felt the same at first. I
felt liberated. I could experiment without cost or risk, I could start—or stop—a book,
I didn’t have to consider the cost of buying and storing a book that was ultimately
not meant for me. I could enjoy the books without having to carry the burden and
responsibility of ownership.

Own Nothing, Have Everything
Do you remember Napster’s slogan after it went legit, trying to transform itself into
a legal music service around 2005? ‘Own nothing, have everything’ – that was the
headline that was supposed to sell legal streaming music. How stupid, I thought. How
could you possibly think that lack of ownership would be a good selling point? What
does it even mean to ‘have everything’ without ownership? And why on earth would
not everyone want to own the most important constituents of their own self, their
own identity? The things I read, the things I sing, make me who I am. Why wouldn’t I
want to own these things?
How revolutionary this idea had been I reflected as I watched the local homeless folks
filling up their sacks with the remains of my library. How happy I would be if I could
have all this stuff I had just thrown away without actually having to own any of it. The
proliferation of digital texts led me to believe that we won’t be needing dead wood
libraries at all, at least no more than we need vinyl to listen to, or collect music. There
might be geeks, collectors, specialists, who for one reason or another still prefer the
physical form to the digital, but for the rest of us convenience, price, searchability, and
all the other digital goodies give enough reason not to collect stuff that collects dust.

Did you notice how deleting an epub file gives you a different feeling than throwing
out a book? You don’t have to feel guilty, you don’t have to feel anything at all.
So I was reading, reading, reading like never before. But at that time my son was too
young to read, so I didn’t have to think about him, or anyone else besides myself. But
as he was growing, it slowly dawned on me: without these physical books how will I be
able to give him the same chance of serendipity, and of discovery, enchantment, and
immersion that I got in my father’s library? And even later, what will I give him as his
heritage? Son, look into this folder of PDFs: this is my legacy, your heritage, explore,
enjoy, take pride in it?
Collections of anything, whether they are art, books, objects, people, are inseparable
from the person who assembled that collection, and when that person is gone, the
collection dies, as does the most important inroad to it: the will that created this
particular order of things has passed away. But the heavy and unavoidable physicality
of a book collection forces all those left behind to make an effort to approach, to
force their way into, and try to navigate that garden of forking paths that is someone
else’s library. Even if you ultimately get rid of everything, you have to introduce
yourself to every book, and let every book introduce itself to you, so you know what
you’re throwing out. Even if you’ll ultimately kill, you will need to look into the eyes of
all your victims.
With a digital collection that’s, of course, not the case.

I was wrong to think that. I now realize that the future is not fully digital, it is more
a physical-digital hybrid, in which the printed book is not simply an endangered
species protected by a few devoted eccentrics who refuse to embrace the obvious

The e-book is ephemeral. It has little past and even less chance to preserve the
fingerprints of its owners over time. It is impersonal, efficient, fast, abundant, like

18

Own Nothing

Balazs Bodo

19

fast food or plastic, it flows through the hand like sand. It lacks the embodiment, the
materiality which would give it a life in a temporal dimension. If you want to network the
dead and the unborn, as is the ambition of every book, then you need to print and bind,
and create heavy objects that are expensive, inefficient and a burden. This burden
subsiding in the object is the bridge that creates the intergenerational dimension,
that forces you to think of the value of a book.
Own nothing, have nothing. Own everything, and your children will hate you when
you die.
I have to say, I’m struggling to find a new balance here. I started to buy books again,
usually books that I’d already read from a stolen copy on-screen. I know what I want
to buy, I know what is worth preserving. I know what I want to show to my son, what
I want to pass on, what I would like to take care of over time. Before, book buying for
me was an investment into a stranger. Now that thrill is gone forever. I measure up
the merchandise well beforehand, I build an intimate relationship, we make love again
and again, before moving in together.
It is certainly a new kind of relationship with the books I bought since I got my e-reader.
I still have to come to terms with the fact that the books I bought this way are rarely
opened, as I already know them, and their role is not to be read, but to be together.
What do I buy, and what do I get? Temporal, existential security? The chance of
serendipity, if not for me, then for the people around me? The reassuring materiality
of the intimacy I built with these texts through another medium?
All of these and maybe more. But in any case, I sense that this library, the physical
embodiment of a physical-electronic hybrid collection with its unopened books and
overflowing e-reader memory cards, is very different from the library I had, and the
library I’m getting rid of at this very moment. The library that I inherited, the library
that grew organically from the detritus of the everyday, the library that accumulated
books similar to how the books accumulated dust, as is the natural way of things, this
library was full of unknowns, it was a library of potentiality, of opportunities, of trips
waiting to happen. This new, hybrid library is a collection of things that I’m familiar with.
I intimately know every piece, they hold little surprise, they offer few discoveries — at
least for me. The exploration, the discovery, the serendipity, the pre-screening takes
place on the e-reader, among the ephemeral, disposable PDFs and epubs.

We Won
This new hybrid model is based on the cheap availability of digital books. In my case, the
free availability of pirated copies available through shadow libraries. These libraries
don’t have everything on offer, but they have books in an order of magnitude larger
than I’ll ever have the time and chance to read, so they offer enough, enough for me
to fill up hard drives with books I want to read, or at least skim, to try, to taste. As if I
moved into an infinite bookstore or library, where I can be as promiscuous, explorative,
nomadic as I always wanted to be. I can flirt with books, I can have a quickie, or I can
leave them behind without shedding a single tear.
I don’t know how this hybrid library, and this analogue-digital hybrid practice of reading
and collecting would work without the shadow libraries which make everything freely
accessible. I rely on their supply to test texts, and feed and grow my print library.
E-books are cheaper than their print versions, but they still cost money, carry a
risk, a cost of experimentation. Book-streaming, the flat-rate, the all-you-can-eat
format of accessing books is at the moment only available to audiobooks, but rarely
for e-books. I wonder why.
Did you notice that there are no major book piracy lawsuits?

Have everything, and own a few.

20

Balazs Bodo

Own Nothing

21

Of course there is the lawsuit against Sci-Hub and Library Genesis in New York, and
there is another one in Canada against aaaaarg, causing major nuisance to those who
have been named in these cases. But this is almost negligible compared to the high
profile wars the music and audiovisual industries waged against Napster, Grokster,
Kazaa, megaupload and their likes. It is as if book publishers have completely given up on
trying to fight piracy in the courts, and have launched a few lawsuits only to maintain
the appearance that they still care about their digital copyrights. I wonder why.
I know the academic publishing industry slightly better than the mainstream popular
fiction market, and I have the feeling that in the former copyright-based business
models are slowly being replaced by something else. We see no major anti-piracy
efforts from publishers, not because piracy is non-existent — on the contrary, it is
global, and it is big — but because the publishers most probably realized that in the
long run the copyright-based exclusivity model is unsustainable. The copyright wars
of the last two decades taught them that law cannot put an end to piracy. As the
Sci-Hub case demonstrates, you can win all you want in a New York court, but this
has little real-world effect as long as the conditions that attract the users to the
shadow libraries remain.
Exclusivity-based publishing business models are under assault from other sides as
well. Mandated open access in the US and in the EU means that there is a quickly
growing body of new research for the access of which publishers cannot charge
money anymore. LibGen and Sci-Hub make it harder to charge for the back catalogue.
Their sheer existence teaches millions on what uncurtailed open access really is, and
makes it easier for university libraries to negotiate with publishers, as they don’t have
to worry about their patrons being left without any access at all.
The good news is that radical open access may well be happening. It is a less and less
radical idea to have things freely accessible. One has to be less and less radical to
achieve the openness that has been long overdue. Maybe it is not yet obvious today
and the victory is not yet universal, maybe it’ll take some extra years, maybe it won’t
ever be evenly distributed, but it is obvious that this genie, these millions of books on
everything from malaria treatments to critical theory, cannot be erased, and open
access will not be undone, and the future will be free of access barriers.

We Are Not Winning at All
But did we really win? If publishers are happy to let go of access control and copyright,
it means that they’ve found something that is even more profitable than selling
back to us academics the content that we have produced. And this more profitable
something is of course data. Did you notice where all the investment in academic
publishing went in the last decade? Did you notice SSRN, Mendeley, Academia.edu,
ScienceDirect, research platforms, citation software, manuscript repositories, library
systems being bought up by the academic publishing industry? All these platforms
and technologies operate on and support open access content, while they generate
data on the creation, distribution, and use of knowledge; on individuals, researchers,
students, and faculty; on institutions, departments, and programs. They produce data
on the performance, on the success and the failure of the whole domain of research
and education. This is the data that is being privatized, enclosed, packaged, and sold
back to us.

Drip, drip, drop, its only nostalgia. My heart is light, as I don’t have to worry about
gutting the library. Soon it won’t matter at all.

Taylorism reached academia. In the name of efficiency, austerity, and transparency,
our daily activities are measured, profiled, packaged, and sold to the highest bidder.
But in this process of quantification, knowledge on ourselves is lost for us, unless we
pay. We still have some patchy datasets on what we do, on who we are, we still have
this blurred reflection in the data-mirrors that we still do control. But this path of
self-enlightenment is quickly waning as less and less data sources about us are freely
available to us.

22

Own Nothing

Who is downloading books and articles? Everyone. Radical open access? We won,
if you like.

Balazs Bodo

23

I strongly believe that information on the self is the foundation
of self-determination. We need to have data on how we operate,
on what we do in order to know who we are. This is what is being
privatized away from the academic community, this is being
taken away from us.
Radical open access. Not of content, but of the data about
ourselves. This is the next challenge. We will digitize every page,
by hand if we must, that process cannot be stopped anymore.
No outside power can stop it and take that from us. Drip, drip,
drop, this is what I console myself with, as another handful of
books land among the waste.
But the data we lose now will not be so easy to reclaim.

24

Balazs Bodo

Own Nothing

25

What if
We Aren't
the Only
Guerrillas
Out
There?
Laurie
Allen

My goal in this paper is to tell the story
of a grass-roots project called Data
Refuge (http://www.datarefuge.org)
that I helped to co-found shortly after,
and in response to, the Trump election
in the USA. Trump’s reputation as
anti-science, and the promise that his
administration would elevate people into
positions of power with a track record
of distorting, hiding, or obscuring the
scientific evidence of climate change
caused widespread concern that
valuable federal data was now in danger.
The Data Refuge project grew from the
work of Professor Bethany Wiggin and
the graduate students within the Penn
Program in Environmental Humanities
(PPEH), notably Patricia Kim, and was
formed in collaboration with the Penn
Libraries, where I work. In this paper, I
will discuss the Data Refuge project, and
call attention to a few of the challenges
inherent in the effort, especially as
they overlap with the goals of this
collective. I am not a scholar. Instead,
I am a librarian, and my perspective as
a practicing informational professional
informs the way I approach this paper,
which weaves together the practical
and technical work of ‘saving data’ with
the theoretical, systemic, and ethical
issues that frame and inform what we
have done.

I work as the head of a relatively small and new department within the libraries
of the University of Pennsylvania, in the city of Philadelphia, Pennsylvania, in the
US. I was hired to lead the Digital Scholarship department in the spring of 2016,
and most of the seven (soon to be eight) people within Digital Scholarship joined
the library since then in newly created positions. Our group includes a mapping
and spatial data librarian and three people focused explicitly on supporting the
creation of new Digital Humanities scholarship. There are also two people in the
department who provide services connected with digital scholarly open access
publishing, including the maintenance of the Penn Libraries’ repository of open
access scholarship, and one Data Curation and Management Librarian. This
Data Librarian, Margaret Janz, started working with us in September 2016, and
features heavily into the story I’m about to tell about our work helping to build Data
Refuge. While Margaret and I were the main people in our department involved in
the project, it is useful to understand the work we did as connected more broadly
to the intersection of activities—from multimodal, digital, humanities creation to
open access publishing across disciplines—represented in our department in Penn.
At the start of Data Refuge, Professor Wiggin and her students had already been
exploring the ways that data about the environment can empower communities
through their art, activism, and research, especially along the lower Schuylkill
River in Philadelphia. They were especially attuned to the ways that missing data,
or data that is not collected or communicated, can be a source of disempowerment.
After the Trump election, PPEH graduate students raised the concern that the
political commitments of the new administration would result in the disappearance
of environmental and climate data that is vital to work in cities and communities
around the world. When they raised this concern with the library, together we cofounded Data Refuge. It is notable to point out that, while the Penn Libraries is a
large and relatively well-resourced research library in the United States, it did not
have any automatic way to ingest and steward the data that Professor Wiggin and
her students were concerned about. Our system of acquiring, storing, describing
and sharing publications did not account for, and could not easily handle, the
evident need to take in large quantities of public data from the open web and make
them available and citable by future scholars. Indeed, no large research library
was positioned to respond to this problem in a systematic way, though there was
general agreement that the community would like to help.
The collaborative, grass-roots movement that formed Data Refuge included many
librarians, archivists, and information professionals, but it was clear from the
beginning that my own profession did not have in place a system for stewarding
these vital information resources, or for treating them as ‘publications’ of the

26

Laurie Allen

What if We Aren't the Only Guerrillas Out There?

27

federal government. This fact was widely understood by various members of our
profession, notably by government document librarians, who had been calling
attention to this lack of infrastructure for years. As Government Information
Librarian Shari Laster described in a blog post in November of 2016, government
documents librarians have often felt like they are ‘under siege’ not from political
forces, but from the inattention to government documents afforded by our systems
and infrastructure. Describing the challenges facing the profession in light of the
2016 election, she commented: “Government documents collections in print are
being discarded, while few institutions are putting strategies in place for collecting
government information in digital formats. These strategies are not expanding in
tandem with the explosive proliferation of these sources, and certainly not in pace
with the changing demands for access from public users, researchers, students,
and more.” (Laster 2016) Beyond government documents librarians, our project
joined efforts that were ongoing in a huge range of communities, including: open
data and open science activists; archival experts working on methods of preserving
born-digital content; cultural historians; federal data producers and the archivists
and data scientists they work with; and, of course, scientists.

the scientific record to fight back, in a concrete way, against
an anti-fact establishment. By downloading data and moving
it into the Internet Archive and the Data Refuge repository,
volunteers were actively claiming the importance of accurate
records in maintaining or creating a just society.

This distributed approach to the work of downloading and saving the data
encouraged people to see how they were invested in environmental and scientific
data, and to consider how our government records should be considered the
property of all of us. Attending Data Rescue events was a way for people who value

Of course, access to data need not rely on its inclusion in
a particular repository. As is demonstrated so well in other
contexts, technological methods of sharing files can make
the digital repositories of libraries and archives seem like a
redundant holdover from the past. However, as I will argue
further in this paper, the data that was at risk in Data Refuge
differed in important ways from the contents of what Bodó
refers to as ‘shadow libraries’ (Bodó 2015). For opening
access to copies of journals articles, shadow libraries work
perfectly. However, the value of these shadow libraries relies
on the existence of the widely agreed upon trusted versions.
If in doubt about whether a copy is trustworthy, scholars
can turn to more mainstream copies, if necessary. This was
not the situation we faced building Data Refuge. Instead, we
were often dealing with the sole public, authoritative copy
of a federal dataset and had to assume that, if it were taken
down, there would be no way to check the authenticity of
other copies. The data was not easily pulled out of systems
as the data and the software that contained them were often
inextricably linked. We were dealing with unique, tremendously
valuable, but often difficult-to-untangle datasets rather than
neatly packaged publications. The workflow we established
was designed to privilege authenticity and trustworthiness
over either the speed of the copying or the easy usability of
the resulting data. 2 This extra care around authenticity was
necessary because of the politicized nature of environmental
data that made many people so worried about its removal
after the election. It was important that our project
supported the strongest possible scientific arguments that
could be made with the data we were ‘saving’. That meant
that our copies of the data needed to be citable in scientific
scholarly papers, and that those citations needed to be
able to withstand hostile political forces who claim that the
science of human-caused climate change is ‘uncertain’. It

28

What if We Aren't the Only Guerrillas Out There?

Born from the collaboration between Environmental Humanists and Librarians,
Data Refuge was always an effort both at storytelling and at storing data. During
the first six months of 2017, volunteers across the US (and elsewhere) organized
more than 50 Data Rescue events, with participants numbering in the thousands.
At each event, a group of volunteers used tools created by our collaborators at
the Environmental and Data Governance Initiative (EDGI) (https://envirodatagov.
org/) to support the End of Term Harvest (http://eotarchive.cdlib.org/) project
by identifying seeds from federal websites for web archiving in the Internet
Archive. Simultaneously, more technically advanced volunteers wrote scripts to
pull data out of complex data systems, and packaged that data for longer term
storage in a repository we maintained at datarefuge.org. Still other volunteers
held teach-ins, built profiles of data storytellers, and otherwise engaged in
safeguarding environmental and climate data through community action (see
http://www.ppehlab.org/datarefugepaths). The repository at datarefuge.org that
houses the more difficult data sources has been stewarded by myself and Margaret
Janz through our work at Penn Libraries, but it exists outside the library’s main
technical infrastructure.1

Laurie Allen

29

was easy to imagine in the Autumn of 2016, and even easier
to imagine now, that hostile actors might wish to muddy the
science of climate change by releasing fake data designed
to cast doubt on the science of climate change. For that
reasons, I believe that the unique facts we were seeking
to safeguard in the Data Refuge bear less similarity to the
contents of shadow libraries than they do to news reports
in our current distributed and destabilized mass media
environment. Referring to the ease of publishing ideas on the
open web, Zeynep Tufecki wrote in a recent column, “And
sure, it is a golden age of free speech—if you can believe your
lying eyes. Is that footage you’re watching real? Was it really
filmed where and when it says it was? Is it being shared by altright trolls or a swarm of Russian bots? Was it maybe even
generated with the help of artificial intelligence? (Yes, there
are systems that can create increasingly convincing fake
videos.)” (Tufekci 2018). This was the state we were trying to
avoid when it comes to scientific data, fearing that we might
have the only copy of a given dataset without solid proof that
our copy matched the original.
If US federal websites cease functioning as reliable stewards
of trustworthy scientific data, reproducing their data
without a new model of quality control risks producing the
very censorship that our efforts are supposed to avoid,
and further undermining faith in science. Said another way,
if volunteers duplicated federal data all over the Internet
without a trusted system for ensuring the authenticity of
that data, then as soon as the originals were removed, a sea of
fake copies could easily render the original invisible, and they
would be just as effectively censored. “The most effective
forms of censorship today involve meddling with trust and
attention, not muzzling speech itself.” (Tufekci 2018).
These concerns about the risks of open access to data should
not be understood as capitulation to the current marketdriven approach to scholarly publishing, nor as a call for
continuation of the status quo. Instead, I hope to encourage
continuation of the creative approaches to scholarship
represented in this collective. I also hope the issues raised in

30

Laurie Allen

Data Refuge will serve as a call to take greater responsibility for the systems into
which scholarship flows and the structures of power and assumptions of trust (by
whom, of whom) that scholarship relies on.
While plenty of participants in the Data Refuge community posited scalable
technological approaches to help people trust data, none emerged that were
strong enough to risk further undermining faith in science that a malicious attack
might cause. Instead of focusing on technical solutions that rely on the existing
systems staying roughly as they are, I would like to focus on developing networks
that explore different models of trust in institutions, and that honor the values
of marginalized and indigenous people. For example, in a recent paper, Stacie
Williams and Jarrett Drake describe the detailed decisions they made to establish
and become deserving of trust in supporting the creation of an Archive of Police
Violence in Cleveland (Williams and Drake 2017). The work of Michelle Caswell and
her collaborators on exploring post-custodial archives, and on engaging in radical
empathy in the archives provide great models of the kind of work that I believe is
necessary to establish new models of trust that might help inform new modes of
sharing and relying on community information (Caswell and Cifor 2016).
Beyond seeking new ways to build trust, it has become clear that new methods
are needed to help filter and contextualize publications. Our current reliance
on a few for-profit companies to filter and rank what we see of the information
landscape has proved to be tremendously harmful for the dissemination of facts,
and has been especially dangerous to marginalized communities (Noble 2018).
While the world of scholarly humanities publishing is doing somewhat better than
open data or mass media, there is still a risk that without new forms of filtering and
establishing quality and trustworthiness, good ideas and important scholarship will
be lost in the rankings of search engines and the algorithms of social media. We
need new, large scale systems to help people filter and rank the information on the
open web. In our current situation, according to media theorist dana boyd, “[t]he
onus is on the public to interpret what they see. To self-investigate. Since we live
in a neoliberal society that prioritizes individual agency, we double down on media
literacy as the ‘solution’ to misinformation. It’s up to each of us as individuals to
decide for ourselves whether or not what we’re getting is true.” (boyd 2018)
In closing, I’ll return to the notion of Guerrilla warfare that brought this panel
together. While some of our collaborators and some in the press did use the term
‘Guerrilla archiving’ to describe the data rescue efforts (Currie and Paris 2017),
I generally did not. The work we did was indeed designed to take advantage of
tactics that allow a small number of actors to resist giant state power. However,

What if We Aren't the Only Guerrillas Out There?

31

if anything, the most direct target of these guerrilla actions in my mind was not
the Trump administration. Instead, the action was designed to prompt responses
by the institutions where many of us work and by communities of scholars and
activists who make up these institutions. It was designed to get as many people as
possible working to address the complex issues raised by the two interconnected
challenges that the Data Refuge project threw into relief. The first challenge,
of course, is the need for new scientific, artistic, scholarly and narrative ways of
contending with the reality of global, human-made climate change. And the second
challenge, as I’ve argued in this paper, is that our systems of establishing and
signaling trustworthiness, quality, reliability and stability of information are in dire
need of creative intervention as well. It is not just publishing but all of our systems
for discovering, sharing, acquiring, describing and storing that scholarship that
need support, maintenance, repair, and perhaps in some cases, replacement. And
this work will rely on scholars, as well as expert information practitioners from a
range of fields (Caswell 2016).

¹ At the time of this writing, we are working
on un-packing and repackaging the data
within Data Refuge for eventual inclusion
in various Research Library Repositories.

Ideally, of course, all federally produced
datasets would be published in neatly
packaged and more easily preservable
containers, along with enough technical
checks to ensure their validity (hashes,
checksums, etc.) and each agency would
create a periodical published inventory of
datasets. But the situation we encountered
with Data Refuge did not start us in
anything like that situation, despite the
hugely successful and important work of
the employees who created and maintained
data.gov. For a fuller view of this workflow,
see my talk at CSVConf 2017 (Allen 2017).

2

Closing note: The workflow established and used at Data Rescue events was
designed to tackle this set of difficult issues, but needed refinement, and was retired
in mid-2017. The Data Refuge project continues, led by Professor Wiggin and her
colleagues and students at PPEH, who are “building a storybank to document
how data lives in the world – and how it connects people, places, and non-human
species.” (“DataRefuge” n.d.) In addition, the set of issues raised by Data Refuge
continue to inform my work and the work of many of our collaborators.

32

Laurie Allen

What if We Aren't the Only Guerrillas Out There?

33

References
Allen, Laurie. 2017. “Contexts and Institutions.” Paper presented at csv,conf,v3, Portland,
Oregon, May 3rd 2017. Accessed May 20, 2018. https://youtu.be/V2gwi0CRYto.
Bodo, Balazs. 2015. “Libraries in the Post - Scarcity Era.” In Copyrighting Creativity:
Creative Values, Cultural Heritage Institutions and Systems of Intellectual Property,
edited by Porsdam. Routledge.
boyd, danah. 2018. “You Think You Want Media Literacy… Do You?” Data & Society: Points.
March 9, 2018. https://points.datasociety.net/you-think-you-want-media-literacy-doyou-7cad6af18ec2.
Caswell, Michelle. 2016. “‘The Archive’ Is Not an Archives: On Acknowledging the
Intellectual Contributions of Archival Studies.” Reconstruction: Studies in
Contemporary Culture 16:1 (2016) (special issue “Archives on Fire”),
http://reconstruction.eserver.org/Issues/161/Caswell.shtml.
Caswell, Michelle, and Marika Cifor. 2016. “From Human Rights to Feminist Ethics: Radical
Empathy in the Archives.” Archivaria 82 (0): 23–43.
Currie, Morgan, and Britt Paris. 2017. “How the ‘Guerrilla Archivists’ Saved History – and
Are Doing It Again under Trump.” The Conversation (blog). February 21, 2017.
https://theconversation.com/how-the-guerrilla-archivists-saved-history-and-aredoing-it-again-under-trump-72346.
“DataRefuge.” n.d. PPEH Lab. Accessed May 21, 2018.
http://www.ppehlab.org/datarefuge/.
“DataRescue Paths.” n.d. PPEH Lab. Accessed May 20, 2018.
http://www.ppehlab.org/datarefugepaths/.
“End of Term Web Archive: U.S. Government Websites.” n.d. Accessed May 20, 2018.
http://eotarchive.cdlib.org/.
“Environmental Data and Governance Initiative.” n.d. EDGI. Accessed May 19, 2018.
https://envirodatagov.org/.
Laster, Shari. 2016. “After the Election: Libraries, Librarians, and the Government - Free
Government Information (FGI).” Free Government Information (FGI). November 23,
2016. https://freegovinfo.info/node/11451.
Noble, Safiya Umoja. 2018. Algorithms of Oppression: How Search Engines Reinforce
Racism. New York: NYU Press.
Tufekci, Zeynep. 2018. “It’s the (Democracy-Poisoning) Golden Age of Free Speech.”
WIRED. Accessed May 20, 2018.
https://www.wired.com/story/free-speech-issue-tech-turmoil-new-censorship/.
“Welcome - Data Refuge.” n.d. Accessed May 20, 2018. https://www.datarefuge.org/.
Williams, Stacie M, and Jarrett Drake. 2017. “Power to the People: Documenting Police
Violence in Cleveland.” Journal of Critical Library and Information Studies 1 (2).
https://doi.org/10.24242/jclis.v1i2.33.

34

Laurie Allen

Guerrilla
Open
Access


 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.