USDC
Complaint: Elsevier v. SciHub and LibGen
2015


Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 1 of 16

UNITED STATES DISTRICT COURT
SOUTHERN DISTRICT OF NEW YORK

Index No. 15-cv-4282 (RWS)
COMPLAINT

ELSEVIER INC., ELSEVIER B.V., ELSEVIER LTD.
Plaintiffs,

v.

SCI-HUB d/b/a WWW.SCI-HUB.ORG, THE LIBRARY GENESIS PROJECT d/b/a LIBGEN.ORG, ALEXANDRA ELBAKYAN, JOHN DOES 1-99,
Defendants.

Plaintiffs Elsevier Inc, Elsevier B.V., and Elsevier Ltd. (collectively “Elsevier”),
by their attorneys DeVore & DeMarco LLP, for their complaint against www.scihub.org,
www.libgen.org, Alexandra Elbakyan, and John Does 1-99 (collectively the “Defendants”),
allege as follows:

NATURE OF THE ACTION

1. This is a civil action seeking damages and injunctive relief for: (1) copyright infringement under the copyright laws of the United States (17 U.S.C. § 101 et seq.); and (2) violations of the Computer Fraud and Abuse Act, 18.U.S.C. § 1030, based upon Defendants’ unlawful access to, use, reproduction, and distribution of Elsevier’s copyrighted works. Defendants’ actions in this regard have caused and continue to cause irreparable injury to Elsevier and its publishing partners (including scholarly societies) for which it publishes certain journals.

1

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 2 of 16

PARTIES

2. Plaintiff Elsevier Inc. is a corporation organized under the laws of Delaware, with its principal place of business at 360 Park Avenue South, New York, New York 10010.

3. Plaintiff Elsevier B.V. is a corporation organized under the laws of the Netherlands, with its principal place of business at Radarweg 29, Amsterdam, 1043 NX, Netherlands.

4. Plaintiff Elsevier Ltd. is a corporation organized under the laws of the United Kingdom, with its principal place of business at 125 London Wall, EC2Y 5AS United Kingdom.

5. Upon information and belief, Defendant Sci-Hub is an individual or organization engaged in the operation of the website accessible at the URL “www.sci-hub.org,” and related subdomains, including but not limited to the subdomain “www.sciencedirect.com.sci-hub.org,”
www.elsevier.com.sci-hub.org,” “store.elsevier.com.sci-hub.org,” and various subdomains
incorporating the company and product names of other major global publishers (collectively with www.sci-hub.org the “Sci-Hub Website”). The sci-hub.org domain name is registered by
“Fundacion Private Whois,” located in Panama City, Panama, to an unknown registrant. As of
the date of this filing, the Sci-Hub Website is assigned the IP address 31.184.194.81. This IP address is part of a range of IP addresses assigned to Petersburg Internet Network Ltd., a webhosting company located in Saint Petersburg, Russia.

6. Upon information and belief, Defendant Library Genesis Project is an organization which operates an online repository of copyrighted materials accessible through the website located at the URL “libgen.org” as well as a number of other “mirror” websites
(collectively the “Libgen Domains”). The libgen.org domain is registered by “Whois Privacy
Corp.,” located at Ocean Centre, Montagu Foreshore, East Bay Street, Nassau, New Providence,

2

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 3 of 16

Bahamas, to an unknown registrant. As of the date of this filing, libgen.org is assigned the IP address 93.174.95.71. This IP address is part of a range of IP addresses assigned to Ecatel Ltd., a web-hosting company located in Amsterdam, the Netherlands.

7. The Libgen Domains include “elibgen.org,” “libgen.info,” “lib.estrorecollege.org,” and “bookfi.org.”

8. Upon information and belief, Defendant Alexandra Elbakyan is the principal owner and/or operator of Sci-Hub. Upon information and belief, Elbakyan is a resident of Almaty, Kazakhstan.

9. Elsevier is unaware of the true names and capacities of the individuals named as Does 1-99 in this Complaint (together with Alexandra Elbakyan, the “Individual Defendants”),
and their residence and citizenship is also unknown. Elsevier will amend its Complaint to allege the names, capacities, residence and citizenship of the Doe Defendants when their identities are learned.

10. Upon information and belief, the Individual Defendants are the owners and operators of numerous of websites, including Sci-Hub and the websites located at the various
Libgen Domains, and a number of e-mail addresses and accounts at issue in this case.

11. The Individual Defendants have participated, exercised control over, and benefited from the infringing conduct described herein, which has resulted in substantial harm to
the Plaintiffs.

JURISDICTION AND VENUE

12. This is a civil action arising from the Defendants’ violations of the copyright laws of the United States (17 U.S.C. § 101 et seq.) and the Computer Fraud and Abuse Act (“CFAA”),

3

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 4 of 16

18.U.S.C. § 1030. Therefore, the Court has subject matter jurisdiction over this action pursuant to 28 U.S.C. § 1331.

13. Upon information and belief, the Individual Defendants own and operate computers and Internet websites and engage in conduct that injures Plaintiff in this district, while
also utilizing instrumentalities located in the Southern District of New York to carry out the acts complained of herein.

14. Defendants have affirmatively directed actions at the Southern District of New York by utilizing computer servers located in the District without authorization and by
unlawfully obtaining access credentials belonging to individuals and entities located in the
District, in order to unlawfully access, copy, and distribute Elsevier's copyrighted materials
which are stored on Elsevier’s ScienceDirect platform.
15.

Defendants have committed the acts complained of herein through unauthorized

access to Plaintiffs’ copyrighted materials which are stored and maintained on computer servers
located in the Southern District of New York.
16.

Defendants have undertaken the acts complained of herein with knowledge that

such acts would cause harm to Plaintiffs and their customers in both the Southern District of
New York and elsewhere. Defendants have caused the Plaintiff injury while deriving revenue
from interstate or international commerce by committing the acts complained of herein.
Therefore, this Court has personal jurisdiction over Defendants.
17.

Venue in this District is proper under 28 U.S.C. § 1391(b) because a substantial

part of the events giving rise to Plaintiffs’ claims occurred in this District and because the
property that is the subject of Plaintiffs’ claims is situated in this District.

4

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 5 of 16

FACTUAL ALLEGATIONS
Elsevier’s Copyrights in Publications on ScienceDirect
18.

Elsevier is a world leading provider of professional information solutions in the

Science, Medical, and Health sectors. Elsevier publishes, markets, sells, and licenses academic
textbooks, journals, and examinations in the fields of science, medicine, and health. The
majority of Elsevier’s institutional customers are universities, governmental entities, educational
institutions, and hospitals that purchase physical and electronic copies of Elsevier’s products and
access to Elsevier’s digital libraries. Elsevier distributes its scientific journal articles and book
chapters electronically via its proprietary subscription database “ScienceDirect”
(www.sciencedirect.com). In most cases, Elsevier holds the copyright and/or exclusive
distribution rights to the works available through ScienceDirect. In addition, Elsevier holds
trademark rights in “Elsevier,” “ScienceDirect,” and several other related trade names.
19.

The ScienceDirect database is home to almost one-quarter of the world's peer-

reviewed, full-text scientific, technical and medical content. The ScienceDirect service features
sophisticated search and retrieval tools for students and professionals which facilitates access to
over 10 million copyrighted publications. More than 15 million researchers, health care
professionals, teachers, students, and information professionals around the globe rely on
ScienceDirect as a trusted source of nearly 2,500 journals and more than 26,000 book titles.
20.

Authorized users are provided access to the ScienceDirect platform by way of

non-exclusive, non-transferable subscriptions between Elsevier and its institutional customers.
According to the terms and conditions of these subscriptions, authorized users of ScienceDirect
must be users affiliated with the subscriber (e.g., full-time and part-time students, faculty, staff

5

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 6 of 16

and researchers of subscriber universities and individuals using computer terminals within the
library facilities at the subscriber for personal research, education or other non-corporate use.)
21.

A substantial portion of American research universities maintain active

subscriptions to ScienceDirect. These subscriptions, under license, allow the universities to
provide their faculty and students access to the copyrighted works within the ScienceDirect
database.
22.

Elsevier stores and maintains the copyrighted material available in ScienceDirect

on servers owned and operated by a third party whose servers are located in the Southern District
of New York and elsewhere. In order to optimize performance, these third-party servers
collectively operate as a distributed network which serves cached copies of Elsevier’s
copyrighted materials by way of particular servers that are geographically close to the user. For
example, a user that accesses ScienceDirect from a University located in the Southern District of
New York will likely be served that content from a server physically located in the District.

Authentication of Authorized University ScienceDirect Users
23.

Elsevier maintains the integrity and security of the copyrighted works accessible

on ScienceDirect by allowing only authenticated users access to the platform. Elsevier
authenticates educational users who access ScienceDirect through their affiliated university’s
subscription by verifying that they are able to access ScienceDirect from a computer system or
network previously identified as belonging to a subscribing university.
24.

Elsevier does not track individual educational users’ access to ScienceDirect.

Instead, Elsevier verifies only that the user has authenticated access to a subscribing university.
25.

Once an educational user authenticates his computer with ScienceDirect on a

university network, that computer is permitted access to ScienceDirect for a limited amount of
6

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 7 of 16

time without re-authenticating. For example, a student could access ScienceDirect from their
laptop while sitting in a university library, then continue to access ScienceDirect using that
laptop from their dorm room later that day. After a specified period of time has passed, however,
a user will have to re-authenticate his or her computer’s access to ScienceDirect by connecting to
the platform through a university network.
26.

As a matter of practice, educational users access university networks, and thereby

authenticate their computers with ScienceDirect, primarily through one of two methods. First,
the user may be physically connected to a university network, for example by taking their
computer to the university’s library. Second, the user may connect remotely to the university’s
network using a proxy connection. Universities offer proxy connections to their students and
faculty so that those users may access university computing resources – including access to
research databases such as ScienceDirect – from remote locations which are unaffiliated with the
university. This practice facilitates the use of ScienceDirect by students and faculty while they
are at home, travelling, or otherwise off-campus.
Defendants’ Unauthorized Access to University Proxy Networks to Facilitate Copyright
Infringement
27.

Upon information and belief, Defendants are reproducing and distributing

unauthorized copies of Elsevier’s copyrighted materials, unlawfully obtained from
ScienceDirect, through Sci-Hub and through various websites affiliated with the Library Genesis
Project. Specifically, Defendants utilize their websites located at sci-hub.org and at the Libgen
Domains to operate an international network of piracy and copyright infringement by
circumventing legal and authorized means of access to the ScienceDirect database. Defendants’
piracy is supported by the persistent intrusion and unauthorized access to the computer networks

7

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 8 of 16

of Elsevier and its institutional subscribers, including universities located in the Southern District
of New York.
28.

Upon information and belief, Defendants have unlawfully obtained and continue

to unlawfully obtain student or faculty access credentials which permit proxy connections to
universities which subscribe to ScienceDirect, and use these credentials to gain unauthorized
access to ScienceDirect.
29.

Upon information and belief, Defendants have used and continue to use such

access credentials to authenticate access to ScienceDirect and, subsequently, to obtain
copyrighted scientific journal articles therefrom without valid authorization.
30.

The Sci-Hub website requires user interaction in order to facilitate its illegal

copyright infringement scheme. Specifically, before a Sci-Hub user can obtain access to
copyrighted scholarly journals, articles, and books that are maintained by ScienceDirect, he must
first perform a search on the Sci-Hub page. A Sci-Hub user may search for content using either
(a) a general keyword-based search, or (b) a journal, article or book identifier (such as a Digital
Object Identifier, PubMed Identifier, or the source URL).
31.

When a user performs a keyword search on Sci-Hub, the website returns a proxied

version of search results from the Google Scholar search database. 1 When a user selects one of
the search results, if the requested content is not available from the Library Genesis Project, SciHub unlawfully retrieves the content from ScienceDirect using the access previously obtained.
Sci-Hub then provides a copy of that article to the requesting user, typically in PDF format. If,
however, the requested content can be found in the Library Genesis Project repository, upon

1

Google Scholar provides its users the capability to search for scholarly literature, but does not provide the
full text of copyrighted scientific journal articles accessible through paid subscription services such as
ScienceDirect. Instead, Google Scholar provides bibliographic information concerning such articles along with a
link to the platform through which the article may be purchased or accessed by a subscriber.

8

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 9 of 16

information and belief, Sci-Hub obtains the content from the Library Genesis Project repository
and provides that content to the user.
32.

When a user searches on Sci-Hub for an article available on ScienceDirect using a

journal or article identifier, the user is redirected to a proxied version of the ScienceDirect page
where the user can download the requested article at no cost. Upon information and belief, SciHub facilitates this infringing conduct by using unlawfully-obtained access credentials to
university proxy servers to establish remote access to ScienceDirect through those proxy servers.
If, however, the requested content can be found in the Library Genesis Project repository, upon
information and belief, Sci-Hub obtains the content from it and provides it to the user.
33.

Upon information and belief, Sci-Hub engages in no other activity other than the

illegal reproduction and distribution of digital copies of Elsevier’s copyrighted works and the
copyrighted works of other publishers, and the encouragement, inducement, and material
contribution to the infringement of the copyrights of those works by third parties – i.e., the users
of the Sci-Hub website.
34.

Upon information and belief, in addition to the blatant and rampant infringement

of Elsevier’s copyrights as described above, the Defendants have also used the Sci-Hub website
to earn revenue from the piracy of copyrighted materials from ScienceDirect. Sci-Hub has at
various times accepted funds through a variety of payment processors, including PayPal,
Yandex, WebMoney, QiQi, and Bitcoin.
Sci-Hub’s Use of the Library Genesis Project as a Repository for Unlawfully-Obtained
Scientific Journal Articles and Books
35.

Upon information and belief, when Sci-Hub pirates and downloads an article from

ScienceDirect in response to a user request, in addition to providing a copy of that article to that
user, Sci-Hub also provides a duplicate copy to the Library Genesis Project, which stores the
9

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 10 of 16

article in a database accessible through the Internet. Upon information and belief, the Library
Genesis Project is designed to be a permanent repository of this and other illegally obtained
content.
36.

Upon information and belief, in the event that a Sci-Hub user requests an article

which has already been provided to the Library Genesis Project, Sci-Hub may provide that user
access to a copy provided by the Library Genesis Project rather than re-download an additional
copy of the article from ScienceDirect. As a result, Defendants Sci-Hub and Library Genesis
Project act in concert to engage in a scheme designed to facilitate the unauthorized access to and
wholesale distribution of Elsevier’s copyrighted works legitimately available on the
ScienceDirect platform.
The Library Genesis Project’s Unlawful Distribution of Plaintiff’s Copyrighted Works
37.

Access to the Library Genesis Project’s repository is facilitated by the website

“libgen.org,” which provides its users the ability to search, download content from, and upload
content to, the repository. The main page of libgen.org allows its users to perform searches in
various categories, including “LibGen (Sci-Tech),” and “Scientific articles.” In addition to
searching by keyword, users may also search for specific content by various other fields,
including title, author, periodical, publisher, or ISBN or DOI number.
38.

The libgen.org website indicates that the Library Genesis Project repository

contains approximately 1 million “Sci-Tech” documents and 40 million scientific articles. Upon
information and belief, the large majority of these works is subject to copyright protection and is
being distributed through the Library Genesis Project without the permission of the applicable
rights-holder. Upon information and belief, the Library Genesis Project serves primarily, if not

10

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 11 of 16

exclusively, as a scheme to violate the intellectual property rights of the owners of millions of
copyrighted works.
39.

Upon information and belief, Elsevier owns the copyrights in a substantial

number of copyrighted materials made available for distribution through the Library Genesis
Project. Elsevier has not authorized the Library Genesis Project or any of the Defendants to
copy, display, or distribute through any of the complained of websites any of the content stored
on ScienceDirect to which it holds the copyright. Among the works infringed by the Library
Genesis Project are the “Guyton and Hall Textbook of Medical Physiology,” and the article “The
Varus Ankle and Instability” (published in Elsevier’s journal “Foot and Ankle Clinics of North
America”), each of which is protected by Elsevier’s federally-registered copyrights.
40.

In addition to the Library Genesis Project website accessible at libgen.org, users

may access the Library Genesis Project repository through a number of “mirror” sites accessible
through other URLs. These mirror sites are similar, if not identical, in functionality to
libgen.org. Specifically, the mirror sites allow their users to search and download materials from
the Library Genesis Project repository.
FIRST CLAIM FOR RELIEF
(Direct Infringement of Copyright)
41.

Elsevier incorporates by reference the allegations contained in paragraphs 1-40

42.

Elsevier’s copyright rights and exclusive distribution rights to the works available

above.

on ScienceDirect (the “Works”) are valid and enforceable.
43.

Defendants have infringed on Elsevier’s copyright rights to these Works by

knowingly and intentionally reproducing and distributing these Works without authorization.

11

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 12 of 16

44.

The acts of infringement described herein have been willful, intentional, and

purposeful, in disregard of and indifferent to Plaintiffs’ rights.
45.

Without authorization from Elsevier, or right under law, Defendants are directly

liable for infringing Elsevier’s copyrighted Works pursuant to 17 U.S.C. §§ 106(1) and/or (3).
46.

As a direct result of Defendants’ actions, Elsevier has suffered and continues to

suffer irreparable harm for which Elsevier has no adequate remedy at law, and which will
continue unless Defendants’ actions are enjoined.
47.

Elsevier seeks injunctive relief and costs and damages in an amount to be proven

at trial.
SECOND CLAIM FOR RELIEF
(Secondary Infringement of Copyright)
48.

Elsevier incorporates by reference the allegations contained in paragraphs 1-40

49.

Elsevier’s copyright rights and exclusive distribution rights to the works available

above.

on ScienceDirect (the “Works”) are valid and enforceable.
50.

Defendants have infringed on Elsevier’s copyright rights to these Works by

knowingly and intentionally reproducing and distributing these Works without license or other
authorization.
51.

Upon information and belief, Defendants intentionally induced, encouraged, and

materially contributed to the reproduction and distribution of these Works by third party users of
websites operated by Defendants.
52.

The acts of infringement described herein have been willful, intentional, and

purposeful, in disregard of and indifferent to Elsevier’s rights.

12

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 13 of 16

53.

Without authorization from Elsevier, or right under law, Defendants are directly

liable for third parties’ infringement of Elsevier’s copyrighted Works pursuant to 17 U.S.C. §§
106(1) and/or (3).
54.

Upon information and belief, Defendants profited from third parties’ direct

infringement of Elsevier’s Works.
55.

Defendants had the right and the ability to supervise and control their websites

and the third party infringing activities described herein.
56.

As a direct result of Defendants’ actions, Elsevier has suffered and continues to

suffer irreparable harm for which Elsevier has no adequate remedy at law, and which will
continue unless Defendants’ actions are enjoined.
57.

Elsevier seeks injunctive relief and costs and damages in an amount to be proven

at trial.
THIRD CLAIM FOR RELIEF
(Violation of the Computer Fraud & Abuse Act)
58.

Elsevier incorporates by reference the allegations contained in paragraphs 1-40

59.

Elsevier’s computers and servers, the third-party computers and servers which

above.

store and maintain Elsevier’s copyrighted works for ScienceDirect, and Elsevier’s customers’
computers and servers which facilitate access to Elsevier’s copyrighted works on ScienceDirect,
are all “protected computers” under the Computer Fraud and Abuse Act (“CFAA”).
60.

Defendants (a) knowingly and intentionally accessed such protected computers

without authorization and thereby obtained information from the protected computers in a
transaction involving an interstate or foreign communication (18 U.S.C. § 1030(a)(2)(C)); and
(b) knowingly and with an intent to defraud accessed such protected computers without
13

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 14 of 16

authorization and obtained information from such computers, which Defendants used to further
the fraud and obtain something of value (18 U.S.C. § 1030(a)(4)).
61.

Defendants’ conduct has caused, and continues to cause, significant and

irreparable damages and loss to Elsevier.
62.

Defendants’ conduct has caused a loss to Elsevier during a one-year period

aggregating at least $5,000.
63.

As a direct result of Defendants’ actions, Elsevier has suffered and continues to

suffer irreparable harm for which Elsevier has no adequate remedy at law, and which will
continue unless Defendants’ actions are enjoined.
64.

Elsevier seeks injunctive relief, as well as costs and damages in an amount to be

proven at trial.
PRAYER FOR RELIEF
WHEREFORE, Elsevier respectfully requests that the Court:
A. Enter preliminary and permanent injunctions, enjoining and prohibiting Defendants,
their officers, directors, principals, agents, servants, employees, successors and
assigns, and all persons and entities in active concert or participation with them, from
engaging in any of the activity complained of herein or from causing any of the injury
complained of herein and from assisting, aiding, or abetting any other person or
business entity in engaging in or performing any of the activity complained of herein
or from causing any of the injury complained of herein;
B. Enter an order that, upon Elsevier’s request, those in privity with Defendants and
those with notice of the injunction, including any Internet search engines, Web
Hosting and Internet Service Providers, domain-name registrars, and domain name

14

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 15 of 16

registries or their administrators that are provided with notice of the injunction, cease
facilitating access to any or all domain names and websites through which Defendants
engage in any of the activity complained of herein;
C. Enter an order that, upon Elsevier’s request, those organizations which have
registered Defendants’ domain names on behalf of Defendants shall disclose
immediately to Plaintiffs all information in their possession concerning the identity of
the operator or registrant of such domain names and of any bank accounts or financial
accounts owned or used by such operator or registrant;
D. Enter an order that, upon Elsevier’s request, the TLD Registries for the Defendants’
websites, or their administrators, shall place the domain names on
registryHold/serverHold as well as serverUpdate, ServerDelete, and serverTransfer
prohibited statuses, for the remainder of the registration period for any such website.
E. Enter an order canceling or deleting, or, at Elsevier’s election, transferring the domain
name registrations used by Defendants to engage in the activity complained of herein
to Elsevier’s control so that they may no longer be used for illegal purposes;
F. Enter an order awarding Elsevier its actual damages incurred as a result of
Defendants’ infringement of Elsevier’s copyright rights in the Works and all profits
Defendant realized as a result of its acts of infringement, in amounts to be determined
at trial; or in the alternative, awarding Elsevier, pursuant to 17 U.S.C. § 504, statutory
damages for the acts of infringement committed by Defendants, enhanced to reflect
the willful nature of the Defendants’ infringement;
G. Enter an order disgorging Defendants’ profits;

15

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 16 of 16

Fuller & Dockray
In the Paradise of Too Many Books An Interview with Sean Dockray
2011


# In the Paradise of Too Many Books: An Interview with Sean Dockray

By Matthew Fuller, 4 May 2011

[0 Comments](/editorial/articles/paradise-too-many-books-interview-sean-
dockray#comments_none) [9191 Reads](/editorial/articles/paradise-too-many-
books-interview-sean-dockray) Print

If the appetite to read comes with reading, then open text archive Aaaaarg.org
is a great place to stimulate and sate your hunger. Here, Matthew Fuller talks
to long-term observer Sean Dockray about the behaviour of text and
bibliophiles in a text-circulation network

Sean Dockray is an artist and a member of the organising group for the LA
branch of The Public School, a geographically distributed and online platform
for the self-organisation of learning.1 Since its initiation by Telic Arts, an
organisation which Sean directs, The Public School has also been taken up as a
model in a number of cities in the USA and Europe.2

We met to discuss the growing phenomenon of text-sharing. Aaaaarg.org has
developed over the last few years as a crucial site for the sharing and
discussion of texts drawn from cultural theory, politics, philosophy, art and
related areas. Part of this discussion is about the circulation of texts,
scanned and uploaded to other sites that it provides links to. Since
participants in The Public School often draw from the uploads to form readers
or anthologies for specific classes or events series, this project provides a
useful perspective from which to talk about the nature of text in the present
era.

**Sean Dockray** **:** People usually talk about three key actors in
discussions about publishing, which all play fairly understandable roles:
readers; publishers; and authors.

**Matthew Fuller:** Perhaps it could be said that Aaaaarg.org suggests some
other actors that are necessary for a real culture of text; firstly that books
also have some specific kind of activity to themselves, even if in many cases
it is only a latent quality, of storage, of lying in wait and, secondly, that
within the site, there is also this other kind of work done, that of the
public reception and digestion, the response to the texts, their milieu, which
involves other texts, but also systems and organisations, and platforms, such
as Aaaaarg.

![](/sites/www.metamute.org/files/u73/Roland_Barthes_web.jpg)

Image: A young Roland Barthes, with space on his bookshelf

**SD:** Where even the three actors aren't stable! The people that are using
the site are fulfilling some role that usually the publisher has been doing or
ought to be doing, like marketing or circulation.

**MF:** Well it needn't be seen as promotion necessarily. There's also this
kind of secondary work with critics, reviewers and so on - which we can say is
also taken on by universities, for instance, and reading groups, magazines,
reviews - that gives an additional life to the text or brings it particular
kinds of attention, certain kind of readerliness.

**SD:** Situates it within certain discourses, makes it intelligible in a way,
in a different way.

**MF:** Yes, exactly, there's this other category of life to the book, which
is that of the kind of milieu or the organisational structure in which it
circulates and the different kind of networks of reference that it implies and
generates. Then there's also the book itself, which has some kind of agency,
or at least resilience and salience, when you think about how certain books
have different life cycles of appearance and disappearance.

**SD:** Well, in a contemporary sense, you have something like _Nights of
Labour_ , by Ranci _è_ re - which is probably going to be republished or
reprinted imminently - but has been sort of invisible, out of print, until, by
surprise, it becomes much more visible within the art world or something.

**MF:** And it's also been interesting to see how the art world plays a role
in the reverberations of text which isn't the same as that in cultural theory
or philosophy. Certainly _Nights of Labour_ , something that is very close to
the role that cultural studies plays in the UK, but which (cultural studies)
has no real equivalent in France, so then, geographically and linguistically,
and therefore also in a certain sense conceptually, the life of a book
exhibits these weird delays and lags and accelerations, so that's a good
example. I'm interested in what role Aaaaarg plays in that kind of
proliferation, the kind of things that books do, where they go and how they
become manifest. So I think one of the things Aaaaarg does is to make books
active in different ways, to bring out a different kind of potential in
publishing.

**SD:** Yes, the debate has tended so far to get stuck in those three actors
because people tend to end up picking a pair and placing them in opposition to
one another, especially around intellectual property. The discussion is very
simplistic and ends up in that way, where it's the authors against readers, or
authors against their publishers, with the publishers often introducing
scarcity, where the authors don't want it to be - that's a common argument.
There's this situation where the record industry is suing its own audience.
That's typically the field now.

**MF:** So within that kind of discourse of these three figures, have there
been cases where you think it's valid that there needs to be some form of
scarcity in order for a publishing project to exist?

**SD:** It's obviously not for me to say that there does or doesn't need to be
scarcity but the scarcity that I think we're talking about functions in a
really specific way: it's usually within academic publishing, the book or
journal is being distributed to a few libraries and maybe 500 copies of it are
being printed, and then the price is something anywhere from $60 to $500, and
there's just sort of an assumption that the audience is very well defined and
stable and able to cope with that.

**MF:** Yeah, which recognises that the audiences may be stable as an
institutional form, but not that over time the individual parts of say that
library user population change in their relationship to the institution. If
you're a student for a few years and then you no longer have access, you lose
contact with that intellectual community...

**SD:** Then people just kind of have to cling to that intellectual community.
So when scarcity functions like that, I can't think of any reason why that
_needs_ to happen. Obviously it needs to happen in the sense that there's a
relatively stable balance that wants to perpetuate itself, but what you're
asking is something else.

**MF:** Well there are contexts where the publisher isn't within that academic
system of very high costs, sustained by volunteer labour by academics, the
classic peer review system, but if you think of more of a trade publisher like
a left or a movement or underground publisher, whose books are being
circulated on Aaaaarg...

**SD:** They're in a much more precarious position obviously than a university
press whose economics are quite different, and with the volunteer labour or
the authors are being subsidised by salary - you have to look at the entire
system rather than just the publication. But in a situation where the
publisher is much more precarious and relying on sales and a swing in one
direction or another makes them unable to pay the rent on a storage facility,
one can definitely see why some sort of predictability is helpful and
necessary.

**MF:** So that leads me to wonder whether there are models of publishing that
are emerging that work with online distribution, or with the kind of thing
that Aaaaarg does specifically. Are there particular kinds of publishing
initiatives that really work well in this kind of context where free digital
circulation is understood as an a priori, or is it always in this kind of
parasitic or cyclical relationship?

**SD:** I have no idea how well they work actually; I don't know how well,
say, Australian publisher re.press, works for example. 3 I like a lot of what
they publish, it's given visibility when re.press distributes it and that's a
lot of what a publisher's role seems to be (and what Aaaaarg does as well).
But are you asking how well it works in terms of economics?

**MF:** Well, just whether there's new forms of publishing emerging that work
well in this context that cut out some of the problems ?

**SD:** Well, there's also the blog. Certain academic discourses, philosophy
being one, that are carried out on blogs really work to a certain extent, in
that there is an immediacy to ideas, their reception and response. But there's
other problems, such as the way in which, over time, the posts quickly get
forgotten. In this sense, a publication, a book, is kind of nice. It
crystallises and stays around.

**MF:** That's what I'm thinking, that the book is a particular kind of thing
which has it's own quality as a form of media. I also wonder whether there
might be intermediate texts, unfinished texts, draft texts that might
circulate via Aaaaarg for instance or other systems. That, at least to me,
would be kind of unsatisfactory but might have some other kind of life and
readership to it. You know, as you say, the blog is a collection of relatively
occasional texts, or texts that are a work in progress, but something like
Aaaaarg perhaps depends upon texts that are finished, that are absolutely the
crystallisation of a particular thought.

![](/sites/www.metamute.org/files/u73/tree_of_knowledge_web.jpg)

Image: The Tree of Knowledge as imagined by Hans Sebald Beham in his 1543
engraving _Adam and Eve_

**SD:** Aaaaarg is definitely not a futuristic model. I mean, it occurs at a
specific time, which is while we're living in a situation where books exist
effectively as a limited edition. They can travel the world and reach certain
places, and yet the readership is greatly outpacing the spread and
availability of the books themselves. So there's a disjunction there, and
that's obviously why Aaaaarg is so popular. Because often there are maybe no
copies of a certain book within 400 miles of a person that's looking for it,
but then they can find it on that website, so while we're in that situation it
works.

**MF:** So it's partly based on a kind of asymmetry, that's spatial, that's
about the territories of publishers and distributors, and also a kind of
asymmetry of economics?

**SD:** Yeah, yeah. But others too. I remember when I was affiliated with a
university and I had JSTOR access and all these things and then I left my job
and then at some point not too long after that my proxy access expired and I
no longer had access to those articles which now would cost $30 a pop just to
even preview. That's obviously another asymmetry, even though, geographically
speaking, I'm in an identical position, just that my subject position has
shifted from affiliated to unaffiliated.

**MF:** There's also this interesting way in which Aaaaarg has gained
different constituencies globally, you can see the kind of shift in the texts
being put up. It seems to me anyway there are more texts coming from non-
western authors. This kind of asymmetry generates a flux. We're getting new
alliances between texts and you can see new bibliographies emerge.

**SD:** Yeah, the original community was very American and European and
gradually people were signing up at other places in order to have access to a
lot of these texts that didn't reach their libraries or their book stores or
whatever. But then there is a danger of US and European thought becoming
central. A globalisation where a certain mode of thought ends up just erasing
what's going on already in the cities where people are signing up, that's a
horrible possible future.

**MF:** But that's already something that's _not_ happening in some ways?

**SD:** Exactly, that's what seems to be happening now. It goes on to
translations that are being put up and then texts that are coming from outside
of the set of US and western authors and so, in a way, it flows back in the
other direction. This hasn't always been so visible, maybe it will begin to
happen some more. But think of the way people can list different texts
together as ‘issues' - a way that you can make arbitrary groupings - and
they're very subjective, you can make an issue named anything and just lump a
bunch of texts in there. But because, with each text, you can see what other
issues people have also put it in, it creates a trace of its use. You can see
that sometimes the issues are named after the reading groups, people are using
the issues format as a collecting tool, they might gather all Portuguese
translations, or The Public School uses them for classes. At other times it's
just one person organising their dissertation research but you see the wildly
different ways that one individual text can be used.

**MF:** So the issue creates a new form of paratext to the text, acting as a
kind of meta-index, they're a new form of publication themselves. To publish a
bibliography that actively links to the text itself is pretty cool. That also
makes me think within the structures of Aaaaarg it seems that certain parts of
the library are almost at breaking point - for instance the alphabetical
structure.

**SD:** Which is funny because it hasn't always been that alphabetical
structure either, it used to just be everything on one page, and then at some
point it was just taking too long for the page to load up A-Z. And today A is
as long as the entire index used to be, so yeah these questions of density and
scale are there but they've always been dealt with in a very ad hoc kind of
way, dealing with problems as they come. I'm sure that will happen. There
hasn't always been a search and, in a way, the issues, along with
alphabetising, became ways of creating more manageable lists, but even now the
list of issues is gigantic. These are problems of scale.

**MF:** So I guess there's also this kind of question that emerges in the
debate on reading habits and reading practices, this question of the breadth
of reading that people are engaging in. Do you see anything emerging in
Aaaaarg that suggests a new consistency of handling reading material? Is there
a specific quality, say, of the issues? For instance, some of them seem quite
focused, and others are very broad. They may provide insights into how new
forms of relationships to intellectual material may be emerging that we don't
quite yet know how to handle or recognise. This may be related to the lament
for the classic disciplinary road of deep reading of specific materials with a
relatively focused footprint whereas, it is argued, the net is encouraging a
much wider kind of sampling of materials with not necessarily so much depth.

**SD:** It's partially driven by people simply being in the system, in the
same way that the library structures our relationship to text, the net does it
in another way. One comment I've heard is that there's too much stuff on
Aaaaarg, which wasn't always the case. It used to be that I read every single
thing that was posted because it was slow enough and the things were short
enough that my response was, ‘Oh something new, great!' and I would read it.
But now, obviously that is totally impossible, there's too much; but in a way
that's just the state of things. It does seem like certain tactics of making
sense of things, of keeping things away and letting things in and queuing
things for reading later become just a necessary part of even navigating. It's
just the terrain at the moment, but this is only one instance. Even when I was
at the university and going to libraries, I ended up with huge stacks of books
and I'd just buy books that I was never going to read just to have them
available in my library, so I don't think feeling overwhelmed by books is
particularly new, just maybe the scale of it is. In terms of how people
actually conduct themselves and deal with that reality, it's difficult to say.
I think the issues are one of the few places where you would see any sort of
visible answers on Aaaaarg, otherwise it's totally anecdotal. At The Public
School we have organised classes in relationship to some of the issues, and
then we use the classes to also figure out what texts we are going to be
reading in the future, to make new issues and new classes. So it becomes an
organising group, reading and working its way through subject matter and
material, then revisiting that library and seeing what needs to be there.

**MF:** I want to follow that kind of strand of habits of accumulation,
sorting, deferring and so on. I wonder, what is a kind of characteristic or
unusual reading behavior? For instance are there people who download the
entire list? Or do you see people being relatively selective? How does the
mania of the net, with this constant churning of data, map over to forms of
bibliomania?

**SD:** Well, in Aaaaarg it's again very specific. Anecdotally again, I have
heard from people how much they download and sometimes they're very selective,
they just see something that's interesting and download it, other times they
download everything and occasionally I hear about this mania of mirroring the
whole site. What I mean about being specific to Aaaaarg is that a lot of the
mania isn't driven by just the need to have everything; it's driven by the
acknowledgement that the source is going to disappear at some point. That
sense of impending disappearance is always there, so I think that drives a lot
of people to download everything because, you know, it's happened a couple
times where it's just gone down or moved or something like that.

**MF:** It's true, it feels like something that is there even for a few weeks
or a few months. By a sheer fluke it could last another year, who knows.

**SD:** It's a different kind of mania, and usually we get lost in this
thinking that people need to possess everything but there is this weird
preservation instinct that people have, which is slightly different. The
dominant sensibility of Aaaaarg at the beginning was the highly partial and
subjective nature to the contents and that is something I would want to
preserve, which is why I never thought it to be particularly exciting to have
lots of high quality metadata - it doesn't have the publication date, it
doesn't have all the great metadata that say Amazon might provide. The system
is pretty dismal in that way, but I don't mind that so much. I read something
on the Internet which said it was like being in the porn section of a video
store with all black text on white labels, it was an absolutely beautiful way
of describing it. Originally Aaaaarg was about trading just those particular
moments in a text that really struck you as important, that you wanted other
people to read so it would be very short, definitely partial, it wasn't a
completist project, although some people maybe treat it in that way now. They
treat it as a thing that wants to devour everything. That's definitely not the
way that I have seen it.

**MF:** And it's so idiosyncratic I mean, you know it's certainly possible
that it could be read in a canonical mode, you can see that there's that
tendency there, of the core of Adorno or Agamben, to take the a's for
instance. But of the more contemporary stuff it's very varied, that's what's
nice about it as well. Alongside all the stuff that has a very long-term
existence, like historical books that may be over a hundred years old, what
turns up there is often unexpected, but certainly not random or
uninterpretable.

![](/sites/www.metamute.org/files/u1/malraux_web3_0.jpg)

Image: French art historian André Malraux lays out his _Musée Imaginaire_ ,
1947

**SD:** It's interesting to think a little bit about what people choose to
upload, because it's not easy to upload something. It takes a good deal of
time to scan a book. I mean obviously some things are uploaded which are, have
always been, digital. (I wrote something about this recently about the scan
and the export - the scan being something that comes out of a labour in
relationship to an object, to the book, and the export is something where the
whole life of the text has sort of been digital from production to circulation
and reception). I happen to think of Aaaaarg in the realm of the scan and the
bootleg. When someone actually scans something they're potentially spending
hours because they're doing the work on the book they're doing something with
software, they're uploading.

**MF:** Aaaarg hasn't introduced file quality thresholds either.

**SD:** No, definitely not. Where would that go?

**MF:** You could say with PDFs they have to be searchable texts?

**SD:** I'm sure a lot of people would prefer that. Even I would prefer it a
lot of the time. But again there is the idiosyncratic nature of what appears,
and there is also the idiosyncratic nature of the technical quality and
sometimes it's clear that the person that uploads something just has no real
experience of scanning anything. It's kind of an inevitable outcome. There are
movie sharing sites that are really good about quality control both in the
metadata and what gets up; but I think that if you follow that to the end,
then basically you arrive at the exported version being the Platonic text, the
impossible, perfect, clear, searchable, small - totally eliminating any trace
of what is interesting, the hand of reading and scanning, and this is what you
see with a lot of the texts on Aaaaarg. You see the hand of the person who's
read that book in the past, you see the hand of the person who scanned it.
Literally, their hand is in the scan. This attention to the labour of both
reading and redistributing, it's important to still have that.

**MF:** You could also find that in different ways for instance with a pdf, a
pdf that was bought directly as an ebook that's digitally watermarked will
have traces of the purchaser coded in there. So then there's also this work of
stripping out that data which will become a new kind of labour. So it doesn't
have this kind of humanistic refrain, the actual hand, the touch of the
labour. This is perhaps more interesting, the work of the code that strips it
out, so it's also kind of recognising that code as part of the milieu.

**SD:** Yeah, that is a good point, although I don't know that it's more
interesting labour.

**MF:** On a related note, The Public School as a model is interesting in that
it's kind of a convention, it has a set of rules, an infrastructure, a
website, it has a very modular being. Participants operate with a simple
organisational grammar which allows them to say ‘I want to learn this' or ‘I
want to teach this' and to draw in others on that basis. There's lots of
proposals for classes, some of them don't get taken up, but it's a process and
a set of resources which allow this aggregation of interest to occur. I just
wonder how you saw that kind of ethos of modularity in a way, as a set of
minimum rules or set of minimum capacities that allow a particular set of
things occur?

**SD:** This may not respond directly to what you were just talking about, but
there's various points of entry to the school and also having something that
people feel they can take on as their own and I think the minimal structure
invites quite a lot of projection as to what that means and what's possible
with it. If it's not doing what you want it to do or you think, ‘I'm not sure
what it is', there's the sense that you can somehow redirect it.

**MF:** It's also interesting that projection itself can become a technical
feature so in a way the work of the imagination is done also through this kind
of tuning of the software structure. The governance that was handled by the
technical infrastructure actually elicits this kind of projection, elicits the
imagination in an interesting way.

**SD:** Yeah, yeah, I totally agree and, not to put too much emphasis on the
software, although I think that there's good reason to look at both the
software and the conceptual diagram of the school itself, but really in a way
it would grind to a halt if it weren't for the very traditional labour of
people - like an organising committee. In LA there's usually around eight of
us (now Jordan Biren, Solomon Bothwell, Vladada Gallegos, Liz Glynn, Naoko
Miyano, Caleb Waldorf, and me) who are deeply involved in making that
translation of these wishes - thrown onto the website that somehow attract the
other people - into actual classes.

**MF:** What does the committee do?

**SD:** Even that's hard to describe and that's what makes it hard to set up.
It's always very particular to even a single idea, to a single class proposal.
In general it'd be things like scheduling, finding an instructor if an
instructor is what's required for that class. Sometimes it's more about
finding someone who will facilitate, other times it's rounding up materials.
But it could be helping an open proposal take some specific form. Sometimes
it's scanning things and putting them on Aaaaarg. Sometimes, there will be a
proposal - I proposed a class in the very, very beginning on messianic time, I
wanted to take a class on it - and it didn't happen until more than a year and
a half later.

**MF:** Well that's messianic time for you.

**SD:** That and the internet. But other times it will be only a week later.
You know we did one on the Egyptian revolution and its historical context,
something which demanded a very quick turnaround. Sometimes the committee is
going to classes and there will be a new conflict that arises within a class,
that they then redirect into the website for a future proposal, which becomes
another class: a point of friction where it's not just like next, and next,
and next, but rather it's a knot that people can't quite untie, something that
you want to spend more time with, but you may want to move on to other things
immediately, so instead you postpone that to the next class. A lot of The
Public School works like that: it's finding momentum then following it. A lot
of our classes are quite short, but we try and string them together. The
committee are the ones that orchestrate that. In terms of governance, it is
run collectively, although with the committee, every few months people drop
off and new people come on. There are some people who've been on for years.
Other people who stay on just for that point of time that feels right for
them. Usually, people come on to the committee because they come to a lot of
classes, they start to take an interest in the project and before they know it
they're administering it.

**Matthew Fuller's <[m.fuller@gold.ac.uk](mailto:m.fuller@gold.ac.uk)> most
recent book, _Elephant and Castle_ , is forthcoming from Autonomedia. **

**He is collated at**

**Footnotes**

1

2 [http://telic.info/ ](http://telic.info/)

3


Bodo
Libraries in the Post-Scarcity Era
2015


Libraries in the Post-Scarcity Era
Balazs Bodo

Abstract
In the digital era where, thanks to the ubiquity of electronic copies, the book is no longer a scarce
resource, libraries find themselves in an extremely competitive environment. Several different actors are
now in a position to provide low cost access to knowledge. One of these competitors are shadow libraries
- piratical text collections which have now amassed electronic copies of millions of copyrighted works
and provide access to them usually free of charge to anyone around the globe. While such shadow
libraries are far from being universal, they are able to offer certain services better, to more people and
under more favorable terms than most public or research libraries. This contribution offers insights into
the development and the inner workings of one of the biggest scientific shadow libraries on the internet in
order to understand what kind of library people create for themselves if they have the means and if they
don’t have to abide by the legal, bureaucratic and economic constraints that libraries usually face. I argue
that one of the many possible futures of the library is hidden in the shadows, and those who think of the
future of libraries can learn a lot from book pirates of the 21 st century about how users and readers expect
texts in electronic form to be stored, organized and circulated.
“The library is society’s last non-commercial meeting place which the majority of the population uses.”
(Committee on the Public Libraries in the Knowledge Society, 2010)
“With books ready to be shared, meticulously cataloged, everyone is a librarian. When everyone is
librarian, library is everywhere.” – Marcell Mars, www.memoryoftheworld.org
I have spent the last few months in various libraries visiting - a library. I spent countless hours in the
modest or grandiose buildings of the Harvard Libraries, the Boston and Cambridge Public Library
systems, various branches of the Openbare Bibliotheek in Amsterdam, the libraries of the University of
Amsterdam, with a computer in front of me, on which another library was running, a library which is
perfectly virtual, which has no monumental buildings, no multi-million euro budget, no miles of stacks,
no hundreds of staff, but which has, despite lacking all what apparently makes a library, millions of
literary works and millions of scientific books, all digitized, all available at the click of the mouse for
everyone on the earth without any charge, library or university membership. As I was sitting in these

1

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

physical spaces where the past seemed to define the present, I was wondering where I should look to find
the library of the future: down to my screen or up around me.
The library on my screen was Aleph, one of the biggest of the countless piratical text collections on the
internet. It has more than a million scientific works and another million literary works to offer, all free to
download, without any charge or fee, for anyone on the net. I’ve spent months among its virtual stacks,
combing through the catalogue, talking to the librarians who maintain the collection, and watching the
library patrons as they used the collection. I kept going back to Aleph both as a user and as a researcher.
As a user, Aleph offered me books that the local libraries around me didn’t, in formats that were more
convenient than print. As a researcher, I was interested in the origins of Aleph, its modus operandi, its
future, and I was curious where the journey to which it has taken the book-readers, authors, publishers
and libraries would end.
In this short essay I will introduce some of the findings of a two year research project conducted on
Aleph. In the project I looked at several things. I reconstructed the pirate library’s genesis in order to
understand the forces that called it to life and shaped its development. I looked at its catalogue to
understand what it has to offer and how that piratical supply of books is related to the legal supply of
books through libraries and online distributors. I also acquired data on its usage, so was able to
reconstruct some aspects of piratical demand. After a short introduction, in the first part of this essay I
will outline some of the main findings, and in the second part will situate the findings in the wider context
of the future of libraries.

Book pirates and shadow librarians
Book piracy has a fascinating history, tightly woven into the history of the printing press (Judge, 1934),
into the history of censorship (Wittmann, 2004), into the history of copyright (Bently, Davis, & Ginsburg,
2010; Bodó, 2011a) and into the history of European civilization (Johns, 2010). Book piracy, in the 21st or
in the mid-17th century is an activity that has deep cultural significance, because ultimately it is a story
about how knowledge is circulated beyond and often against the structures of political and economic
power (Bodó, 2011b), and thus it is a story about the changes this unofficial circulation of knowledge
brings.
There are many different types of book pirates. Some just aim for easy money, others pursue highly
ideological goals, but they are invariably powerful harbingers of change. The emergence of black markets
whether they be of culture, of drugs or of arms is always a symptom, a warning sign of a friction between

2

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

supply and demand. Increased activity in the grey and black zones of legality marks the emergence of a
demand which legal suppliers are unwilling or unable to serve (Bodó, 2011a). That friction, more often
than not, leads to change. Earlier waves of book piracy foretold fundamental economic, political, societal
or technological shifts (Bodó, 2011b): changes in how the book publishing trade was organized (Judge,
1934; Pollard, 1916, 1920); the emergence of the new, bourgeois reading class (Patterson, 1968; Solly,
1885); the decline of pre-publication censorship (Rose, 1993); the advent of the Reformation and of the
Enlightenment (Darnton, 1982, 2003), or the rapid modernization of more than one nation (Khan &
Sokoloff, 2001; Khan, 2004; Yu, 2000).
The latest wave of piracy has coincided with the digital revolution which, in itself, profoundly upset the
economics of cultural production and distribution (Landes & Posner, 2003). However technology is not
the primary cause of the emergence of cultural black markets like Aleph. The proliferation of computers
and the internet has just revealed a more fundamental issue which all has to do with the uneven
distribution of the access to knowledge around the globe.
Sometimes book pirates do more than just forecast and react to changes that are independent of them.
Under certain conditions, they themselves can be powerful agents of change (Bodó, 2011b). Their agency
rests on their ability to challenge the status quo and resist cooptation or subjugation. In that effect, digital
pirates seem to be quite resilient (Giblin, 2011; Patry, 2009). They have the technological upper hand and
so far they have been able to outsmart any copyright enforcement effort (Bodó, forthcoming). As long as
it is not completely possible to eradicate file sharing technologies, and as long as there is a substantial
difference between what is legally available and what is in demand, cultural black markets will be here to
compete with and outcompete the established and recognized cultural intermediaries. Under this constant
existential threat, business models and institutions are forced to adapt, evolve or die.
After the music and audiovisual industries, now the book industry has to address the issue of piracy.
Piratical book distribution services are now in direct competition with the bookstore on the corner, the
used book stall on the sidewalk, they compete with the Amazons of the world and, like it or not, they
compete with libraries. There is, however, a significant difference between the book and the music
industries. The reluctance of music rights holders to listen to the demands of their customers caused little
damage beyond the markets of recorded music. Music rights holders controlled their own fates and those
who wanted to experiment with alternative forms of distribution had the chance to do so. But while the
rapid proliferation of book black markets may signal that the book industry suffers from similar problems
as the music industry suffered a decade ago, the actions of book publishers, the policies they pursue have
impact beyond the market of books and directly affect the domain of libraries.

3

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

The fate of libraries is tied to the fate of book markets in more than one way. One connection is structural:
libraries emerged to remedy the scarcity in books. This is true both for the pre-print era as well as in the
Gutenberg galaxy. In the era of widespread literacy and highly developed book markets, libraries offer
access to books under terms publishers and booksellers cannot or would not. Libraries, to a large extent,
are defined to complement the structure of the book trade. The other connection is legal. The core
activities of the library (namely lending, copying) are governed by the same copyright laws that govern
authors and publishers. Libraries are one of the users in the copyright system, and their existence depends
on the limitations of and exceptions to the exclusive rights of the rights holders. The space that has been
carved out of copyright to enable the existence of libraries has been intensely contested in the era of
postmodern copyright (Samuelson, 2002) and digital technologies. This heavy legal and structural
interdependence with the market means that libraries have only a limited control over their own fate in the
digital domain.
Book pirates compete with some of the core services of libraries. And as is usually the case with
innovation that has no economic or legal constraints, pirate libraries offer, at least for the moment,
significantly better services than most of the libraries. Pirate libraries offer far more electronic books,
with much less restrictions and constraints, to far more people, far cheaper than anyone else in the library
domain. Libraries are thus directly affected by pirate libraries, and because of their structural
interdependence with book markets, they also have to adjust to how the commercial intermediaries react
to book piracy. Under such conditions libraries cannot simply count on their survival through their legacy.
Book piracy must be taken seriously, not just as a threat, but also as an opportunity to learn how shadow
libraries operate and interact with their users. Pirate libraries are the products of readers (and sometimes
authors), academics and laypeople, all sharing a deep passion for the book, operating in a zone where
there is little to no obstacle to the development of the “ideal” library. As such, pirate libraries can teach
important lessons on what is expected of a library, how book consumption habits evolve, and how
knowledge flows around the globe.

Pirate libraries in the digital age
The collection of texts in digital formats was one of the first activities that computers enabled: the text file
is the native medium of the computer, it is small, thus it is easy to store and copy. It is also very easy to
create, and as so many projects have since proved, there are more than enough volunteers who are willing
to type whole books into the machine. No wonder that electronic libraries and digital text repositories
were among the first “mainstream” application of computers. Combing through large stacks of matrix-

4

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

printer printouts of sci-fi classics downloaded from gopher servers is a shared experience of anyone who
had access to computers and the internet before it was known as the World Wide Web.
Computers thus added fresh momentum to the efforts of realizing the age-old dream of the universal
library (Battles, 2004). Digital technologies offered a breakthrough in many of the issues that previously
posed serious obstacles to text collection: storage, search, preservation, access have all become cheaper
and easier than ever before. On the other hand, a number of key issues remained unresolved: digitization
was a slow and cumbersome process, while the screen proved to be too inconvenient, and the printer too
costly an interface between the text file and the reader. In any case, ultimately it wasn’t these issues that
put a break to the proliferation of digital libraries. Rather, it was the realization, that there are legal limits
to the digitization, storage, distribution of copyrighted works on the digital networks. That realization
soon rendered many text collections in the emerging digital library scene inaccessible.
Legal considerations did not destroy this chaotic, emergent digital librarianship and the collections the adhoc, accidental and professional librarians put together. The text collections were far too valuable to
simply delete them from the servers. Instead, what happened to most of these collections was that they
retreated from the public view, back into the access-controlled shadows of darknets. Yesterday’s gophers
and anonymous ftp servers turned into closed, membership only ftp servers, local shared libraries residing
on the intranets of various academic, business institutions and private archives stored on local hard drives.
The early digital libraries turned into book piracy sites and into the kernels of today’s shadow libraries.
Libraries and other major actors, who decided to start large scale digitization programs soon needed to
find out that if they wanted to avoid costly lawsuits, then they had to limit their activities to work in the
public domain. While the public domain is riddled with mind-bogglingly complex and unresolved legal
issues, but at least it is still significantly less complicated to deal with than copyrighted and orphan works.
Legally more innovative, (or as some would say, adventurous) companies, such as Google and Microsoft,
who thought they had sufficient resources to sort out the legal issues soon had to abandon their programs
or put them on hold until the legal issues were sorted out.
There were, however, a large group of disenfranchised readers, library patrons, authors and users who
decided to ignore the legal problems and set out to build the best library that could possibly be built using
the digital technologies. Despite the increased awareness of rights holders to the issue of digital book
piracy, more and more communities around text collections started defy the legal constraints and to
operate and use more or less public piratical shadow libraries.

5

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Aleph1
Aleph2 is a meta-library, and currently one of the biggest online piratical text collections on the internet.
The project started on a Russian bulletin board devoted to piracy in around 2008 as an effort to integrate
various free-floating text collections that circulated online, on optical media, on various public and private
ftp servers and on hard-drives. Its aim was to consolidate these separate text collections, many of which
were created in various Russian academic institutions, into a single, unified catalog, standardize the
technical aspects, add and correct missing or incorrect metadata, and offer the resulting catalogue,
computer code and the collection of files as an open infrastructure.

From Russia with love
It is by no means a mistake that Aleph was born in Russia. In post-Soviet Russia the unique constellation
of several different factors created the necessary conditions for the digital librarianship movement that
ultimately led to the development of Aleph. A rich literary legacy, the Soviet heritage, the pace with
which various copying technologies penetrated the market, the shortcomings of the legal environment and
the informal norms that stood in for the non-existent digital copyrights all contributed to the emergence of
the biggest piratical library in the history of mankind.
Russia cherishes a rich literary tradition, which suffered and endured extreme economic hardships and
political censorship during the Soviet period (Ermolaev, 1997; Friedberg, Watanabe, & Nakamoto, 1984;
Stelmakh, 2001). The political transformation in the early 1990’s liberated authors, publishers, librarians
and readers from much of the political oppression, but it did not solve the economic issues that stood in
the way of a healthy literary market. Disposable income was low, state subsidies were limited, the dire
economic situation created uncertainty in the book market. The previous decades, however, have taught
authors and readers how to overcome political and economic obstacles to access to books. During the
Soviet times authors, editors and readers operated clandestine samizdat distribution networks, while
informal book black markets, operating in semi-private spheres, made uncensored but hard to come by
books accessible (Stelmakh, 2001). This survivalist attitude and the skills that came with it became handy
in the post-Soviet turmoil, and were directly transferable to the then emerging digital technologies.

1

I have conducted extensive research on the origins of Aleph, on its catalogue and its users. The detailed findings, at
the time of writing this contribution are being prepared for publication. The following section is brief summary of
those findings and is based upon two forthcoming book chapters on Aleph in a report, edited by Joe Karaganis, on
the role of shadow libraries in the higher education systems of multiple countries.
2
Aleph is a pseudonym chosen to protect the identity of the shadow library in question.

6

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Russia is not the only country with a significant informal media economy of books, but in most other
places it was the photocopy machine that emerged to serve such book grey/black markets. In pre-1990
Russia and in other Eastern European countries the access to this technology was limited, and when
photocopiers finally became available, computers were close behind them in terms of accessibility. The
result of the parallel introduction of the photocopier and the computer was that the photocopy technology
did not have time to lock in the informal market of texts. In many countries where the photocopy machine
preceded the computer by decades, copy shops still capture the bulk of the informal production and
distribution of textbooks and other learning material. In the Soviet-bloc PCs instantly offered a less costly
and more adaptive technology to copy and distribute texts.
Russian academic and research institutions were the first to have access to computers. They also had to
somehow deal with the frustrating lack of access to up-to-date and affordable western works to be used in
education and research (Abramitzky & Sin, 2014). This may explain why the first batch of shadow
libraries started in a number of academic/research institutions such as the Department of Mechanics and
Mathematics (MexMat) at Moscow State University. The first digital librarians in Russia were
mathematicians, computer scientists and physicists, working in those institutions.
As PCs and internet access slowly penetrated Russian society, an extremely lively digital librarianship
movement emerged, mostly fuelled by enthusiastic readers, book fans and often authors, who spared no
effort to make their favorite books available on FIDOnet, a popular BBS system in Russia. One of the
central figures in these tumultuous years, when typed-in books appeared online by the thousands, was
Maxim Moshkov, a computer scientist, alumnus of the MexMat, and an avid collector of literary works.
His digital library, lib.ru was at first mostly a private collection of literary texts, but soon evolved into the
number one text repository which everyone used to depose the latest digital copy on a newly digitized
book (Мошков, 1999). Eventually the library grew so big that it had to be broken up. Today it only hosts
the Russian literary classics. User generated texts, fan fiction and amateur production was spin off into the
aptly named samizdat.lib.ru collection, low brow popular fiction, astrology and cheap romance found its
way into separate collections, and so did the collection of academic/scientific books, which started an
independent life under the name of Kolkhoz. Kolkhoz, which borrowed its name from the commons
based agricultural cooperative of the early Soviet era, was both a collection of scientific texts, and a
community of amateur librarians, who curated, managed and expanded the collection.
Moshkov and his library introduced several important norms into the bottom-up, decentralized, often
anarchic digital library movement that swept through the Russian internet in the late 1990’s, early 2000’s.
First, lib.ru provided the technological blueprint for any future digital library. But more importantly,

7

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Moshkov’s way of handling the texts, his way of responding to the claims, requests, questions, complaints
of authors and publishers paved the way to the development of copynorms (Schultz, 2007) that continue
to define the Russian digital library scene until today. Moshkov was instrumental in the creation of an
enabling environment for the digital librarianship while respecting the claims of authors, during times
when the formal copyright framework and the enforcement environment was both unable and unwilling to
protect works of authorship (Elst, 2005; Sezneva, 2012).

Guerilla Open Access
Around the time of the late 2000’s when Aleph started to merge the Kolkhoz collection with other, freefloating texts collections, two other notable events took place. It was in 2008 when Aaron Swartz penned
his Guerilla Open Access Manifesto (Swartz, 2008), in which he called for the liberation and sharing of
scientific knowledge. Swartz forcefully argued that scientific knowledge, the production of which is
mostly funded by the public and by the voluntary labor of academics, cannot be locked up behind
corporate paywalls set up by publishers. He framed the unauthorized copying and transfer of scientific
works from closed access text repositories to public archives as a moral act, and by doing so, he created
an ideological framework which was more radical and promised to be more effective than either the
creative commons (Lessig, 2004) or the open access (Suber, 2013) movements that tried to address the
access to knowledge issues in a more copyright friendly manner. During interviews, the administrators of
Aleph used the very same arguments to justify the raison d'être of their piratical library. While it seems
that Aleph is the practical realization of Swartz’s ideas, it is hard to tell which served as an inspiration for
the other.
It was also in around the same time when another piratical library, gigapedia/library.nu started its
operation, focusing mostly on making freely available English language scientific works (Liang, 2012).
Until its legal troubles and subsequent shutdown in 2012, gigapedia/library.nu was the biggest English
language piratical scientific library on the internet amassing several hundred thousand books, including
high-quality proofs ready to print and low resolution scans possibly prepared by a student or a lecturer.
During 2012 the mostly Russian-language and natural sciences focused Alephs absorbed the English
language, social sciences rich gigapedia/library.nu, and with the subsequent shutdown of
gigapedia/library.nu Aleph became the center of the scientific shadow library ecosystem and community.

Aleph by numbers

8

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

By adding pre-existing text collections to its catalogue Aleph was able to grow at an astonishing rate.
Aleph added, on average 17.500 books to its collection each month since 2009, and as a result, by April
2014 is has more than 1.15 million documents. Nearly two thirds of the collection is in English, one fifth
of the documents is in Russian, while German works amount to the third largest group with 8.5% of the
collection. The rest of the major European languages, like French or Spanish have less than 15000 works
each in the collection.
More than 50 thousand publishers have works in the library, but most of the collection is published by
mainstream western academic publishers. Springer published more than 12% of the works in the
collection, followed by the Cambridge University Press, Wiley, Routledge and Oxford University Press,
each having more than 9000 works in the collection.
Most of the collection is relatively recent, more than 70% of the collection being published in 1990 or
after. Despite the recentness of the collection, the electronic availability of the titles in the collection is
limited. While around 80% of the books that had an ISBN number registered in the catalogue3 was
available in print either as a new copy or a second hand one, only about one third of the titles were
available in e-book formats. The mean price of the titles still in print was 62 USD according to the data
gathered from Amazon.com.
The number of works accessed through of Aleph is as impressive as its catalogue. In the three months
between March and June, 2012, on average 24.000 documents were downloaded every day from one of
its half-a-dozen mirrors.4 This means that the number of documents downloaded daily from Aleph is
probably in the 50 to 100.000 range. The library users come from more than 150 different countries. The
biggest users in terms of volume were the Russian Federation, Indonesia, USA, India, Iran, Egypt, China,
Germany and the UK. Meanwhile, many of the highest per-capita users are Central and Eastern European
countries.

What Aleph is and what it is not
Aleph is an example of the library in the post scarcity age. It is founded on the idea that books should no
longer be a scarce resource. Aleph set out to remove both sources of scarcity: the natural source of
3

Market availability data is only available for that 40% of books in the Aleph catalogue that had an ISBN number
on file. The titles without a valid ISBN number tend to be older, Russian language titles, in general with low
expected print and e-book availability.
4
Download data is based on the logs provided by one of the shadow library services which offers the books in
Aleph’s catalogue as well as other works also free and without any restraints or limitations.

9

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

scarcity in physical copies is overcome through distributed digitization; the artificial source of scarcity
created by copyright protection is overcome through infringement. The liberation from both constraints is
necessary to create a truly scarcity free environment and to release the potential of the library in the postscarcity age.
Aleph is also an ongoing demonstration of the fact that under the condition of non-scarcity, the library can
be a decentralized, distributed, commons-based institution created and maintained through peer
production (Benkler, 2006). The message of Aleph is clear: users left to their own devices, can produce a
library by themselves for themselves. In fact, users are the library. And when everyone has the means to
digitize, collect, catalogue and share his/her own library, then the library suddenly is everywhere. Small
individual and institutional collections are aggregated into Aleph, which, in turn is constantly fragmented
into smaller, local, individual collections as users download works from the collection. The library is
breathing (Battles, 2004) books in and out, but for the first time, this circulation of books is not a zero
sum game, but a cumulative one: with every cycle the collection grows.
On the other hand Aleph may have lots of books on offer, but it is clear that it is neither universal in its
scope, nor does it fulfill all the critical functions of a library. Most importantly Aleph is disembedded
from the local contexts and communities that usually define the focus of the library. While it relies on the
availability of local digital collections for its growth, it has no means to play an active role in its own
development. The guardians of Aleph can prevent books from entering the collection, but they cannot
pay, ask or force anyone to provide a title if it is missing. Aleph is reliant on the weak copy-protection
technologies of official e-text repositories and the goodwill of individual document submitters when it
comes to the expansion of the collection. This means that the Aleph collection is both fragmented and
biased, and it lacks the necessary safeguards to ensure that it stays either current or relevant.
Aleph, with all its strengths and weaknesses carries an important lesson for the discussions on the future
of libraries. In the next section I’ll try situate these lessons in the wider context of the library in the post
scarcity age.

The future of the library
There is hardly a week without a blog post, a conference, a workshop or an academic paper discussing the
future of libraries. While existing libraries are buzzing with activity, librarians are well aware that they
need to re-define themselves and their institutions, as the book collections around which libraries were
organized slowly go the way the catalogue has gone: into the digital realm. It would be impossible to give

10

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

a faithful summary of all the discussions on the future of libraries is such a short contribution. There are,
however, a few threads, to which the story of Aleph may contribute.

Competition
It is very rare to find the two words: libraries and competition in the same sentence. No wonder: libraries
enjoyed a near perfect monopoly in their field of activity. Though there may have been many different
local initiatives that provided free access to books, as a specialized institution to do so, the library was
unmatched and unchallenged. This monopoly position has been lost in a remarkably short period of time
due to the internet and the rapid innovations in the legal e-book distribution markets. Textbooks can be
rented, e-books can be lent, a number of new startups and major sellers offer flat rate access to huge
collections. Expertise that helps navigate the domains of knowledge is abundant, there are multiple
authoritative sources of information and meta-information online. The search box of the library catalog is
only one, and not even the most usable of all the different search boxes one can type a query in5.
Meanwhile there are plenty of physical spaces which offer good coffee, an AC plug, comfortable chairs
and low levels of noise to meet, read and study from local cafes via hacker- and maker spaces, to coworking offices. Many library competitors have access to resources (human, financial, technological and
legal) way beyond the possibilities of even the richest libraries. In addition, publishers control the
copyrights in digital copies which, absent of well fortified statutory limitations and exceptions, prevent
libraries keeping up with the changes in user habits and with the competing commercial services.
Libraries definitely feel the pressure. “Libraries’ offers of materials […] compete with many other offers
that aim to attract the attention of the public. […] It is no longer enough just to make a good collection
available to the public.” (Committee on the Public Libraries in the Knowledge Society, 2010) As a
response, libraries have developed different strategies to cope with this challenge. The common thread in
the various strategy documents is that they try to redefine the library as a node in the vast network of
institutions that provide knowledge, enable learning, facilitate cooperation and initiate dialogues. Some of
the strategic plans redefine the library space as an “independent medium to be developed” (Committee on
the Public Libraries in the Knowledge Society, 2010), and advise libraries to transform themselves into
culture and community centers which establish partnerships with citizens, communities and with other
public and private institutions. Some librarians propose even more radical ways of keeping the library

5

ArXiv, SSRN, RePEc, PubMed Central, Google Scholar, Google Books, Amazon, Mendeley, Citavi,
ResearchGate, Goodreads, LibraryThing, Wikipedia, Yahoo Answers, Khan Academy, specialized twitter and other
social media accounts are just a few of the available discovery services.

11

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

relevant by, for example, advocating more opening hours without staff and hosting more user-governed
activities.
In the research library sphere, the Commission on the Future of the Library, a task force set up by the
University of California Berkeley defined the values the university research library will add in the digital
age as “1) Human expertise; 2) Enabling infrastructure; and 3) Preservation and dissemination of
knowledge for future generations.” (Commission on the Future of the Library, 2013). This approach is
from among the more conservative ones, still relying on the hope that libraries can offer something
unique that no one else is able to provide. Others, working at the Association of Research Libraries are
more like their public library counterparts, defining the future role of the research libraries as a “convener
of ‘conversations’ for knowledge construction, an inspiring host; a boundless symposium; an incubator;
a 3rd space both physically and virtually; a scaffold for independence of mind; and a sanctuary for
freedom of expression, a global entrepreneurial engine” (Pendleton-Jullian, Lougee, Wilkin, & Hilton,
2014), in other words, as another important, but in no way unique node in the wider network of
institutions that creates and distributes knowledge.
Despite the differences in priorities, all these recommendations carry the same basic message. The unique
position of libraries in the center of a book-based knowledge economy, on the top of the paper-bound
knowledge hierarchy is about to be lost. As libraries are losing their monopoly of giving low cost, low
restrictions access to books which are scarce by nature, and they are losing their privileged and powerful
position as the guardians of and guides to the knowledge stored in the stacks. If they want to survive, they
need to find their role and position in a network of institutions, where everyone else is engaged in
activities that overlap with the historic functions of the library. Just like the books themselves, the power
that came from the privileged access to books is in part dispersed among the countless nodes in the
knowledge and learning networks, and in part is being captured by those who control the digital rights to
digitize and distribute books in the digital era.
One of the main reasons why libraries are trying to redefine themselves as providers of ancillary services
is because the lack of digital lending rights prevents them from competing on their own traditional home
turf - in giving free access to knowledge. The traditional legal limitations and exceptions to copyright that
enabled libraries to fulfill their role in the analogue world do not apply in the digital realm. In the
European Union, the Infosoc Directive (“Directive 2001/29/EC on the harmonisation of certain aspects of
copyright and related rights in the information society,” 2001) allows for libraries to create digital copies
for preservation, indexing and similar purposes and allows for the display of digital copies on their
premises for research and personal study (Triaille et al., 2013). While in theory these rights provide for

12

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

the core library services in the digital domain, their practical usefulness is rather limited, as off-premises
e-lending of copyrighted works is in most cases6 only possible through individual license agreements with
publishers.
Under such circumstances libraries complain that they cannot fulfill their public interest mission in the
digital era. What libraries are allowed to do under their own under current limitations and exceptions, is
seen as inadequate for what is expected of them. But to do more requires the appropriate e-lending
licenses from rights holders. In many cases, however, libraries simply cannot license digitally for e-lending. In those cases when licensing is possible, they see transaction costs as prohibitively high; they
feel that their bargaining positions vis-à-vis rightholders is unbalanced; they do not see that the license
terms are adapted to libraries’ policies, and they fear that the licenses provide publishers excessive and
undue influence over libraries (Report on the responses to the Public Consultation on the Review of the
EU Copyright Rules, 2013).
What is more, libraries face substantial legal uncertainties even where there are more-or-less well defined
digital library exceptions. In the EU, questions such as whether the analogue lending rights of libraries
extend to e-books, whether an exhaustion of the distribution right is necessary to enjoy the lending
exception, and whether licensing an e-book would exhaust the distribution right are under consideration
by the Court of Justice of the European Union in a Dutch case (Rosati, 2014b). And while in another case
(Case C-117/13 Technische Universität Darmstadt v Eugen Ulmer KG) the CJEU reaffirmed the rights of
European libraries to digitize books in their collection if that is necessary to give access to them in digital
formats on their premises, it also created new uncertainties by stating that libraries may not digitize their
entire collections (Rosati, 2014a).
US libraries face a similar situation, both in terms of the narrowly defined exceptions in which libraries
can operate, and the huge uncertainty regarding the limits of fair use in the digital library context. US
rights holders challenged both Google’s (Authors Guild v Google) and the libraries (Authors Guild v
HathiTrust) rights to digitize copyrighted works. While there seems to be a consensus of courts that the
mass digitization conducted by these institutions was fair use (Diaz, 2013; Rosati, 2014c; Samuelson,
2014), the accessibility of the scanned works is still heavily limited, subject to licenses from publishers,
the existence of print copies at the library and the institutional membership held by prospective readers.
While in the highly competitive US e-book market many commercial intermediaries offer e-lending
6

The notable exception being orphan works which are presumed to be still copyrighted, but without an identifiable
rights owner. In the EU, the Directive 2012/28/EU on certain permitted uses of orphan works in theory eases access
to such works, but in practice its practical impact is limited by the many constraints among its provisions. Lacking
any orphan works legislation and the Google Book Settlement still in limbo, the US is even farther from making
orphan works generally accessible to the public.

13

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

licenses to e-book catalogues of various sizes, these arrangements also carry the danger of a commercial
lock-in of the access to digital works, and render libraries dependent upon the services of commercial
providers who may or may not be the best defenders of public interest (OECD, 2012).
Shadow libraries like Aleph are called into existence by the vacuum that was left behind by the collapse
of libraries in the digital sphere and by the inability of the commercial arrangements to provide adequate
substitute services. Shadow libraries are pooling distributed resources and expertise over the internet, and
use the lack of legal or technological barriers to innovation in the informal sphere to fill in the void left
behind by libraries.

What can Aleph teach us about the future of libraries?
The story of Aleph offers two, closely interrelated considerations for the debate on the future of libraries:
a legal and an organizational one. Aleph operates beyond the limits of legality, as almost all of its
activities are copyright infringing, including the unauthorized digitization of books, the unauthorized
mass downloads from e-text repositories, the unauthorized acts of uploading books to the archive, the
unauthorized distribution of books, and, in most countries, the unauthorized act of users’ downloading
books from the archive. In the debates around copyright infringement, illegality is usually interpreted as a
necessary condition to access works for free. While this is undoubtedly true, the fact that Aleph provides
no-cost access to books seems to be less important than the fact that it provides an access to them in the
first place.
Aleph is a clear indicator of the volume of the demand for current books in digital formats in developed
and in developing countries. The legal digital availability, or rather, unavailability of its catalogue also
demonstrates the limits of the current commercial and library based arrangements that aim to provide low
cost access to books over the internet. As mentioned earlier, Aleph’s catalogue is mostly of recent books,
meaning that 80% of the titles with a valid ISBN number are still in print and available as a new or used
print copy through commercial retailers. What is also clear, that around 66% of these books are yet to be
made available in electronic format. While publishers in theory have a strong incentive to make their most
recent titles available as e-books, they lag behind in doing so.
This might explain why one third of all the e-book downloads in Aleph are from highly developed
Western countries, and two third of these downloads are of books without a kindle version. Having access
to print copies either through libraries or through commercial retailers is simply not enough anymore.
Developing countries are a slightly different case. There, compared to developed countries, twice as many

14

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

of the downloads (17% compared to 8% in developed countries) are of titles that aren’t available in print
at all. Not having access to books in print seems to be a more pressing problem for developing countries
than not having access to electronic copies. Aleph thus fulfills at least two distinct types of demand: in
developed countries it provides access to missing electronic versions, in developing countries it provides
access to missing print copies.
The ability to fulfill an otherwise unfulfilled demand is not the only function of illegality. Copyright
infringement in the case of Aleph has a much more important role: it enables the peer production of the
library. Aleph is an open source library. This means that every resource it uses and every resource it
creates is freely accessible to anyone for use without any further restrictions. This includes the server
code, the database, the catalogue and the collection. The open source nature of Aleph rests on the
ideological claim that the scientific knowledge produced by humanity, mostly through public funds
should be open for anyone to access without any restrictions. Everything else in and around Aleph stems
from this claim, as they replicate the open access logic in all the other aspects of Aleph’s operation. Aleph
uses the peer produced Open Library to fetch book metadata, it uses the bittorrent and ed2k P2P networks
to store and make books accessible, it uses Linux and MySQL to run its code, and it allows its users to
upload books and edit book metadata. As a consequence of its open source nature, anyone can contribute
to the project, and everyone can enjoy its benefits.
It is hard to quantify the impact of this piratical open access library on education, science and research in
various local contexts where Aleph is the prime source of otherwise inaccessible books. But it is
relatively easy to measure the consequences of openness at the level of the Aleph, the library. The
collection of Aleph was created mostly by those individuals and communities who decided to digitize
books by themselves for their own use. While any single individual is only capable of digitizing a few
books at the maximum, the small contributions quickly add up. To digitize the 1.15 million documents in
the Aleph collection would require an investment of several hundred million Euros, and a substantial
subsequent investment in storage, collection management and access provision (Poole, 2010). Compared
to these figures the costs associated with running Aleph is infinitesimal, as it survives on the volunteer
labor of a few individuals, and annual donations in the total value of a few thousand dollars. The hundreds
of thousands who use Aleph on a more or less regular basis have an immense amount of resources, and by
disregarding the copyright laws Aleph is able to tap into those resources and use them for the
development of the library. The value of these resources and of the peer produced library is the difference
between the actual costs associated with Aleph, and the investment that would be required to create
something remotely similar.

15

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

The decentralized, collaborative mass digitization and making available of current, thus most relevant
scientific works is only possible at the moment through massive copyright infringement. It is debatable
whether the copyrighted corpus of scientific works should be completely open, and whether the blatant
disregard of copyrights through which Aleph achieved this openness is the right path towards a more
openly accessible body of scientific knowledge. It is also yet to be measured what effects shadow libraries
may have on the commercial intermediaries and on the health of scientific publishing and science in
general. But Aleph, in any case, is a case study in the potential benefits of open sourcing the library.

Conclusion
If we can take Aleph as an expression of what users around the globe want from a library, then the answer
is that there is a strong need for a universally accessible collection of current, relevant (scientific) books
in restrictions-free electronic formats. Can we expect any single library to provide anything even remotely
similar to that in the foreseeable future? Does such a service have a place in the future of libraries? It is as
hard to imagine the future library with such a service as without.
While the legal and financial obstacles to the creation of a scientific library with as universal reach as
Aleph may be difficult the overcome, other aspects of it may be more easily replicable. The way Aleph
operates demonstrates the amount of material and immaterial resources users are willing to contribute to
build a library that responds to their needs and expectations. If libraries plan to only ‘host’ user-governed
activities, it means that the library is still imagined to be a separate entity from its users. Aleph teaches us
that this separation can be overcome and users can constitute a library. But for that they need
opportunities to participate in the production of the library: they need the right to digitize books and copy
digital books to and from the library, they need the opportunity to participate in the cataloging and
collection building process, they need the opportunity to curate and program the collection. In other
words users need the chance to be librarians in the library if they wish to do so, and so libraries need to be
able to provide access not just to the collection but to their core functions as well. The walls that separate
librarians from library patrons, private and public collections, insiders and outsiders can all prevent the
peer production of the library, and through that, prevent the future that is the closest to what library users
think of as ideal.

16

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

References
Abramitzky, R., & Sin, I. (2014). Book Translations as Idea Flows: The Effects of the Collapse of
Communism

on

the

Diffusion

of

Knowledge

(No.

w20023).

Retrieved

from

http://papers.ssrn.com/abstract=2421123
Battles, M. (2004). Library: An unquiet history. WW Norton & Company.
Benkler, Y. (2006). The wealth of networks : how social production transforms markets and freedom.
New Haven: Yale University Press.
Bently, L., Davis, J., & Ginsburg, J. C. (Eds.). (2010). Copyright and Piracy An Interdisciplinary
Critique. Cambridge University Press.
Bodó, B. (2011a). A szerzői jog kalózai. Budapest: Typotex.
Bodó, B. (2011b). Coda: A Short History of Book Piracy. In J. Karaganis (Ed.), Media Piracy in
Emerging Economies. New York: Social Science Research Council.
Bodó, B. (forthcoming). Piracy vs privacy–the analysis of Piratebrowser. IJOC.
Commission on the Future of the Library. (2013). Report of the Commission on the Future of the UC
Berkeley Library. Berkeley: UC Berkeley.
Committee on the Public Libraries in the Knowledge Society. (2010). The Public Libraries in the
Knowledge Society. Copenhagen: Kulturstyrelsen.
Darnton, R. (1982). The literary underground of the Old Regime. Cambridge, Mass: Harvard University
Press.
Darnton, R. (2003). The Science of Piracy: A Crucial Ingredient in Eighteenth-Century Publishing.
Studies on Voltaire and the Eighteenth Century, 12, 3–29.
Diaz, A. S. (2013). Fair Use & Mass Digitization: The Future of Copy-Dependent Technologies after
Authors Guild v. Hathitrust. Berkeley Technology Law Journal, 23.
Directive 2001/29/EC on the harmonisation of certain aspects of copyright and related rights in the
information society. (2001). Official Journal L, 167, 10–19.
Elst, M. (2005). Copyright, freedom of speech, and cultural policy in the Russian Federation.
Leiden/Boston: Martinus Nijhoff.
Ermolaev, H. (1997). Censorship in Soviet Literature: 1917-1991. Rowman & Littlefield.
Friedberg, M., Watanabe, M., & Nakamoto, N. (1984). The Soviet Book Market: Supply and Demand.
Acta Slavica Iaponica, 2, 177–192.
Giblin, R. (2011). Code Wars: 10 Years of P2P Software Litigation. Cheltenham, UK ; Northampton,
MA: Edward Elgar Publishing.

17

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Johns, A. (2010). Piracy: The Intellectual Property Wars from Gutenberg to Gates. University Of
Chicago Press.
Judge, C. B. (1934). Elizabethan book-pirates. Cambridge: Harvard University Press.
Khan, B. Z. (2004). Does Copyright Piracy Pay? The Effects Of U.S. International Copyright Laws On
The Market For Books, 1790-1920. Cambridge, MA: National Bureau Of Economic Research.
Khan, B. Z., & Sokoloff, K. L. (2001). The early development of intellectual property institutions in the
United States. Journal of Economic Perspectives, 15(3), 233–246.
Landes, W. M., & Posner, R. A. (2003). The economic structure of intellectual property law. Cambridge,
Mass.: Harvard University Press.
Lessig, L. (2004). Free culture : how big media uses technology and the law to lock down culture and
control creativity. New York: Penguin Press.
Liang, L. (2012). Shadow Libraries. e-flux. Retrieved from http://www.e-flux.com/journal/shadowlibraries/
Patry, W. F. (2009). Moral panics and the copyright wars. New York: Oxford University Press.
Patterson, L. R. (1968). Copyright in historical perspective (p. vii, 264 p.). Nashville,: Vanderbilt
University Press.
Pendleton-Jullian, A., Lougee, W. P., Wilkin, J., & Hilton, J. (2014). Strategic Thinking and Design—
Research Library in 2033—Vision and System of Action—Part One. Colombus, OH: Association of
Research

Libraries.

Retrieved

from

http://www.arl.org/about/arl-strategic-thinking-and-design/arl-

membership-refines-strategic-thinking-and-design-at-spring-2014-meeting
Pollard, A. W. (1916). The Regulation Of The Book Trade In The Sixteenth Century. Library, s3-VII(25),
18–43.
Pollard, A. W. (1920). Shakespeare’s fight with the pirates and the problems of the transmission of his
text. Cambridge [Eng.]: The University Press.
Poole, N. (2010). The Cost of Digitising Europe’s Cultural Heritage - A Report for the Comité des Sages
of

the

European

Commission.

Retrieved

from

http://nickpoole.org.uk/wp-

content/uploads/2011/12/digiti_report.pdf
Report on the responses to the Public Consultation on the Review of the EU Copyright Rules. (2013).
European Commission, Directorate General for Internal Market and Services.
Rosati, E. (2014a). Copyright exceptions and user rights in Case C-117/13 Ulmer: a couple of
observations. IPKat. Retrieved October 08, 2014, from http://ipkitten.blogspot.co.uk/2014/09/copyrightexceptions-and-user-rights-in.html

18

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

Rosati, E. (2014b). Dutch court refers questions to CJEU on e-lending and digital exhaustion, and another
Dutch reference on digital resale may be just about to follow. IPKat. Retrieved October 08, 2014, from
http://ipkitten.blogspot.co.uk/2014/09/dutch-court-refers-questions-to-cjeu-on.html
Rosati, E. (2014c). Google Books’ Library Project is fair use. Journal of Intellectual Property Law &
Practice, 9(2), 104–106.
Rose, M. (1993). Authors and owners : the invention of copyright. Cambridge, Mass: Harvard University
Press.
Samuelson, P. (2002). Copyright and freedom of expression in historical perspective. J. Intell. Prop. L.,
10, 319.
Samuelson, P. (2014). Mass Digitization as Fair Use. Communications of the ACM, 57(3), 20–22.
Schultz, M. F. (2007). Copynorms: Copyright Law and Social Norms. Intellectual Property And
Information Wealth v01, 1, 201.
Sezneva, O. (2012). The pirates of Nevskii Prospekt: Intellectual property, piracy and institutional
diffusion in Russia. Poetics, 40(2), 150–166.
Solly, E. (1885). Henry Hills, the Pirate Printer. Antiquary, xi, 151–154.
Stelmakh, V. D. (2001). Reading in the Context of Censorship in the Soviet Union. Libraries & Culture,
36(1), 143–151.
Suber,

P.

(2013).

Open

Access

(Vol.

1).

Cambridge,

MA:

The

MIT

Press.

doi:10.1109/ACCESS.2012.2226094
Swartz,

A.

(2008).

Guerilla

Open

Access

Manifesto.

Aaron

Swartz.

Retrieved

from

https://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt
Triaille, J.-P., Dusollier, S., Depreeuw, S., Hubin, J.-B., Coppens, F., & Francquen, A. de. (2013). Study
on the application of Directive 2001/29/EC on copyright and related rights in the information society (the
“Infosoc Directive”). European Union.
Wittmann, R. (2004). Highwaymen or Heroes of Enlightenment? Viennese and South German Pirates and
the German Market. Paper presented at the History of Books and Intellectual History conference.
Princeton University.
Yu, P. K. (2000). From Pirates to Partners: Protecting Intellectual Property in China in the Twenty-First
Century.

American

University

Law,

50.

Retrieved

from

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=245548
Мошков, М. (1999). Что вы все о копирайте. Лучше бы книжку почитали (Библиотеке копирайт не
враг). Компьютерры, (300).

19


 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.