Murtaugh
A bag but is language nothing of words
2016


## A bag but is language nothing of words

### From Mondotheque

#####

(language is nothing but a bag of words)

[Michael Murtaugh](/wiki/index.php?title=Michael_Murtaugh "Michael Murtaugh")

In text indexing and other machine reading applications the term "bag of
words" is frequently used to underscore how processing algorithms often
represent text using a data structure (word histograms or weighted vectors)
where the original order of the words in sentence form is stripped away. While
"bag of words" might well serve as a cautionary reminder to programmers of the
essential violence perpetrated to a text and a call to critically question the
efficacy of methods based on subsequent transformations, the expression's use
seems in practice more like a badge of pride or a schoolyard taunt that would
go: Hey language: you're nothin' but a big BAG-OF-WORDS.

## Bag of words

In information retrieval and other so-called _machine-reading_ applications
(such as text indexing for web search engines) the term "bag of words" is used
to underscore how in the course of processing a text the original order of the
words in sentence form is stripped away. The resulting representation is then
a collection of each unique word used in the text, typically weighted by the
number of times the word occurs.

Bag of words, also known as word histograms or weighted term vectors, are a
standard part of the data engineer's toolkit. But why such a drastic
transformation? The utility of "bag of words" is in how it makes text amenable
to code, first in that it's very straightforward to implement the translation
from a text document to a bag of words representation. More significantly,
this transformation then opens up a wide collection of tools and techniques
for further transformation and analysis purposes. For instance, a number of
libraries available in the booming field of "data sciences" work with "high
dimension" vectors; bag of words is a way to transform a written document into
a mathematical vector where each "dimension" corresponds to the (relative)
quantity of each unique word. While physically unimaginable and abstract
(imagine each of Shakespeare's works as points in a 14 million dimensional
space), from a formal mathematical perspective, it's quite a comfortable idea,
and many complementary techniques (such as principle component analysis) exist
to reduce the resulting complexity.

What's striking about a bag of words representation, given is centrality in so
many text retrieval application is its irreversibility. Given a bag of words
representation of a text and faced with the task of producing the original
text would require in essence the "brain" of a writer to recompose sentences,
working with the patience of a devoted cryptogram puzzler to draw from the
precise stock of available words. While "bag of words" might well serve as a
cautionary reminder to programmers of the essential violence perpetrated to a
text and a call to critically question the efficacy of methods based on
subsequent transformations, the expressions use seems in practice more like a
badge of pride or a schoolyard taunt that would go: Hey language: you're
nothing but a big BAG-OF-WORDS. Following this spirit of the term, "bag of
words" celebrates a perfunctory step of "breaking" a text into a purer form
amenable to computation, to stripping language of its silly redundant
repetitions and foolishly contrived stylistic phrasings to reveal a purer
inner essence.

## Book of words

Lieber's Standard Telegraphic Code, first published in 1896 and republished in
various updated editions through the early 1900s, is an example of one of
several competing systems of telegraph code books. The idea was for both
senders and receivers of telegraph messages to use the books to translate
their messages into a sequence of code words which can then be sent for less
money as telegraph messages were paid by the word. In the front of the book, a
list of examples gives a sampling of how messages like: "Have bought for your
account 400 bales of cotton, March delivery, at 8.34" can be conveyed by a
telegram with the message "Ciotola, Delaboravi". In each case the reduction of
number of transmitted words is highlighted to underscore the efficacy of the
method. Like a dictionary or thesaurus, the book is primarily organized around
key words, such as _act_ , _advice_ , _affairs_ , _bags_ , _bail_ , and
_bales_ , under which exhaustive lists of useful phrases involving the
corresponding word are provided in the main pages of the volume. [1]

[![Liebers
P1016847.JPG](/wiki/images/4/41/Liebers_P1016847.JPG)](/wiki/index.php?title=File:Liebers_P1016847.JPG)

[![Liebers
P1016859.JPG](/wiki/images/3/35/Liebers_P1016859.JPG)](/wiki/index.php?title=File:Liebers_P1016859.JPG)

[![Liebers
P1016861.JPG](/wiki/images/3/34/Liebers_P1016861.JPG)](/wiki/index.php?title=File:Liebers_P1016861.JPG)

[![Liebers
P1016869.JPG](/wiki/images/f/fd/Liebers_P1016869.JPG)](/wiki/index.php?title=File:Liebers_P1016869.JPG)

> [...] my focus in this chapter is on the inscription technology that grew
parasitically alongside the monopolistic pricing strategies of telegraph
companies: telegraph code books. Constructed under the bywords “economy,”
“secrecy,” and “simplicity,” telegraph code books matched phrases and words
with code letters or numbers. The idea was to use a single code word instead
of an entire phrase, thus saving money by serving as an information
compression technology. Generally economy won out over secrecy, but in
specialized cases, secrecy was also important.[2]

In Katherine Hayles' chapter devoted to telegraph code books she observes how:

> The interaction between code and language shows a steady movement away from
a human-centric view of code toward a machine-centric view, thus anticipating
the development of full-fledged machine codes with the digital computer. [3]

[![Liebers
P1016851.JPG](/wiki/images/1/13/Liebers_P1016851.JPG)](/wiki/index.php?title=File:Liebers_P1016851.JPG)
Aspects of this transitional moment are apparent in a notice included
prominently inserted in the Lieber's code book:

> After July, 1904, all combinations of letters that do not exceed ten will
pass as one cipher word, provided that it is pronounceable, or that it is
taken from the following languages: English, French, German, Dutch, Spanish,
Portuguese or Latin -- International Telegraphic Conference, July 1903 [4]

Conforming to international conventions regulating telegraph communication at
that time, the stipulation that code words be actual words drawn from a
variety of European languages (many of Lieber's code words are indeed
arbitrary Dutch, German, and Spanish words) underscores this particular moment
of transition as reference to the human body in the form of "pronounceable"
speech from representative languages begins to yield to the inherent potential
for arbitrariness in digital representation.

What telegraph code books do is remind us of is the relation of language in
general to economy. Whether they may be economies of memory, attention, costs
paid to a telecommunicatons company, or in terms of computer processing time
or storage space, encoding language or knowledge in any form of writing is a
form of shorthand and always involves an interplay with what one expects to
perform or "get out" of the resulting encoding.

> Along with the invention of telegraphic codes comes a paradox that John
Guillory has noted: code can be used both to clarify and occlude. Among the
sedimented structures in the technological unconscious is the dream of a
universal language. Uniting the world in networks of communication that
flashed faster than ever before, telegraphy was particularly suited to the
idea that intercultural communication could become almost effortless. In this
utopian vision, the effects of continuous reciprocal causality expand to
global proportions capable of radically transforming the conditions of human
life. That these dreams were never realized seems, in retrospect, inevitable.
[5]

[![Liebers
P1016884.JPG](/wiki/images/9/9c/Liebers_P1016884.JPG)](/wiki/index.php?title=File:Liebers_P1016884.JPG)

[![Liebers
P1016852.JPG](/wiki/images/7/74/Liebers_P1016852.JPG)](/wiki/index.php?title=File:Liebers_P1016852.JPG)

[![Liebers
P1016880.JPG](/wiki/images/1/11/Liebers_P1016880.JPG)](/wiki/index.php?title=File:Liebers_P1016880.JPG)

Far from providing a universal system of encoding messages in the English
language, Lieber's code is quite clearly designed for the particular needs and
conditions of its use. In addition to the phrases ordered by keywords, the
book includes a number of tables of terms for specialized use. One table lists
a set of words used to describe all possible permutations of numeric grades of
coffee (Choliam = 3,4, Choliambos = 3,4,5, Choliba = 4,5, etc.); another table
lists pairs of code words to express the respective daily rise or fall of the
price of coffee at the port of Le Havre in increments of a quarter of a Franc
per 50 kilos ("Chirriado = prices have advanced 1 1/4 francs"). From an
archaeological perspective, the Lieber's code book reveals a cross section of
the needs and desires of early 20th century business communication between the
United States and its trading partners.

The advertisements lining the Liebers Code book further situate its use and
that of commercial telegraphy. Among the many advertisements for banking and
law services, office equipment, and alcohol are several ads for gun powder and
explosives, drilling equipment and metallurgic services all with specific
applications to mining. Extending telegraphy's formative role for ship-to-
shore and ship-to-ship communication for reasons of safety, commercial
telegraphy extended this network of communication to include those parties
coordinating the "raw materials" being mined, grown, or otherwise extracted
from overseas sources and shipped back for sale.

## "Raw data now!"

From [La ville intelligente - Ville de la connaissance](/wiki/index.php?title
=La_ville_intelligente_-_Ville_de_la_connaissance "La ville intelligente -
Ville de la connaissance"):

Étant donné que les nouvelles formes modernistes et l'utilisation de matériaux
propageaient l'abondance d'éléments décoratifs, Paul Otlet croyait en la
possibilité du langage comme modèle de « [données
brutes](/wiki/index.php?title=Bag_of_words "Bag of words") », le réduisant aux
informations essentielles et aux faits sans ambiguïté, tout en se débarrassant
de tous les éléments inefficaces et subjectifs.


From [The Smart City - City of Knowledge](/wiki/index.php?title
=The_Smart_City_-_City_of_Knowledge "The Smart City - City of Knowledge"):

As new modernist forms and use of materials propagated the abundance of
decorative elements, Otlet believed in the possibility of language as a model
of '[raw data](/wiki/index.php?title=Bag_of_words "Bag of words")', reducing
it to essential information and unambiguous facts, while removing all
inefficient assets of ambiguity or subjectivity.


> Tim Berners-Lee: [...] Make a beautiful website, but first give us the
unadulterated data, we want the data. We want unadulterated data. OK, we have
to ask for raw data now. And I'm going to ask you to practice that, OK? Can
you say "raw"?

>

> Audience: Raw.

>

> Tim Berners-Lee: Can you say "data"?

>

> Audience: Data.

>

> TBL: Can you say "now"?

>

> Audience: Now!

>

> TBL: Alright, "raw data now"!

>

> [...]

>

> So, we're at the stage now where we have to do this -- the people who think
it's a great idea. And all the people -- and I think there's a lot of people
at TED who do things because -- even though there's not an immediate return on
the investment because it will only really pay off when everybody else has
done it -- they'll do it because they're the sort of person who just does
things which would be good if everybody else did them. OK, so it's called
linked data. I want you to make it. I want you to demand it. [6]

## Un/Structured

As graduate students at Stanford, Sergey Brin and Lawrence (Larry) Page had an
early interest in producing "structured data" from the "unstructured" web. [7]

> The World Wide Web provides a vast source of information of almost all
types, ranging from DNA databases to resumes to lists of favorite restaurants.
However, this information is often scattered among many web servers and hosts,
using many different formats. If these chunks of information could be
extracted from the World Wide Web and integrated into a structured form, they
would form an unprecedented source of information. It would include the
largest international directory of people, the largest and most diverse
databases of products, the greatest bibliography of academic works, and many
other useful resources. [...]

>

> **2.1 The Problem**
> Here we define our problem more formally:
> Let D be a large database of unstructured information such as the World
Wide Web [...] [8]

In a paper titled _Dynamic Data Mining_ Brin and Page situate their research
looking for _rules_ (statistical correlations) between words used in web
pages. The "baskets" they mention stem from the origins of "market basket"
techniques developed to find correlations between the items recorded in the
purchase receipts of supermarket customers. In their case, they deal with web
pages rather than shopping baskets, and words instead of purchases. In
transitioning to the much larger scale of the web, they describe the
usefulness of their research in terms of its computational economy, that is
the ability to tackle the scale of the web and still perform using
contemporary computing power completing its task in a reasonably short amount
of time.

> A traditional algorithm could not compute the large itemsets in the lifetime
of the universe. [...] Yet many data sets are difficult to mine because they
have many frequently occurring items, complex relationships between the items,
and a large number of items per basket. In this paper we experiment with word
usage in documents on the World Wide Web (see Section 4.2 for details about
this data set). This data set is fundamentally different from a supermarket
data set. Each document has roughly 150 distinct words on average, as compared
to roughly 10 items for cash register transactions. We restrict ourselves to a
subset of about 24 million documents from the web. This set of documents
contains over 14 million distinct words, with tens of thousands of them
occurring above a reasonable support threshold. Very many sets of these words
are highly correlated and occur often. [9]

## Un/Ordered

In programming, I've encountered a recurring "problem" that's quite
symptomatic. It goes something like this: you (the programmer) have managed to
cobble out a lovely "content management system" (either from scratch, or using
any number of helpful frameworks) where your user can enter some "items" into
a database, for instance to store bookmarks. After this ordered items are
automatically presented in list form (say on a web page). The author: It's
great, except... could this bookmark come before that one? The problem stems
from the fact that the database ordering (a core functionality provided by any
database) somehow applies a sorting logic that's almost but not quite right. A
typical example is the sorting of names where details (where to place a name
that starts with a Norwegian "Ø" for instance), are language-specific, and
when a mixture of languages occurs, no single ordering is necessarily
"correct". The (often) exascerbated programmer might hastily add an additional
database field so that each item can also have an "order" (perhaps in the form
of a date or some other kind of (alpha)numerical "sorting" value) to be used
to correctly order the resulting list. Now the author has a means, awkward and
indirect but workable, to control the order of the presented data on the start
page. But one might well ask, why not just edit the resulting listing as a
document? Not possible! Contemporary content management systems are based on a
data flow from a "pure" source of a database, through controlling code and
templates to produce a document as a result. The document isn't the data, it's
the end result of an irreversible process. This problem, in this and many
variants, is widespread and reveals an essential backwardness that a
particular "computer scientist" mindset relating to what constitutes "data"
and in particular it's relationship to order that makes what might be a
straightforward question of editing a document into an over-engineered
database.

Recently working with Nikolaos Vogiatzis whose research explores playful and
radically subjective alternatives to the list, Vogiatzis was struck by how
from the earliest specifications of HTML (still valid today) have separate
elements (OL and UL) for "ordered" and "unordered" lists.

> The representation of the list is not defined here, but a bulleted list for
unordered lists, and a sequence of numbered paragraphs for an ordered list
would be quite appropriate. Other possibilities for interactive display
include embedded scrollable browse panels. [10]

Vogiatzis' surprise lay in the idea of a list ever being considered
"unordered" (or in opposition to the language used in the specification, for
order to ever be considered "insignificant"). Indeed in its suggested
representation, still followed by modern web browsers, the only difference
between the two visually is that UL items are preceded by a bullet symbol,
while OL items are numbered.

The idea of ordering runs deep in programming practice where essentially
different data structures are employed depending on whether order is to be
maintained. The indexes of a "hash" table, for instance (also known as an
associative array), are ordered in an unpredictable way governed by a
representation's particular implementation. This data structure, extremely
prevalent in contemporary programming practice sacrifices order to offer other
kinds of efficiency (fast text-based retrieval for instance).

## Data mining

In announcing Google's impending data center in Mons, Belgian prime minister
Di Rupo invoked the link between the history of the mining industry in the
region and the present and future interest in "data mining" as practiced by IT
companies such as Google.

Whether speaking of bales of cotton, barrels of oil, or bags of words, what
links these subjects is the way in which the notion of "raw material" obscures
the labor and power structures employed to secure them. "Raw" is always
relative: "purity" depends on processes of "refinement" that typically carry
social/ecological impact.

Stripping language of order is an act of "disembodiment", detaching it from
the acts of writing and reading. The shift from (human) reading to machine
reading involves a shift of responsibility from the individual human body to
the obscured responsibilities and seemingly inevitable forces of the
"machine", be it the machine of a market or the machine of an algorithm.

From [X = Y](/wiki/index.php?title=X_%3D_Y "X = Y"):

Still, it is reassuring to know that the products hold traces of the work,
that even with the progressive removal of human signs in automated processes,
the workers' presence never disappears completely. This presence is proof of
the materiality of information production, and becomes a sign of the economies
and paradigms of efficiency and profitability that are involved.


The computer scientists' view of textual content as "unstructured", be it in a
webpage or the OCR scanned pages of a book, reflect a negligence to the
processes and labor of writing, editing, design, layout, typesetting, and
eventually publishing, collecting and cataloging [11].

"Unstructured" to the computer scientist, means non-conformant to particular
forms of machine reading. "Structuring" then is a social process by which
particular (additional) conventions are agreed upon and employed. Computer
scientists often view text through the eyes of their particular reading
algorithm, and in the process (voluntarily) blind themselves to the work
practices which have produced and maintain these "resources".

Berners-Lee, in chastising his audience of web publishers to not only publish
online, but to release "unadulterated" data belies a lack of imagination in
considering how language is itself structured and a blindness to the need for
more than additional technical standards to connect to existing publishing
practices.

Last Revision: 2*08*2016

1. ↑ Benjamin Franklin Lieber, Lieber's Standard Telegraphic Code, 1896, New York;
2. ↑ Katherine Hayles, "Technogenesis in Action: Telegraph Code Books and the Place of the Human", How We Think: Digital Media and Contemporary Technogenesis, 2006
3. ↑ Hayles
4. ↑ Lieber's
5. ↑ Hayles
6. ↑ Tim Berners-Lee: The next web, TED Talk, February 2009
7. ↑ "Research on the Web seems to be fashionable these days and I guess I'm no exception." from Brin's [Stanford webpage](http://infolab.stanford.edu/~sergey/)
8. ↑ Extracting Patterns and Relations from the World Wide Web, Sergey Brin, Proceedings of the WebDB Workshop at EDBT 1998,
9. ↑ Dynamic Data Mining: Exploring Large Rule Spaces by Sampling; Sergey Brin and Lawrence Page, 1998; p. 2
10. ↑ Hypertext Markup Language (HTML): "Internet Draft", Tim Berners-Lee and Daniel Connolly, June 1993,
11. ↑

Retrieved from
[https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480](https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480)

WHW
There Is Something Political in the City Air
2016


What, How & for Whom / WHW

“There is something political in the city air”*

The curatorial collective What,
How & for Whom / WHW, based
in Zagreb and Berlin, examine
the interconnections between
contemporary art and political and
social strata, including the role of art
institutions in contemporary society.
In the present essay, their discussion
of recent projects they curated
highlights the struggle for access to
knowledge and the free distribution
of information, which in Croatia also
means confronting the pressures
of censorship and revisionism
in the writing of history and the
construction of the future.

Contemporary art’s attempts to come to terms with its evasions in delivering on the promise of its own intrinsic capacity to propose alternatives, and
to do better in the constant game of staying ahead of institutional closures
and marketization, are related to a broader malady in leftist politics. The
crisis of organizational models and modes of political action feels especially acute nowadays, after the latest waves of massive political mobilization
and upheaval embodied in such movements as the Arab Spring and Occupy and the widespread social protests in Southern Europe against austerity
measures – and the failure of these movements to bring about structural
changes. As we witnessed in the dramatic events that unfolded through the
spring and summer of 2015, even in Greece, where Syriza was brought to
power, the people’s will behind newly elected governments proved insufficient to change the course of austerity politics in Europe. Simultaneously,
a series of conditional gains and effective defeats gave rise to the alarming
ascent of radical right-wing populism, against which the left has failed to
provide any real vision or driving force.
Both the practice of political articulation and the political practices of
art have been affected by the hollowing and disabling of democracy related
to the ascendant hegemony of the neoliberal rationale that shapes every
domain of our lives in accordance with a specific image of economics,1
as well as the problematic “embrace of localism and autonomy by much
of the left as the pure strategy”2 and the left’s inability to destabilize the
dominant world-view and reclaim the future.3 Consequently, art practices
increasingly venture into novel modes of operation that seek to “expand
our collective imagination beyond what capitalism allows”.4 They not only
point to the problems but address them head on. By negotiating art’s autonomy and impact on the social, and by conceptualizing the whole edifice
of art as a social symptom, such practices attempt to do more than simply
squeeze novel ideas into exhausted artistic formats and endow them with
political content that produces “marks of distinction”,5 which capital then
exploits for the enhancement of its own reproduction.
The two projects visited in this text both work toward building truly
accessible public spaces. Public Library, launched by Marcell Mars and
Tomislav Medak in 2012, is an ongoing media and social project based on
ideas from the open-source software movement, while Autonomy Cube, by
artist Trevor Paglen and the hacker and computer security researcher Jacob Appelbaum, centres on anonymized internet usage in the post–Edward
*
1
2
3
4
5

David Harvey, Rebel Cities: From the Right to the City to the Urban Revolution, Verso, London and New York, 2012, p. 117.
See Wendy Brown, Undoing the Demos: Neoliberalism’s Stealth Revolution, Zone books,
New York, 2015.
Harvey, Rebel Cities, p. 83.
See Nick Srnicek and Alex Williams, Inventing the Future: Postcapitalism and a World
Without Work, Verso, London and New York, 2015.
Ibid., p. 495.
See Harvey, Rebel Cities, especially pp. 103–109.

“There is something political in the city air”

289

Snowden world of unprecedented institutionalized surveillance. Both projects operate in tacit alliance with art institutions that more often than not
are suffering from a kind of “mission drift” under pressure to align their
practices and structures with the profit sector, a situation that in recent
decades has gradually become the new norm.6 By working within and with
art institutions, both Public Library and Autonomy Cube induce the institutions to return to their initial mission of creating new common spaces
of socialization and political action. The projects develop counter-publics
and work with infrastructures, in the sense proposed by Keller Easterling:
not just physical networks but shared standards and ideas that constitute
points of contact and access between people and thus rule, govern, and
control the spaces in which we live.7
By building a repository of digitized books, and enabling others to do this
as well, Public Library promotes the idea of the library as a truly public institution that offers universal access to knowledge, which “together with
free public education, a free public healthcare, the scientific method, the
Universal Declaration of Human Rights, Wikipedia, and free software,
among others – we, the people, are most proud of ”, as the authors of the
project have said.8 Public Library develops devices for the free sharing of
books, but it also functions as a platform for advocating social solidarity
in free access to knowledge. By ignoring and avoiding the restrictive legal
regime for intellectual property, which was brought about by decades of
neoliberalism, as well as the privatization or closure of public institutions,
spatial controls, policing, and surveillance – all of which disable or restrict
possibilities for building new social relations and a new commons – Public
Library can be seen as part of the broader movement to resist neoliberal
austerity politics and the commodification of knowledge and education
and to appropriate public spaces and public goods for common purposes.
While Public Library is fully engaged with the movement to oppose the
copyright regime – which developed as a kind of rent for expropriating the
commons and reintroducing an artificial scarcity of cognitive goods that
could be reproduced virtually for free – the project is not under the spell of
digital fetishism, which until fairly recently celebrated a new digital commons as a non-frictional space of smooth collaboration where a new political and economic autonomy would be forged that would spill over and
undermine the real economy and permeate all spheres of life.9 As Matteo
Pasquinelli argues in his critique of “digitalism” and its celebration of the
6
7
8
9

See Brown, Undoing the Demos.
Keller Easterling, Extrastatecraft: The Power of Infrastructure Space, Verso, London and
New York, 2014.
Marcell Mars, Manar Zarroug, and Tomislav Medak, “Public Library”, in Public Library,
ed. Marcell Mars, Tomislav Medak, and What, How & for Whom / WHW, exh. publication, What, How & for Whom / WHW and Multimedia Institute, Zagreb, 2015, p. 78.
See Matteo Pasquinelli, Animal Spirits: A Bestiary of the Commons, NAi Publishers, Rotterdam, and Institute of Network Cultures, Amsterdam, 2008.

290

What, How & for Whom / WHW

virtues of the information economy with no concern about the material
basis of production, the information economy is a parasite on the material
economy and therefore “an accurate understanding of the common must
be always interlinked with the real physical forces producing it and the material economy surrounding it.”10
Public Library emancipates books from the restrictive copyright regime
and participates in the exchange of information enabled by digital technology, but it also acknowledges the labour and energy that make this possible. There is labour that goes into the cataloguing of the books, and labour
that goes into scanning them before they can be brought into the digital
realm of free reproduction, just as there are the ingenuity and labour of
the engineers who developed a special scanner that makes it easier to scan
books; also, the scanner needs to be installed, maintained, and fed books
over hours of work. This is where the institutional space of art comes in
handy by supporting the material production central to the Public Library
endeavour. But the scanner itself does not need to be visible. In 2014, at
the Museo Nacional Centro de Arte Reina Sofia in Madrid, we curated the
exhibition Really Useful Knowledge, which dealt with conflicts triggered by
struggles over access to knowledge and the effects that knowledge, as the
basis of capital reproduction, has on the totality of workers’ lives. In the
exhibition, the production funds allocated to Public Library were used to
build the book scanner at Calafou, an anarchist cooperative outside Barcelona. The books chosen for scanning were relevant to the exhibition’s
themes – methods of reciprocal learning and teaching, forms of social and
political organization, the history of the Spanish Civil War, etc. – and after
being scanned, they were uploaded to the Public Library website. All that
was visible in the exhibition itself was a kind of index card or business card
with a URL link to the Public Library website and a short statement (fig. 1):
A public library is:
• free access to books for every member of society
• library catalog
• librarian
With books ready to be shared, meticulously cataloged, everyone is a
librarian. When everyone is librarian, the library is everywhere.11
Public Library’s alliance with art institutions serves to strengthen the
cultural capital both for the general demand to free books from copyright
restrictions on cultural goods and for the project itself – such cultural capital could be useful in a potential lawsuit. Simultaneously, the presence and
realization of the Public Library project within an exhibition enlists the host
institution as part of the movement and exerts influence on it by taking
the museum’s public mission seriously and extending it into a grey zone of
10
11

Ibid., p. 29.
Mars, Zarroug, and Medak, “Public Library”, p. 85.

“There is something political in the city air”

291

questionable legality. The defence of the project becomes possible by making the traditional claim of the “autonomy” of art, which is not supposed
to assert any power beyond the museum walls. By taking art’s autonomy
at its word, and by testing the truth of the liberal-democratic claim that
the field of art is a field of unlimited freedom, Public Library engages in a
kind of “overidentification” game, or what Keller Easterling, writing about
the expanded activist repertoire in infrastructure space, calls “exaggerated
compliance”.12 Should the need arise, as in the case of a potential lawsuit
against the project, claims of autonomy and artistic freedom create a protective shroud of untouchability. And in this game of liberating books from
the parochial capitalist imagination that restricts their free circulation, the
institution becomes a complicit partner. The long-acknowledged insight
that institutions embrace and co-opt critique is, in this particular case, a
win-win situation, as Public Library uses the public status of the museum
as a springboard to establish the basic message of free access and the free
circulation of books and knowledge as common sense, while the museum
performs its mission of bringing knowledge to the public and supporting
creativity, in this case the reworking, rebuilding and reuse of technology
for the common good. The fact that the institution is not naive but complicit produces a synergy that enhances potentialities for influencing and
permeating the public sphere. The gesture of not exhibiting the scanner in
the museum has, among other things, a practical purpose, as more books
would be scanned voluntarily by the members of the anarchist commune
in Calafou than would be by the overworked museum staff, and employing
somebody to do this during the exhibition would be too expensive (and the
mantra of cuts, cuts, cuts would render negotiation futile). If there is a flirtatious nod to the strategic game of not exposing too much, it is directed less
toward the watchful eyes of the copyright police than toward the exhibition
regime of contemporary art group shows in which works compete for attention, the biggest scarcity of all. Public Library flatly rejects identification
with the object “our beloved bookscanner” (as the scanner is described on
the project website13), although it is an attractive object that could easily
be featured as a sculpture within the exhibition. But its efficacy and use
come first, as is also true of the enigmatic business card–like leaflet, which
attracts people to visit the Public Library website and use books, not only to
read them but also to add books to the library: doing this in the privacy of
one’s home on one’s own computer is certainly more effective than doing
it on a computer provided and displayed in the exhibition among the other
art objects, films, installations, texts, shops, cafés, corridors, exhibition
halls, elevators, signs, and crowds in a museum like Reina Sofia.
For the exhibition to include a scanner that was unlikely to be used or
a computer monitor that showed the website from which books might be
12
13

Easterling, Extrastatecraft, p. 492.
See https://www.memoryoftheworld.org/blog/2012/10/28/our-belovedbookscanner-2/ (accessed July 4, 2016).

292

What, How & for Whom / WHW

downloaded, but probably not read, would be the embodiment of what
philosopher Robert Pfaller calls “interpassivity”, the appearance of activity or a stand-in for it that in fact replaces any genuine engagement.14 For
Pfaller, interpassivity designates a flight from engagement, a misplaced libidinal investment that under the mask of enjoyment hides aversion to an
activity that one is supposed to enjoy, or more precisely: “Interpassivity is
the creation of a compromise between cultural interests and latent cultural
aversion.”15 Pfaller’s examples of participation in an enjoyable process that
is actually loathed include book collecting and the frantic photocopying of
articles in libraries (his book was originally published in 2002, when photocopying had not yet been completely replaced by downloading, bookmarking, etc.).16 But he also discusses contemporary art exhibitions as sites of
interpassivity, with their overabundance of objects and time-based works
that require time that nobody has, and with the figure of the curator on
whom enjoyment is displaced – the latter, he says, is a good example of
“delegated enjoyment”. By not providing the exhibition with a computer
from which books can be downloaded, the project ensures that books are
seen as vehicles of knowledge acquired by reading and not as immaterial
capital to be frantically exchanged; the undeniable pleasure of downloading and hoarding books is, after all, just one step removed from the playground of interpassivity that the exhibition site (also) is.
But Public Library is hardly making a moralistic statement about the
virtues of reading, nor does it believe that ignorance (such as could be
overcome by reading the library’s books) is the only obstacle that stands
in the way of ultimate emancipation. Rather, the project engages with, and
contributes to, the social practice that David Harvey calls “commoning”:
“an unstable and malleable social relation between a particular self-defined social group and those aspects of its actually existing or yet-to-becreated social and/or physical environment deemed crucial to its life and
livelihood”.17 Public Library works on the basis of commoning and tries to
enlist others to join it, which adds a distinctly political dimension to the
sabotage of intellectual property revenues and capital accumulation.
The political dimension of Public Library and the effort to form and
publicize the movement were expressed more explicitly in the Public Li14
15
16

17

Robert Pfaller, On the Pleasure Principle in Culture: Illusions Without Owners, Verso, London and New York, 2014.
Ibid., p. 76.
Pfaller’s book, which first appeared in German, was published in English only in 2014.
His ideas have gained greater relevance over time, not only as the shortcomings of the
immensely popular social media activism became apparent – where, as many critics
have noted, participation in political organizing and the articulation of political tasks
and agendas are often replaced by a click on an icon – but also because of Pfaller’s
broader argument about the self-deception at play in interpassivity and its role in eliciting enjoyment from austerity measures and other calamities imposed on the welfare
state by the neoliberal regime, which since early 2000 has exceeded even the most sober (and pessimistic) expectations.
Ibid., p. 73.

“There is something political in the city air”

293

brary exhibition in 2015 at Gallery Nova in Zagreb, where we have been
directing the programme since 2003. If the Public Library project was not
such an eminently collective practice that pays no heed to the author function, the Gallery Nova show might be considered something like a solo exhibition. As it was realized, the project again used art as an infrastructure
and resource to promote the movement of freeing books from copyright
restrictions while collecting legitimization points from the art world as enhanced cultural capital that could serve as armour against future attacks
by the defenders of the holy scripture of copyright laws. But here the more
important tactic was to show the movement as an army of many and to
strengthen it through self-presentation. The exhibition presented Public
Library as a collection of collections, and the repertory form (used in archive science to describe a collection) was taken as the basic narrative procedure. It mobilized and activated several archives and open digital repositories, such as MayDay Rooms from London, The Ignorant Schoolmaster and
His Committees from Belgrade, Library Genesis and Aaaaaarg.org, Catalogue
of Free Books, (Digitized) Praxis, the digitized work of the Midnight Notes
Collective, and Textz.com, with special emphasis on activating the digital
repositories UbuWeb and Monoskop. Not only did the exhibition attempt to
enlist the gallery audience but, equally important, the project was testing
its own strength in building, articulating, announcing, and proposing, or
speculating on, a broader movement to oppose the copyright of cultural
goods within and adjacent to the art field.
Presenting such a movement in an art institution changes one of the
basic tenets of art, and for an art institution the project’s main allure probably lies in this kind of expansion of the art field. A shared politics is welcome, but nothing makes an art institution so happy as the sense of purpose that a project like Public Library can endow it with. (This, of course,
comes with its own irony, for while art institutions nowadays compete for
projects that show emphatically how obsolete the aesthetic regime of art is,
they continue to base their claims of social influence on knowledge gained
through some form of aesthetic appreciation, however they go about explaining and justifying it.) At the same time, Public Library’s nonchalance
about institutional maladies and anxieties provides a homeopathic medicine whose effect is sometimes so strong that discussion about placebos
becomes, at least temporarily, beside the point. One occasion when Public
Library’s roving of the political terrain became blatantly direct was the exhibition Written-off: On the Occasion of the 20th Anniversary of Operation
Storm, which we organized in the summer of 2015 at Gallery Nova (figs.
2–4).
The exhibition/action Written-off was based on data from Ante Lesaja’s
extensive research on “library purification”, which he published in his book
Knjigocid: Uništavanje knjige u Hrvatskoj 1990-ih (Libricide: The Destruction
of Books in Croatia in the 1990s).18 People were invited to bring in copies of
18

Ante Lesaja, Knjigocid: Uništavanje knjige u Hrvatskoj 1990-ih, Profil and Srbsko narodno

294

What, How & for Whom / WHW

books that had been removed from Croatian public libraries in the 1990s.
The books were scanned and deposited in a digital archive; they then became available on a website established especially for the project. In Croatia during the 1990s, hundreds of thousands of books were removed from
schools and factories, from public, specialized, and private libraries, from
former Yugoslav People’s Army centres, socio-political organizations, and
elsewhere because of their ideologically inappropriate content, the alphabet they used (Serbian Cyrillic), or the ethnic or political background of the
authors. The books were mostly thrown into rubbish bins, discarded on
the street, destroyed, or recycled. What Lesaja’s research clearly shows is
that the destruction of the books – as well as the destruction of monuments
to the People’s Liberation War (World War II) – was not the result of individuals running amok, as official accounts preach, but a deliberate and systematic action that symbolically summarizes the dominant politics of the
1990s, in which war, rampant nationalism, and phrases about democracy
and sovereignty were used as a rhetorical cloak to cover the nakedness of
the capitalist counter-revolution and criminal processes of dispossession.
Written-off: On the Occasion of the 20th Anniversary of Operation Storm
set up scanners in the gallery, initiated a call for collecting and scanning
books that had been expunged from public institutions in the 1990s, and
outlined the criteria for the collection, which corresponded to the basic
domains in which the destruction of the books, as a form of censorship,
was originally implemented: books written in the Cyrillic alphabet or in
Serbian regardless of the alphabet; books forming a corpus of knowledge
about communism, especially Yugoslav communism, Yugoslav socialism,
and the history of the workers’ struggle; and books presenting the anti-Fascist and revolutionary character of the People’s Liberation Struggle during
World War II.
The exhibition/action was called Written-off because the removal and
destruction of the books were often presented as a legitimate procedure
of library maintenance, thus masking the fact that these books were unwanted, ideologically unacceptable, dangerous, harmful, unnecessary, etc.
Written-off unequivocally placed “book destruction” in the social context
of the period, when the destruction of “unwanted” monuments and books
was happening alongside the destruction of homes and the killing of “unwanted” citizens, outside of and prior to war operations. For this reason,
the exhibition was dedicated to the twentieth anniversary of Operation
Storm, the final military/police operation in what is called, locally, the
Croatian Homeland War.19
The exhibition was intended as a concrete intervention against a political logic that resulted in mass exile and killing, the history of which is
glossed over and critical discussion silenced, and also against the official
19

vijeće, Zagreb, 2012.
Known internationally as the Croatian War of Independence, the war was fought between Croatian forces and the Serb-controlled Yugoslav People’s Army from 1991 to
1995.

“There is something political in the city air”

295

celebrations of the anniversary, which glorified militarism and proclaimed
the ethical purity of the victory (resulting in the desired ethnic purity of the
nation).
As both symbolic intervention and real-life action, then, the exhibition
Written-off took place against a background of suppressed issues relating
to Operation Storm – ethno-nationalism as the flip side of neoliberalism,
justice and the present status of the victims and refugees, and the overall character of the war known officially as the Homeland War, in which
discussions about its prominent traits as a civil war are actively silenced
and increasingly prosecuted. In protest against the official celebrations
and military parades, the exhibition marked the anniversary of Operation
Storm with a collective action that evokes books as symbolic of a “knowledge society” in which knowledge becomes the location of conflictual engagement. It pointed toward the struggle over collective symbolic capital
and collective memory, in which culture as a form of the commons has a
direct bearing on the kind of place we live in. The Public Library project,
however, is engaged not so much with cultural memory and remembrance
as a form of recollection or testimony that might lend political legitimation
to artistic gestures; rather, it engages with history as a construction and
speculative proposition about the future, as Peter Osborne argues in his
polemical hypotheses on the notion of contemporary art that distinguishes
between “contemporary” and “present-day” art: “History is not just a relationship between the present and the past – it is equally about the future.
It is this speculative futural moment that definitively separates the concept
of history from memory.”20 For Public Library, the future that participates
in the construction of history does not yet exist, but it is defined as more
than just a project against the present as reflected in the exclusionary, parochially nationalistic, revisionist and increasingly fascist discursive practices of the Croatian political elites. Rather, the future comes into being as
an active and collective construction based on the emancipatory aspects of
historical experiences as future possibilities.
Although defined as an action, the project is not exultantly enthusiastic
about collectivity or the immediacy and affective affinities of its participants, but rather it transcends its local and transient character by taking
up the broader counter-hegemonic struggle for the mutual management
of joint resources. Its endeavour is not limited to the realm of the political
and ideological but is rooted in the repurposing of technological potentials
from the restrictive capitalist game and the reutilization of the existing infrastructure to build a qualitatively different one. While the culture industry adapts itself to the limited success of measures that are geared toward
preventing the free circulation of information by creating new strategies
for pushing information into a form of property and expropriating value

20

Peter Osborne, Anywhere or Not at All: Philosophy of Contemporary Art, Verso, London
and New York, 2013, p. 194.

296

What, How & for Whom / WHW

fig. 1
Marcell Mars, Art as Infrastructure: Public Library, installation
view, Really Useful Knowledge, curated by WHW, Museo
Nacional Centro de Arte Reina Sofia, Madrid, 2014.
Photo by Joaquin Cortes and Roman Lores / MNCARS.

fig. 2
Public Library, exhibition view, Gallery Nova, Zagreb, 2015.
Photo by Ivan Kuharic.

fig. 3
Written-off: On the Occasion of the 20th Anniversary of Operation
Storm, exhibition detail, Gallery Nova, Zagreb, 2015.
Photo by Ivan Kuharic.

fig. 4
Written-off: On the Occasion of the 20th Anniversary of Operation
Storm, exhibition detail, Gallery Nova, Zagreb, 2015.
Photo by Ivan Kuharic.

fig. 5
Trevor Paglen and Jacob Appelbaum, Autonomy Cube,
installation view, Really Useful Knowledge, curated by WHW,
Museo Nacional Centro de Arte Reina Sofia, Madrid, 2014.
Photo by Joaquín Cortés and Román Lores / MNCARS.

through the control of metadata (information about information),21 Public Library shifts the focus away from aesthetic intention – from unique,
closed, and discrete works – to a database of works and the metabolism
of the database. It creates values through indexing and connectivity, imagined communities and imaginative dialecticization. The web of interpenetration and determination activated by Public Library creates a pedagogical endeavour that also includes a propagandist thrust, if the notion of
propaganda can be recast in its original meaning as “things that must be
disseminated”.
A similar didactic impetus and constructivist praxis is present in the work
Autonomy Cube, which was developed through the combined expertise of
artist and geographer Trevor Paglen and internet security researcher, activist and hacker Jacob Appelbaum. This work, too, we presented in the
Reina Sofia exhibition Really Useful Knowledge, along with Public Library
and other projects that offered a range of strategies and methodologies
through which the artists attempted to think through the disjunction between concrete experience and the abstraction of capital, enlisting pedagogy as a crucial element in organized collective struggles. Autonomy Cube
offers a free, open-access, encrypted internet hotspot that routes internet
traffic over TOR, a volunteer-run global network of servers, relays, and services, which provides anonymous and unsurveilled communication. The
importance of the privacy of the anonymized information that Autonomy
Cube enables and protects is that it prevents so-called traffic analysis – the
tracking, analysis, and theft of metadata for the purpose of anticipating
people’s behaviour and relationships. In the hands of the surveillance
state this data becomes not only a means of steering our tastes, modes of
consumption, and behaviours for the sake of making profit but also, and
more crucially, an effective method and weapon of political control that
can affect political organizing in often still-unforeseeable ways that offer
few reasons for optimism. Visually, Autonomy Cube references minimalist
sculpture (fig. 5) (specifically, Hans Haacke’s seminal piece Condensation
Cube, 1963–1965), but its main creative drive lies in the affirmative salvaging of technologies, infrastructures, and networks that form both the leading organizing principle and the pervasive condition of complex societies,
with the aim of supporting the potentially liberated accumulation of collective knowledge and action. Aesthetic and art-historical references serve
as camouflage or tools for a strategic infiltration that enables expansion of
the movement’s field of influence and the projection of a different (contingent) future. Engagement with historical forms of challenging institutions
becomes the starting point of a poetic praxis that materializes the object of
its striving in the here and now.
Both Public Library and Autonomy Cube build their autonomy on the dedi21

McKenzie Wark, “Metadata Punk”, in Public Library, pp. 113–117 (see n. 9).

“There is something political in the city air”

305

cation and effort of the collective body, without which they would not
exist, rendering this interdependence not as some consensual idyll of cooperation but as conflicting fields that create further information and experiences. By doing so, they question the traditional edifice of art in a way
that supports Peter Osborne’s claim that art is defined not by its aesthetic
or medium-based status, but by its poetics: “Postconceptual art articulates a post-aesthetic poetics.”22 This means going beyond criticality and
bringing into the world something defined not by its opposition to the real,
but by its creation of the fiction of a shared present, which, for Osborne,
is what makes art truly contemporary. And if projects like these become a
kind of political trophy for art institutions, the side the institutions choose
nevertheless affects the common sense of our future.

22

Osborne, Anywhere or Not at All, p. 33.

306

What, How & for Whom / WHW

“There is something political in the city air”

307


USDC
Complaint: Elsevier v. SciHub and LibGen
2015


Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 1 of 16

UNITED STATES DISTRICT COURT
SOUTHERN DISTRICT OF NEW YORK

Index No. 15-cv-4282 (RWS)
COMPLAINT

ELSEVIER INC., ELSEVIER B.V., ELSEVIER LTD.
Plaintiffs,

v.

SCI-HUB d/b/a WWW.SCI-HUB.ORG, THE LIBRARY GENESIS PROJECT d/b/a LIBGEN.ORG, ALEXANDRA ELBAKYAN, JOHN DOES 1-99,
Defendants.

Plaintiffs Elsevier Inc, Elsevier B.V., and Elsevier Ltd. (collectively “Elsevier”),
by their attorneys DeVore & DeMarco LLP, for their complaint against www.scihub.org,
www.libgen.org, Alexandra Elbakyan, and John Does 1-99 (collectively the “Defendants”),
allege as follows:

NATURE OF THE ACTION

1. This is a civil action seeking damages and injunctive relief for: (1) copyright infringement under the copyright laws of the United States (17 U.S.C. § 101 et seq.); and (2) violations of the Computer Fraud and Abuse Act, 18.U.S.C. § 1030, based upon Defendants’ unlawful access to, use, reproduction, and distribution of Elsevier’s copyrighted works. Defendants’ actions in this regard have caused and continue to cause irreparable injury to Elsevier and its publishing partners (including scholarly societies) for which it publishes certain journals.

1

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 2 of 16

PARTIES

2. Plaintiff Elsevier Inc. is a corporation organized under the laws of Delaware, with its principal place of business at 360 Park Avenue South, New York, New York 10010.

3. Plaintiff Elsevier B.V. is a corporation organized under the laws of the Netherlands, with its principal place of business at Radarweg 29, Amsterdam, 1043 NX, Netherlands.

4. Plaintiff Elsevier Ltd. is a corporation organized under the laws of the United Kingdom, with its principal place of business at 125 London Wall, EC2Y 5AS United Kingdom.

5. Upon information and belief, Defendant Sci-Hub is an individual or organization engaged in the operation of the website accessible at the URL “www.sci-hub.org,” and related subdomains, including but not limited to the subdomain “www.sciencedirect.com.sci-hub.org,”
www.elsevier.com.sci-hub.org,” “store.elsevier.com.sci-hub.org,” and various subdomains
incorporating the company and product names of other major global publishers (collectively with www.sci-hub.org the “Sci-Hub Website”). The sci-hub.org domain name is registered by
“Fundacion Private Whois,” located in Panama City, Panama, to an unknown registrant. As of
the date of this filing, the Sci-Hub Website is assigned the IP address 31.184.194.81. This IP address is part of a range of IP addresses assigned to Petersburg Internet Network Ltd., a webhosting company located in Saint Petersburg, Russia.

6. Upon information and belief, Defendant Library Genesis Project is an organization which operates an online repository of copyrighted materials accessible through the website located at the URL “libgen.org” as well as a number of other “mirror” websites
(collectively the “Libgen Domains”). The libgen.org domain is registered by “Whois Privacy
Corp.,” located at Ocean Centre, Montagu Foreshore, East Bay Street, Nassau, New Providence,

2

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 3 of 16

Bahamas, to an unknown registrant. As of the date of this filing, libgen.org is assigned the IP address 93.174.95.71. This IP address is part of a range of IP addresses assigned to Ecatel Ltd., a web-hosting company located in Amsterdam, the Netherlands.

7. The Libgen Domains include “elibgen.org,” “libgen.info,” “lib.estrorecollege.org,” and “bookfi.org.”

8. Upon information and belief, Defendant Alexandra Elbakyan is the principal owner and/or operator of Sci-Hub. Upon information and belief, Elbakyan is a resident of Almaty, Kazakhstan.

9. Elsevier is unaware of the true names and capacities of the individuals named as Does 1-99 in this Complaint (together with Alexandra Elbakyan, the “Individual Defendants”),
and their residence and citizenship is also unknown. Elsevier will amend its Complaint to allege the names, capacities, residence and citizenship of the Doe Defendants when their identities are learned.

10. Upon information and belief, the Individual Defendants are the owners and operators of numerous of websites, including Sci-Hub and the websites located at the various
Libgen Domains, and a number of e-mail addresses and accounts at issue in this case.

11. The Individual Defendants have participated, exercised control over, and benefited from the infringing conduct described herein, which has resulted in substantial harm to
the Plaintiffs.

JURISDICTION AND VENUE

12. This is a civil action arising from the Defendants’ violations of the copyright laws of the United States (17 U.S.C. § 101 et seq.) and the Computer Fraud and Abuse Act (“CFAA”),

3

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 4 of 16

18.U.S.C. § 1030. Therefore, the Court has subject matter jurisdiction over this action pursuant to 28 U.S.C. § 1331.

13. Upon information and belief, the Individual Defendants own and operate computers and Internet websites and engage in conduct that injures Plaintiff in this district, while
also utilizing instrumentalities located in the Southern District of New York to carry out the acts complained of herein.

14. Defendants have affirmatively directed actions at the Southern District of New York by utilizing computer servers located in the District without authorization and by
unlawfully obtaining access credentials belonging to individuals and entities located in the
District, in order to unlawfully access, copy, and distribute Elsevier's copyrighted materials
which are stored on Elsevier’s ScienceDirect platform.
15.

Defendants have committed the acts complained of herein through unauthorized

access to Plaintiffs’ copyrighted materials which are stored and maintained on computer servers
located in the Southern District of New York.
16.

Defendants have undertaken the acts complained of herein with knowledge that

such acts would cause harm to Plaintiffs and their customers in both the Southern District of
New York and elsewhere. Defendants have caused the Plaintiff injury while deriving revenue
from interstate or international commerce by committing the acts complained of herein.
Therefore, this Court has personal jurisdiction over Defendants.
17.

Venue in this District is proper under 28 U.S.C. § 1391(b) because a substantial

part of the events giving rise to Plaintiffs’ claims occurred in this District and because the
property that is the subject of Plaintiffs’ claims is situated in this District.

4

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 5 of 16

FACTUAL ALLEGATIONS
Elsevier’s Copyrights in Publications on ScienceDirect
18.

Elsevier is a world leading provider of professional information solutions in the

Science, Medical, and Health sectors. Elsevier publishes, markets, sells, and licenses academic
textbooks, journals, and examinations in the fields of science, medicine, and health. The
majority of Elsevier’s institutional customers are universities, governmental entities, educational
institutions, and hospitals that purchase physical and electronic copies of Elsevier’s products and
access to Elsevier’s digital libraries. Elsevier distributes its scientific journal articles and book
chapters electronically via its proprietary subscription database “ScienceDirect”
(www.sciencedirect.com). In most cases, Elsevier holds the copyright and/or exclusive
distribution rights to the works available through ScienceDirect. In addition, Elsevier holds
trademark rights in “Elsevier,” “ScienceDirect,” and several other related trade names.
19.

The ScienceDirect database is home to almost one-quarter of the world's peer-

reviewed, full-text scientific, technical and medical content. The ScienceDirect service features
sophisticated search and retrieval tools for students and professionals which facilitates access to
over 10 million copyrighted publications. More than 15 million researchers, health care
professionals, teachers, students, and information professionals around the globe rely on
ScienceDirect as a trusted source of nearly 2,500 journals and more than 26,000 book titles.
20.

Authorized users are provided access to the ScienceDirect platform by way of

non-exclusive, non-transferable subscriptions between Elsevier and its institutional customers.
According to the terms and conditions of these subscriptions, authorized users of ScienceDirect
must be users affiliated with the subscriber (e.g., full-time and part-time students, faculty, staff

5

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 6 of 16

and researchers of subscriber universities and individuals using computer terminals within the
library facilities at the subscriber for personal research, education or other non-corporate use.)
21.

A substantial portion of American research universities maintain active

subscriptions to ScienceDirect. These subscriptions, under license, allow the universities to
provide their faculty and students access to the copyrighted works within the ScienceDirect
database.
22.

Elsevier stores and maintains the copyrighted material available in ScienceDirect

on servers owned and operated by a third party whose servers are located in the Southern District
of New York and elsewhere. In order to optimize performance, these third-party servers
collectively operate as a distributed network which serves cached copies of Elsevier’s
copyrighted materials by way of particular servers that are geographically close to the user. For
example, a user that accesses ScienceDirect from a University located in the Southern District of
New York will likely be served that content from a server physically located in the District.

Authentication of Authorized University ScienceDirect Users
23.

Elsevier maintains the integrity and security of the copyrighted works accessible

on ScienceDirect by allowing only authenticated users access to the platform. Elsevier
authenticates educational users who access ScienceDirect through their affiliated university’s
subscription by verifying that they are able to access ScienceDirect from a computer system or
network previously identified as belonging to a subscribing university.
24.

Elsevier does not track individual educational users’ access to ScienceDirect.

Instead, Elsevier verifies only that the user has authenticated access to a subscribing university.
25.

Once an educational user authenticates his computer with ScienceDirect on a

university network, that computer is permitted access to ScienceDirect for a limited amount of
6

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 7 of 16

time without re-authenticating. For example, a student could access ScienceDirect from their
laptop while sitting in a university library, then continue to access ScienceDirect using that
laptop from their dorm room later that day. After a specified period of time has passed, however,
a user will have to re-authenticate his or her computer’s access to ScienceDirect by connecting to
the platform through a university network.
26.

As a matter of practice, educational users access university networks, and thereby

authenticate their computers with ScienceDirect, primarily through one of two methods. First,
the user may be physically connected to a university network, for example by taking their
computer to the university’s library. Second, the user may connect remotely to the university’s
network using a proxy connection. Universities offer proxy connections to their students and
faculty so that those users may access university computing resources – including access to
research databases such as ScienceDirect – from remote locations which are unaffiliated with the
university. This practice facilitates the use of ScienceDirect by students and faculty while they
are at home, travelling, or otherwise off-campus.
Defendants’ Unauthorized Access to University Proxy Networks to Facilitate Copyright
Infringement
27.

Upon information and belief, Defendants are reproducing and distributing

unauthorized copies of Elsevier’s copyrighted materials, unlawfully obtained from
ScienceDirect, through Sci-Hub and through various websites affiliated with the Library Genesis
Project. Specifically, Defendants utilize their websites located at sci-hub.org and at the Libgen
Domains to operate an international network of piracy and copyright infringement by
circumventing legal and authorized means of access to the ScienceDirect database. Defendants’
piracy is supported by the persistent intrusion and unauthorized access to the computer networks

7

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 8 of 16

of Elsevier and its institutional subscribers, including universities located in the Southern District
of New York.
28.

Upon information and belief, Defendants have unlawfully obtained and continue

to unlawfully obtain student or faculty access credentials which permit proxy connections to
universities which subscribe to ScienceDirect, and use these credentials to gain unauthorized
access to ScienceDirect.
29.

Upon information and belief, Defendants have used and continue to use such

access credentials to authenticate access to ScienceDirect and, subsequently, to obtain
copyrighted scientific journal articles therefrom without valid authorization.
30.

The Sci-Hub website requires user interaction in order to facilitate its illegal

copyright infringement scheme. Specifically, before a Sci-Hub user can obtain access to
copyrighted scholarly journals, articles, and books that are maintained by ScienceDirect, he must
first perform a search on the Sci-Hub page. A Sci-Hub user may search for content using either
(a) a general keyword-based search, or (b) a journal, article or book identifier (such as a Digital
Object Identifier, PubMed Identifier, or the source URL).
31.

When a user performs a keyword search on Sci-Hub, the website returns a proxied

version of search results from the Google Scholar search database. 1 When a user selects one of
the search results, if the requested content is not available from the Library Genesis Project, SciHub unlawfully retrieves the content from ScienceDirect using the access previously obtained.
Sci-Hub then provides a copy of that article to the requesting user, typically in PDF format. If,
however, the requested content can be found in the Library Genesis Project repository, upon

1

Google Scholar provides its users the capability to search for scholarly literature, but does not provide the
full text of copyrighted scientific journal articles accessible through paid subscription services such as
ScienceDirect. Instead, Google Scholar provides bibliographic information concerning such articles along with a
link to the platform through which the article may be purchased or accessed by a subscriber.

8

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 9 of 16

information and belief, Sci-Hub obtains the content from the Library Genesis Project repository
and provides that content to the user.
32.

When a user searches on Sci-Hub for an article available on ScienceDirect using a

journal or article identifier, the user is redirected to a proxied version of the ScienceDirect page
where the user can download the requested article at no cost. Upon information and belief, SciHub facilitates this infringing conduct by using unlawfully-obtained access credentials to
university proxy servers to establish remote access to ScienceDirect through those proxy servers.
If, however, the requested content can be found in the Library Genesis Project repository, upon
information and belief, Sci-Hub obtains the content from it and provides it to the user.
33.

Upon information and belief, Sci-Hub engages in no other activity other than the

illegal reproduction and distribution of digital copies of Elsevier’s copyrighted works and the
copyrighted works of other publishers, and the encouragement, inducement, and material
contribution to the infringement of the copyrights of those works by third parties – i.e., the users
of the Sci-Hub website.
34.

Upon information and belief, in addition to the blatant and rampant infringement

of Elsevier’s copyrights as described above, the Defendants have also used the Sci-Hub website
to earn revenue from the piracy of copyrighted materials from ScienceDirect. Sci-Hub has at
various times accepted funds through a variety of payment processors, including PayPal,
Yandex, WebMoney, QiQi, and Bitcoin.
Sci-Hub’s Use of the Library Genesis Project as a Repository for Unlawfully-Obtained
Scientific Journal Articles and Books
35.

Upon information and belief, when Sci-Hub pirates and downloads an article from

ScienceDirect in response to a user request, in addition to providing a copy of that article to that
user, Sci-Hub also provides a duplicate copy to the Library Genesis Project, which stores the
9

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 10 of 16

article in a database accessible through the Internet. Upon information and belief, the Library
Genesis Project is designed to be a permanent repository of this and other illegally obtained
content.
36.

Upon information and belief, in the event that a Sci-Hub user requests an article

which has already been provided to the Library Genesis Project, Sci-Hub may provide that user
access to a copy provided by the Library Genesis Project rather than re-download an additional
copy of the article from ScienceDirect. As a result, Defendants Sci-Hub and Library Genesis
Project act in concert to engage in a scheme designed to facilitate the unauthorized access to and
wholesale distribution of Elsevier’s copyrighted works legitimately available on the
ScienceDirect platform.
The Library Genesis Project’s Unlawful Distribution of Plaintiff’s Copyrighted Works
37.

Access to the Library Genesis Project’s repository is facilitated by the website

“libgen.org,” which provides its users the ability to search, download content from, and upload
content to, the repository. The main page of libgen.org allows its users to perform searches in
various categories, including “LibGen (Sci-Tech),” and “Scientific articles.” In addition to
searching by keyword, users may also search for specific content by various other fields,
including title, author, periodical, publisher, or ISBN or DOI number.
38.

The libgen.org website indicates that the Library Genesis Project repository

contains approximately 1 million “Sci-Tech” documents and 40 million scientific articles. Upon
information and belief, the large majority of these works is subject to copyright protection and is
being distributed through the Library Genesis Project without the permission of the applicable
rights-holder. Upon information and belief, the Library Genesis Project serves primarily, if not

10

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 11 of 16

exclusively, as a scheme to violate the intellectual property rights of the owners of millions of
copyrighted works.
39.

Upon information and belief, Elsevier owns the copyrights in a substantial

number of copyrighted materials made available for distribution through the Library Genesis
Project. Elsevier has not authorized the Library Genesis Project or any of the Defendants to
copy, display, or distribute through any of the complained of websites any of the content stored
on ScienceDirect to which it holds the copyright. Among the works infringed by the Library
Genesis Project are the “Guyton and Hall Textbook of Medical Physiology,” and the article “The
Varus Ankle and Instability” (published in Elsevier’s journal “Foot and Ankle Clinics of North
America”), each of which is protected by Elsevier’s federally-registered copyrights.
40.

In addition to the Library Genesis Project website accessible at libgen.org, users

may access the Library Genesis Project repository through a number of “mirror” sites accessible
through other URLs. These mirror sites are similar, if not identical, in functionality to
libgen.org. Specifically, the mirror sites allow their users to search and download materials from
the Library Genesis Project repository.
FIRST CLAIM FOR RELIEF
(Direct Infringement of Copyright)
41.

Elsevier incorporates by reference the allegations contained in paragraphs 1-40

42.

Elsevier’s copyright rights and exclusive distribution rights to the works available

above.

on ScienceDirect (the “Works”) are valid and enforceable.
43.

Defendants have infringed on Elsevier’s copyright rights to these Works by

knowingly and intentionally reproducing and distributing these Works without authorization.

11

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 12 of 16

44.

The acts of infringement described herein have been willful, intentional, and

purposeful, in disregard of and indifferent to Plaintiffs’ rights.
45.

Without authorization from Elsevier, or right under law, Defendants are directly

liable for infringing Elsevier’s copyrighted Works pursuant to 17 U.S.C. §§ 106(1) and/or (3).
46.

As a direct result of Defendants’ actions, Elsevier has suffered and continues to

suffer irreparable harm for which Elsevier has no adequate remedy at law, and which will
continue unless Defendants’ actions are enjoined.
47.

Elsevier seeks injunctive relief and costs and damages in an amount to be proven

at trial.
SECOND CLAIM FOR RELIEF
(Secondary Infringement of Copyright)
48.

Elsevier incorporates by reference the allegations contained in paragraphs 1-40

49.

Elsevier’s copyright rights and exclusive distribution rights to the works available

above.

on ScienceDirect (the “Works”) are valid and enforceable.
50.

Defendants have infringed on Elsevier’s copyright rights to these Works by

knowingly and intentionally reproducing and distributing these Works without license or other
authorization.
51.

Upon information and belief, Defendants intentionally induced, encouraged, and

materially contributed to the reproduction and distribution of these Works by third party users of
websites operated by Defendants.
52.

The acts of infringement described herein have been willful, intentional, and

purposeful, in disregard of and indifferent to Elsevier’s rights.

12

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 13 of 16

53.

Without authorization from Elsevier, or right under law, Defendants are directly

liable for third parties’ infringement of Elsevier’s copyrighted Works pursuant to 17 U.S.C. §§
106(1) and/or (3).
54.

Upon information and belief, Defendants profited from third parties’ direct

infringement of Elsevier’s Works.
55.

Defendants had the right and the ability to supervise and control their websites

and the third party infringing activities described herein.
56.

As a direct result of Defendants’ actions, Elsevier has suffered and continues to

suffer irreparable harm for which Elsevier has no adequate remedy at law, and which will
continue unless Defendants’ actions are enjoined.
57.

Elsevier seeks injunctive relief and costs and damages in an amount to be proven

at trial.
THIRD CLAIM FOR RELIEF
(Violation of the Computer Fraud & Abuse Act)
58.

Elsevier incorporates by reference the allegations contained in paragraphs 1-40

59.

Elsevier’s computers and servers, the third-party computers and servers which

above.

store and maintain Elsevier’s copyrighted works for ScienceDirect, and Elsevier’s customers’
computers and servers which facilitate access to Elsevier’s copyrighted works on ScienceDirect,
are all “protected computers” under the Computer Fraud and Abuse Act (“CFAA”).
60.

Defendants (a) knowingly and intentionally accessed such protected computers

without authorization and thereby obtained information from the protected computers in a
transaction involving an interstate or foreign communication (18 U.S.C. § 1030(a)(2)(C)); and
(b) knowingly and with an intent to defraud accessed such protected computers without
13

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 14 of 16

authorization and obtained information from such computers, which Defendants used to further
the fraud and obtain something of value (18 U.S.C. § 1030(a)(4)).
61.

Defendants’ conduct has caused, and continues to cause, significant and

irreparable damages and loss to Elsevier.
62.

Defendants’ conduct has caused a loss to Elsevier during a one-year period

aggregating at least $5,000.
63.

As a direct result of Defendants’ actions, Elsevier has suffered and continues to

suffer irreparable harm for which Elsevier has no adequate remedy at law, and which will
continue unless Defendants’ actions are enjoined.
64.

Elsevier seeks injunctive relief, as well as costs and damages in an amount to be

proven at trial.
PRAYER FOR RELIEF
WHEREFORE, Elsevier respectfully requests that the Court:
A. Enter preliminary and permanent injunctions, enjoining and prohibiting Defendants,
their officers, directors, principals, agents, servants, employees, successors and
assigns, and all persons and entities in active concert or participation with them, from
engaging in any of the activity complained of herein or from causing any of the injury
complained of herein and from assisting, aiding, or abetting any other person or
business entity in engaging in or performing any of the activity complained of herein
or from causing any of the injury complained of herein;
B. Enter an order that, upon Elsevier’s request, those in privity with Defendants and
those with notice of the injunction, including any Internet search engines, Web
Hosting and Internet Service Providers, domain-name registrars, and domain name

14

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 15 of 16

registries or their administrators that are provided with notice of the injunction, cease
facilitating access to any or all domain names and websites through which Defendants
engage in any of the activity complained of herein;
C. Enter an order that, upon Elsevier’s request, those organizations which have
registered Defendants’ domain names on behalf of Defendants shall disclose
immediately to Plaintiffs all information in their possession concerning the identity of
the operator or registrant of such domain names and of any bank accounts or financial
accounts owned or used by such operator or registrant;
D. Enter an order that, upon Elsevier’s request, the TLD Registries for the Defendants’
websites, or their administrators, shall place the domain names on
registryHold/serverHold as well as serverUpdate, ServerDelete, and serverTransfer
prohibited statuses, for the remainder of the registration period for any such website.
E. Enter an order canceling or deleting, or, at Elsevier’s election, transferring the domain
name registrations used by Defendants to engage in the activity complained of herein
to Elsevier’s control so that they may no longer be used for illegal purposes;
F. Enter an order awarding Elsevier its actual damages incurred as a result of
Defendants’ infringement of Elsevier’s copyright rights in the Works and all profits
Defendant realized as a result of its acts of infringement, in amounts to be determined
at trial; or in the alternative, awarding Elsevier, pursuant to 17 U.S.C. § 504, statutory
damages for the acts of infringement committed by Defendants, enhanced to reflect
the willful nature of the Defendants’ infringement;
G. Enter an order disgorging Defendants’ profits;

15

Case 1:15-cv-04282-RWS Document 1 Filed 06/03/15 Page 16 of 16

 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.