digitization in Adema 2009


that
one can find virtually all Ebooks and texts one needs via p2p networks and
other file sharing community’s (the true
[Darknet](http://en.wikipedia.org/wiki/Darknet_\(file_sharing\)) in a way) –
more and more people are offering (and asking for!) selections of texts and
books (including the ones by Adorno) on openly available websites and blogs,
or they are scanning them and offering them for (educational) use on their
domains. Although the Internet is mostly known for the pirating and
dissemination of pirated movies and music, copyright protected textual content
has (of course) always been spread too. But with the rise of ‘born digital’
text content, and with the help of massive digitization efforts like Google
Books (and accompanying Google Books [download
tools](http://www.codeplex.com/GoogleBookDownloader)) accompanied by the
appearance of better (and cheaper) scanning equipment, the movement of
‘openly’ spreading (pirated) texts (whether or not focusing on education and
‘fair use’) seems to be growing fast.

The direct harm (to both the producers and their publishers) of the free
online availability of (in copyright) texts is also maybe less clear than for
instance with music and films. Many feel texts and books will still be
preferred to be read in print, making the online and free availability of text
nothing more than a marketing tool for the sales of the printed


digitization in Barok 2014


question about what something is, what
kinds, parts and properties does it have, and so on, can be consulted in
existing documents or generate new documents based on collection of data [in]
the field and through experiment, before proceeding to reasoning [arguments
and deductions]. Formulation of a query is determined by protocols providing
access to documents, which means that there is a difference between collecting
data outside the archive (the undocumented, ie. in the field and through
experiment), consulting with a person--an archivist (expert, librarian,
documentalist), and consulting with a database storing documents. The
phenomena such as [deepening] of specialization and throughout digitization
[have given] privilege to the database as [a|the] [fundamental] means for
research. Obviously, this is a very recent [phenomenon]. Queries were once
formulated in natural language; now, given the fact that databases are queried
[using] SQL language, their interfaces are mere extensions of it and
researchers pose their questions by manipulating dropdowns, checkboxes and
input boxes mashed together on a flat screen being ran by software that in
turn translates them into a long line of conditioned _SELECTs_ and _JOINs_
performed on tables of data.

Specialization, digitization and networking have changed the language of
questioning. Inquiry, once attached to the flesh and paper has been
[entrusted] to the digital and networked. Researchers are querying the black
box.

C

Searching in a collection of [amassed/assembled] [tangible] documents (ie.
bookshelf) is different from searching in a systematically structured
repository (library) and even more so from searching in a digital repository
(digital library). Not that they are mutually exclusive. One can devise
structures and algorithms to search through a printed text, or read books in a
library one by one. They are rather [models] [embodying] various [processes]
associated with the query. These properties of the q


s can be also
stated explicitly, by indexing tables and then referring them from a
particular point in the text. The same goes for explicit associations made
between blocks of the text by means of indexed paragraphs, chapters or pages.

From this follows that all utterances point to the following utterance by the
nature of sequential order, and indexing provides means for pointing elsewhere
in the document as well.

A lot can be said about references to other texts. Here, to spare time, I
would refer you to a talk I gave a few months ago and which is online
10(http://monoskop.org/Talks/Communing_Texts).

This is still the realm of print. What happens with document when it is
digitized?

Digitization breaks a document into units of which each is assigned a numbered
position in the sequence of the document. From this perspective digitization
can be viewed as a total indexation of the document. It is converted into
units rendered for machine operations. This sequentiality is made explicit, by
means of an underlying index.

Sequences and chains are orders of one dimension. Their one-dimensional
ordering allows addressability of each element and [random] access. [Jumps]
between [random] addresses are still sequential, processing elements one at a
time.

## (K) The
index[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=6
"Edit section: \(K\) The index")]

* [![](/images/thumb/2/27/Summa_confessorum.1310.jpg/103px-Summa_confessorum.1310.jpg)](/File:Summa_confessorum.1310.jpg)

Summa confessorum [1297-98], 1310.


digitization in Barok 2014


to underestimate the value and benefits of library work, nor the
importance of discipline-centered writing or of the recognition of the oeuvre of
the author. But consider an author working on an article who in the early phase
of his research needs to prepare a bibliography on the activity of Fluxus in central Europe or on the use of documentary film in education. Such research cuts
through national boundaries and/or branches of disciplines and he is left to travel
not only to locate artefacts, protagonists and experts in the field but also to find
literature, which in turn makes even the mere process of compiling bibliography
relatively demanding and costly activity.
3

In this sense, the digitization of publications and archival material, providing their
free online access and enabling fulltext search, in other words “open access”, catalyzes research across political-geographical and disciplinary configurations. Because while the index of the printed book contains only selected terms and for
the purposes of searching the index across several books the researcher has to have
them all at hand, the software-enabled search in digitized texts (with a good OCR)
works with the index of every single term in all of them.
This kind of research also obviously benefits from online translation tools, multilingual case bibliographies online, as well as second hand bookstores and small
specialized


rly those created in software,
is electronic. However the exceptions are significant. They include works made,
typeset, illustrated and copied manually, such as manuscripts written on paper
or other media, by hand or using a typewriter or other mechanic means, and
other pre-digital techniques such as lithography, offset, etc., or various forms of
writing such as clay tablets, rolls, codices, in other words the history of print and
publishing in its striking variety, all of which provide authors and publishers with
heterogenous means of expression. Although this “segment” is today generally
perceived as artists’ books interesting primarily for collectors, the current process
of massive digitization has triggered the revival, comebacks, transformations and
5

novel approaches to publishing. And it is these publications whose nature is closer
to the label ‘book’ rather than the automated electro-chemical version of the offset
lithography of digital files on acid-free paper.
Despite that it is remarkable to observe a view spreading among publishers that
books created in software are books with attributes we have known for ages. On
top of that there is a tendency to handle files such as PDFs, EPUBs, MOBIs and
others as if they are printed books, even subject to the rules of limited edition, a
consequence of what can be found in the rise of so called electronic libraries that
“borrow


digitization in Bodo 2014


the most comprehensive scientific pirate
libraries on the net. These sites offer free access to hundreds of thousands of books and millions of
journal articles. In this contribution we try to understand the factors that led to the development of
these sites, and the sociocultural and legal conditions that enable them to operate under hostile legal
and political conditions. Through the reconstruction of the micro-histories of peer produced online text
collections that played a central role in the history of RuNet, we are able to link the formal and informal
support for these sites to the specific conditions developed under the Soviet and post Soviet times.

(pirate) libraries on the net
The digitization and collection of texts was one of the very first activities enabled by computers. Project
Gutenberg, the first in line of digital libraries was established as early as 1971. By the early nineties, a
number of online electronic text archives emerged, all hoping to finally realize the dream that was
chased by humans every since the first library: the collection of everything (Battles, 2004), the Memex
(Bush, 1945), the Mundaneum (Rieusset-Lemarié, 1997), the Library of Babel (Borges, 1998). It did not
take long to realize that the dream was still beyond reach: the information storage and retrieval
technology might have been ready, but copyright law, for the foreseeable future was not. Copyri


tly in Russian, of course.
Fidonet was almost all typed in by hand. […] Maybe several thousand of the most important books,
novels that "everyone must read" and such stuff. People typed in poetry, smaller prose pieces. I have
myself read a sci-fi novel printed on a mainframe, which was obviously typed in. This novel was by
Strugatski brothers. It was not prohibited or dissident, but just impossible to buy in the stores. These
were culturally important, cult novels, so people typed them in. […] At this point it became clear that
there was a lot of value in having a plaintext file with some novels, and the most popular novels were first
digitized in this way.”
The next stage in the text digitization started around 1994. By that time growing numbers of people had
computers, scanning peripherals, OCR software. Russian internet and PC penetration while extremely
low overall in the 1990s (0.1% of the population having internet access in 1994, growing to 8.3% by
2003), began to make inroads in educational and scientific institutions and among Moscow and
St.Petersburg elites, who were often the critical players in these networks. As access to technologies
increased a much wider array of people began to digitize their favorite texts, and these collections began
to circulate, first via CD-ROMs, later via the internet.
One of such collection belonged to Maxim Moshkov, who published his library under the name lib.ru in
1994. Moshkov was a graduate of the Moscow State University Department of Mechanics and
Mathematics, which played a large role in the digitization of scientific works. After graduation, he started
to work for the Scientific Research Institute of System Development, a computer science institute
associated with the Russian Academy of Sciences. He describes the early days of his collection as follows:
“ I began to collect electronic texts in 1990, on a desktop computer. When I got on the Internet in 1994, I
found lots of sites with texts. It was like a dream came true: there they were, all the desired books. But
these collections were in a dreadful state! Incompatible formats, different encodings, missing content. I
had to spend hours scouring the different sites and directories to find something.
As a result, I decided to convert all t


ternet: I sought out and pulled texts from the network, which were
lying there freely accessible. Slowly the library grew, and the audience increased with it. People started
to send books to me, because they were easier to read in my collection. And the time came when I
stopped surfing the internet for books: regular readers are now sending me the books. Day after day I get
about 100 emails, and 10-30 of them contain books. So many books were sent in, that I did not have time
to process them. Authors, translators and publishers also started to send texts. They all needed the
library.”(Мошков, 1999)

In the second half of the 1990’s, the Russian Internet—RuNet—was awash in book digitization projects.
With the advent of scanners, OCR technology, and the Internet, the work of digitization eased
considerably. Texts migrated from print to digital and sometimes back to print again. They circulated
through different collections, which, in turn, merged, fell apart, and re-formed. Digital libraries with the
mission to collect and consolidate these free-floating texts sprung up by the dozens.
Such digital librarianship was the antithesis of official Soviet book culture: it was free, bottom-up,
democratic, and uncensored. It also offered a partial remedy to problems created by the post-Soviet
collapse of the economy: the impoverishment of libraries, readers, and publishers. In this context, book
digitization and collecting also offered a sense of political, economic and cultural agency, with parallels
to the copying and distribution of texts in Soviet times. The capacity to scale up these practices coincided
with the moment when anti-totalitarian social sentiments were the strongest, and economic needs the
direst.
The unprecedented bloom of digital librarianship is the result of the superimposition of multiple waves
of distinct transformations: technological, political, economical and social. “Maksim Moshkov's Library”
was ground zero for this convergence and soon became a central point of exchange for the community
engaged in text digitization and collection:
[At the outset] there were just a couple of people who started scanning books in large quantities. Literally
hundreds of books. Others started proofreading, etc. There was a huge hole in the market for books.
Science fiction, adventure, crime fiction, all of this was hugely in demand by the public. So lib.ru was to a
large part the response, and was filled by those books that people most desired and most valued.
For years, lib.ru integrated as much as it could of the different digital libraries flourishing in the RuNet. By
doing so, it preserved the collections of the many short-lived libraries.
This process of collection slowed in the early 2000’s. By that time, lib.ru had


textbooks and monographs of all time, in
all fields of natural science.
There was never any commercial support. The kolhoz group never had a web site with a database, like
most projects today. They had an ftp server with files, and the access to ftp was given by PM in a forum.
This ftp server was privately supported by one of the members (who was an academic researcher, like
most kolhoz members). The files were distributed directly by burning files on writable DVDs and giving the

4

DJVU is a file format that revolutionized online book distribution the way mp3 revolutionized the online music
distribution. For books that contain graphs, images and mathematical formulae scanning is the only digitization
option. However, the large number of resulting image files is difficult to handle. The DJVU file format allows for the
images of scanned book pages to be stored in the smallest possible file size, which makes it the perfect medium for
the distribution of scanned e-books.

11

Draft Manuscript, 11/4/2014, DO NOT CITE!
DVDs away. Later, the ftp access was closed to the public, and only a temporary file-swapping ftp server
remained. Today the kolhoz DVD releases are mostly spread via torrents.” 5
Kolhoz amassed around fifty thousand documents, the mexmat collection of the Moscow State
University Department of Mechanics and Mathematics (Moshkov’s alma mater) was around the same
size, the


what we already have. So growth is defined by newly
scanned or issued books. Also, the quality of the collection is represented not by the number of books but
by the amount of knowledge it contains. [ALEPH] does not need to grow more and I am not the only one
among us who thinks so. […]
We have absolutely no idea who sends books in. It is practically impossible to know, because there are a
million books. We gather huge collections which eliminate any traces of the original uploaders.
My expectation is that new arrivals will dry up. Not completely, as I described above, some books will
always be scanned or rescanned (it nowadays happens quite surprisingly often) and the overall process of
digitization cannot and should not be stopped. It is also hard to say when the slowdown will occur: I
expected it about a year ago, but then library.nu got shut down and things changed dramatically in many
respects. Now we are "in charge" (we had been the largest anyways, just now everyone thinks we are in
5

Anonymous source #1

12

Draft Manuscript, 11/4/2014, DO NOT CITE!
charge) and there has been a temporary rise in the book inflow. At the moment, relatively small or
previously unseen collections are being integrated into [ALEPH]. Perhaps in a year it will saturate.
However, intuition is not a good guide. There are dynamic processes responsible for eBook availability. If
publishers massively digiti


l libraries and their emerged in a period a double transformation: the post-Soviet copyright
system had to adopt global norms, while the global norms struggled to adapt to the emergence of digital
copying.
The first post-Soviet decade produced new copyright laws that conformed with some of the international
norms advocated by Western rightsholders, but little legal clarity or enforceability (Sezneva & Karaganis,
2011). Under such conditions, informally negotiated copynorms set in to fill the void of non-existent,
unreasonable, or unenforceable laws. The pirate libraries in the RuNet are as much regulated by such
norms as by the actual laws themselves.
During most of the 1990’s user-driven digitization and archiving was legal, or to be more exact, wasn’t
illegal. The first Russian copyright law, enacted in 1993, did not cover “internet rights” until a 2006
amendment (Budylin & Osipova, 2007; Elst, 2005, p. 425). As a result, many argued (including the
Moscow prosecutor’s office), that the distribution of copyrighted works via the internet was not
copyright infringement. Authors and publishers, who saw their works appear in digital form, and
circulated via CD-ROMs and the internet, had to rely on informal norms, still in development, to establish
control over their texts vis-à-vis enthusiastic collectors and for-profit entrepreneurs.
The HARRYFAN CD was one of the early examples of


ements. And it is to grow and prosper. […] I
simply want the books to find their readers because I am afraid to live in a world where no one reads
books. This is already the case in America, and it is speeding up with us. I don’t just want to derail this
process, I would like to turn it around.”

17

Draft Manuscript, 11/4/2014, DO NOT CITE!
Moshkov played a crucial role in consolidating copynorms in the Russian digital publishing domain. His
reputation and place in the Russian literary domain is marked by a number of prizes12, and the library’s
continued existence. This place was secured by a number of closely intertwined factors:







Framing and anchoring the digitization and distribution practice in the library tradition.
The non-profit status of the enterprise.
Respecting the wishes of the rights holders even if he was not legally obliged to do so.
Maintaining active communication with the different stakeholders in the community,
including authors and readers.
Responding to a clear gap in affordable, legal access.
Conservatism with regard to the book, anchored in the argument that digital texts are not
substitutes for printed matter.

Many other digital libraries tried to follow Moshkov’s formula, but the times were changing. Internet and
computer access left the sub-cultural niches and became mainstream; commercialization became a
viable option and thus


digitization in Bodo 2015


Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

printer printouts of sci-fi classics downloaded from gopher servers is a shared experience of anyone who
had access to computers and the internet before it was known as the World Wide Web.
Computers thus added fresh momentum to the efforts of realizing the age-old dream of the universal
library (Battles, 2004). Digital technologies offered a breakthrough in many of the issues that previously
posed serious obstacles to text collection: storage, search, preservation, access have all become cheaper
and easier than ever before. On the other hand, a number of key issues remained unresolved: digitization
was a slow and cumbersome process, while the screen proved to be too inconvenient, and the printer too
costly an interface between the text file and the reader. In any case, ultimately it wasn’t these issues that
put a break to the proliferation of digital libraries. Rather, it was the realization, that there are legal limits
to the digitization, storage, distribution of copyrighted works on the digital networks. That realization
soon rendered many text collections in the emerging digital library scene inaccessible.
Legal considerations did not destroy this chaotic, emergent digital librarianship and the collections the adhoc, accidental and professional librarians put together. The text collections were far too valuable to
simply delete them from the servers. Instead, what happened to most of these collections was that they
retreated from the public view, back into the access-controlled shadows of darknets. Yesterday’s gophers
and anonymous ftp servers turned into closed, membership only ftp servers, local shared libraries residing
on the intranets of various academic, business institutions and private archives stored on local hard drives.
The early digital libraries turned into book piracy sites and into the kernels of today’s shadow libraries.
Libraries and other major actors, who decided to start large scale digitization programs soon needed to
find out that if they wanted to avoid costly lawsuits, then they had to limit their activities to work in the
public domain. While the public domain is riddled with mind-bogglingly complex and unresolved legal
issues, but at least it is still significantly less complicated to deal with than copyrighted and orphan works.
Legally more innovative, (or as some would say, adventurous) companies, such as Google and Microsoft,
who thought they had sufficient resources to sort out the legal issues soon had to abandon their programs
or put them on hold until the legal issues were sorted out.
There were, however, a large group of disenfranchised readers, library patrons, author


is only available for that 40% of books in the Aleph catalogue that had an ISBN number
on file. The titles without a valid ISBN number tend to be older, Russian language titles, in general with low
expected print and e-book availability.
4
Download data is based on the logs provided by one of the shadow library services which offers the books in
Aleph’s catalogue as well as other works also free and without any restraints or limitations.

9

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

scarcity in physical copies is overcome through distributed digitization; the artificial source of scarcity
created by copyright protection is overcome through infringement. The liberation from both constraints is
necessary to create a truly scarcity free environment and to release the potential of the library in the postscarcity age.
Aleph is also an ongoing demonstration of the fact that under the condition of non-scarcity, the library can
be a decentralized, distributed, commons-based institution created and maintained through peer
production (Benkler, 2006). The message of Aleph is clear: users left to their own devices, can produce a
library by themselves for themselves. In fact, users are the library. And when everyone has the means to
digitize, collect, ca


ed the rights of
European libraries to digitize books in their collection if that is necessary to give access to them in digital
formats on their premises, it also created new uncertainties by stating that libraries may not digitize their
entire collections (Rosati, 2014a).
US libraries face a similar situation, both in terms of the narrowly defined exceptions in which libraries
can operate, and the huge uncertainty regarding the limits of fair use in the digital library context. US
rights holders challenged both Google’s (Authors Guild v Google) and the libraries (Authors Guild v
HathiTrust) rights to digitize copyrighted works. While there seems to be a consensus of courts that the
mass digitization conducted by these institutions was fair use (Diaz, 2013; Rosati, 2014c; Samuelson,
2014), the accessibility of the scanned works is still heavily limited, subject to licenses from publishers,
the existence of print copies at the library and the institutional membership held by prospective readers.
While in the highly competitive US e-book market many commercial intermediaries offer e-lending
6

The notable exception being orphan works which are presumed to be still copyrighted, but without an identifiable
rights owner. In the EU, the Directive 2012/28/EU on certain permitted uses of orphan works in theory eases access
to such works, but in practice its practical impact is limited by the man


hind by the collapse
of libraries in the digital sphere and by the inability of the commercial arrangements to provide adequate
substitute services. Shadow libraries are pooling distributed resources and expertise over the internet, and
use the lack of legal or technological barriers to innovation in the informal sphere to fill in the void left
behind by libraries.

What can Aleph teach us about the future of libraries?
The story of Aleph offers two, closely interrelated considerations for the debate on the future of libraries:
a legal and an organizational one. Aleph operates beyond the limits of legality, as almost all of its
activities are copyright infringing, including the unauthorized digitization of books, the unauthorized
mass downloads from e-text repositories, the unauthorized acts of uploading books to the archive, the
unauthorized distribution of books, and, in most countries, the unauthorized act of users’ downloading
books from the archive. In the debates around copyright infringement, illegality is usually interpreted as a
necessary condition to access works for free. While this is undoubtedly true, the fact that Aleph provides
no-cost access to books seems to be less important than the fact that it provides an access to them in the
first place.
Aleph is a clear indicator of the volume of the demand for current books in digital formats in developed
and in developing countri


thousand dollars. The hundreds
of thousands who use Aleph on a more or less regular basis have an immense amount of resources, and by
disregarding the copyright laws Aleph is able to tap into those resources and use them for the
development of the library. The value of these resources and of the peer produced library is the difference
between the actual costs associated with Aleph, and the investment that would be required to create
something remotely similar.

15

Bodó B. (2015): Libraries in the post-scarcity era.
in: Porsdam (ed): Copyrighting Creativity: Creative values, Cultural Heritage Institutions and Systems of Intellectual Property, Ashgate

The decentralized, collaborative mass digitization and making available of current, thus most relevant
scientific works is only possible at the moment through massive copyright infringement. It is debatable
whether the copyrighted corpus of scientific works should be completely open, and whether the blatant
disregard of copyrights through which Aleph achieved this openness is the right path towards a more
openly accessible body of scientific knowledge. It is also yet to be measured what effects shadow libraries
may have on the commercial intermediaries and on the health of scientific publishing and science in
general. But Aleph, in any case, is a case study in the potential benefits of open sourcing the library.

Conclusion
If we can take Al


Social Science Research Council.
Bodó, B. (forthcoming). Piracy vs privacy–the analysis of Piratebrowser. IJOC.
Commission on the Future of the Library. (2013). Report of the Commission on the Future of the UC
Berkeley Library. Berkeley: UC Berkeley.
Committee on the Public Libraries in the Knowledge Society. (2010). The Public Libraries in the
Knowledge Society. Copenhagen: Kulturstyrelsen.
Darnton, R. (1982). The literary underground of the Old Regime. Cambridge, Mass: Harvard University
Press.
Darnton, R. (2003). The Science of Piracy: A Crucial Ingredient in Eighteenth-Century Publishing.
Studies on Voltaire and the Eighteenth Century, 12, 3–29.
Diaz, A. S. (2013). Fair Use & Mass Digitization: The Future of Copy-Dependent Technologies after
Authors Guild v. Hathitrust. Berkeley Technology Law Journal, 23.
Directive 2001/29/EC on the harmonisation of certain aspects of copyright and related rights in the
information society. (2001). Official Journal L, 167, 10–19.
Elst, M. (2005). Copyright, freedom of speech, and cultural policy in the Russian Federation.
Leiden/Boston: Martinus Nijhoff.
Ermolaev, H. (1997). Censorship in Soviet Literature: 1917-1991. Rowman & Littlefield.
Friedberg, M., Watanabe, M., & Nakamoto, N. (1984). The Soviet Book Market: Supply and Demand.
Acta Slavica Iaponica, 2, 177–192.
Giblin, R. (2011). Code Wars: 10 Years of P2P Software Litigation. Cheltenha


ctual Property, Ashgate

Rosati, E. (2014b). Dutch court refers questions to CJEU on e-lending and digital exhaustion, and another
Dutch reference on digital resale may be just about to follow. IPKat. Retrieved October 08, 2014, from
http://ipkitten.blogspot.co.uk/2014/09/dutch-court-refers-questions-to-cjeu-on.html
Rosati, E. (2014c). Google Books’ Library Project is fair use. Journal of Intellectual Property Law &
Practice, 9(2), 104–106.
Rose, M. (1993). Authors and owners : the invention of copyright. Cambridge, Mass: Harvard University
Press.
Samuelson, P. (2002). Copyright and freedom of expression in historical perspective. J. Intell. Prop. L.,
10, 319.
Samuelson, P. (2014). Mass Digitization as Fair Use. Communications of the ACM, 57(3), 20–22.
Schultz, M. F. (2007). Copynorms: Copyright Law and Social Norms. Intellectual Property And
Information Wealth v01, 1, 201.
Sezneva, O. (2012). The pirates of Nevskii Prospekt: Intellectual property, piracy and institutional
diffusion in Russia. Poetics, 40(2), 150–166.
Solly, E. (1885). Henry Hills, the Pirate Printer. Antiquary, xi, 151–154.
Stelmakh, V. D. (2001). Reading in the Context of Censorship in the Soviet Union. Libraries & Culture,
36(1), 143–151.
Suber,

P.

(2013).

Open

Access

(Vol.

1).

Cambridge,

MA:

The

MIT

Press.

doi:10.1109/ACCESS.2012.2226094
Swartz,

A.

(2008).

Guerilla

Open

Access

Manifesto.

A


digitization in Constant 2015


tiheroes of our own adventures we open up our practice in a way that seems
infectious. We make a point of presenting a new experiment, of producing
something printed and also something edible on site each time; this mix of
ingredients seems to work best. ‘Print Parties’ are how we keep contact with
our fellow designers who are interested in our journey but have sometimes
difficulty following us into the exotic territory of BoF, Version Control and
GPL3.

You state in a few texts that OSP is interested in glitches as a productive force in
software, how do you explain this to a printer trying to get a file to convert to the
kind of thing they expect?
Not! Printing has become cheap through digitization and is streamlined to
the extreme. Often there is literally no space built in to even have a second
look at a differently formatted file, so to state that glitches are productive
is easier said than done. Still, those hickups make processes tangible, especially at moments you don’t want them to interfere.
For a book we are designing at the moment, we might partially work by
hand on positive film (a step now also skipped in file-to-plate systems). It
makes us literally sit with pre-press professionals for a day and hopefully we
can learn better where to intervene and how to involve them into the process.
To take the productive force of glitches beyond predictable aesthetics, means

9

it re


digitization in Constant 2016


de plus en plus facile de
produire des livres en masse, les
bibliothèques privées payantes, au
service des catégories privilégiées de
la société, ont commencé à se
répandre. Ce phénomène a mis en
relief la question de la classe dans la
demande naissante pour un accès
public aux livres.

P.83

It is within and against this milieu that libraries such as
the Internet Archive, Wikileaks, Aaaaarg, UbuWeb,
Monoskop, Memory of the World, Nettime, TheNextLayer
and others gain their political agency. Their countertechniques for negotiating the publicness of publishing
include self-archiving, open access, book liberation,
leaking, whistleblowing, open source search algorithms
and so on.
Digitization and posting texts online are interventions in
the procedures that make search possible. Operating
online collections of texts is as much about organising
texts within libraries, as is placing them within books of
the web.

Originally written 15-16 June 2015 in Prague, Brno
and Vienna for a talk given at the Technopolitics seminar in Vienna on 16 June 2015.
Revised 29 December 2015 in Bergen.
Last
Revision:
1·08·2016

The
Indexalist
MATTHEW FULLER

I first spoke to the patient in the last week of that August. That evening the sun was tender in
drawing its shadows across the lines of his face. The eyes gazed softly into a close middle
distance, as if composing a line upon a translucent page


irm
the new Google venture in the field of historical collections. The executive chairman of
Alphabet declared: “I can think of no better use of our time and our resources to make the
images and ideas from your civilization, from the very beginning of time, available to a billion
people worldwide.”
A detailed account and reflection of this visit, its background and agenda can be found in
Powered by Google: Widening Access and Tightening Corporate Control. (Schiller & Yeo
2014)
FRANCE REACTS AGAINST GOOGLE BOOKS

In relation to the Google Books dispute in Europe, Reuters reported in 2009 that France's
ex-president Nicolas Sarkozy “pledged hundreds of millions of euros toward a separate
digitization program, saying he would not permit France to be “stripped of our heritage to the
benefit of a big company, no matter how friendly, big or American it is.”[15]

Although the reactionary and nationalistic agenda of Sarkozy should not be celebrated, it is
important to note that the first open attack on Google’s cultural agenda came from the French
government. Four years later, the Google Cultural Institute establishes its headquarters in
Paris.
2010
EUROPEAN COMMISSION LAUNCHES AN ANTITRUST INVESTIGATION AGAINST
GOOGLE.

The European Commission has decided to open an antitrust investigation into
allegations that Google Inc. has abused a dominant position in online search, in
violation o


digitization in Dockray, Forster & Public Office 2018


0in%20the%20Twentieth%20Century.pdf)
makes the fragility of historical repositories startlingly clear. “[A]cidified
paper that crumbles to dust, leather, parchment, film and magnetic light
attacked by light, heat humidity or dust” all assault archives. “Floods,
fires, hurricanes, storms, earthquakes” and, of course, “acts of war,
bombardment and fire, whether deliberate or accidental” wiped out significant
portions of many hundreds of major research libraries worldwide. When
expanding the scope to consider public, private, and community libraries, that
number becomes uncountable.

Published during the early days of the World Wide Web, the report acknowledges
the emerging role of digitization (“online databases, CD-ROM etc.”), but today
we might reflect on the last twenty years, which has also introduced new forms
of loss.

Digital archives and libraries are subject to a number of potential hazards:
technical accidents like disk failures, accidental deletions, misplaced data
and imperfect data migrations, as well as political-economic accidents like
defunding of the hosting institution, deaccessioning parts of the collection
and sudden restrictions of access rights. Immediately after library.nu was
shut down on the grounds of copyright infringement in 2012, [Lawrence Liang
wrote](https://kafila.online/2012/02/19/library-nu-r-i-p/) of feeling “first
and foremost a visceral e


digitization in Graziano, Mars & Medak 2019


rms is the university.
Third, the migration online of the academic syllabus falls into larger efforts by universities to ‘disrupt’ the educational system through digital technologies. The introduction
of virtual learning environments has led to lesson plans, slides, notes, and syllabi becoming items to be deposited with the institution. The doors of public higher education are being opened to commercial qualification providers by means of the rise in
metrics-based management, digital platforming of university services, and transformation of students into consumers empowered to make ‘real-time’ decisions on how to
spend their student debt.23 Such neoliberalization masquerading behind digitization
is nowhere more evident than in the hype that was generated around Massive Open
Online Courses (MOOCs), exactly at the height of the last economic crisis.
MOOCs developed gradually from the Massachusetts Institute of Techology’s (MIT) initial experiments with opening up its teaching materials to the public through the OpenCourseWare project in 2001. By 2011, MOOCs were saluted as a full-on democratization of access to ‘Ivy-League-caliber education [for] the world’s poor.’24 And yet, their
promise quickly deflated following extremely low completion rates (as low as 5%).25
Believing that in fifty years there will be no more than 10 institutions globally delivering
higher education,26 b


process of political subjectivation with that of collective education.
By creating effective pedagogical tools, movements have brought educators and students into the fold of their struggles. In the context of our new network environment,
political struggles have produced a new media object: #Syllabus, a crowdsourced list
of resources—historic and present—relevant to a cause. By doing so, these struggles
adapt, resist, and live in and against the networks dominated by techno-capital, with
all of the difficulties and contradictions that entails.
What have we learned from the academic syllabus migrating online?
In the contemporary university, critical pedagogy is clashing head-on with the digitization of higher education. Education that should empower and research that should
emancipate are increasingly left out in the cold due to the data-driven marketization
of academia, short-cutting the goals of teaching and research to satisfy the fluctuating demands of labor market and financial speculation. Resistance against the capture of data, research workflows, and scholarship by means of digitization is a key
struggle for the future of mass intellectuality beyond exclusions of class, disability,
gender, and race.
What have we learned from #Syllabus as a media object?
As old formats transform into new media objects, the digital network environment defines the conditions in which these new media objects try to adjust, resist, and live. A
right intuition can intervene and change the landscape—not necessarily for the good,
particularly if the imperatives of capital accumulation and social control prevail. We
thus need to re-appropriate the process of production and distribution of #Syllabus
as a media object in its totality. We need to build tools to collectively control the workflows that


digitization in Kelty, Bodo & Allen 2018


t we once considered as private
libraries. Amateur librarianship is becoming public shadow
librarianship. Hybrid use, as poetically unpacked in Balazs
Bodo's reflection on his own personal library, is now entangling
print and digital in novel ways. And, as he warns, the terrain
of antagonism is shifting. While for-profit publishers are
seemingly conceding to Guerrilla Open Access, they are
opening new territories: platforms centralizing data, metrics
and workflows, subsuming academic autonomy into new
processes of value extraction.
The 2010s brought us hope and then realization how little
digital networks could help revolutionary movements. The
redistribution toward the wealthy, assisted by digitization, has
eroded institutions of solidarity. The embrace of privilege—
marked by misogyny, racism and xenophobia—this has catalyzed
is nowhere more evident than in the climate denialism of the
Trump administration. Guerrilla archiving of US government
climate change datasets, as recounted by Laurie Allen,
indicates that more technological innovation simply won't do
away with the 'post-truth' and that our institutions might be in
need of revision, replacement and repair.
As the contributions to this pamphlet indicate, the terms
of struggle have shifted: not only do we have to continue
defending our shadow libraries, but we need to take back the
autonomy of knowledge production and rebuild inst


digitization in Ludovico 2013




5



6



7



8



Published 26 August 2013
Original in English
First published by Springerin 3/2013 (German version); Eurozine (English
version)

Contributed by Springerin © Alessandro Ludovico / Springerin / Eurozine



l content by converting it from another medium. This
process, of course, creates accidents. Krissy Wilson's blog/artwork _The Art
of Google Books_2 explores daily the non-digital elements (accidental or not)
emerging in scanned pages, which can be purely material - such as scribbled
notes, parts of the scanning person's hand, dried flowers - or typographical
or linguistic, or deleted or missing parts, all of them precisely annotated.
This small selection of illustrations of how physicality causes technology to
fail may be self-reflective, but it shows a particular aspect of a larger
development. In fact, industrial scanning is only one side of the coin. The
other is the private and personal digitization and sharing of books.

On the basis of brilliant open source tools like the DIY Bookscanner,3 there
are various technical and conceptual efforts to building specialist digital
libraries. _Monoskop_4 is exemplary: its creator Dusan Barok has transformed
his impressive personal collection of media (about contemporary art, culture
and politics, with a special focus on eastern Europe) into a common resource,
freely downloadable and regularly updated. It is a remarkably inspired
selection that can be shared regardless of possible copyright restrictions.
_Monoskop_ is an extreme and excellent example of a personal digital library
made public. But any small or big collection can be easily shared. C


digitization in Mars & Medak 2019


­commodified
access they provide in the world of print. For instance, libraries
frequently don’t have the right to purchase e-­books for lending and
preservation. If they do, they are limited by how many times—­
twenty-­six in the case of one publisher—­and under what conditions
they can lend them before not only the license but the “object”
itself is revoked. In the case of academic journals, it is even worse:
as they move to predominantly digital models of distribution,
libraries can provide access to and “preserve” them only for as
long as they pay extortionate prices for ongoing subscriptions. By
building tools for organizing and sharing electronic libraries, creating digitization workflows, and making books available online, the
Public Library/Memory of the World project is aimed at helping to
fill the space that remains denied to real-­world public libraries. It is
obviously not alone in this effort. There are many other platforms,
some more public, some more secretive, working to help people
share books. And the practice of sharing is massive.
—­https://www.memoryoftheworld.org

Capitalism and Schizophrenia
New media remediate old media. Media pay homage to their
(mediatic) predecessors, which themselves pay homage to their
own (mediatic) predecessors. Computer graphics remediate film,
which remediates photography, which remediates painting, and so
on (McLuhan


a schizoid
impasse sustained by a failed metaphor.
The revolutionary events of the Paris Commune of 1871, its mere
“existence” as Marx has called it,10 a brief moment of “communal
luxury” set in practice as Kristin Ross (2015) describes it, demanded
that, in spite of any circumstances and reservations, one takes a
side. And such is our present moment of truth.
Digital networks have expanded the potential for access and
created an opening for us to transform the production of knowledge and culture in the contemporary world. And yet they have
likewise facilitated the capacity of intellectual property industries

to optimize, to cut out the cost of printing and physical distribution.
Digitization is increasingly helping them to control access, expand
copyright, impose technological protection measures, consolidate
the means of distribution, and capture the academic valorization
process.
As the potential opening for universalizing access to culture and
knowledge created by digital networks is now closing, attempts at
private legal reform such as Creative Commons licenses have had
only a very limited effect. Attempts at institutional reform such as
Open Access publishing are struggling to go beyond a niche. Piracy
has mounted a truly disruptive opposition, but given the legal
repression it has met with, it can become an agent of change only if
it is embraced as a kind of mass civil dis


digitization in Mars, Medak & Sekulic 2016


trictive
space of identity, and obtain access to entire
knowledge of the world. However, instead
of resulting in democratising and emancipatory processes, with the handing over of
Internet and technological innovation to the
market in 1990s it resulted in the gradual
disruption of previous social arrangements
in the allocation of goods and in the intensification of the commodification process.
That trajectory reached its full-blown development in the form of Internet platforms
that simultaneously enabled old owners of
goods to control more closely their accessibility and permited new owners to seek out
new forms of commercial exploitation. Take
for example Google Books, where the process of digitization of the entire printed culture of the world resulted in no more than
ad and retail space where only few books
can be accessed for free. Or Amazon Kinde,
where the owner of the platform has such
dramatic control over books that on behest
of copyright holders it can remotely delete
a purchased copy of a book, as quite indicatively happened in 2009 with Orwell's 1984.
The promised technological innovation that
would bring a new turn of the complexity in
the social allocation of goods resulted in a
simplification and reduction of everything
into private property.
The history of resistance to such extreme forms of enclosure of culture and
knowledge is only a bit younger than the
234

Taken literal


digitization in Mattern 2014


vilege — perhaps even fetishize — the book and
the bookstack: take MVRDV’s [Book
Mountain](http://www.mvrdv.nl/projects/spijkenisse/) (2012), for a town in the
Netherlands; or TAX arquitectura’s [Biblioteca Jose
Vasconcelos](http://www.designboom.com/architecture/biblioteca-vasconcelos-by-
tax-arquitectura-alberto-kalach/) (2006) in Mexico City.

Stacks occupy a different, though also fetishized, space in Helmut Jahn’s
[Mansueto Library](http://www.archdaily.com/143532/joe-and-rika-mansueto-
library-murphy-jahn/) (2011) at the University of Chicago, which mixes diverse
infrastructures to accommodate media of varying materialities: a grand reading
room, a conservation department, a digitization department, and [a
subterranean warehouse of books retrieved by
robot](https://www.youtube.com/watch?v=ESCxYchCaWI&feature=youtu.be). (It’s
worth noting that Boston and other libraries contained [book
railways](http://libraryhistorybuff.blogspot.com/2010/12/book-retrieval-
systems.html) and conveyer belt retrieval systems — proto-robots — a century
ago.) Snøhetta’s [James B. Hunt Jr.
Library](http://www.ncsu.edu/huntlibrary/watch/) (2013) at North Carolina
State University also incorporates a robotic storage and retrieval system, so
that the library can store more books on site, as well as meet its goal of
providing seating for 20 percent of the student population. 23 Here the
patro


us Turner, March 21, 2014.
17. Marcellus Turner in _Library 2020_ : 92.
18. Ken Worpole addresses library partnerships, and their implications for design in his _Contemporary Library Architecture: A Planning and Design Guide_ (New York: Routledge, 2013). The book offers a comprehensive look the public roles that libraries serve, and how they inform library planning and design.
19. Kristin Fontichiaro in _Library 2020_ : 8.
20. See Bill Ptacek in _Library 2020_ : 119.
21. The quotations are from my earlier article for Places, “[Marginalia: Little Libraries in the Urban Margins](http://places.designobserver.com/feature/little-libraries-and-tactical-urbanism/33968/).” Within mass-digitization projects like Google Books, as Elisabeth Jones explains, “works that are still in copyright but out of print and works of indeterminate copyright status and/or ownership” will fall between the cracks (in _Library 2020_ : 17).
22. I dedicate a chapter in _The New Downtown Library_ to what makes a library “contextual” — and I address just how slippery that term can be.
23. This sentence was amended after publication to note the multiple motives of implementing the bookBot storage and retrieval system; its compact storage allowed the library to reintegrate some collections that were formerly stored off-site. The library has also developed a Virtual Browse catalog system, which aim


digitization in Mars & Medak 2017


rad, 2016). He interpreted to numerous books into Croatian language,
including Multitude (Hardt & Negri, 2009) and A Hacker Manifesto (Wark,
2006c). He is an author and performer with the internationally acclaimed Zagrebbased performance collective BADco (BADco, 2016). Tomislav writes and talks
about politics of technological development, and politics and aesthetics.
Tomislav and Marcell have been working together for almost two decades.
Their recent collaborations include a number of activities around the Public Library
project, including HAIP festival (Ljubljana, 2012), exhibitions in
Württembergischer Kunstverein (Stuttgart, 2014) and Galerija Nova (Zagreb,
2015), as well as coordinated digitization projects Written-off (2015), Digital
Archive of Praxis and the Korčula Summer School (2016), and Catalogue of
Liberated Books (2013) (in Monoskop, 2016b).
243

CHAPTER 12

Ana Kuzmanic is an artist based in Zagreb and Associate Professor at the
Faculty of Civil Engineering, Architecture and Geodesy at the University in Split
(Croatia), lecturing in drawing, design and architectural presentation. She is a
member of the Croatian Association of Visual Artists. Since 2007 she held more
than a dozen individual exhibitions and took part in numerous collective
exhibitions in Croatia, the UK, Italy, Egypt, the Netherlands, the USA, Lithuania
and Slovenia. In 2011 she co-founded the international a


nd large denied
to public libraries. For instance, libraries frequently do not have the right to
purchase e-books for lending and preservations. If they do, they are limited in
regards to how many times and under what conditions they can lend digital objects
before the license and the object itself is revoked (Greenfield, 2012). The case of
academic journals is even worse. As journals become increasingly digital, libraries
can provide access and ‘preserve’ them only for as long as they pay extortionate
subscriptions. The Public Library project fills in the space that remains denied to
real-world public libraries by building tools for organizing and sharing electronic
libraries, creating digitization workflows and making books available online.
Obviously, we are not alone in this effort. There are many other platforms, public
and hidden, that help people to share books. And the practice of sharing is massive.
PJ & AK: The Public Library project (Memory of the World, 2016a) is a part of
a wider global movement based, amongst other influences, on the seminal work of
Aaron Swartz. This movement consists of various projects including but not
limited to Library Genesis, Aaaaarg.org, UbuWeb, and others. Please situate The
Public Library project in the wider context of this movement. What are its distinct
features? What are its main contributions to the movement at large?
MM & TM: The Public Li


digitization in Medak, Sekulic & Mertens 2014


ial choice of book scanning setup will have to take these trade-offs into consideration. If
your scanning community is confined to your hacklab, you won't be risking much if technological
sophistication and integration fails to function smoothly. But if you're aiming at a broad community
of users, with varying levels of technological skill and patience, you want to create as much timesaving automation as possible on the condition of keeping maximum stability. Furthermore, if the
time of individual members of your scanning community can contribute is limited, you might also
want to divide some of the tasks between users and their different skill levels.
This manual breaks down the process of digitization into a general description of steps in the
workflow leading from the printed book to a digital e-book, each of which can be in a concrete
situation addressed in various manners depending on the scanning equipment, software, hacking
skills and user skill level that are available to your book scanning project. Several of those steps can
be handled by a single piece of equipment or software, or you might need to use a number of them your mileage will vary. Therefore, the manual will try to indicate the design choices you have in the
process of planning your workflow and should help you make decisions on what design is best for
you situation.
Introducing book scanner designs
The book scanning st


digitization in Sekulic 2018


disobedience. In the context of sci-hub and Library Genesis, both
projects from the periphery of knowledge production, “copyright infringement
opens on to larger questions about the legitimacy of the historic compromise –
if indeed there ever even was one – between the labor that produces culture
and knowledge and its commodification as codified in existing copyright
regulations.”(6) Here, disobedience and piracy have an equalizing effect on
the asymmetries of access to knowledge.

In 2008, programmer and hacktivist Aaron Swartz published Guerilla Open
Access Manifesto triggered by the enclosure of scientific knowledge production
of the past, often already part of public domain, via digitization. “The
world's entire scientific and cultural heritage, published over centuries in
books and journals, is increasingly being digitized and locked up by a handful
private corporations […] We need to download scientific journals and upload
them to file sharing networks. We need to fight for Guerilla Open Access.”(7)
On January 6, 2011, the MIT police and the US Secret Service arrested Aaron
Swartz on charges of having downloaded a large number of scientific articles
from one of the most used and paywalled database. The federal prosecution
decided to show the increasingly nervous publishing industry the lengths they
are willing to go to protect them by indicting Swartz on 13 criminal coun


he existence of projects such as Library
Genesis, sci-hub, Public Library/Memory of the World, aaaarg.org, monoskop,
and ubuweb, commonly known as shadow libraries, show how building
infrastructure for storing, indexing, and access, as well as supporting
digitization, can not only be put to use by the periphery, but used as a
challenge to the normalization of enclosure offered by the core. The people
building alternative networks of distribution also build networks of support
and solidarity. Those on the peripheries need to 'steal' the knowledge behind
paywalls in order to fight the asymmetries paywalls enforce – peripheries
“steal” in order to advance. Depending on the vantage point, digitization of a
book can be stealing, or liberating it to return the knowledge (from the dusty
library closed stacks) back into circulation. “Old” knowledge can teach new
tricksters a handful of tricks.

In 2015 I realized none of the architecture students of the major European
architecture schools can have a chance encounter with Architecture and
Feminisms or Sexuality and Space, nor with many books on similar topics
because they were typically located in the library’s closed stacks. Both books
were formative and in 2005, as a student I went to great lengths to gain
access to them. The library at the Faculty of Architecture in Belgrade, was
starved of books due to permanent financial crisis, and


d uploaded to the usual digital repositories. It takes two to four hours to
make a neat and searchable PDF scan of a book. As a PDF, knowledge production
usually under the radar or long out of print becomes more accessible. One of
the first books I digitized was Robert Goodman's After the Planners, a
critique of urban planning and the limits of alternate initiatives in cities
written in the late 1960s. A few years after I scanned it, online photos from
a conference drew my attention –the important, white male professor was
showing the front page of After the Planners on his slide. I realized fast the
image had a light signature of the scanner I had used. While I do not know if
this act of digitization made a dent or was co-opted, seeing the image was a
small proof that digitization can bring books back into circulation and access
to them might make a difference – or that access to knowledge can be a weapon.



[Dubravka Sekulic](https://www.making-futures.com/contributor/sekulic/) writes
about the production of space. She is an amateur-librarian at Public
Library/Memory of the World, where she maintains feminist, and space/race
collections. During Making Futures School, Dubravka will be figuring out the
future of education (on all things spatial) together with [Elise
Hunchuck](https://www.making-futures.com/contributor/hunchuck/), [Jonathan
Solomon](https://www.making-futures.com/contributor/solomon/) and [Valentina
Karga](https://www.making-futures.com/contributor/


downloaded
here.](https://www.making-futures.com/wp-content/uploads/2019/05
/Dubravka_Sekulic-On_Knowledge_and_Stealing.pdf)

__

Notes:

(1) For more on the project Herman’s House. Accessed 6 April 2018.


(2) Public Library is a project which has been since 2012 developing and
publicly supporting scenarios for massive disobedience against the current
regulation of production and circulation of knowlde and culture in the digital
realm. See: ‘Memory of the World’. Accessed 7 April 2018.


(3) Herman's library can be accessed at[
http://herman.memoryoftheworld.org/](http://herman.memoryoftheworld.org/) More
on the context of digitization see: ‘Herman’s Library’. Memory of the World
(blog), 28 October 2014. /hermans-library/>, and ‘Public Library. Rethinking the Infrastructures of
Knowledge Production’. Memory of the World (blog), 30 October 2014.
the-infrastructures-of-knowledge-production/.>

(4) For more on shadow libraries and library genesis see: Bodo, Balazs.
‘Libraries in the Post-Scarcity Era’. SSRN Scholarly Paper. Rochester, NY:
Social Science Research Network, 10 June 2015.


(5) ‘Sci-Hub Tears Down Academia’s “Illegal” Copyri


digitization in Thylstrup 2019


remedia Limited. Printed and bound in the United States of America.

Library of Congress Cataloging-in-Publication Data

Names: Thylstrup, Nanna Bonde, author.

Title: The politics of mass digitization / Nanna Bonde Thylstrup.

Description: Cambridge, MA : The MIT Press, [2018] | Includes bibliographical
references and index.

Identifiers: LCCN 2018010472 | ISBN 9780262039017 (hardcover : alk. paper)

eISBN 9780262350044

Subjects: LCSH: Library materials--Digitization. | Archival materials--
Digitization. | Copyright and digital preservation.

Classification: LCC Z701.3.D54 T49 2018 | DDC 025.8/4--dc23 LC record
available at


ss
the Atlantic and provided me with invaluable new perspectives, as well as
theoretical insights and challenges. Beyond the aforementioned, three people
in particular have been instrumental in terms of reading through drafts and in
providing constructive challenges, intellectual critique, moral support, and
fun times in equal proportions—thank you so much Kristin Veel, Henriette
Steiner, and Daniela Agostinho. Marianne Ping-Huang has further offered
invaluable support to this project and her theoretical and practical
engagement with digital archives and academic infrastructures continues to be
a source of inspiration. I am also immensely grateful to all the people
working on or with mass digitization who generously volunteered their time to
share with me their visions for, and perspectives on, mass digitization.

This book has further benefited greatly from dialogues taking place within the
framework of two larger research projects, which I have been fortunate enough
to be involved in: Uncertain Archives and The Past’s Future. I am very
grateful to all my colleagues in both these research projects: Kristin Veel,
Daniela Agostinho, Annie Ring, Katrine Dirkinck-Holmfeldt, Pepita Hesselberth,
Kristoffer Ørum, Ekaterina Kalinina Anders Søgaard as well as Helle Porsdam,
Jeppe Eimose, Stina Teilmann, John Naughton, Jeffrey Schnapp, Matthew Battles,
and Fiona McMillan. I am further indebted to La Vaughn Belle, George Tyson,
Temi Odumosu, Mathias Danbolt, Mette Kia, Lene Asp, Marie Blønd, Mace Ojala,
Renee Ridgway, and many others for our conversations on the ethical issues of
the mass digitization of colonial material. I have also benefitted from the
support and insights offered by other colleagues at the Department of Arts and
Cultural Studies, University of Copenhagen.

A big part of writing a book is also about keeping sane, and for this you need
great colleagues that can pull you out of your own circuit and launch you into
other realms of inquiry through collaboration, conversation, or just good
times. Thank you Mikkel Flyverbom, Rasmus Helles, Stine Lomborg, Helene
Ratner, Anders Koed Madsen, Ulrik Ekman, Solveig Gade, Anna Leander, Mareile
Kaufmann, Holger Schulze, Jakob Kreutzfeld, Jens Hauser, Nan Gerdes, Kerry
Greaves, Mikkel Thelle, Mads Rosendahl Thomsen, Knut Ove Eliassen,


nonymous peer reviewers whose insightful and constructive
comments helped improve this book immensely. Research for this book was
supported by grants from the Danish Research Council and the Velux Foundation.

Last, but not least, I wish to thank my loving partner Thomas Gammeltoft-
Hansen for his invaluable and critical input, optimistic outlook, and perfect
morning cappuccinos; my son Georg and daughter Liv for their general
awesomeness; and my extended family—Susanne, Bodil, and Hans—for their support
and encouragement.

I dedicate this book to my parents, Karen Lise Bonde Thylstrup and Asger
Thylstrup, without whom neither this book nor I would have materialized.

# I
Framing Mass Digitization

# 1
Understanding Mass Digitization

## Introduction

Mass digitization is first and foremost a professional concept. While it has
become a disciplinary buzzword used to describe large-scale digitization
projects of varying scope, it enjoys little circulation beyond the confines of
information science and such projects themselves. Yet, as this book argues, it
has also become a defining concept of our time. Indeed, it has even attained
the status of a cultural and moral imperative and obligation.1 Today, anyone
with an Internet connection can access hundreds of millions of digitized
cultural artifacts from the comfort of their desk—or many other locations—and
cultural institutions and private bodies add thousands of new cultural works
to the digital sphere every day. The practice of mass digitization is forming
new nexuses of knowledge, and new ways of engaging with that knowledge. What
at first glance appears to be a simple act of digitization (the transformation
of singular books from boundary objects to open sets of data), reveals, on
closer examination, a complex process teeming with diverse political, legal,
and cultural investments and controversies.

This volume asks why mass digitization has become such a “matter of concern,”2
and explores its implications for the politics of cultural memory. In
practical terms, mass digitization is digitization on an industrial scale. But
in cultural terms, mass digitization is much more than this. It is the promise
of heightened access to—and better preservation of—the past, and of more
original scholarship and better funding opportunities. It also promises
entirely new ways of reading, viewing, and structuring archives, new forms of
value and their extraction, and new infrastructures of control. This volume
argues that the shape-shifting quality of mass digitization, and its social
dynamics, alters the politics of cultural memory institutions. Two movements
simultaneously drive mass digitization programs: the relatively new phenomenon
of big data gold rushes, and the historically more familiar archival
accumulative imperative. Yet despite these prospects, mass digitization
projects are also uphill battles. They are costly and speculative processes,
with no guaranteed rate of return, and they are constantly faced by numerous
limitations and contestations on legal, social, and cultural levels.
Nevertheless, both public and private institutions adamantly emphasize the
need to digitize on a massive scale, motivating initiatives around the
globe—from China to Russia, Africa to Europe, South America to North America.
Some of these initiatives are bottom-up projects driven by highly motivated
individuals, while others are top-down and governed by complex bureaucratic
apparatuses. Some are backed by private money, others publically funded. Some
exist as actual archives, while others figure only as projections in policy
papers. As the ideal of mass digitization filters into different global
empirical situations, the concept of mass digitization attains nuanced
political hues. While all projects formally seek to serve the public interest,
they are in fact infused with much more diverse, and often conflicting,
political and commercial motives and dynamics. The same mass digitization
project can even be imbued with different and/or contradictory investments,
and can change purpose and function over time, sometimes rapidly.

Mass digitization projects are, then, highly political. But they are not
political in the sense that they transfer the politics of analog cultural
memory institutions into the digital sphere 1:1, or even liberate cultural
memory artifacts from the cultural politics of analog cultural memory
institutions. Rather, mass digitization presents a new political cultural
memory paradigm, one in which we see strands of technical and ideological
continuities combine with new ideals and opportunities; a political cultural
memory paradigm that is arguably even more complex—or at least appears more
messy to us now—than that of analog institutions, whose politics we have had
time to get used to. In order to grasp the political stakes of mass
digitization, therefore, we need to approach mass digitization projects not as
a continuation of the existing politics of cultural memory, or as purely
technical endeavors, but rather as emerging sociopolitical and sociotechnical
phenomena that introduce new forms of cultural memory politics.

## Framing, Mapping, and Diagnosing Mass Digitization

Interrogating the phenomenon of mass digitization, this book asks the question
of how mass digitization affects the politics of cultural memory institutions.
As a matter of practice, something is clearly changing in the conversion of
bounded—and scarce—historical material into ubiquitous ephemeral data. In
addition to the technical aspects of digitization, mass digitization is also
changing the political territory of cultural memory objects. Global commercial
platforms are increasingly administering and operating their scanning
activities in favor of the digital content they reap from the national “data
tombs” of museums and libraries and the feedback loops these generate. This
integration of commercial platforms into the otherwise primarily public
institutional set-up of cultural memory has produced a reconfiguration of the
political landscape of cultural memory from the traditional symbolic politics
of scarcity, sovereignty, and cultural capital to the late-sovereign
infrapolitics of standardization and subversion.

The empirical outlook of the present book is predominantly Western. Yet, the
overarching dynamics that have been pursued are far from limited to any one
region or continent, nor limited solely to the field of cultural memory.
Digitization is a global phenomenon and its reliance on late-sovereign
politics and subpolitical governance forms are shared across the globe.

The central argument of this book is that mass digitization heralds a new kind
of politics in the regime of cultural memory. Mass digitization of cultural
memory is neither a neutral technical process nor a transposition of the
politics of analog cultural heritage to the digital realm on a 1:1 scale. The
limitations of using conventional cultural-political frameworks for
understanding mass digitization projects become clear when working through the
concepts and regimes of mass digitization. Mass digitization brings together
so many disparate interests and elements that any mono-theoretical lens would
fail to account for the numerous political issues arising within the framework
of mass digitization. Rather, mass digitization should be approached as an
_infrapolitical_ process that brings together a multiplicity of interests
hitherto foreign to the realm of cultural memory.

The first part of the book, “framing,” outlines the theoretical arguments in
the book—that the political dynamics of mass digitization organize themselves
around the development of the technical infrastructures of mass digitization
in late-sovereign frameworks. Fusing infrastructure theory and theories on the
political dynamics of late sovereignty allows us to understand mass
digitization projects as cultural phenomena that are highly dependent on
standardization and globalization processes, while also recognizing that their
resultant infrapolitics can operate as forms of both control and subversion.

The second part of the book, “mapping,” offers an analysis of three different
mass digitization phenomena and how they relate to the late-sovereign politics
that gave rise to them. The part thus examines the historical foundation,
technical infrastructures, and (il)licit status and ideological underpinnings
of three variations of mass digitization projects: primarily corporate,
primarily public, and primarily private. While these variations may come
across as reproductions of more conventional societal structures, the chapters
in part two nevertheless also present us with a paradox: while the different
mass digitization projects that appear in this book—from Google’s privatized
endeavor to Europeana’s supranational politics to the unofficial initiatives
of shadow libraries—have different historical and cultural-political
trajectories and conventional regimes of governance, they also undermine these
conventional categories as they morph and merge into new infrastructures and
produce a new form of infrapolitics. The case studies featured in this book
are not to be taken as exhaustive examples, but rather as distinct, yet
nevertheless entangled, examples of how analog cultural memory is taken online
on a digital scale. They have been chosen with the aim of showing the
diversity of mass digitization, but also how it, as a phenomenon, ultimately
places the user in the dilemma of digital capitalism with its ethos of access,
speed, and participation (in varying degrees). The choices also have their
limitations, however. In their Western bias, which is partly rooted in this
author’s lack of language skills (specifically in Russian and Chinese), for
instance, they fail to capture the breadth and particularities of the
infrapolitics of mass digitization in other parts of the world. Much more
research is needed in this area.

The final part of the book, “diagnosing,” zooms in on the pathologies of mass
digitization in relation to affective questions of desire and uncertainty.
This part argues that instead of approaching mass digitization projects as
rationalized and instrumental projects, we should rather acknowledge them as
ambivalent spatio-temporal projects of desire and uncertainty. Indeed, as the
third part concludes, it is exactly uncertainty and desire that organizes the
new spatio-temporal infrastructures of cultural memory institutions, where
notions such as serendipity and the infrapolitics of platforms have taken
precedence over accuracy and sovereign institutional politics. The third part
thus calls into question arguments that imagine mass digitization as
instrumentalized projects that either undermine or produce values of
serendipity, as well as overarching narratives of how mass digitization
produces uncomplicated forms of individualized empowerment and freedom.
Instead, the chapter draws attention to the new cultural logics of platforms
that affect the cultural politics of mass digitization projects.

Crucially, then, this book seeks neither to condemn nor celebrate mass
digitization, but rather to unpack the phenomenon and anchor it in its
contemporary political reality. It offers a story of the ways in which mass
digitization produces new cultural memory institutions online that may be
entwined in the cultural politics of their analog origins, but also raises new
political questions to the collections.

## Setting the Stage: Assembling the Motley Crew of Mass Digitization

The dream and practice of mass digitizing cultural works has been around for
decades and, as this section attests, the projects vary significantly in
shape, size, and form. While rudimentary and nonexhaustive, this section
gathers a motley collection of mass digitization initiatives, from some of the
earliest digitization programs to later initiatives. The goal of this section
is thus not so much to meticulously map mass digitization programs, but rather
to provide examples of projects that might illuminate the purpose of this book
and its efforts to highlight the infrastructural politics of mass
digitization. As the section attests, mass digitization is anything but a
streamlined process. Rather, it is a painstakingly complex process mired in
legal, technical, personal, and political challenges and problems, and it is a
vision whose grand rhetoric often works to conceal its messy reality.

It is pertinent to note that mass digitization suffers from the combined
gendered and racialized reality of cultural institutions, tech corporations,
and infrastructural projects: save a few exceptions, there is precious little
diversity in the official map of mass digitization, even in those projects
that emerge bottom-up. This does not mean that women and minorities have not
formed a crucial part of mass digitization, selecting cultural objects,
prepping them (for instance ironing newspapers to ensure that they are flat),
scanning them, and constructing their digital infrastructures. However, more
often than not, their contributions fade into the background as tenders of the
infrastructures of mass digitization rather than as the (predominantly white,
male) “face” of mass digitization. As such, an important dimension of the
politics of these infrastructural projects is their reproduction of
established gendered and racialized infrastructures already present in both
cultural institutions and the tech industry.3 This book hints at these crucial
dimensions of mass digitization, but much more work is needed to change the
familiar cast of cultural memory institutions, both in the analog and digital
realms.

With these introductory remarks in place, let us now turn to the long and
winding road to mass digitization as we know it today. Locating the exact
origins of this road is a subjective task that often ends up trapping the
explorer in the mirror halls of technology. But it is worth noting that of
course there existed, before the Internet, numerous attempts at capturing and
remediating books in scalable forms, for the purposes both of preservation and
of extending the reach of library collections. One of the most revolutionary
of such technologies before the digital computer or the Internet was
microfilm, which was first held forth as a promising technology of
preservation and remediation in the middle of the 1800s.4 At the beginning of
the twentieth century, the Belgian author, entrepreneur, vision


7 The collaboration not only spurred international interest, but also
inspired a group of influential tech activists and artists closely associated
with the creative work of shadow libraries to create the critical archival
project Mondotheque.be, a platform for “discussing and exploring the way
knowledge is managed and distributed today in a way that allows us to invent
other futures and different narrations of the past,”8 and a resulting digital
publication project, _The Radiated Book,_ authored by an assembly of
activists, artists, and scholars such as Femke Snelting, Tomislav Medak,
Dusan Barok, Geraldine Juarez, Shin Joung Yeo, and Matthew Fuller. 9

Another early precursor of mass digitization emerged with Project Gutenberg,
often referred to as the world’s oldest digital library. Project Gutenberg was
the brainchild of author Michael S. Hart, who in 1971, using technologies such
as ARPANET, Bulletin Board Systems (BSS), and Gopher protocols, experimented
with publishing and distributing books in digital form. As Hart reminisced in
his later text, “The History and Philosophy of Project Gutenberg,”10 Project
Gutenberg emerged out of a donation he received as an undergraduate in 1971,
which consisted of $100 million worth of computing time on the Xerox Sigma V
mainframe at the University of Illinois at Urbana-Champaign. Wanting to make
good use of the donation, Hart, in his ow


on Interchange”). While Project Gutenberg only converted about 50
works into digital text in the 1970s and the 1980s (the first was the
Declaration of Independence), it today hosts up to 56,000 texts in its
distinctly lo-fi manner.12 Interestingly, Michael S. Hart noted very early on
that the intention of the project was never to reproduce authoritative
editions of works for readers—“who cares whether a certain phrase in
Shakespeare has a ‘:’ or a ‘;’ between its clauses”—but rather to “release
etexts that are 99.9% accurate in the eyes of the general reader.”13 As the
present book attests, this early statement captures one of the central points
of contestation in mass digitization: the trade-off between accuracy and
accessibility, raising questions both of the limits of commercialized
accelerated digitization processes (see chapter 2 on Google Books) and of
class-based and postcolonial implications (see chapter 4 on shadow libraries).

If Project Gutenberg spearheaded the efforts of bringing cultural works into
the digital sphere through manual conversion of analog text into lo-fi digital
text, a French mass digitization project affiliated with the construction of
the Bibliothèque nationale de France (BnF) initiated in 1989 could be
considered one of the earliest examples of actually digitizing cultural works
on an industrial scale.14 The French were thus working on blueprints of mass
digitization programs before mass digitization became a widespread practice __
as part of the construction of a new national library, under the guidance of
Alain Giffard and initiated by François Mitterand. In a letter sent in 1990 to
Prime Minister Michel Rocard, President Mitterand outlined his vision of a
digital library, noting that “the novelty will be in the possibility of using
the most modern computer techniques for access to catalogs and documents of
the Bibliothèque nationale de France.”15 The project managed to digitize a
body of 70,000–80,000 titles, a sizeable amount of works for its time. As
Alain Giffard noted in hindsight, “the main difficulty for a digitization
program is to choose the books, and to choose the people to choose the
books.”16 Explaining in a conversation with me how he went about this task,
Giffard emphasized that he chose “not librarians but critics, researchers,
etc.” This choice, he underlined, could be made only because the digitization
program was “the last project of the president and a special mission” and thus
not formally a civil service program.17 The work process was thus as follows:

> I asked them to prepare a list. I told them, “Don’t think about what exists.
I ask of you a list of books that would be logical in this concept of a
library of France.” I had the first list and we showed it to the national
library, which was always fighting internally. So I told them, “I want this
book to be digitized.” But they would never give it to us because of
territory. Their ship was not my ship. So I said to them, “If you don’t give
me the books I shall buy the books.” They said I could never buy them, but


e I earned a lot
of money at that time. So in the end I had a lot of books. And I said to them,
“If you want the books digitized you must give me the books.” But of the
80,000 books that were digitized, half were not in the collection. I used the
staff’s garages for the books, 80,000 books. It is an incredible story.18

Incredible indeed. And a wonderful anecdote that makes clear that mass
digitization, rather than being just a technical challenge, is also a
politically contingent process that raises fundamental questions of territory
(institutional as well as national), materiality, and culture. The integration
of the digital _très grande bibliothèque_ into the French national mass
digitization project Gallica, later in 1997, also foregrounds the
infrastructural trajectory of early national digitization programs into later
glocal initiatives. 19

The question of pan-national digitization programs was precisely at the
forefront of another early prominent mass digitization project, namely the
Universal Digital Library (UDL), which was launched in 1995 by Carnegie Mellon
computer scientist Raj Reddy and developed by linguist Jaime Carbonell,
physicist Michael Shamos, and Carnegie Mellon Foundation dean of libraries
Gloriana St. Clair. In 1998, the project launched the Thousand Book Project.
Later, the UDL scaled its initial efforts up to the Million Book Project,
which they successfully completed in 2007.20 Organizationally, the UDL stood
out from many of the other digitization projects by including initial
participation from three non-Western entities in addition to the Carnegie
Mellon Foundation—the governments of India, China, and Egypt.21 Indeed, India
and China invested about $10 million in the initial phase, employing several
hundred people to find books, bring them in, and take them back. While the
project ambitiously aimed to provide access “to all human knowledge, anytime,
anywhere,” it ended its scanning activities 2008. As such, the Universal
Digital Library points to another central infrastructural dimension of mass
digitization: its highly contingent spatio-temporal configurations that are
often posed in direct contradistinction to the universalizing discourse of
mass digitization. Across the board, mass digitization projects, while
confining themselves in practice to a limited target of how many books they
will digitize, employ a discourse of universality, perhaps alluding vaguely to
how long such an endeavor will take but in highly uncertain terms (see
chapters 3 and 5 in particular).

No exception from the universalizing discourse, another highly significant
mass digitization project, the Internet Archive, emerged around the same time
as the Universal Digital Library. The Internet Archive was founded by open
access activist and computer engineer Brewster Kahle in 1996, and although it
was primarily oriented toward preserving born-digital material, in particular
the Internet ( _Wired_ calls Brewster Kahle “the Internet’s de facto
librarian” 22), the Archive also began digitizing books in 2005, supported by
a grant from the Alfred Sloan Foundation. Later that year, the Internet
Archive created the infrastructural initiative, Open Content Alliance (OCA),
and was now embedded in an infrastructure that included over 30 major US
libraries, as well as major search engines (by Yahoo! and Microsoft),
technology companies (Adobe and Xerox), a commercial publisher (O’Reilly
Media, Inc.), and a not-for-profit membership organization of more than 150
institutions, including universities, research libraries, archives, museums,
and historical societies.23 The Internet Archive’s mass digitization
infrastructure was thus from the beginning a mesh of public and private
cooperation, where libraries made their collections available to the Alliance
for scanning, and corporate sponsors or the Internet Archive conversely funded
the digitization processes. As such, the infrastructures of the Internet
Archive and Google Books were rather similar in their set-ups.24 Nevertheless,
the initiative of the Internet Archive’s mass digitization project and its
attendant infrastructural alliance, OCA, should be read as both a technical
infrastructure responding to the question of _how_ to mass digitize in
technical terms, and as an infrapolitical reaction in response to the forces
of the commercial world that were beginning to gather around mass
digitization, such as Amazon 25 and Google. The Internet Archive thus
positioned itself as a transparent open source alternative to the closed doors
of corporate and commercial initiatives. Yet, as Kalev Leetaru notes, the case
was more complex than that. Indeed, while the OCA was often foregrounded as
more transparent than Google, their technical infrastructural components and
practices were in fact often just as shrouded in secrecy.26 As such, the
Internet Archive and the OCA draw attention to the important infrapolitical
question in mass digitization, namely how, why, and when to manage
visibilities in mass digitization projects.

Although the media sometimes picked up stories on mass digitization projects
already outlined, it wasn’t until Google entered the scene that mass
digitization became a headline-grabbing enterprise. In 2004, Google founders
Larry Page and Sergey Brin traveled to Frankfurt to make a rare appearance at
the Frankfurt Book Fair. Google was at that time still considered a “scrappy”
Internet company in some quarters, as compared with tech giants such as
Microsoft.27 Yet Page and Brin went to Frankfurt to deliver a monumental
announcement: Google would launch a ten-year plan to make available
approximately 15 million digitized books, both in- and out-of-copyright
works.28 They baptized the program “Google Print,” a project that consisted of
a series of partnerships between Google and five English-language libraries:
the University of Michigan at Ann Arbor, Stanford, Harvard, Oxford (Bodleian
Library), and the New York City Public Library. While Page’s and Brin’s
announcement was surprising to some, many had anticipated it; as already
noted, advances toward mass digitization proper had already been made, and
some of the partnership institutions had been negotiating with Google since
2002.29 As with many of the previous mass digitization projects, Google found
inspiration for their digitization project in the long-lived utopian ideal of
the universal library, and in particular the mythic library of Alexandria.30
As with other Google endeavors, it seemed that Page was intent on realizing a
utopian ideal that scholars (and others) had long dreamed of: a library
containing everything ever written. It would be realized, however, not with
traditional human-centered means drawn from the world of libraries, but rather
with an AI approach. Google Books would exceed human constraints, taking the
seemingly impossible vision of digitizing all the books in the world as a
starting point for constructing an omniscient Artificial Intelligence that
would know the entire human symbol system and all


e constraints were physical (how to digitize and organize
all this knowledge in physical form); legal (how to do it in a way that
suspends existing regulation); and political (how to transgress territorial
systems). The invocation of the notion of the universal library was not a
neutral action. Rather, the image of Google Books as a library worked as a
symbolic form in a cultural scheme that situated Google as a utopian, and even
ethical, idealist project. Google Books seemingly existed by virtue of
Goethe’s famous maxim that “To live in the ideal world is to treat the
impossible as if it were possible.”31 At the time, the industry magazine
_Bookseller_ wrote in response to Google’s digitization plans: “The prospect
is both thrilling and frightening for the book industry, raising a host of
technical and theoretical issues.” 32 And indeed, while some reacted with
enthusiasm and relief to the prospect of an organization being willing to
suffer the cost of mass digitization, others expressed economic and ethical
concerns. The Authors Guild, a New York–based association, promptly filed a
copyright infringement suit against Google. And librarians were forced to
revisit core ethical principles such as privacy and public access.

The controversies of Google Books initially played out only in US territory.
However, another set of concerns of a more territorial and political nature
soon came to light. The French President at the time, Jacques Chirac, called
France to cultural-political arms, urging his culture minister, Renaud
Donnedieu de Vabres, and Jean-Noël Jeanneney, then-head of France’s
Bibliothèque nationale, to do the same with French texts as Google planned to
do with their partner libraries, but by means of a French search engine.33
Jeanneney initially framed this French cultural-political endeavor as a
European “contre-attaque” against Google Books, which, according to Jeanneney,
could pose “une domination écrasante de l'Amérique dans la définition de
l'idée que les prochaines générations se feront du monde.” (“a crushing
American domination of the formation of future generations’ ideas about the
world”)34 Other French officials insisted that the French digitization project
should be seen not primarily as a cultural-political reaction _against_
Google, but rather as a cultural-political incentive within France and Europe
to make European information available online. “I really stress that it's not
anti-American,” an official at France’s Ministry of Culture and Communication,
speaking on the condition of anonymity, noted in an interview. “It is not a
reaction. The objective is to make more material relevant to European heritage
available. … Everybody is working on digitization projects.” Furthermore, the
official did not rule out potential cooperation between Google and the
European project. 35 There was no doubt, however, that the move to mass
digitization “was a political drive by the French,” as Stephen Bury, head of
European and American collections at the British Library, emphasized.36

Despite its mixed messages, the French reaction nevertheless underscored the
controversial nature of mass digitization as a symbolic, as well as technical,
aspiration: mass digitization was a process that not only neutrally scanned
and represented books but could also produce a new mode of world-making,
actively structuring archives as well as their users.37 Now questions began to
surface about where, or with whom, to place governance over this new archive:
who would be the custodian of the keys to this new library? And who would be
the librarians? A series of related questions could also be asked: who would
determine the archival limits, the relations between the secret and the non-
secret or the private and the public, and whether these might involve property
or access rights, publication or reproduction rights, classification, and
putting into order? France soon managed to rally other EU countries (Spain,
Poland, Hungary, Italy, and Germany) to back its recommendation to the
European Commission (EC) to construct a European alternative to Google’s
search engine and archive and to set this out in writing. Occasioned by the
French recommendation, the EC promptly adopted the idea of Europeana—the name
of the proposed alternative—as a “flagship project” for the budding EU
cultural policy.38 Soon after, in 2008, the EC launched Europeana, giving
access to some 4.5 million digital objects from more than 1,000 institutions.

Europeana’s Europeanizing discourse presents a territorializing approach to
mass digitization that stands in contrast to the more universalizing tone of
Mundaneum, Gutenberg, Google Books, and the Universal Digital Library. As
such, it ties in with our final examples, namely the sovereign mass
digitization projects that have in fact always been one of the primary drivers
in mass digitization efforts. To this day, the map of mass digitization is
populated with sovereign mass digitization efforts from Holland and Norway to
France and the United States. One of the most impressive projects is the
Norwegian mass digitization project at the National Library of Norway, which
since 2004 has worked systematically to develop a digital National Library
that encompasses text, audio, video, image, and websites. Impressively, the
National Library of Norway offers digital library services that provide online
access (to all with a Norwegian IP address) to full-text versions of all books
published in Norway up until the year 2001, access to digital newspaper
collections from the major national and regional newspapers in all libraries
in the country, and opportunities for everyone with Internet access to search
and listen to more than 40,000 radio programs recorded between 1933 and the
present day.39 Another ambitious national mass digitization project is the
Dutch National Library’s effort to digitize all printed publications since
1470 and to create a National Platform for Digital Publications, which is to
act both as a content delivery platform for its mass digitization output and
as a national aggregator for publications. To this end, the Dutch National
Library made deals with Google Books and Proquest to digitize 42 million pages
just as it entered into partnerships with cross-domain aggregators such as
Europeana.40 Finally, it is imperative to mention the Digital Public Library
of America (DPLA), a national digital library conceived of in 2010 and
launched in 2013, which aggregates digital collections of metadata from around
the United States, pulling in content from large institutions like the
National Archives and Records Administration and HathiTrust, as well as from
smaller archives. The DPLA is in great part the fruit of the intellectual work
of Har


hich consisted of influential names from the
digital, legal, and library worlds, such as Robert Darnton, Maura Marx, and
John Palfrey from Harvard University; Paul Courant of the University of
Michigan; Carla Hayden, then of Baltimore’s Enoch Pratt Free Library and
subsequently the Librarian of Congress; Brewster Kahle; Jerome McGann; Amy
Ryan of the Boston Public Library; and Doron Weber of the Sloan Foundation.
Key figures in the DPLA have often to great rhetorical effect positioned DPLA
vis-à-vis Google Books, partly as a question of public versus private
infrastructures.41 Yet, as the then-Chairman of DPLA John Palfrey conceded,
the question of what constitutes “public” in a mass digitization context
remains a critical issue: “The Digital Public Library of America has its
critics. One counterargument is that investments in digital infrastructures at
scale will undermine support for the traditional and the local. As the
chairman of the DPLA, I hear this critique in the question-and-answer period
of nearly every presentation I give. … The concern is that support for the
DPLA will undercut already eroding support for small, local public
libraries.”42 While Palfrey offers good arguments for why the DPLA could
easily work in unison with, rather than jeopardize, smaller public libraries,
and while the DPLA is building infrastructures to support this claim,43 the
discussion nevertheless highlights the difficulties with determining when
something is “public,” and even national.

While the highly publicized and institutionalized projects I have just
recounted have taken center stage in the early and later years of mass
digitization, they neither constitute the full cast, nor the whole machinery,
of mass digitization assemblages. Indeed, as chapter 4 in this book charts, at
the margins of mass digitization another set of actors have been at work
building new digital cultural memory assemblages, including projects such as
Monoskop and Lib.ru. These actors, referred to in this book as shadow library
projects (see chapter 4), at once both challenge and confirm the broader
infrapolitical dimensions of mass digitization, including its logics of
digital capitalism, network power, and territorial reconfigurations of
cultural memory between universalizing and glocalizing discourses. Within this
new “ecosystem of access,” unauthorized archives as Libgen, Gigapedia, and
Sci-Hub have successfully built “shadow libraries” with global reach,
containing massive aggregations of downloadable text material of both
scholarly and fictional character.44 As chapter 4 shows, these initiatives
further challenge our notions of public good, licit and illicit mass
digitization, and the territorial borders of mass digitization, just as they
add another layer of complexity to the question of the politics of mass
digitization.

Today, then, the landscape of mass digitization has evolved considerably, and
we can now begin to make out the political contours that have shaped, and
continue to shape, the emergent contemporary knowledge infrastructures of mass
digitization, ripe as they are with contestation, cooperation, and
competition. From this perspective, mass digitization appears as a preeminent
example of how knowledge politics are configured in today’s world of
“assemblages” as “multisited, transboundary networks” that connect
subnational, national, supranational, and global infrastructures and actors,
without, however, necessarily doing so through formal interstate systems.45 We
can also see that mass digitization projects did not arise as a result of a
sovereign decision, but rather emerged through a series of contingencies
shaped by late-capitalist and late-sovereign forces. Furthermore, mass
digitization presents us with an entirely new cultural memory paradigm—a
paradigm that requires a shift in thinking about cultural works, collections,
and contexts, from cultural records to be preserved and read by humans, to
ephemeral machine-readable entities. This change requires a shift in thinking
about the economy of cultural works, collections, and contexts, from scarce
institutional objects to ubiquitous flexible information. Finally, it requires
a shift in thinking about these same issues as belonging to national-global
domains to conceiving them in terms of a set of political processes that may
well be placed in national settings, but are oriented toward global agendas
and systems.

## Interrogating Mass Digitization

Mass digitization is often elastic in definition and elusive in practice.
Concrete attempts have been made to delimit what mass digitization is, but
these rarely go into specifics. The two characteristics most commonly
associated with mass digitization are the relative lack of selectivity of
materials, as compared to smaller-scale digitization projects, and the high
speed and high volume of the process in terms of both digital conversion and
metadata creation, which are made possible through a high level of
automation.46 Mass digitization is thus concerned not only with preservation,
but also with what kind of knowledge practices and values technology allows
for and encourages, for example, in relation to de- and recontextualization,
automation, and scale.47

Studies of mass digitization are commonly oriented toward technology or
information policy issues close to libraries, such as copyright, the quality
of digital imagery, long-term preservation responsibility, standards and
interoperability, and economic models for libraries, publishers, and
booksellers, rather than, as here, the exploration of theory.48 This is not to
say that existing work on mass digitization is not informed by theoretical
considerations, but rather that the majority of research emphasizes policy and
technical implementation at the expense of a more fundamental understanding of
the cultural implications of mass digitization. In part, the reason for this
is the relative novelty of mass digitization as an identifiable field of
practice and policy, and its significant ramifications in the fields of law
and information science.49 In addition to scholarly elucidations, mass
digitization has also given rise to more ideologically fuelled critical books
and articles on the topic.50

Despite its disciplinary branching, work on mass digitization has mainly taken
place in the fields of information science, law, and computer science, and has
primarily problematized the “hows” of mass digitization and not the “whys.”51
As with technical work on mass digitization, most nontechnical studies of mass
digitization are “problem-solving” rather than “critical,” and this applies in
particular to work originating from within the policy analysis community. This
body seeks to solve problems within the existing social order—for example,
copyright or metadata—rather than to interrogate the assumptions that underlie
mass digitization programs, which would include asking what kinds of knowledge
production mass digitization gives rise to. How does mass digitization change
the ideological infrastructures of cultural heritage institutions? And from
what political context does the urge to digitize on an industrial scale
emerge? While the technical and problem-solving corpus on mass digitization is
highly valuable in terms of outlining the most important stakeholders and
technical issues of the field, it does not provide insight into the deeper
structures, social mechanisms, and political implications of mass
digitization. Moreover, it often fails to account for digitization as a force
that is deeply entwined with other dynamics that shape its development and
uses. It is this lack that the present volume seeks to mitigate.

## Assembling Mass Digitization

Mass digitization is a composite and fluctuating infrastructure of
disciplines, interests, and forces rooted in public-private assemblages,
driven by ideas of value extraction and distribution, and supported by new
forms of social organization. Google Books, for instance, is both a commercial
project covered by nondisclosure agreements _and_ an academic scholarly
project open for all to see. Similarly, Europeana is both a public
digitization project directed at “citizens” _and_ a public-private partnership
enterprise ripe with profit motives. Nevertheless, while it is tempting to
speak about specific mass digitization projects such as Google Books and
Europeana in monolithic and contrastive terms, mass digitization projects are
anything but tightly organized, institutionally delineated, coherent wholes
that produce one dominant reading. We do not find one “essence” in mass
digitized archives. They are not “enlightenment projects,” “library services,”
“software applications,” “interfaces,” or “corporations.” Nor are they rooted
in one central location or single ideology. Rather, mass digitization is a
complex material and social infrastructure performed by a diverse
constellation of cultural memory professionals, computer scientists,
information specialists, policy personnel, politicians, scanners, and
scholars. Hence, this volume approaches mass digitization projects as
“assemblages,” that is, as contingent arrangements consisting of humans,
machines, objects, subjects, spaces and places, habits, norms, laws, politics,
and so on. These arrangements cross national-global and public-private lines,
producing what this volume calls “late-sovereign,” “posthuman,” and “late-
capitalist” assemblages.

To give an example, we can look at how the national and global aspects of
cultural memory institutions change with mass digitization. The national
museums and libraries we frequent today were largely erected during eras of
high nationalism, as supreme acts of cultural and national territoriality.
“The early establishment of a national collection,” as Belinda Tiffen notes,
“was an important step in the birth of the new nation,” since it signified
“the legitimacy of the nation as a political and cultural entity with its own
heritage and culture worthy of being recorded and preserved.”52 Today, as the
initial French incentive to build Europeana shows, we find similar
nationalization processes in mass digitization projects. However,
nationalizing a digital collection often remains a performative gesture than a
practical feat, partly because the information environment in the digital
sphere differs significantly from that of the analog world in terms of
territory and materiality, and partly because the dichotomy between national
and global, an agreed-upon construction for centuries, is becoming more and
more difficult to uphold in theory and practice.53 Thus, both Google Books and
Europeana link to sovereign frameworks such as citizens and national
representation, while also undermining them with late-capitalist transnational
economic agreements.

A related example is the posthuman aspect of cultural memory politics.
Cultural memory artifacts have always been thought of as profoundly human
collections, in the sense that they were created by and for human minds and
human meaning-making. Previously, humans also organized collections. But with
the invention of computers, most cultural memory institutions also introduced
a machine element to the management of accelerating amounts of information,
such as computerized catalog systems and recollection systems. With the advent
of mass digitization, machines have gained a whole new role in the cultural
memory ecosystem, not only as managers, but also as interpreters. Thus,
collections are increasingly digitized to be read by machines instead of
humans, just as metadata is now becoming a question of machine analysis rather
than of human contextualization. Machines are taking on more and more tasks in
the realm of cultural memory that require a substantial amount of cognitive
insight (just as mass digitization has created the need for new robot-like,
and often poorly paid, human tasks, such as the monotonous work of book
scanning). Mass digitization has thereby given rise to an entirely new
cultural-legal category titled “non-consumptive research,” a term used to
describe the large-scale analysis of texts, and which has been formalized by
the Google Books Settlement, for instance, in the following way: “research in
which computational analysis is performed on one or more books, but not
research in which a researcher reads or displays.”54

Lastly, mass digitization connects the politics of cultural memory to
transnational late capitalism, and to one of its expressions in particular:
digital capitalism.55 Of course, cultural memory collections have a long
history with capitalism. The nineteenth century held very fuzzy boundaries
between the cultural functions of libraries and the commercial interests that
surrounded them, and, as historian of libraries Francis Miksa notes, Melvin
Dewey, inventor of the Dewey Decimal System, was a great admirer of the
corporate ideal, and was eager to apply it to the library system.56 Indeed,
library development in the United States was greatly advanced by the
philanthropy of capitalism, most notably by Andrew Carnegie.57 The question,
then, is not so much whether mass digitization has brought cultural memory
institutions, and their collections and users, into a capitalist system, but
_what kind_ of capitalist system mass digitization has introduced cultural
memory to: digital capitalism.

Today, elements of the politics of cultural memory are being reassembled into
novel knowledge configurations. As a consequence, their connections and
conjugations are being transformed, as are their institutional embeddings.
Indeed, mass digitization assemblages are a product of our time. They are new
forms of knowledge institutions arising from a sociopolitical environment
where vertical territorial hierarchies and horizontal networks entwine in a
new political mesh: where solid things melt into air, and clouds materialize
as material infrastructures, where boundaries between experts and laypeople
disintegrate, and where machine cognition operates on a par with human
cognition on an increasingly large scale. These assemblages enable new types
of political actors—networked assemblages—which hold particular forms of power
despite their informality vis-à-vis the formal political system; and in turn,
through their practices, these acto


his time, the stable structures of
modernist institutions began to give ground to postmodern forces: sovereign
systems entered into supra-, trans-, and international structures,
“globalization” became a buzzword, and privatizing initiatives drove wedges
into the foundations of state structures. The centralized power exercised by
disciplinary institutions was increasingly distributed along more and more
lines, weakening the walls of circumscribed centralized authority.60 This
disciplinary decomposition took place on all levels and across all fields of
society, including institutional cultural memory containers such as libraries
and museums. The forces of privatization, globalization, and digitization put
pressures not only on the authority of these institutions but also on a host
of related authoritative cultural memory elements, such as “librarians,”
“cultural works,” and “taxonomies,” and cultural memory practices such as
“curating,” “reading,” and “ownership.” Librarians were “disintermediated” by
technology, cultural works fragmented into flexible data, and curatorial
principles were revised and restructured just as reading was now beginning to
take place in front of screens, meaning-making to be performed by machines,
and ownership of works to be substituted by contractual renewals.

Thinking about mass digitization as an “assemblage” allows us to abandon the
image of a circumscribed entity in favor of approaching it as an aggregate of
many highly varied components and their contingent connections: scanners,
servers, reading devices, cables, algorithms; national, EU, and US
policymakers; corporate CEOs and employees; cultural heritage professionals
and laypeople; software developers, engineers, lobby organizations, and
unsalaried labor; legal settlements, academic conferences, position papers,
and so on. It gives us pause—every time we say “Google” or “Europeana,” we
might reflect on what we actually mean. Does the researcher employed by a
university library and working with Google Books also belong to Google Books?
Do the underpaid scanners? Do the users of Google? Or, when we refer to Google
Books, do we rather only mean to include the founders and CEOs of Google? Or
has Google in fact become a metaphor that expresses certain characteristics of
our time? The present volume suggests that all these components enter into the
new phenomenon of mass digitization and produce a new field of potentiality,
while at the same time they retain their original qualities and value systems,
at least to some extent. No assemblage is whole and imperturbable, nor
entirely reducible to its parts, but is simultaneously an accumulation of
smaller assemblages and a member of larger ones.61 Thus Google Books, for
example, is both an aggregation of smaller assemblages such as university
libraries, scanners (both humans and machines), and books, _and_ a member of
larger assemblages such as Google, Silicon Valley, neoliberal lobbies, and the
Internet, to name but a few.

While representations of assemblages such as the analyses performed in this
volume are always doomed to misrepresent empirical reality on some level, this
approach nevertheless provides a tool for grasping at least some of mass
digitization’s internal heterogeneity, and the mechanisms and processes that
enable each project’s continued assembled existence. The concept of the
assemblage allows us to grasp mass digitization as comprised of ephemeral
projects that are uncertain by nature, and sometimes even made up of
contradictory components.62 It also allows us to recognize that they are more
than mere networks: while ephemeral and networked, something enables them to
cohere. Bruno Latour writes, “Groups are not silent things, but rather the
provisional product of a constant uproar made by the millions of contradictory
voices about what is a group and who pertains to what.”63 It is the “taming
and constraining of this multivocality,” in particular by communities of
knowledge and everyday practices, that enables something like mass
digitization to cohere as an assemblage.64 This book is, among other things,
about those communities and practices, and the politics they produce and are
produced by. In particular, it addresses the politics of mass digitization as
an infrapolitical activity that retreats into, and emanates from, digital
infrastructures and the network effects they produce.

## Politics in Mass Digitization: Infrastructure and Infrapolitics

If the concept of “assemblage” allows us to see the relational set-up of mass
digitization, it also allows us to inquire into its political infrastructures.
In political terms, assemblage thinking is partly driven by dissatisfaction
with state-centric dominant ontologies, including reified units such as state,
society, or capitalism, and the unilinear focus on state-centric politics over
other forms of politics.65 The assemblage perspective is therefore especially
useful for understanding the politics of late-sovereign and late-capitalist
data projects such as mass digitization. As we will see in part 2, the
epistemic frame of sovereignty continues to offer an organizing frame for the
constitution and regulation of mass digitization and the virtues associated
with it (such as national representation and citizen engagement). However, at
the same time, mass digitization projects are in direct correspondence with
neoliberal values such as privatization, consumerism, globalization, and
acceleration, and its technological features allow for a complete
restructuring of the disciplinary spaces of libraries to form vaster and even
global scales of integration and economic organization on a multinational
stage.

Mass digitization is a concrete example of what cultural memory projects look
like in a “late-sovereign” age, where globalization tests the political and
symbolic authority of sovereign cultural memory politics to its limits, while
sovereignty as an epistemic organizing principle for the politics of cultural
memory nonetheless persists.66 The politics of cultural memory, in particular
those practiced by cultural heritage institutions, often still cling to fixed
sovereign taxonomies and epistemic frameworks. This focus is partly determined
by their institutional anchoring in the framework of national cultural
policies. In mass digitization, however, the formal political apparatus of
cultural heritage institutions is adjoined by a politics that plays out in the
margins: in lobbies, software industries, universities, social media, etc.
Those evaluating mass digitization assemblages in macropolitical terms, that
is, those who are concerned with political categories, will glean little of
the real politics of mass digitization, since such politics at the margins
would escape this analytic matrix.67 Assemblage thinking, by contrast, allows
us to acknowledge the political mechanisms of mass digitization beyond
disciplinary regulatory models, in societies where “where forces … not
categories, clash.”68

As Ian Hacking and many others have noted, the capacious usage of the notion
of “politics” threatens to strip the word of meaning.69 But talk of a politics
of mass digitization is no conceptual gimmick, since what is taking place in
the construction and practice of mass digitization assemblages plainly is
political. The question, then, is how best to describe the politics at work in
mass digitization assemblages. The answer advanced by the present volume is to
think of the politics of mass digitization as “infrapolitics.”

The notion of infrapolitics has until now primarily and profoundly been
advanced as a concept of hidden dissent or contestation (Scott, 1990).70 This
volume suggests shifting the lens to focus on a different kind of
infrapolitics, however, one that not only takes the shape of resistance but
also of maintenance and conformity, since the story of mass digitization is
both the story of contestation _and_ the politics of mundane and standard-
seeking practices. 71 The infrapolitics of mass digitization is, then, a kind
of politics “premised not on a subject, but on the infra,” that is, the
“underlying rules of the world,” organized around glocal infrastructures.72
The infrapolitics of mass digitization is the building and living of
infrastructures, both as spaces of contestation and processes of
naturalization.

Geoffrey Bowker and Susan Leigh Star have argued that the establishment of
standards, categories, and infrastructures “should be recognized as the
significant site of political and ethical work that they are.”73 This applies
not least in the construction and development of knowledge infrastructures
such as mass digitization assemblages, structures that are upheld by
increasingly complex sets of protocols and standards. Attaching “politics” to
“infrastructure” endows the term—and hence mass digitization under this
rubric—with a distinct organizational form that connects various stages and
levels of politics, as well as a distinct temporality that relates mass
digitization to the forces and ideas of industrialization and globalization.

The notion of infrastructure has a surprisingly brief etymology. It first
entered the French language in 1875 in relation to the excavation of
railways.74 Over the following decades, it primarily designated fixed
installations designed to facilitate and foster mobility. It did not enter
English vocabulary until 1927, and as late as 1951, the word was still
described by English sources as “new” (OED).75 When NATO adopted the term in
the 1950s, it gained a military tinge. Since then, “infrastructure” has
proliferated into ever more contexts and disciplines, becoming a “plastic
word”76 often used to signify any vital and widely shared human-constructed
resource.77

What makes infrastructures central for understanding the politics of mass
digitization? Primarily, they are crucial to understanding how industrialism
has affected the ways in which we organize and engage with knowledge, but the
politics of infrastructures are also becoming increasingly significant in the
late-sovereign, late-capitalist landscape.

The infrastructures of mass digitization mediate, combine, connect, and
converge upon different institutions, social networks, and devices, augmenting
the actors that take part in them with new agential possibilities by expanding
the radius of their action, strengthening and prolonging the reach of their
performance, and setting them free for other activities through their
accelerating effects, time often reinvested in other infrastructures, such as,
for instance, social media activities. The infrastructures of mass
digitization also increase the demand for globalization and mobility, since
they expand the radius of using/reading/working.

The infrastructures of mass digitization are thus media of polities and
politics, at times visible and at others barely legible or felt, and home both
to dissent as well as to standardizing measures. These include legal
infrastructures such as copyright, privacy, and trade law; material
infrastructures such as books, wires, scanners, screens, server parks, and
shelving systems; disciplinary infrastructures such as metadata, knowledge
organization, and standards; cultural infrastructures such as algorithms,
searching, reading, and downloading; societal infrastructures such as the
realms of the public and private, national and global. These infrastructures
are, depending, both the prerequisites for and the results of interactions
between the spatial, temporal, and social classes that take part in the
construction of mass digitization. The infrapolitics of mass digitization is
thus geared toward both interoperability and standardization, as well as
toward variation.78

Often when thinking of infrastructures, we conceive of them in terms of
durability and stability. Yet, while some infrastructures, such as railways
and Internet cables, are fairly solid and rigid constructions, others—such as
semantic links, time-limited contracts, and research projects—are more
contingent entities which operate not as “fully coherent, deliberately
engineered, end-to-end processes,” but rather as morphous contingent
assemblages, as “ecologies or complex adaptive systems” consisting of
“numerous systems, each with unique origins and goals, which are made to
interoperate by means of standards, socket layers, social practices, norms,
and individual behaviors that smooth out the connections among them.”79 This
contingency has direct implications for infrapolitics, which become equally
flexible and adaptive. These characteristics endow mass digitization
infrastructures with vulnerabilities but also with tremendous cultural power,
allowing them to distribute agency, and to create and facilitate new forms of
sociality and culture.

Building mass digitization infrastructures is a costly endeavor, and hence
mass digitization infrastructures are often backed by public-private
partnerships. Indeed infrastructures—and mass digitization infrastructures are
no exceptions—are often so costly that a certain mixture of political or
individual megalomania, state reach, and private capital is present in their
construction.80 This mixed foundation means that a lot of the political
decisions regarding mass digitization literally take place _beneath_ the radar
of “the representative institutions of the political system of nation-states,”
while also more or less aggressively filling out “gaps” in nation-state
systems, and even creating transnational zones with their own policies. 81
Hence the notion of “infra”: the infrapolitics of mass digitization hover at a
frequency that lies _below_ and beyond formal sovereign state apparatus,
organized, as they are, around glocal—and often private or privatized—material
and social infrastructures.

While distinct from the formalized sovereign political system, infrapolitical
assemblages nevertheless often perform as late-sovereign actors by engaging in
various forms of “sovereignty games.”82 Take Google, for instance, a private
corporation that often defines itself as at odds with state practice, yet also
often more or less informally meets with state leaders, engages in diplomatic
discussions, and enters into agreements with state agencies and local
political councils. The infrapolitical


rapolitical forces can on the other hand also squeeze the life out of
existing parliamentary ways, promoting instead various forms of apolitical or
libertarian modes of life. The infrapolitical apparatus thus stands apart from
more formalized politics, not only in terms of political arena, but also the
constraints that are placed upon them in the form, for instance, of public
accountability.83 What is described here can in general terms be called the
infrapolitics of neoliberalism, whose scenery consists of lobby rooms, policy-
making headquarters, financial zones, public-private spheres, and is populated
by lobbyists, bureaucrats, lawyers, and CEOs.

But the infrapolitical dynamics of mass digitization also operate in more
mundane and less obvious settings, such as software design offices and
standardization agencies, and are enacted by engineers, statisticians,
designers, and even users. Infrastructures are—increasingly—essential parts of
our everyday lives, not only in mass digitization contexts, but in all walks
of life, from file formats and software programs to converging transportation
systems, payment systems, and knowledge infrastructures. Yet, what is most
significant about the majority of infrapolitical institutions is that they are
so mundane; if we notice them at all, they appear to us as boring “lists of
numbers and technical specifications.”84 And their maintenance and
construction often occurs “behind the scenes.”85 There is a politics to these
naturalizing processes, since they influence and frame our moral, scientific,
and aesthetic choices. This is to say that these kinds of infrapolitical
activities often retire or withdraw into a kind of self-evidence in which the
values, choices, and influences of infrastructures are taken for granted and
accorded a kind of obviousness, which is universally accepted. It is therefore
all the more “politically and ethically crucial”86 to recognize the
infrapolitics of mass digitization, not only as contestation and privatized
power games, but also as a mode of existence that values professionalized
standardization measures and mundane routines, not least because these
infrapolitical modes of existence often outlast their material circumstances
(“software outlasts hardware” as John Durham Peters notes).87 In sum,
infrastructures and the infrapolitics they produce yield subtle but
significant world-making powers.

## Power in Mass Digitization

If mass digitization is a product of a particular social configuration and
political infrastructure, it is also, ultimately, a site and an instrument of
power. In a sense, mass digitization is an event that stages a fundamental
confrontation between state and corporate power, while pointing to the
reconfigurations of both as they become increasingly embedded in digital
infrastructures. For instance, such confrontation takes place at the
negotiating table, where cultural heritage directors face the seductive and
awe-inspiring riches of Silicon Valley, as well as its overwhelmingly
intricate contractual layouts and its intimidating entourage of lawyers.
Confrontation also takes place at the level of infrastructural ideology, in
the meeting between twentieth-century standardization ideals and the playful
and flexible network dynamics of the twenty-first century, as seen for
instance in the conjunction of institutionally fixed taxonomies and
algorithmic retrieval systems that include feedback mechanisms. And it takes
place at the level of users, as they experience a gain in some powers and the
loss of others in their identity transition from national patrons of cultural
memory institutions to globalized users of mass digitization assemblages.

These transformations are partly the results of society’s increasing reliance
on network power and its effects. Political theorists Michael Hardt and
Antonio Negri suggested almost two decades ago that among other things, global
digital systems enabled a shift in power infrastructures from robust national
economies and core industrial sectors to interactive networks and flexible
accumulation, creating a “form of network power, which requires the wide
collaboration of dominant nation-states, major corporations, supra-national
economic and political institutions, various NGOs, media conglomerates and a
series of other powers.”88 From this landscape, according to their argum


diagnosis was one of several similar
arguments across the political spectrum that were formed within such a short
interval that “the network” arguably became the “defining concept of our
epoch.”89 Within this new epoch, the old centralized blocs of power crumbled
to make room for new forms of decentralized “bastard” power phenomena, such as
the extensive corporate/state mass surveillance systems revealed by Edward
Snowden and others, and new forms of human rights such as “the right to be
forgotten,” a right for which a more appropriate name would be “the right to
not be found by Google.”90 Network power and network effects are therefore
central to understanding how mass digitization assemblages operate, and why
some mass digitization assemblages are more powerful than others.

The power dynamics we find in Google Books, for instance, are directly related
to the ways in which digital technologies harness network effects: the power
of Google Books grows exponentially as its network expands.91 Indeed, as Siva
Vaidhyanathan noted in his critical work on Google’s role in society, what he
referred to as the “Googlization of books” was ultimately deeply intertwined
with the “Googlization of everything.”92 The networks of Google thus weren’t
external to both the success and the challenges of Google, but deeply endemic
to it, from portals and ranking systems to anchoring (elite) institutions, and
so on. The better Goo


ically by a
mixture of all of the above, than by sheer quality.93 This explains not only
the success of Google Books, but also its traction with even its critics:
although Google Books was initially criticized heavily for its poor imagery
and faulty metadata,94 its possible harmful impact on the public sphere,95 and
later, over privacy concerns,96 it had already created a power hub to which,
although they could have navigated around it, masses of people were
nevertheless increasingly drawn.

Network power is endemic not only to concrete digital networks, but also to
globalization at large as a process that simultaneously gives rise to feelings
of freedom of choice and loss of choice.97 Mass digitization assemblages, and
their globalization of knowledge infrastructures, thus crystalize the more
general tendencies of globalization as a process in which people participate
by choice, but not necessarily voluntarily; one in which we are increasingly
pushed into a game of social coordination, where common standards allow more
effective coordination yet also entrap us in their pull for convergence.
Standardization is therefore a key technique of network power: on the one
hand, standardization is linked with globalization (and various neoliberal
regimes) and the attendant widespread contraction of the state, while on the
other hand, standardization implies a reconfiguration of everyday life.98
Stan


on
could even be said to be habit forming: through standardization, “inventions
become commonplace, novelties become mundane, and the local becomes
universal.”100

To be sure, standardization has long been a crucial tool of world-making
power, spanning both the early and late-capitalist eras.101 “Standard time,”
as John Durham Peters notes, “is a sine qua non for international
capitalism.”102 Without the standardized infrastructure of time there would be
no global transportation networks, no global trade channels, and no global
communication networks. Indeed, globalization is premised on standardization
processes.

What kind of standardization processes do we find, then, in mass digitization
assemblages? Internet use alone involves direct engagement with hundreds of
global standards, from Bluetooth to Wi-Fi standards, from protocol standards
to file standards such as Word and MP4 and HTTP.103 Moreover, mass
digitization assemblages confront users with a series of additional standards,
from cultural standards of tagging to technical standards of interoperability,
such as the European Data Model (EDM) and Google’s schema.org, or legal
standards such as copyright and privacy regulations. Yet, while these
standards share affinities with the standardization processes of
industrialization, in many respects they also deviate from them. Instead, we
experience in mass digitization “a new form of standardization,”104 in which
differentiation and flexibility gain increasing influence without, however,
dispensing with standardization processes.

Today’s standardization is increasingly coupled with demands for flexibility
and interoperability. Flexibility, as Joyce Kolko has shown, is a term that
gained traction in the 1970s, when it was employed to describe putative
solutions to the problems of Fordism.105 It was seen as an antidote to Fordist
“rigidity”—a serious offense in the neoliberal regime. Thus, while the digital
networks underlying mass digitization are geared toward standardization and
expansion, since “information technology rewards scale, but only to the extent
that practices are standardized,”106 they are also becoming increasingly
flexible, since too-rigid standards hinder network effects, that is, the
growth of additional networks. This is one reason why mass digitization
assemblages increasingly and intentionally break down the so-called “silo”
thinking of cultural memory institutions, and implement standard flexibility
and interoperability to increase their range.107 One area of such
reconfiguration in mass digitization is the taxonomic field, where stable
institutional taxonomic structures are converted to new flexible modes of
knowledge organization like linked data.108 Linked data can connect cultural
memory artifacts as well as metadata in new ways, and the move from a cultural
memory web of interlinked documents to a cultural memory web of interlinked
data can potentially “amplify the impact of the work of libraries and
archives.”109 However, in order to work effectively, linked data demands
standards and shared protocols.

Flexibility allows the user a freer range of actions, and thus potentially
also the possibility of innovation. These affordances often translate into
user freedom or empowerment


are made “fluid” in the sense
that they are dispersed of clear boundaries and allowed multiple identities,
and in that they enable continuity and dissolution.

While these new flexible standard-setting mechanisms are often localized in
national and subnational settings, they are also globalized systems “oriented
towards global agendas and systems.”111 Indeed, they are “glocal”
configurations with digital networks at their cores. The increasing
significance of these glocal configurations has not only cultural but also
democratic consequences, since they often leave users powerless when it comes
to influencing their cores.112 This more fundamental problematic also pertains
to mass digitization, a phenomenon that operates in an environment that
constructs and encourages less Habermasian public spheres than “relations of
sociability,” from which “aggregate outcomes emerge not from an act of
collective decision-making, but through the accumulation of decentralized,
individual decisions that, taken together, nonetheless conduce to a
circumstance that affects the entire group.”113 For example, despite the
flexibility Google Books allows us in terms of search and correlation, we have
very little sway over its construction, even though we arguably influence its
dynamics. The limitations of our influence on the cores of mass digitization
assemblages have implications not only for how we conceive of institutional
power, but also for our own power within these matrixes.

## Notes

1. Borghi 2012, 420. 2. Latour 2008. 3. For more on this, see Hicks 2018;
Abbate 2012; Ensmenger 2012. In the case of libraries, (white) women still
make out the majority of the workforce, but there is a disproportionate amount
of men in senior positions, in comparison with their overall representation;
see, for example, Schonfeld and Sweeney 2017. 4. Meckler 1982. 5. Otlet and
Rayward 1990, chaps. 6 and 15. 6. For a historical and contemporary overview
over some milestones in the use of microfilms in a library context, see Canepi
et al. 2013, specifically “Historic Overview.” See also chap. 10 in Baker
2002. 7. Pfanner 2012. 8.
. 9. Medak et al.
2016. 10. Michael S. Hart, “The History and Philosophy of Project Gutenberg,”
Project Gutenberg, August 1992,
.
11. Ibid. 12. . 13. Ibid. 14. Bruno Delorme,
Digitization at the Bibliotheque Nationale De France, Including an Interview
with Bruno Delorme,” _Serials_ 24 (3) (2011): 261–265. 15. Alain Giffard,
“Dilemmas of Digitization in Oxford,” _AlainGiffard’s Weblog_ , posted May 29,
2008, digitization-
in-oxford>. 16. Ibid. 17. Author’s interview with Alain Giffard, Paris, 2010.
18. Ibid. 19. Later, in 1997, François Mitterrand demanded that the digitized
books should be brought online, accessible as text from everywhere. This,
then, was what became known as Gallica, the digital library of BnF, which was
launched in 1997. Gallica contains documents primarily out of copyright from
the Middle Ages to the 1930s, with priority given to French-speaking culture,
hosting about 4 million documents. 20. Imerito 2009. 21. Ambati et al. 2006;
Chen 2005. 22. Ryan Singel, “Stop the Google Library, Net’s Librarian Says,”
_Wired_ , May 19, 2009, library-nets-librarian-says>. 23. Alfred P. Sloan Foundation, Annual Report,
2006,
.
24. Leetaru 2008. 25. Amazon was also a major player in the early years of
mass digitization. In 2003 they gave access to a digital archive of more than
120,000 books with the professed goal of adding Amazon’s multimillion-title
catalog in the following years. As with all other mass digitization
initiatives, Jeff Bezos faced a series of copyright and technological
challenges. He met these with legal rhetorical ingenuity and the technical
skills of Udi Manber, who later became the lead engineer with Google, see, for
example, Wolf 2003. 26. Leetaru 2008. 27. John Markoff, “The Coming Search
Wars,” _New York Times_ , February 1, 2004,
. 28.
Google press release, “Google Checks out Library Books,” December 14, 2004,
.
29. Vise and Malseed 2005, chap. 21. 30. Auletta 2009, 96. 31. Johann Wolfgang
Goethe, _Sprüche in Prosa_


is
to bring something that is complementary, to bring diversity. But this doesn’t
mean that Google is an enemy of diversity.” 36. Chrisafis 2008. 37. Béquet
2009. For more on the political potential of archives, see Foucault 2002;
Derrida 1996; and Tygstrup 2014. 38. “Comme vous soulignez, nos bibliothèques
et nos archives contiennent la mémoire de nos culture européenne et de
société. La numérisation de leur collection—manuscrits, livres, images et
sons—constitue un défi culturel et économique auquel il serait bon que
l’Europe réponde de manière concertée.” (As you point out, our libraries and
archives contain the memory of our European culture and society. Digitization
of their collections—manuscripts, books, images, and sounds—is a cultural and
economic challenge it would be good for Europe to meets in a concerted
manner.) Manuel Barroso, open letter to Jacques Chirac, July 7, 2007,
[http://www.peps.cfwb.be/index.php?eID=tx_nawsecuredl&u=0&file=fileadmin/sites/numpat/upload/numpat_super_editor/numpat_editor/documents/Europe/Bibliotheques_numeriques/2005.07.07reponse_de_la_Commission_europeenne.pdf&hash=fe7d7c5faf2d7befd0894fd998abffdf101eecf1](http://www.peps.cfwb.be/index.php?eID=tx_nawsecuredl&u=0&file=fileadmin/sites/numpat/upload/numpat_super_editor/numpat_editor/documents/Europe/Bibliotheques_numeriques/2005.07.07reponse_de_la_Commission_europeenne.pdf&hash=fe7d7c5faf2d7befd0894fd998abffdf101eecf1).
39. Jøsevold 2016. 40. Janssen 2011. 41. Robert Darnton, “Google’s Loss: The
Public’s Gain,” _New York Review of Books_ , April 28, 2011,
. 42.
Palfrey 2015, __ 104. 43. See, for example, DPLA’s Public Library
Partnership’s Project, partnerships>. 44. Karaganis, 2018. 45. Sassen 2008, 3. 46. Coyle 2006; Borghi
and Karapapa, _Copyright and Mass Digitization_ ; Patra, Kumar, and Pani,
_Progressive Trends in Electronic Resource Management in Libraries_. 47.
Borghi 2012. 48. Beagle et al. 2003; Lavoie and Dempsey 2004; Courant 2006;
Earnshaw and Vince 2007; Rieger 2008; Leetaru 2008; Deegan and Sutherland
2009; Conway 2010; Samuelson 2014. 49. The earliest textual reference to the
mass digitization of books dates to the early 1990s. Richard de Gennaro,
Librarian of Harvard College, in a panel on funding strategies, argued that an
existing preservation program called “brittle books” should take precedence
over other preservation strategies such as mass deacidification; see Sparks,
_A Roundtable on Mass Deacidification_ , 46. Later the word began to attain
the sense we recognize today, as referring to digitization on a large scale.
In 2010 a new word popped up, “ultramass digitization,” a concept used to
describe the efforts of Google vis-à-vis more modest large-scale digitization
projects; see Greene 2010 _._ 50. Kevin Kelly, “Scan This Book!,” _New York
Times_ , May 14, 2006, ; Hall 2008; Darnton 2009;
Palfrey 2015. 51. As Alain Giffard notes, “I am not very confident with the
programs of digitization full of technical and economical considerations, but
curiously silent on the intellectual aspects” (Alain Giffard, “Dilemmas of
Digitization in Oxford,” _AlainGiffard’s Weblog_ , posted May 29, 2008,
digitization-in-
oxford>). 52. Tiffen 2007. 344. See also Peatling 2004. 53. Sassen 2008. 54.
See _The Authors Guild et al. vs. Google, Inc._ , Amended Settlement Agreement
05 CV 8136, United States District Court, Southern District of New York,
(2009) sec 7(2)(d) (research corpus), sec. 1.91, 14. 55. Informational
capitalism is a variant of late capitalism, which is based on cognitive,
communicative, and cooperative labor. See Christian Fuchs, _Digital Labour and
Karl Marx_ (New York: Routledge, 2014), 135–152. 56. Miksa 1983, 93. 57.
Midbon 1980. 58. Said 19


king classical nudes with porn; and on the other hand, it
allows users and institutions to harness social information about patterns of
use. Linked data has ideological and economic underpinnings as much as
technical ones. 109.  _The National Digital Platform: for Libraries, Archives
and Museums_ , 2015, report-national-digital-platform>. 110. Petter Nielsen and Ole Hanseth, “Fluid
Standards. A Case Study of a Norwegian Standard for Mobile Content Services,”
under review,
.
111. Sassen 2008, 3. 112. Grewal 2008. 113. Ibid., 9.

# II
Mapping Mass Digitization

# 2
The Trials, Tribulations, and Transformations of Google Books

## Introduction

In a 2004 article in the cultural theory journal _Critical Inquiry_ , book
historian Roger Chartier argued that the electronic world had created a triple
rupture in the world of text: by providing new techniques for inscribing and
disseminating the written word, by inspiring new relationships with texts, and
by imposing new forms of organization onto them. Indeed, Chartier foresaw that
“the originality and the importance of the digital revolution must therefore
not be underestimated insofar as it forces the contemporary reader to
abandon—consciously or not—the various legacies that formed it.”1 Chartier’s
premonition was inspired by the ripples that digitization was already
spreading across the sea of texts. People were increasingly writing and
distributing electronically, interacting with texts in new ways, and operating
and implementing new textual economies.2 These textual transformations __ gave
rise to a range of emotional reactions in readers and publishers, from
catastrophizing attititudes and pessimism about “the end of the book” to the
triumphalist mythologizing of liquid virtual books that were shedding their
analog ties like butterflies shedding their cocoons.

The most widely publicized mass digitization project to date, Google Books,
precipitated the entire emotional spectrum that could arise from these textual
transversals: from fears that control over culture was slipping from authors
and publishers into the hands of large tech companies, to hopeful ideas about
the democratizing potential of bringing knowledge that was once locked up in
dusty tomes at places like Harvard and Stanford, and to a utopian
mythologizing of the transcendent potential of mass digitization. Moreover,
Google Books also affected legal and professional transformations of the
infrastructural set-up of the book, creating new precedents and a new
professional ethos. The cultural, legal, and political significance of Google
Books, whether positive or negative, not only emphasizes its fundamental role
in shaping current knowledge landscapes, it also allows us to see Google Books
as a prism that reflects more general political tendencies toward
globalization, privatization, and digitization, such as modulations in
institutional infrastructures, legal landscapes, and aesthetic and political
conventions. But how did the unlikely marriage between a tech company and
cultural memory institutions even come about? Who drove it forward, and around
and within which infrastructures? And what kind of cultural memory politics
did it produce? The following sections of this chapter will address some of
these problematics.

## The New Librarians

It was in the midst of a turbulent restructuring of the world of text, in
October 2004 at the Frankfurt International Book Fair, that Larry Page and
Sergey Brin of Google announced the launch of Google Print, a cooperation
between Google and leading



decade later, the traditional practices of reading, and the guardianship of
text and cultural works, had acquired entirely new meanings. In October 2004,
however, the publishing world was still unaware of Google’s pending influence
on the institutional world of cultural memory. Indeed, at that time, Amazon’s
mounting dominance in the field of books, which began a decade earlier in
1995, appeared to pose much more significant implications. The majority of
publishers therefore greeted Google’s plans in Frankfurt as a welcome
alternative to Jeff Bezos’s growing online behemoth.

Larry Page and Sergey Brin withheld a few details from their announcement at
Frankfurt, however; Google’s digitization plans would involve not only
cooperation with publishers, but also with libraries. As such, what would
later become Google Books would in fact consist of two separate, yet
interrelated, programs: Google Print (which would later become Google Partner
Program) and Google Library Project. In all secrecy, Google had for many
months prior to the Frankfurt Book Fair worked with select libraries in the US
and the UK to digitize their holdings. And in December 2004 the true scope of
Google’s mass digitization plans were revealed: what Page and Brin were
building was the foundation of a groundbreaking cultural memory archive,
inspired by the myth of Alexandria.3 The invocation of Alexandria situated the
nascent Google Books project in a cultural schema that historicized the
project as a utopian, even moral and idealist, project that could finally,
thanks to technology, exceed existing human constraints—legal, political, and
physical.4

Google’s utopian discourse was not foreign to mass digitization enthusiasts.
Indeed, it was the _langue du jour_ underpinning most large-scale digitization
projects, a discourse nurtured and influenced by the seemingly borderless
infrastructure of the web itself (which was often referred to in
universalizing terms). 5 Yet, while the universalizing discourse of mass
digitization was familiar, it had until then seemed like aspirational talk at
best, and strategic policy talk in the face of limited public funding, complex
copyright landscapes, and lumbering infrastructures, at worst. Google,
however, faced the task with a fresh attitude of determination and a will to
disrupt, as well as a very different form of leverage in terms of
infrastructural set-up. Google was already the world’s preferred search
engine, having mastered the tactical skill of navigating its users through
increasingly complex information landscapes on the web, and harvesting their
metadata in the process to continuously improve Google’s feedback systems.
Essentially ever-larger amounts of information (understood here as “users”)
were passing through Google’s crawling engines, and as the masses of
information in Google’s server parks grew, so did their computational power.
Google Books, then, as opposed to most existing digitization projects, which
were conceived mainly in terms of “access,” was embedded in the larger system
of Google that understood the power and value of “feedback,” collecting
information and entering it into feedback loops between users, machines, and
engineers. Google also understood that information power didn’t necessarily
lie in owning all the information they gave access to, but rather in
controlling the informational processes themselves.

Yet, despite Google’s advances in information seeking behaviors, the idea of
Google Books appeared as an odd marriage. Why was a private company in Silicon
Valley, working in the futuristic and accelerating world of software and fluid
information


n fact returning home to its point of inception. Google was born of a
research project titled the Stanford Integrated Digital Library Project, which
was part of the NSF’s Digital Libraries Initiative (1994–1999). Larry Page and
Sergey Brin were students then, working on the Stanford component of this
project, intending to develop the base technologies required to overcome the
most critical barriers to effective digital libraries, of which there were
many.6 Page’s and Brin’s specific project, titled Google, was presented as a
technical solution to the increasing amount of information on the World Wide
Web.7 At Stanford, Larry Page also tried to facilitate a serious discussion of
mass digitization at Stanford, and of whether or not it was feasible. But his
ideas received little support, and he was forced to leave the idea on the
drawing board in favor of developing search technologies.8

In September 1998, Sergey Brin and Larry Page left the library project to
found Google as a company and became immersed in search engine technologies.
However, a few years later, Page resuscitated the idea of mass digitization as
a part of their larger self-professed goal to change the world of information
by increasing access, scaling the amount of information available, and
improving computational power. They convinced Eric Schmidt, the new CEO of
Google, that the mass digitization of cultural works made sense not only from
a information perspective, but also from a business perspective, since the
vast amounts of information Google could extract from books would improve
Google’s ability to deliver information that was hitherto lacking, and this
new content would eventually also result in an increase in traffic and clicks
on ads.9

## The Scaling Techniques of Mass Digitization

A series of experiments followed on how to best approach the daunting task.
The emergence and decay of these experiments highlight the ways in which mass
digitization assemblages consist not only of thoughts, ideals, and materials,
but also a series of cultural techniques that entwine temporality,
materiality, and even corporeality. This perspective on mass digitization
emphasizes the mixed nature of mass digitization assemblages: what at first
glance appears as a relatively straightforward story about new technical
inventions, at a closer look emerges as complex entanglements of human and
nonhuman actors, with implications not only for how we approach it as a legal-
technical entity but also an infrapolitical phenomenon. As the following
section shows, attending to the complex cultural techniques of mass
digitization (its “how”) enables us to see that its “minor” techniques are not
excluded from or irrelevant to, but rather are endemic to, larger questions of
the infrapolitics of digital capitalism. Thus, Google’s simple technique of
scaling scanning to make the digitization processes go faster becomes
entangled in the creation of new habits and techniques of acceleration and
rationalization that tie in with the politics of digital culture and digital
devices. The industrial scaling of mass digitization becomes a crucial part of
the industrial apparatus of big data, which provide new modes of inscription
for both individuals and digital industries that in turn can be capitalized on
via data-mining, just as it raises questions of digital labor and copyright.

Yet, what kinds of scaling techniques—and what kinds of investments—Google
would have to leverage to achieve its initial goals were still unclear to
Google in those early years. Larry Page and co-worker Marissa Mayer therefore
began to experiment with the best ways to proceed. First, they created a
makeshift scanning device, whereby Marissa Mayer would turn the page and Larry
Page would click the shutter of the camera, guided by the pace of a
metronome.10 These initial mass digitization experiments signaled the
industrial nature of the mass digitization process, providing a metronomic
rhythm governed by the implacable regularity of the machine, in addition to
the temporal horizon of eternity in cultural memory institutions (or at least
of material decay).11 After some experimentation with scale and time, Google
bought a consignment of books from a second-hand book store in Arizona. They
scanned them and subsequently experimented with how to best index these works
not only by using information from the book, but also by pulling data about
the books from various other sources on the web. These extractions allowed
them to calculate a work’s relevance and importance, for instance by looking
at the number of times it had been referred to.12

I


_001.jpg](images/11404_002_fig_001.jpg)

Figure 2.1 François-Marie Lefevere and Marin Saric. “Detection of grooves in
scanned images.” U.S. Patent 7508978B1. Assigned to Google LLC.

These new scanning technologies allowed Google to unsettle the fixed content
of cultural works on an industrial scale and enter them into new distribution
systems. The untethering and circulation of text already existed, of course,
but now text would mutate on an industrial scale, bringing into coexistence a
multiplicity of archiving modes and textual accumulation. Indeed, Google’s
systematic scaling-up of already existing technologies on an industrial and
accelerated scale posed a new paradigm in mass digitization, to a much larger
extent than, for instance, inventions of new technologies.14 Thus, while
Google’s new book scanners did expand the possibilities of capturing
information, Google couldn’t solve the problem of automating the process of
turning the pages of the books. For that they had to hire human scanners who
were asked to manually turn pages. The work of these human scanners was
largely invisible to the public, who could only see the books magically
appearing online as the digital archive accumulated. The scanners nevertheless
left ghostly traces, in the form of scanning errors such as pink fingers and
missing and crumbled pages—visual traces that underlined the historically
crucial role of human labor in industrializing and automating processes.15
Indeed, the question of how to solve human errors in the book scanning process
led to a series of inventive systems, such as the patent granted to Google in
2009 (filed in 2003), which describes a system that would minimize scanning
errors with the help of music.16 Later, Google open sourced plans for a book
scanner named “Linear Book Scanner” that would turn the pages automatically
with the help of a vacuum cleaner and a cleverly designed sheet metal
structure, after passing them over two image sensors taken from a desktop
scanner.17

Eventually, after much experimentation, Google consolidated its mass
digitization efforts in collaboration with select libraries.18 While some
institutions immediately and enthusiastically welcomed Google’s aspirations as
aligning with their own mission to improve access to information, others were
more hesitant, an institutional vacillation that hinted ominously at
controversy to come. Some libraries, such as the University of Michigan,
greeted the initiative with enthusiasm, whereas others, such as the Library of
Congress, saw a red flag pop up: copyright, one of the most fundamental
elements in the rights of texts and authors.19 The Library of Congress
questioned whether it was legal to scan and index books without a rights
holder’s permission. Google, in response,


neier, and Michael Chabon, and publishers, argued that although Google
Books was an “extremely exciting” project, it failed in its current form to
protect the privacy of readers, thus creating a “real risk of disclosure” of
sensitive information to “prying governmental entities and private litigants,”
potentially giving rise to a “chilling effect,” hurting not only readers but
also authors and publishers, not least those writing about sensitive or
controversial topics.32 The Association of Libraries also raised a set of
concerns, such as the cost of library subscriptions and privacy.33 And most
predictably, companies such as Amazon and Microsoft, who also had a stake in
mass digitization, opposed the settlement; Microsoft even funded some nuanced
research efforts into its implications.34 Finally, and most damningly, the
Department of Justice decided to get involved with an antitrust argument.

By this point, opposition to the Google Books project, as it was outlined in
the proposed settlement, wasn’t only motivated by commercial concerns; it was
now also motivated by a public that framed Google’s mass digitization project
as a parasitical threat to the public sphere itself. The framing of Google as
a potential menace was a jarring image that stood in stark contrast to Larry
Page’s and Sergey Brin’s philanthropic attitudes and to Google’s famous “Don’t
be evil” slogan. The public reaction thus signaled a change in Google’s
reputation as the company metamorphosed in the public eye from a small
underdog company to a multinational corporation with a near-monopoly in the
search industry. Google’s initially inspiring approach to information as a
realm of plenitude now appeared in the public view more similar to the actions
of megalomaniac land-grabbers.

Google, however, while maintaining it


igitizing the works they were digitizing, and that
their main goal was to enrich the public sphere with more information, not to
build an information monopoly. In July 2013 Judge Denny Chin issued a new
opinion confirming that Google Books was indeed fair use.37 Chin’s opinion was
later consolidated in a major victory for Google in 2015 when Judge Pierre
Leval in the Second Circuit Court legalized Google Books with the words
“Google’s unauthorized digitizing of copyright-protected works, creation of a
search functionality, and display of snippets from those works are non-
infringing fair uses.“38 Leval’s decision marked a new direction, not only for
Google Books, but also for mass digitization in general, as it signaled a
shift in cultural expectations about what it means to experience and
disseminate cultural artifacts.

Once again, the story of Google Books took a new turn. What was first
presented as a gift to cultural memory institutions and the public, and later
as theft from and threat to these same entities, on closer inspection revealed
itself as a much more complex circulatory system of expectations, promises,
risks, and blame. Google Books thus instigated a dynamic and forceful
connection between Google and cultural memory institutions, where the roles of
giver and receiver, and the first giver and second giver/returner, were
difficult to decode. Indeed, the binding natu


itutions globally, giving rise to
new institutional networks, in some cases increasing globalization and
mobility for both users and objects, and in other cases restricting the same.
The Google Books contracts display both technical and symbolic aspects: as
technical artifacts they establish intricate frameworks of procedures,
commitments, rights, and incentives for governing the transactions of cultural
memory artifacts and their digitized copies. As symbolic artifacts they evoke
normative principles, expressing different measures of good will toward
libraries, but also—as all contracts do—introduce the possibility of distrust,
conflict and betrayal.47

Despite their centrality to mass digitization assemblages, and although some
of them have been made available to the public,48 the content of these
particular contracts still suffer from the epistemic gap incurred in practical
and symbolic form by Google’s Agreements and Non-Disclosure Agreements (NDA),
a kind of agreement most libraries are required to sign when entering the
agreement. Like all contracts, the individual contracts signed by the
partnership libraries vary in nature and have different implications. While
many of Google’s agreements may be publically available, they have often only
been made public through requests and transparency mechanisms such as the
Freedom of Information Act. As the Open Rights Alliance notes in


, sharing the improved techniques
could benefit the company in the long run—inevitably, much of the output would
find its way onto the web, bolstering Google’s indexes. But in this case,
paranoia and a focus on short-term gain kept the machines under wraps.”55 The
nondisclosure agreements show that while boundaries may be blurred between
Google Books and libraries, we may still identify different regulatory models
and modes of existence within their networks, including the explicit _library
ethos_ (in the Weberian sense of the term) of public access, not only to the
front end but also to some areas of the back end, and the business world’s
secrecy practices. 56

Entering into a mass digitization public-private partnership (PPP) with a
corporation such as Google is thus not only a logical and pragmatic next step
for cultural memory institutions, it is also a political step. As already
noted, Google Books, through its embedding in Google, injects cultural memory
objects into new economic and cultural infrastructures. These infrastructures
are governed less by the hierarchical world of curators, historians, and
politicians, and more by feedback networks of tech companies, users, and
algorithms. Moreover, they forge ever closer connections to data-driven market
logics, where computational rather than representational power counts. Mass
digitization PPPs such as Google Books are thus also symptoms of a much more
pervasive infrapolitical situation, in which cultural memory institutions are
increasingly forced to alter their identities from public caretakers of
cultural heritage to economic actors in the EU internal market, controlled by
the framework of competition law, time-limited contracts, and rules on state
aid.57 Moreover, mastering the rules of these new infrastructures is not
necessarily an easy feat for public institutions.58 Thus, while Google claims
to hold a core commitment regarding free digital access to information, and
while its financial apparatus could be construed as making Google an eligible
partner in accordance with


account Google’s previous
monopoly-building history.60

## The Politics of Google Books

A final aspect of Google Books relates to the universal aspiration of Google
Books’s collection, its infrapolitics, and what it empirically produces in
territorial terms. As this chapter’s previous sections have outlined, it was
an aspiration of Google Books to transcend the cultural and political
limitations of physical cultural memory collections by gathering the written
material of cultural memory institutions into one massive digitized
collection. Yet, while the collection spans millions of works in hundreds of
languages from hundreds of countries,61 it is also clear that even large-scale
mass digitization processes still entail procedures of selection on multiple
levels from libraries to works. These decisions produce a political reality
that in some respects reproduces and accentuates the existing politics of
cultural memory institutions in terms of territorial and class-based
representations, and in other respects give rise to new forms of cultural
memory politics that part ways with the political regimes of traditional
curatorial apparatuses.

One obvious area in which to examine the politics produced by the Google Books
assemblage is in the selection of libraries that Google chooses to partner
with.62 While the full list of Google Books partners is not disclosed on
Google’s own webpage,


e gaps and biases of Google Books
reveal it to be less of a universal and monolithic collection, and more of an
impressive, but also specific and contingent, assemblage of works, texts, and
relations that is determined by the relations Google Books has entered into in
terms of class, discipline, and geographical scope.

Google Books is not only the result of selection processes on the level of
partnering institutions, but also on the level of organizational
infrastructure. While the infrastructures of Google Books in fact depart from
those of its parent company in many regards to avoid copyright infringement
charges, there is little doubt, however, that people working actively on
Google’s digitization activities (included here are both users and Google
employees) are also globally distributed in networked constellations. The
central organization for cultural digitization, the Google Cultural Institute,
is located in Paris, France. Yet the people affiliated with this hub are
working across several countries. Moreover, people working on various aspects
of Google Books, from marketing to language technology, to software
developments and manual scanning processes, are dispersed across the globe.
And it is perhaps in this way that we tend to think of Google in general—as a
networked global company—and for good reasons. Google has been operating
internationally almost for as long as it has been around. It has offices in
countries all over the globe, and works in numerous languages. Today it is one
of the most important global information institutions, and as m


Alexandria, claiming that “We’re going to scan all the
books in the world,” and explaining that for search to be truly comprehensive
“it must include every book ever published.” Page literally wanted Google to
be a “super librarian” (Auletta 2009, __ 96). 4. Constraints of a physical
character (how to digitize and organize all this knowledge in physical form);
legal character (how to do it in a way that suspends existing regulation); and
political character (how to transgress territorial systems). 5. Take, for
instance, project Bibliotheca Universalis, comprising American, Japanese,
German, and British libraries among others, whose professed aim was “to
exploit existing digitization programs in order to … make the major works of
the world’s scientific and cultural heritage accessible to a vast public via
multimedia technologies, thus fostering … exchange of knowledge and dialogue
over national and international borders.” It was a joint project of the French
Ministry of Culture, the National Library of France, the Japanese National
Diet Library, the Library of Congress, the National Library of Canada,
Discoteca di Stato, Deutsche Bibliothek, and the British Library:
. The project took its name
from the groundbreaking Medieval publication _Bibliotecha Universalis_
(1545–1549), a four-volume alphabetical bibliography


isions/isysquery/b3f81bc4-3798-476e-
81c0-23db25f3b301/1/doc/13-4829_opn.pdf>. In the aftermath of Pierre Leval’s
decision the Authors Guild has yet again filed yet another petition for the
Supreme Court to reverse the appeals court decision, and has publically
reiterated the framing of Google as a parasite rather than a benefactor. A
brief supporting the Guild’s petition and signed by a diverse group of authors
such as Malcolm Gladwell, Margaret Atwood, J. M. Coetzee, Ursula Le Guin, and
Yann Martel noted that the legal framework used to assess Google knew nothing
about “the digital reproduction of copyrighted works and their communication
on the Internet or the phenomenon of ‘mass digitization’ of vast collections
of copyrighted works”; nor, they argued, was the fair-use doctrine ever
intended “to permit a wealthy for-profit entity to digitize millions of works
and to cut off authors’ licensing of their reproduction, distribution, and
public display rights.” Amicus Curiae filed on behalf of Author’s Guild
Petition, No. 15–849, February 1, 2016, content/uploads/2016/02/15-849-tsac-TAA-et-al.pdf>. 39. Oxford English
Dictionary,
[http://www.oed.com/view/Entry/40328?rskey=bCMOh6&result=1&isAdvanced=false#eid8462140](http://www.oed.com/view/Entry/40328?rskey=bCMOh6&result=1&isAdvanced=false#eid8462140).
40. The contract as we know it


nd Michel et
al. 2011. 66. Neubert 2008; and Weiss and James 2012, 1–3. 67. I am indebted
to Gayatri Spivak here, who makes this argument about New York in the context
of globalization; see Spivak 2000. 68. In this respect Google mirrors the
glocalization strategies of media companies in general; see Thussu 2007, 19.
69. Although the decisions of foreign legislation of course also affect the
workings of Google, as is clear from the growing body of European regulatory
casework on Google such as the right to be forgotten, competition law, tax,
etc.

# 3
Sovereign Soul Searching: The Politics of Europeana

## Introduction

In 2008, the European Commission launched the European mass digitization
project, Europeana, to great fanfare. Although the EC’s official
communications framed the project as a logical outcome of years of work on
converging European digital library infrastructures, the project was received
in the press as a European counterresponse to Google Books.1 The popular media
framings of Europeana were focused in particular on two narratives: that
Europeana was a public response to Google’s privatization of cultural memory,
and that Europeana was a territorial response to American colonization of
European information and culture. This chapter suggests that while both of
these sentiments were present in Europeana’s early years, the politics of what
Europeana was—and is—paints a more complicated picture. A closer glance at
Europeana’s social, economic, and legal infrastructures thus shows that the
European mass digitization project is neither an attempt to replicate Google’s
glocal model, nor is it a continuation of traditional European cultural
policies. Rather, Europeana produces a new form of cultural memory politics
that converge national and supranational imaginaries with global information
infrastructures.

If global information infrastructures and national politics today seemingly go
hand in hand in Europeana, it wasn’t always so. In fact, in the 1990s,
networked technologies and national imaginaries appeared to be mutually
exclusive modes of existence. The fall of the Berlin Wall in 1989 nourished a
new antisovereign sentiment, which gave way to recurring claims in the 1990s
that the age of sovereig


al infrastructures are quickly
supplementing, and in many cases even substituting, those national
communicative infrastructures that were instrumental in establishing a
national imagined community in the first place—infrastructures such as novels
and newspapers.5 The convergence of territorially bounded imaginaries and
global networks creates new cultural-political constellations of cultural
memory where the centripetal forces of nationalism operate alongside,
sometimes with and sometimes against, the centrifugal forces of digital
infrastructures. Europeana is a preeminent example of these complex
infrastructural and imaginary dynamics.

## A European Response

When Google announced their digitization program at the Frankfurt Book Fair in
2004, it instantly created ripples in the European cultural-political
landscape, in France in particular. Upon hearing the news about Google’s
plans, Jacques Chirac, president of France at the time, promptly urged the
then-culture minister, Renaud Donnedieu de Vabres, and Jean-Noël Jeanneney,
head of France’s Bibliothèque nationale, to commence a similar digitization
project and to persuade other European countries to join them.6 The seeds for
Europeana were sown by France, “the deepest, most sedimented reservoir of
anti-American arguments,”7 as an explicitly political reaction to Google
Books.

Europeana was thus from its inception laced with the ambiguous political
relationship between two historically competing universalist-exceptionalist
nations: the United States and France.8 A relationship that France sometimes
pictures as a question of Americanization, and at other times extends to an
image of a more diffuse Anglo-Saxon constellation. Highlighting the effects
Google Books would have on French culture, Jeanneney argued that Google’s mass
digitization efforts would pose several possible dangers to French cultural
memory such as bias in the collecting and organizing practices of Google Books
and an Anglicization of the cultural memory regulatory system. Explaining why
Google Books should be seen not only as an American, but also as an Anglo-
Saxon project, Jeanneney noted that while Google Books “was obviously an
American project,” it was nevertheless also one “that reached out to the
British.” The alliance between the Bodleian Library at Oxford and Google Books
was thus not only a professional partnership in Jeanneney’s eyes, but also a
symbolic bond where “the familiar Anglo-Saxon solidarity” manifested once
again vis-à-vis France, only this time in the digital sphere. Jeanneney even
paraphrased Churchill’s comment to Charles de Gaulle, noting that Oxford’s
alliance with Google Books yet again evidenced how British institutions,
“without consulting anyone on the other side of the English Channel,” favored
US-UK alliances over UK-Continental alliances “in search of European
patriotism for the adventure under way.”9

How can we understand Jeanneney’s framing of Google Books as an Anglo-Saxon
project and the function of this framing in his plea for a nation-based
digitization program? As historian Emile Chabal suggests, the concept of the
Anglo-Saxon mentality is a preeminently French construct that has a clear and
rich rhetorical function to strengthen the French self-understanding vis-à-vis
a stereotypical “other.”10 While fuzzy in its conceptual infrastructure, the
French rhetoric of the Anglo-Saxon is nevertheless “instinctively understood
by the vast majority of the French population” to denote “not simply a
socioeconomic vision loosely inspired by market liberalism and
multiculturalism” but also (and sometimes primarily) “an image of
individualism, enterprise, and atomization.”11 All these dimensions were at
play in Jeanneney’s anti-Google Books rhetoric. Indeed, Jeanneney suggested,
Google’s mass digitization project was not only Anglo-Saxon in its collecting
practices and organizational principles, but also in its regulatory framework:
“We know how Anglo-Saxon law competes with Latin law in international
jurisdictions and in those of new nations. I don’t want to see Anglo-Saxon law
unduly favored by Google as a result of the hierarchy that will be
spontaneously established on its lists.”12

What did Jeanneney suggest as infrastructural protection against the network
power of the Anglo-Saxon mass digitization project? According to Jeanneney,
the answer lay in territorial digitization programs: rather than simply
accepting the colonizing forces of the Anglo-Saxon matrix, Jeanneney argued, a
national digitization effort was needed. Such a national digitization project
would be a “ _contre-attaque_ ” against Google Books that should protect three
dimensions of French cultural sovereignty: its language, the role of the state
in cultural policy, and the cultural/intellectual order of knowledge in the
cultural collections.13 Thus Jeanneney suggested that any Anglo-Saxon mass
digitization project should be competed against and complemented by mass
digitization projects from other nations and cultures to ensure that cultural
works are embedded in meaningful cultural contexts and languages. While the
nation was the central base of mass digitization programs, Jeanenney noted,
such digitization programs necessarily needed to be embedded in a European, or
Continental, infrastructure. Thus, while Jeanneney’s rallying cry to protect
the French cultural memory was voiced from France, he gave it a European
signature, frequently addressing and including the rest of Europe as a natural
ally in his _contre-attaque_ against Google Books. 14 Jeanenney’s extension of
French concerns to a European level was characteristic for France, which had
historically displayed a leadership role in formulating and shaping the EU.15
The EU, Jeanneney argued, could provide a resilient supranational
infrastructure that would enable French diversity to exist within the EU while
also providing a protective shield against unhampered Anglo-Saxon
globalization.

Other French officials took on a less combative tone, insisting that the
French digitization project should be seen not merely as a reaction to Google
but rather in the context of existing French and European efforts to make
information available online. “I really stress that it’s not anti-American,”
stated one official at the Ministry of Culture and Communication. Rather than
framing the French national initiatives as a reaction to Google Books, the
official instead noted that the prime objective was to “make more material
relevant to European patrimony available,” noting also that the national
digitization efforts were neither unique nor exclusionary—not even to
Google.16 The disjunction between Jeanneney’s discursive claims to mass
digitization sovereignty and the anonymous bureaucrat’s pragmatic and
networked approach to mass digitization indicates the late-sovereign landscape
of mass digitization as it unfolded between identity politics and pragmatic
politics, between discursive claims to sovereignty and economic global
cooperation. And as the next section shows, the intertwinement of these
discursive, ideological, and economic infrastructures produced a memory
politics in Europeana that was neither sovereign nor post-sovereign, but
rather late-sovereign.

## The Infrastructural Reality of Late-Sovereignty

Politically speaking, Europeana was always more than just an empty
countergesture or emulating response to Google. Rather, as soon as the EU
adopted Europeana as a prestige project, Europeana became embedded in the
political project of Europeanization and began to produce a politi


ts territorial imaginaries nevertheless took place by means of globalized
networked infrastructures. The circumscribed cultural imaginary of Europeana
was thus made interoperable with the networked logic of globalization. This
combination of a European imaginary and neoliberal infrastructure in Europeana
produced an uneasy balance between national and supranational infrastructural
imaginaries on the one hand and globalized infrastructures on the other.

If France saw Europeana primarily through the prism of sovereign competition,
the European Commission emphasized a different dispositive: economic
competition. In his 2005 response to Jaques Chirac, José Manuel Barroso
acknowledged that the digitization of European cultural heritage was an
important task not only for nation-states but also for the EU as a whole.
Instead of the defiant tone of Jeanneney and De Vabres, Barraso and the EU
institutions opted for a more neutral, pragmatic, and diplomatic mass
digitization discourse. Instead of focusing on Europeana as a lever to prop up
the cultural sovereignty of France, and by extension Europe, in the face of
Americanization, Barosso framed Europeana as an important economic element in
the construction of a knowledge economy.17

Europeana was thus still a competitive project, but it was now reframed as one
that would be much more easily aligned with, and integrated into, a global
market economy.18 One might see the difference in the French and the EU
responses as a question of infrastructural form and affordance. If French mass
digitization discourses were concerned with circumscribing the French cultural
heritage within the territory of the nation, the EC was in practice more
attuned to the networked aspects of the global economy and an accompanying
discourse of competition and potentiality. The infrastructural shift from
delineated sphere to globalized network changed the infrapolitics of cultural
memory from traditional nation-based issues such as identity politics
(including the formation of canons) to more globally aligned trade-related
themes such as copyright and public-private governance.

The shift from canon to copyright did not mean, however, that national
concerns dissipated. On the contrary, ministers from the Euro


in the US22), in which they argued against the inclusion of foreign
authors in the lawsuit.23 They further brought separate suits against Google
Books for their scanning activities and sought to exercise diplomatic pressure
against the advancement of Google Books.24

On an EU level, however, the territorial concerns were sidestepped in favor of
another matrix of concern: the question of public-private governance. Thus,
despite pressure from some member states, the EC decided not to write a
similar “amicus brief” on behalf of the EU.25 Instead, EC Commissioners
McCreevy and Reding emphasized the need for more infrastructures connecting
the public and private sectors in the field of mass digitization.26 Such PPPs
could range from relatively conservative forms of cooperation (e.g., private
sponsoring, or payments from the private sector for links provided by
Europeana) to more far-reaching involvement, such as turning the management of
Europeana over to the private sector.27 In a similar vein, a report authored
by a high-level reflection group (Comité des Sages) set down by the European
Commission opened the door for public-private partnerships and also set a time
frame for commercial exploitation.28 It was even suggested that Google could
play a role in the construction of Europeana. These considerations thus
contrasted the French resistance against Google with previous statements made
by the EC, which were concerned with preserving the public sector in the
administration of Europeana.

Did the European Commission’s networked politics signal a post-sovereign
future for Europeana? This chapter suggests no: despite the EC’s strategies,
it would be wrong to label the infrapolitics of Europeana as post-sovereign.
Rather, Europeana draws up a _late-sovereign_ 29 mass digitization landscape,
where claims to national sovereignty exist alongside networked
infrastructures.30 Why not post-sovereign? Because, as legal scholar Neil
Walker noted in 2003,31 the logic of sovereignty never waned even in the face
of globalized capitalism and legal pluralism. Instead, it fused with these
more globalized infrastructures to produce a form of politics that displayed
considerable continuity with the old sovereign order, yet also had distinctive
features such as globalized trade networks and constitutional pluralisms. In
this new system, seemingly traditional claims to sovereignty are carried out
irrespective of political practices, showing that globally networked
infrastructures and


levance as nationalist imaginaries increase in strength and power through
increasingly globalized networks.

As the following section shows, Europeana is a product of political processes
that are concerned with both the construction of bounded spheres and canons
_and_ networked infrastructures of connectivity, competition, and potentiality
operating beyond, below, and between national societal structures. Europeana’s
late-sovereign framework produces an infrapolitics in which the discursive
political juxtaposition between Europeana and Google Books exists alongside
increased cooperation between Google Books and Europeana, making it necessary
to qualify the comparative distinctions in mass digitization projects on a
much more detailed level than merely territorial delineations, without,
however, disposing of the notion of sovereignty. The simultaneous
contestations and connections between Europeana and Google Books thus make
visible the complex economic, intellectual, and technological infrastructures
at play in mass digitization.

What form did these infrastructures take? In a sense, the complex
infrastructural set-up of Europeana as it played out in the EU’s framework
ended up extending along two different axes: a vertical axis of national and
supranational sovereignty, where the tectonic territorial plates of nation-
states and continents move relative to each other by converging, diverging,
and transforming; and a horizontal axis of deterritorializing flows that
stream within, between, and throughout sovereign territories consisting both
of capital interests (in the form of transnational lobby organizations working
to protect, promote, and advance the interests of multinational companies or
nongovernmental organizations) and the affective relations of users.

## Harmonizing Europe: From Canon to Copyright

Even if the EU is less concerned with upholding the regulatory boundaries of
the nation-state in mass digitization, bordering effects are still found in
mass digitized collections—this time in the form of copyright regulation. As
in the case of Google Books, mass digitization also raised questions in Europe
about the future role of copyright in the digital sphere. On the one hand,
cultural industries were concerned about the implications of mass digitization
for their production and copyrights32; on the other hand, educational
institutions and digital industries were interested in “unlocking” the
cognitive and cultural potentials that resided within the copyrighted
collections in cultural heritage institutions. Indeed, copyright was such a
crucial concern that the EC repeatedly stated the necessity to reform and
harmonize European copyright regulation across borders.

Why is copyright a concern for Europeana? Alongside economic challenges, the
current copyright legislation is _the_ greatest obstacle against mass
digitization. Copyright effectively prohibits mass digitization of any kind of
material that is still within copyright, creating large gaps in digitized
collections that are often referred to as “the twentieth-century black hole.”
These black holes appear as a result of the way European “copyright interacts
with the digitization of cultural heritage collections” and manifest
themselves as “marked lack of online availability of twentieth-century
collections.” 33 The lack of a common copyright mechanism not only hinders
online availability, but also challenges European cross-border digitization
projects as well as the possibilities for data-mining collections à la Google
because of the difficulties connected to ascertaining the relevant
public domain and hence definitively flagging the public domain status of an
object.34

While Europeana’s twentieth-century black hole poses a problem, Europe would
not, as one worker in the EC’s Directorate-General (DG) Copyright unit noted,
follow Google’s opt-out mass digitization strategy because “the European
solution is not the Google solution. We do a diligent search for the rights
holder before digitizing the material. We follow the law.”35 By positioning
herself as on the right side of the law, the DG employee implicitly also
placed Google on the wrong side of the law. Yet, as another DG employee
explained with frustration, the right side of the law was looking increasingly
untenable in an age of mass digitization. Indeed, as she noted, the demands
for diligent search was making her work near impossible, not least due to the
different legal regimes in the US and the EU:

> Today if one wants to digitize a work, one has to go and ask the rights
holder individually. The problem is often that you can’t find the rights
holder. And sometimes it takes so much time. So there is a rights holder, you
know that he would agree, but it takes so much time to go and find out. And
not all countries have collective management … you have to go company by
company. In Europe we have producing companies that disappear after the film
has been made, because they are created only to make that film. So who are you
going


overlapping instances:
the exclusivity of national intellectual property laws, the economic interests
toward a common market, and the cultural interests in the free movement of
information and knowledge production—a tension that is further amplified by
the coexistence of different legal traditions across member states.37 Seeking
to resolve this tension, the European Parliament and certain units in the
European Commission have strategically used Europeana as a rhetorical lever to
increase harmonization of copyright legislation and thus make it easier for
institutions to make their collections available online.38 “Harmonization” has
thus become a key concept in the rights regime of mass digitization,
essentially signaling interoperability rather than standardization of national
copyright regimes. Yet stakeholders differ in their opinions concerning who
should hold what rights over what content, over what period of time, at what
price, and how things should be made available. So within the process of
harmonization is a process that is less than harmonious, namely bringing
stakeholders to the table and committing. As the EC interviewee confirms,
harmonization requires not only technical but also political cooperation.

The question of harmonization illustrates the infrapolitical dimensions of
Europeana’s copyright systems, showing that they are not just technical
standards or “direct


g if not overtly political borders in the
collections, then certainly infrapolitical manifestations of the cultural
barriers that still exist between European countries.

## The Infrapolitics of Interoperability

Copyright is not the only infrastructural regime that upholds borders in
Europeana’s collections; technical standards also pose great challenges for
the dream of an European connective cultural memory.42 The notion of
_interoperability_ 43 has therefore become a key concern for mass
digitization, as interoperability is what allows digitized cultural memory
institutions to exchange and share documents, queries, and services.44

The rise of interoperability as a key concept in mass digitization is a side-
effect of the increasing complexity of economic, political, and technological
networks. In the twentieth century, most European cultural memory institutions
existed primarily as small “sovereign” institutions, closed spheres governed
by internal logics and with little impetus to open up their internal machinery
to other institutions and cooperate. The early 2000s signaled a shift in the
institutional infrastructural layout of cultural memory institutions, however.
One early significant articulation of this shift was a 324-page European
Commission report entitled _Technological Landscapes for Tomorrow’s Cultural
Economy: Unlocking the Value of Cultural Heritage_ (or the DigiC


ran deeper that technological
logic.46 The more complex cultural memory infrastructures become, the more
interoperability is needed if one wants the infrastructures to connect and
communicate with each other.47 As information scholar Christine Borgman notes,
interoperability has therefore long been “the holy grail of digital
libraries”—a statement echoed by Commissioner Reding on Europeana in 2005 when
she stated that “I am not suggesting that the Commission creates a single
library. I envisage a network of many digital libraries—in different
institutions, across Europe.”48 Reding’s statement shows that even at the
height of the French exceptionalist discourse on European mass digitization,
other political forces worked instead to reformat the sovereign sphere into a
network. The unravelling of the bounded spheres of cultural memory
institutions into networked infrastructures is therefore both an effect of,
and the further mobilization of, increased interoperability.

Interoperability is not only a concern for mass digitization projects,
however; rather, the calls for interoperability takes place on a much more
fundamental level. A European Council Conclusion on Europeana identifies
interoperability as a key challenge for the future construction of Europeana,
but also embeds this concern within the overarching European interoperability
strategy, _European Interoperability Framework for pan-European eGovernment
services_. 49 Today, then, interoperability appears to be turning into a
social theory. The extension of the concept of interoperability into the
social sphere naturally follows the socialization of another technical term:
infrastructure. In the past decades, Susan Leigh Star, Geoffrey Bowker, and
others have


e Mcdonough notes that “we need to
cease viewing [interoperability] purely as a technical problem, and
acknowledge that it is the result of the interplay of technical and social
factors.”53 Pushing the concept of interoperability even further, legal
scholars Urs Gasser and John Palfrey have even argued for viewing the world
through a theory of interoperability, naming their project “interop theory,”54
while Internet governance scholar Laura Denardis proposes a political theory
of interoperability.55

More than denoting a technical fact, then, interoperability emerges today as
an infrastructural logic, one that promotes openness, modularity, and
connectivity. Within the field of mass digitization, the notion of
interoperability is in particular promoted by the infrastructural workers of
cultural memory (e.g., archivists, librarians, software developers, digital
humanists, etc.) who dream of opening up the silos they work on to enrich them
with new meanings.56 As noted in chapter 1, European cultural memory
institutions had begun to address unconnected institutions as closed “silos.”
Mass digitization offered a way of thinking of these institutions anew—not as
frigid closed containers, but rather as vital connective infrastructures.
Interoperability thus gives rise to a new infrastructural form of cultural
memory: the traditional delineated sovereign spheres of expertise of analog
cultural memory institutions are pried open and reformatted as networked
ecosystems that consist not only of the traditional national public providers,
but also of additional components that have hitherto been alien in the
cultural memory industry, such as private individual users and commercial
industries.57

The logic of interoperability is also born of a specific kind of
infrapolitics: the politics of modul


oncepts we find on the Internet
in the age of digital capitalism, such as “prosumers”, “produsers”, and so on.
These concepts are becoming more and more pervasive in the digital environment
where “any format of sound can be mixed with any format of video, and then
supplemented with any format of text or images.”59 According to Lessig, the
challenge to this “open” vision are those “who don’t play in this
interoperability game,” and the contestation between the “open” and the
“closed” takes place in the “the network,” which produces “a world where
anyone can clip and combine just about anything to make something new.”60

Despite its centrality in the mass digitization rhetoric, the concept of
interoperability and the politics it produces is rarely discussed in critical
terms. Yet, as Gasser and Palfrey readily conceded in 2007, interoperability
is not necessarily in itself an “unalloyed good.” Indeed, in “certain
instances,” Palfrey and Gasser noted, interoperability brings with it possible
drawbacks such as increased homogeneity, lack of security, lack of
reliability.61 Today, ten years on, Urs Gasser’s and John Palfrey’s admissions
of the drawbacks of interoperability appear too modest, and it becomes clear
that while their theoretical apparatus was able to identify the centrality of
interoperability in a digital world, their social theory m


ic politics of speed that is also inherent in
connectivity and interoperability: “Connection implies smooth surfaces with no
margins of ambiguity … connections are optimized in terms of speed and have
the potential to accelerate with technological developments.63 The
connectivity enabled by interoperability thus implies modularity with
components necessarily “open to interfacing and interoperability.”
Interoperability, then, is not only a question of openness, but also a way of
harnessing network effects by means of speed and resilience.

While interoperability may be an inherent infrastructural tenet of neoliberal
systems, increased interoperability does not automatically make mass
digitization projects neoliberal. Yet, interoperability does allow for
increased connectivity between individual cultural memory objects and a
neoliberal economy. And while the neoliberal economy may emulate critical
discourses on freedom and creativity, its main concern is profit. The same
systems that allow users to create and navigate collections more freely are
made interoperable with neoliberal systems of control.64

## The “Work” in Networking

What are the effects of interoperability for the user? The culture of
connectivity and interoperability has not only allowed Europeana’s collections
to become more visible to a wider public, it has also enabled these publics to
become intentionally or unintentionally involved in the act of describing and
ordering these same collections, for instance by inviting users to influence
existing collections as well as to generate their own collections. The
increased interaction with works also transform them from stable to mobile
objects.65 Mass digitization has thus transformed curatorial practice,
expanding it beyond the closed spheres of cultural memory institutions into
much broader ecosystems and extending the focus of curatorial attention from
fixed objects to dynamic network systems. As a result, “curatorial work has
become more widely distributed between multiple agents including technological
networks and software.”66 From having played a central role in the curatorial
practice, the curator is now only part of this entire system and increasingly
not central to it. Sharing the curator’s place are users, algorithms, software
engineers, and a multitude of other factors.

At the same time, the information deluge generated by digitization has
enhanced the necessity of curation, both within and outside institutions. Once
considered as professional caretaking for collections, the curatorial concept
has now been modulated to encompass a whole host of activities and agents,
just as curatorial practices are now ever more engaged in epistemic meaning
making, selecting and organizing materials in an interpretive framework
through the aggregation of global connection.67 And as the already monumental
and ever accelerating digital collections exceed human curatorial capacity,
the computing power of machines and cognitive capabilities of ordinary
citizens is increasingly needed to penetrate and make meaning of the data
accumulations.

W


onal strategies of harvesting the “cognitive surplus” of users75 in
environments where play is increasingly taking on aspects of labor and vice
versa. As cultural theorist Angela Mitropoulos has noted, “networking is also
net-working.”76 Thus, while many of the participatory structures we find in
Europeana are participatory projects proper and not just what we might call
participation-lite—or minimal participation77—models, the new interoperable
infrastructures of cultural memory ecosystems make it increasingly difficult
to uphold clear-cut distinctions between civic practice and exploitation in
crowdsourcing projects.

## Collecting Europe

If Europeana is a late-sovereign mass digitization project that maintains
discursive ties to the national imaginary at the same time that it undercuts
this imaginary by means of networked infrastructures through increased
interoperability, the final question is: what does this late-sovereign
assemblage produce in cultural terms? As outlined above, it was an aspiration
of Europeana to produce and distribute European cultural memory by means of
mass digitization. Today, its collection gathers more than 50 million cultural
works in differing formats—from sound bites to photographs, textiles, films,
files, and books. As the previous sections show, however, the processes of
gathering the cultural artifacts have generated a lot of friction, producing a
political reality that in some respects reproduces and accentuates the
existing politics of cultural memory institutions in terms of representation
and ownership, and in other respects gives rise to new forms of cultural
memory politics that part ways with the political regimes of traditional
curatorial apparatuses.

The story of how Europeana’s initial collection was published and later
revised offer


’ de-Nazification program, the Bavarian state
allowed no one to republish the book. 80 Therefore, reissues of _Mein Kampf_
only reemerged in 2015, when the copyright was released. The premature digital
distribution of _Mein Kampf_ in Euro­peana was thus, according to copyright
legislation, illegal. While the _Mein Kampf_ case was extraordinary, it
flagged a more fundamental problem of how to police and analyze all the
incoming data from individual cultural heritage institutions.

On a more fundamental level, however, _Mein Kampf_ indicated not only a legal,
but also a political, issue for Europeana: how to deal with the expressions
that Europeana’s feedback mechanisms facilitated. Mass digitization promoted a
new kind of cultural memory logic, namely of feedback. Feedback mechanisms are
central to data-driven companies like Google because they offer us traces of
the inner worlds of people that would otherwise never appear in empirical
terms, but that can be catered to in commercial terms. 81 Yet, while the
traces might interest the corporation (or sociologist) on the hunt for
people’s hidden thoughts, a prestige project such as Europeana found it
untenable. What Europeana wanted was to present Europe’s cultural memory; what
they ended up showing was Europeans’ intense fascination with fascism and
porn. And this was problematic because Europeana was a political project of
represen


Europeana.83 So while Europeana is in principle representing
Europe’s collective cultural memory, in reality it represents a highly
fragmented image of Europe with a lot of European countries not even appearing
in the databases. Moreover, even these numbers are potentially misleading, as
one information scholar formerly working with Europeana notes: to pump up
their statistical representation, many institutions strategically invented
counting systems that would make their representation seem bigger than it
really is, for example, by declaring each scanned page in a medieval
manuscript as an object instead of as the entire work.84 The strategic acts of
volume increase are interesting mass digitization phenomena for many reasons:
first, they reveal the ultimately volume-based approach of mass digitization.
According to the scholar, this volume-based approach finds a political support
in the EC system, for whom “the object will always be quantitative” since
volume is “the only thing the commission can measure in terms of funding and
result.”85 In a way then, the statistics tell more than one story: in
political terms, they recount not only the classic tale of a fragmented Europe
but also how Europe is increasingly perceived, represented, and managed by
calculative technologies. In technical terms, they reveal the gray areas of
how to delineate and calculate data: what makes a data object? And in cultural
policy terms, they reflect the highly divergent prioritization of mass
digitization in European countries.

The final question is, then: how is this fragmented European collection
distributed? This is the point where Europeana’s territorial matrix reveals
its ultimately networked infrastructure. Europeana may be entered through
Google, Facebook, Twitter, and Pinterest, and vice versa. Therefore a click on
the aforementioned cake exhibition, for example, takes one straight to Google
Arts and Culture. The transportation from the Europeana platform to Google
happens smoothly, without any friction or notice, and if one didn’t look at
the change in URL, one would hardly notice the change at all since the
interface appears almost similar. Yet, what are the implications of thi


d the
networked infrastructures of Europeana show just how difficult it is to
collect Europe in the digital sphere. This is not to say that territorial
sentiments don’t have power, however—far from it. Within the digital sphere we
are already seeing territorial statements circulated in Europe on both
national and supranational scales, with potentially far-reaching implications
on both. Yet, there is little to suggest that the territorial sentiments will
reproduce sovereign spheres in practice. To the extent that reterritorializing
sentiments are circulated in globalizing networks, this chapter has sought to
counter both ideas about post sovereignty and pure nationalization, viewing
mass digitization instead through the lens of late-sovereignty. As this
chapter shows, the notion of late-sovereignty allows us to conceptualize mass
digitization programs, such as Europeana, as globalized phenomena couched
within the language of (supra)national sovereignty. In the age where rampant
nationalist movements sweep through globalized communication networks, this
approach feels all the more urgent and applicable not only to mass
digitization programs, but also to reterritorializing communication phenomena
more broadly. Only if we take the ways in which the nationalist imaginary
works in the infrastructural reality of late capitalism, can we begin to
account for the infrapolitics of the highly mediated new territorial
imaginaries.

## Notes

1. Lefler 2007; Henry W., “Europe’s Digital Library versus Google,” Café
Babel, September 22, 2008, /europes-digital-library-versus-google.html>; Chrisafis 2008. 2. While
digitization did not stand apart from the political and economic developments
in the rapidly globalizing world, digital theorists and activists soon gave
rise to the Internet as an inherent metaphor for this integrative development,
a sign of the inevitability of an ultimately borderless world, where as
Negroponte notes, time zones would “probably play a bigger role in our digital
future than trade zones” (Negroponte 1995, 228). 3. Goldsmith and Wu 2006. 4.
Rogers 2012. 5. Anderson 1991. 6. “Jacques Chirac donne l’impulsion à la
création d’une bibliothèque numérique,” _Le Monde_ , March 16, 2005,
donne-l-impulsion-a-la-cr


porters immediately rendered _défie,_ connotes a
kind of violence or aggressiveness that isn’t implied by the French word. The
right word in English is ‘challenge,’ which has a different implication, more
sporting, more positive, more rewarding for both sides” (Jeanneney 2007, 85).
14. See pages 12, 22, and 24 for a few examples in Jeanneney 2007. 15. On the
issue of the common currency, see, for instance, Martin and Ross 2004. The
idea of France as an appropriate spokesperson for Europe was familiar already
in the eighteenth century when Voltaire declared French “la Langue de
l’Europe”; see Bivort 2013. 16. The official thus first noted that, “Everybody
is working on digitization projects … cooperation between Google and the
European project could therefore well occur.” and later added that ”The worst
scenario we could achieve would be that we had two big digital libraries that
don’t communicate. … The idea is not to do the same thing, so maybe we could
cooperate, I don’t know. Frankly, I’m not sure they would be interested in
digitizing our patrimony. The idea is to bring something that is
complementary, to bring diversity. But this doesn’t mean that Google is an
enemy of diversity.” See Labi 2005. 17. Letter from Manuel Barroso to Jaques
Chirac, July 7, 2005,
[http://www.peps.cfwb.be/index.php?eID=tx_nawsecuredl&u=0&file=fileadmin/sites/numpat/upload/numpat_super_editor/numpat_editor/documents/Europe/Bibliotheques_numeriques/2005.07.07reponse_de_la_Commission_europeenne.pdf&hash=fe7d7c5faf2d7befd0894fd998abffdf101eecf1](http://www.peps.cfwb.be/index.php?eID=tx_nawsecuredl&u=0&file=fileadmin/sites/numpat/upload/numpat_super_editor/numpat_editor/documents/Europe/Bibliotheques_numeriques/2005.07.07reponse_de_la_Commission_europeenne.pdf&hash=fe7d7c5faf2d7befd0894fd998abffdf101eecf1).
18. As one EC communication noted, a digitization project on the scale of
Europeana could sharpen Europe’s competitive edge in digitization processes
compared to those in the US as well India and China; see European Commission,
“i2010: Digital Libraries,” _COM(2005) 465 final_ , September 30, 2005, [eur-
lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52005DC0465&from=EN](http
://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52005DC0465&from=EN).
19. “Google Books raises concerns in some member states,” as an anonymous
Czech diplomatic source put it; see Paul Meller, “EU to Investigate Google
Books’ Copyright Policies,” _PCWorld_ , May 28, 2009,
.
20. Pfanner 2011; Doward 2009; Samuel 2009. 21. Amicus brief is


ps://pro.europeana.eu/blogpost/eu-parliament-in-favour-of-copyright-
rules-better-fit-for-a-digital-age>. 39. Jasanoff 2013, 133 40. Ibid. 41. Tate
2001. 42. It would be tempting to suggest the discussion on harmonization
above would apply to interoperability as well. But while the concepts of
harmonization and interoperability—along with the neighboring term
standardization—are used intermittently and appear similar at first glance,
they nevertheless have precise cultural-legal meanings and implicate different
infrastructural set-ups. As noted above, the notion of harmonization is
increasingly used in the legal context of harmonizing regulatory
apparatuses—in the case of mass digitization especially copyright laws. But
the word has a richer semantic meaning, suggesting a search for commonalities,
literally by means of fitting together or arranging units into a whole. As
such the notion of harmony suggests something that is both pleasing and
presupposes a cohesive unit(y), for example, a door hinged to a frame, an arm
hinged to a body. While used in similar terms, the notion of interoperability
expresses a very different infrastructural modality. If harmonization suggests
unity, interoperability rather alludes to modularity. For more on the concepts
of standardization and harmonization in regulatory contexts, see Tay and
Parker 1990. 43. The notion of interoperability is oft


m’s ability to transfer, render and connect to useful information across
systems, and calls for interoperability have increased as systems have become
increasingly complex. 44. There are “myriad technical and engineering issues
associated with connecting together networks, databases, and other computer-
based systems”; digitized cultural memory institutions have the option of
providing “a greater array of services” than traditional libraries and
archives from sophisticated search engines to document reformatting as rights
negotiations; digitized cultural memory materials are often more varied than
the material held in traditional libraries; and finally and most importantly,
mass digitization institutions are increasingly becoming platforms that
connect “a large number of loosely connected components” because no “single
corporation, professional organization, or government” would be able to
provide all that is necessary for a project such as Europeana; not least on an
international scale. EU-NSF Digital Library Working Group on Interoperability
between Digital Libraries Position Paper, 1998,
. 45.  _The
Digicult Report: Technological Landscapes for Tomorrow’s Cultural Economy:
Unlocking the Value of Cultural Heritage: Executive Summary_ (Luxembourg:
Office for Official Publications of the European Commun


51. Tsilas 2011, 103. 52. Borgman
2015, 46. 53. McDonough 2009. 54. Palfrey and Gasser 2012. 55. DeNardis 2011.
56. The .txtual Condition: Digital Humanities, Born-Digital Archives, and the
Future Literary; Palfrey and Gasser 2012; Matthew Kirschenbaum, “Distant
Mirrors and the Lamp,” talk at the 2013 MLA Presidential Forum Avenues of
Access session on “Digital Humanities and the Future of Scholarly
Communication.” 57. Ping-Huang 2016. 58. Lessig 2005 59. Ibid. 60. Ibid. 61.
Palfrey and Gasser 2012. 62. McPherson 2012, 29. 63. Berardi, Genosko, and
Thoburn 2011, 29–31. 64. For more on the nexus of freedom and control, see
Chun 2006. 65. The mere act of digitization of course inflicts mobility on an
object as digital objects are kept in a constant state of migration. 66. Krysa
2006. 67. See only the wealth of literature currently generated on the
“curatorial turn,” for example, O’Neill and Wilson 2010; and O’Neill and
Andreasen 2011. 68. Romeo and Blaser 2011. 69. Europeana Sound Connections,
collections-on-a-social-networking-platform.html>. 70. Ridge 2013. 71. Carolyn
Dinshaw has argued for the amateur’s ability in similar terms, focusing on her
potential to queer the archive (see Dinshaw 2012). 72. Stiegler 2003; Stiegler
n.d. The idea of the amateur as a subversive character precedes digitization,
of course. Think only of Roland Barthes’s idea of the amateur as a truly
subversive character that could lead to a break with existing ideologies in
disciplinary societies; see, for instance, Barthes’s celebration of the
amateur as a truly anti-bourgeois character (Barthes 1977 and Barthes 1981).
73. Not least in light of recent writings on the experience as even love
itself as a form of labor (see Weigel 2016). The constellation of love as a
form of labor has a long history (see Lewis 1987). 74. Raddick et al. 2009;
Proctor 2013. 75. “Many companies and institutions, that are successful
online, are good at supporting and harnessing people’s cognitive surplus. …
Users get th


plomatic reputation. Yet by transferring Hitler’s
author’s rights to the Bavarian Ministry, they allocated _Mein Kampf_ to an
existence in a gray area between private and public law. Since then, the book
has been the center of attention in a rift between, on the one hand, the
Ministry of Finance who has rigorously defended its position as the formal
rights holder, and, on the other hand, historians and intellectuals who,
supported the Bavarian science minister Wolfgang Heubisch, have argued that an
academic annotated version of _Mein Kampf_ should be made publicly accessible
in the name of Enlightenment. 81. Latour 2007. 82. Europeana’s more
traditional curatorial approach to mass digitization was criticized not only
by the media, but also others involved in mass digitization projects, who
claimed that Europeana had fundamentally misunderstood the point of mass
digitization. One engineer working on mass digitization projects is the
influential cultural software developer organization, IRI, argued that
Europeana’s production pattern was comparable to “launching satellites”
without thinking of the messages that are returned by the satellites. Google,
he argued, was differently attuned to the importance of feedback, because
“feedback is their business.” 83. In the most recent published report, Germany
contributes with about 15 percent and France with around 16 percent of the
total amount of available works. At the same time, Belgium and Slovenia only
count around 1 percent and Denmark along with Greece, Luxembourg, Portugal,
and a slew of other countries doesn’t even achieve representation in the pie
chart; see “Europeana Content Report,” August 6, 2015,
/europeana-dsi-ms7-content-report-august.pdf>. 84. Europeana information
scholar interview, 2011. 85. Ibid. 86. Wiebe de Jager, “MS15: Annual traffic
report and analysis,” Europeana, May 31 2014,
.

# 4
The Licit and Illicit Nature of Mass Digitization

## Introduction: Lurking in the Shadows

A friend has just recommended an academic book to you, and now you are dying
to read it. But you know that it is both expensive and hard to get your hands
on. You head down to your library to request the book, but you soon realize
that the wait list is enormous and that you will not be able to get your hands
on the book for a couple of weeks. Desperate, you turn to your friend for
help. She asks, “Why don’t you just go to a pirate library?” and provides you
with a link. A new world opens up. Twenty minutes later you have downloaded 30
books that you felt were indispensable to your bookshelf. You didn’t pay a
thing. You know what you did was i


“What are the moral
implications of my actions vis-à-vis the colonial framework that currently
dictates Europeana’s copyright policies?”

The existence of what this book terms shadow libraries raises difficult
questions, not only to your own moral compass but also to the field of mass
digitization. Political and popular discourses often reduce the complexity of
these questions to “right” and “wrong” and Hollywood narratives of pirates and
avengers. Yet, this chapter wishes to explore the deeper infrapolitical
implications of shadow libraries, setting out the argument that shadow
libraries offer us a productive framework for examining the highly complex
legal landscape of mass digitization. Rather than writing a chapter that
either supports or counters shadow libraries, the chapter seeks to chart the
complexity of the phenomenon and tease out its relevance for mass digitization
by framing it within what we might call an infrapolitics of parasitism.

In _The Parasite_ , a strange and fabulating book that brings together
information theory and cybernetics, physics, philosophy, economy, biology,
politics, and folk tales, French philosopher Michel Serres constructs an
argument about the conceptual figure of the parasite to explore the parasitic
nature of social relations. In a dizzying array of images and thought-
constructs, Serres argues against the idea of a balanced exchange of energy,
suggesting instead that our world is characterized by one parasite stealing
energy by feeding on another organism. For this purpose he reminds us of the
three meanings of parasite in


milieu.” In the following
sections, the lens of the parasite will help us explore the murky waters of
shadow libraries, not (only) as entities, but also as relational phenomena.
The point is to show how shadow libraries belong to the same infrapolitical
ecosystem as Google Books and Europeana, sometimes threatening them, but often
also strengthening them. Moreover, it seeks to show how visitors’ interactions
with shadow libraries are also marked by parasitical relations with Google,
which often mediates literature searches, thus entangling Google and shadow
libraries in a parasitical relationship where one feeds off the other and vice
versa.

Despite these entangled relations, the mass digitization strategies of shadow
libraries, Europeana, and Google Books differ significantly. Basically, we
might say that Google Books and Europeana each represent different strategies
for making material available on an industrial scale while maintaining claims
to legality. The sprawling and rapidly growing group of mass digitization
projects interchangeably termed shadow libraries represents a third set of
strategies. Shadow libraries5 share affinities with Europeana and Google Books
in the sense that they offer many of the same services: instant access to a
wealth of cultural works spanning journal articles, monographs, and textbooks
among others. Yet, while Google Books and Europeana promote visibility to
increase traffic, embed themselves in formal systems of communication, and
operate within the legal frameworks of public funding and private contracting,
shadow libraries in contrast operate in the shadows of formal visibility and
regulatory systems. Hence, while formal mass digitization projects such as
Google Books and Europeana publicly proclaim their desire to digitize the
world’s cultural memory, another layer of people, scattered across the globe
and belonging to very diverse environments, harbor the same aspirations, but
in much more subtle terms. Most of these people express an interest in the
written word, a moral conviction of free access, and a political view on
existing copyright regulations as unjust and/or untimely. Some also express
their fascination with the new wonders of technology and their new
infrastructural possibilities. Others merely wish to practice forms of access
that their finances, political regime, or geography otherwise prohibit them
from doing. And all of them are important nodes in a new shadowy
infrastructural system that provides free access worldwide to books and
articles on a scale that collectively far surpasses both Google and Europeana.

Because of their illicit nature, most analyses of shadowy libraries have
centered on their legal transgressions. Yet, their cultural trajectories
contain nuances that far exceed legal binaries. Approaching shadow libraries
through the lens of infrapolitics is helpful for bringing forth these much
more complex cultural mass digitization systems. This chapter explores three
examples of shadow libraries, focusing in particular on their stories of
origin, their cultural economies, and their sociotechnical infrastructures.
Not all shadow libraries fit perfectly into the category of mass digitization.
Some of them are smaller in size, more selective, and less industrial.
Nevertheless, I include them because their open access strategies allow for
unlimited downloads. Thus, shadow libraries, while perhaps selective in size
themselves, offer the opportunity to reproduce works at a massive and
distributed scale. As such, they are the perfect example of a mass
digitization assemblage.

The first case centers on lib.ru, an early Russia-based file-sharing platform
for exchanging books that today has grown into a massive and distributed file-
sharing project. It is primarily run by individuals, but it has also received
public funding, which shows that what at first glance appears as a simple case
of piracy simultaneously serves as a much more complex infrapolitical
structure. The second case, Monoskop, distinguishes itself by its boutique
approach to digitization. Monoskop too is characterized by its territorial
trajectory, rooted in Bratislava’s digital scene as an attempt to establish an
intellectual platform for the study of avant-garde (digital) cultures that
could connect its Bratislava-based creators to a global scene. Finally, the
chapter looks at UbuWeb, a shadow library dedicated to avant-garde cultural
works ranging from text and audio to images and film. Founded in 1996 as a US-
based noncommercial file-sharing site by poet Kenneth Goldsmith in response to
the marginal distribution of crucial avant-garde material, UbuWeb today offers
a wealth of avant-garde sound art, video, and textual works.

As the case studies show, shadow libraries have become significant mass
digitization infrastructures that offer the user free access to academic
articles and books, often by means of illegal file-sharing. They are informal
and unstable networks that rely on active user participation across a wide
spectrum, from deeply embedded people who have established file-sharing sites
to the everyday user occasionally sending the odd book or article to a friend
or colleague. As Lars Eckstein notes, most shadow libraries are characterized
not only by their informal character, but also by the speed with which they
operate, providing “a velocity of media content” which challenges legal
attacks and other forms of countermeasures.6 Moreover, shadow libraries also
often operate in a much more widely distributed fashion than both Europeana
and Google, distributing and mirroring content across multiple servers, and
distributing labor and responsibility in a system that is on the one hand more
robust, more redundant, and more resistant to any single point of failure or
control, and on the other hand more ephemeral, without a central point of
back-up. Indeed, some forms of shadow libraries exist entirely without a
center, instead operating infrastructurally along communication channels in
social media; for example, the use of the Twitter hashtag #ICanHazPDF to help
pirate scientific papers.

Today, shadow libraries exist as timely reminders of the infrapolitical nature
of mass digitization. They appear as hypertrophied versions of the access
provided by Google Books and Europeana. More fundamentally, they also exist as
political symptoms of the ideologies of the digital, characterized by ideals
of velocity and connectivity. As such, we might say that although shadow
libraries often position themselves as subversives, in many ways they also
belong to the same storyline as other mass digitization projects such as
Google Books and Europeana. Significantly, then, shadow libraries are
infrapolitical in two senses: first, they have become central infrastructural
elements in what James C. Scott calls the “infrapolitics of subordinate
groups,” providing everyday resistance by creating entrance points to
hitherto-excluded knowledge zones.7 Second, they represent and produce the
infrapolitics of the digital _tout court_ with their ideals of real-time,
globalized, and unhindered access.

## Lib.ru

Lib.ru is one of the earliest known digital shadow libraries. It was
established by the Russian computer science professor Maxim Moshkov, who
complemented his academic practice of programming w


re then distributed, leaving only the Russian
literary classics on the original site.13 Neighboring sites hosted other
genres, ranging from user-generated texts and fan fiction on a shadow site
called [samizdat.lib.ru](http://samizdat.lib.ru) to academic books in a shadow
library titled Kolkhoz, named after the commons-based agricultural cooperative
of the early Soviet era and curated and managed by “amateur librarians.”14 The
steadily accumulating numbers of added works, digital distributors, and online
access points expanded not only the range of the shadow collections, but also
their networked affordances. Lib.ru and its offshoots thus grew into an
influential node in the global mass digitization landscape, attracting both
political and legal attention.

### Lib.ru and the Law

Until 2004, lib.ru deployed a practice of handling copyright complaints by
simply removing works at the first request from the authors.15 But in 2004 the
library received its first significant copyright claim from the big Russian
publisher Kirill i Mefody (KM). KM requested that Moshkov remove access to a
long list of books, claiming exclusive Internet rights on the books, along
with works that were considered public domain. Moshkov refused to honor the
request, and a lawsuit ensued. The Ostankino Court of Moscow initially denied
the lawsuit because the contracts for exclusive Internet rights were
considered i


e infrapolitics of samizdat not only referred to a specific social practice
but were also, as Ann Komaromi reminds us, a particular discourse network
rooted in the technology of the typewriter: “Because so many people had their
own typewriters, the production of samizdat was more individual and typically
less linked to ideology and organized political structures. … The circulation
of Samizdat was more rhizomatic and spontaneous than the underground
press—samizdat was like mushroom ‘spores.’”26 The technopolitical
infrastructure of samizdat changed, however, with the fall of the Berlin Wall
in 1989, the further decentralization of the Russian media landscape, and the
emergence of digitization. Now, new nodes emerged in the Russian information
landscape, and there was no centralized authority to regulate them. Moreover,
the transmission of the Western capitalist system gave rise to new types of
shadow activity that produced items instead of just sharing items, adding a
new consumerist dimension to shadow libraries. Indeed, as Kuznetsov notes, the
late-Soviet samizdat created a dynamic textual space that aligned with more
general tendencies in mass digitization where users were “both readers and
librarians, in contrast to a traditional library with its order, selection,
and strict catalogisation.”27

If many of the new shadow libraries that emerged in the 1990s and 2000s were
inspired by the infrapolitics of samizdat, then, they also became embedded in
an infrastructural apparatus that was deeply nested within a market economy.
Indeed, new digital libraries emerged under such names as Aldebaran,
Fictionbook, Litportal, Bookz.ru, and Fanzin, which developed new platforms
for the distribution of electronic books under the label “Liters,” offering
texts to be read free of charge on a computer screen or downloaded at a
cost.28 In both cases, th


obal phenomena,
yet one should be careful with disregarding the specific cultural-political
trajectories that shape each individual shadow library. Lib.ru demonstrates
how the infrapolitics of shadow libraries emerge as infrastructural
expressions of the convergence between historical sovereign trajectories,
global information infrastructures, and public-private governance structures.
Shadow libraries are not just globalized projects that exist in parallel to
sovereign state structures and global economic flows. Instead, they are
entangled in territorial public-private governance practices that produce
their own late-sovereign infrapolitics, which, paradoxically, are embedded in
larger mass digitization problematics, both on their own territory and on the
global scene.

## Monoskop

In contrast to the broad and distributed infrastructure of lib.ru, other
shadow libraries have emerged as specialized platforms that cater to a
specific community and encourage a specific practice. Monoskop is one such
shadow library. Like lib.ru, Monoskop started as a one-man project and in many
respects still reflects its creator, Dušan Barok, who is an artist, writer,
and cultural activist involved in critical practices in the fields of
software, art, and theory. Prior to Monoskop, his activities were mainly
focused on the Bratislava cultural media scene, and Monoskop was among other
things set up as an infr


terials” whose production were often “considered a task of
the state,”56 on the other hand it shows how intellectual content is
increasingly privatized, not only in corporate terms but also through
individuals, which in UbuWeb’s case is expressed in Kenneth Goldsmith, who
acts as the sole archival gatekeeper.57

## The Infrapolitics of Shadow Libraries

If the complexity of shadow libraries cannot be reduced to the contrastive
codes of “right” and “wrong” and global-local binaries, the question remains
how to theorize the cultural politics of shadow libraries. This final section
outlines three central infrapolitical aspects of shadow libraries: access,
speed, and gift.

Mass digitization poses two important questions to knowledge infrastructures:
a logistical question of access and a strategic question of to whom to
allocate that access. Copyright poses a significant logistical barrier between
users and works as a point of control in the ideal free flow of information.
In mass digitization, increased access to information stimulates projects,
whereas in publishing industries with monopoly possibilities, the drive is
toward restriction and control. The uneasy fit between copyright regulations
and mass digitization projects has, as already shown, given rise to several
conflicts, either as legal battles or as copyright reform initiatives arguing
that current copyright frameworks cast doubt upon the political ideal of total
access. As with Europeana and Google Books, the question of _access_ often
stands at the core of the infrapolitics of shadow libraries. Yet, the
strategic responses to the problem of copyright vary significantly: if
Europeana moves within the established realm of legality to reform copyright
regulations and Google Books produces claims to new cultural-legal categories
such as “nonconsumptive reading,” shadow libraries offer a third
infrastructural maneuver—bypassing copyright in


ng.” Serres contrasts the parasitic
model with established models of society based on notions such as exchange and
gift giving.71 Shadow libraries produce an infrapolitics that denies the
distinction between producers and subtractors of value, allowing us instead to
focus on the social roles infrastructural agents perform. Restoring a sense of
the wider context of parasitism to shadow libraries does not provide a clear-
cut solution as to when and where shadow libraries should be condemned and
when and where they should be tolerated. But it does help us ask questions in
a different way. And it certainly prevents the regarding of shadow libraries
as the “other” in the landscape of mass digitization. Shadow libraries
instigate new creative relations, the dynamics of which are infrastructurally
premised upon the medium they use. Just as typewriters were an important
component of samizdat practices in the Soviet Union, digital infrastructures
are central components of shadow libraries, and in many respects shadow
libraries bring to the fore the same cultural-political questions as other
forms of mass digitization: questions of territorial imaginaries,
infrastructures, regulation, speed, and ethics.

## Notes

1. Serres 1982, 55. 2. Serres 1982, 36. 3. Serres 1982, 36. 4. Samyn 2012. 5.
I stick with “shadow library,” a term that I first found in Lawrence Liang’s
(2012) writings on copyright and have since seen meaningfully unfolded in a
variety of contexts. Part of its strength is its sidestepping of the question
of the pirate and that term’s colonial connotations. 6. Eckstein and Schwarz
2014. 7. Scott 2009, 185–201. 8. See also Maxim Moshkov’s own website hosted
on lib.ru, . 9. Carey 2015. 10. Schmidt 2009. 11. Bodó
2016. “Libraries in the p


for instance, Larkin 2008; Castells and Cardoso
2012; Fredriksson and Arvanitakis 2014; Burkart 2014; and Eckstein and Schwarz
2014. 62. Liang 2009. 63. Larkin 2008. 64. John Bohannon, “Who’s Downloading
Pirated Papers? Everyone,” _Science Magazine_ , April 28, 2016,
everyone>. 65. “The Scientists Encouraging Online Piracy with a Secret
Codeword,” _BBC Trending_ , October 21, 2015, trending-34572462>. 66. Liu 2013. 67. Tenen and Foxman 2014. 68. See Kramer
2016. 69. Gardner and Gardner 2017. 70. Giesler 2006, 283. 71. Serres 2013, 8.

# III
Diagnosing Mass Digitization

# 5
Lost in Mass Digitization

## The Desire and Despair of Large-Scale Collections

In 1995, founding editor of _Wired_ magazine Kevin Kelly mused upon how a
digital library would look:

> Two decades ago nonlibrarians discovered Borges’s Library in silicon
circuits of human manufacture. The poetic can imagine the countless rows of
hexagons and hallways stacked up in the Library corresponding to the
incomprehensible micro labyrinth of crystalline wires and gates stamped into a
silicon computer chip. A computer chip, blessed by the proper incantation of
software, creates Borges’s Library on command. … Pages from the books appear
on the screen one after another without delay. To search Borges’s Library of
all possible books, past, present, and future, one needs only to sit down (the
modern solution) and click the mouse.1

At the time of Kelly’s writing, book digitization on a massive scale had not
yet taken place. Building his chimerical dream around Jorge Luis Borges’s own
famous magic piece of speculation regarding the Library of Babel, Kelly not
only dreamed up a fantasy of what a digital library might be in an imaginary
dialogue with Borges; he also argued that Jorge Luis Borges’s vision had
already taken place, by grace of nonlibrarians, or—more
specifically—programmers. Specifically, Kelly mentions Karl Sims, a computer
scientist working on a supercomputer called Connection Machine 5 (you may
remember it from the set of _Jurassic Park_ ), who had created a simulated
version of Borges’s library.2

Twenty years after Kelly’s vision, a whole host of mass digitization projects
have sought more or less explicitly to fulfill Kelly’s vision. Incidentally,
Brewster Kahle, one of the lead engineers of the aforementioned Connection
Machine, has become a key figure in the field. Kahle has long dreamed of
creating a universal digital library, and has worked to fulfill it in
practical terms through the nonprofit Internet Archive project, which he
founded in 1996 with the stated mission of creating “universal access to all
knowledge.” In an op-ed in 2017, Kahle lamented the recent lack of progress in
mass digitization and argued for the need to create a new vision for mass
digitization, stating, “The Internet Archive, working with library partners,
proposes bringing millions of books online, through purchase or digitization,
starting with the books most widely held and used in libraries and
classrooms.”3 Reminding us that three major entities have “already digitized
modern materials at scale: Google, Amazon, and the Internet Archive, probably
in that order of magnitude,”4 Kahle nevertheless notes that “bringing
universal access to books” has not yet been achieved because of a fractured
field that diverges on questions of money, technology, and legal clarity. Yet,
outlining his new vision for how a sustainable mass digitization project could
be achieved, Kahle remains convinced that mass digitization is both a
necessity and a possibility.

While Brewster Kahle, Kevin Kelly, Google, Amazon, Europeana’s member
institutions, and others disagree on how to achieve mass digitization, for
whom, and in what form, they are all united in their quest for digitization on
a massive scale. Many shadow libraries operate with the same quantitative
statements, proudly asserting the quantities of their massive holdings on the
front page.

Given the fractured field of mass digitization, and the lack of economic
models for how to actually make mass digitization sustainable, why does the
common dream of mass digitization persist? As this chapter shows, the desire
for quantity, which drives mass digitization, is—much like the Borges stories
to which Kelly also refers—laced with ambivalence. On the one hand, the
quantitative aspirations are driven forth by the basic assumption that “more
is more”: more data and more cultural memory equal better industrial and
intellectual progress. One the other hand, the sheer scale of ambition also
causes frustration, anxiety, and failed plans.

The sense that sheer size and big numbers hold the promise of progress and
greatness is nothing new, of course. And mass digitization brings together
three fields that have each historically grown out of scalar ambitions:
collecting practices, statistics, and industrialization processes.
Historically, as cultural theorist Couze Venn reminds us, most large
collections bear the imprint of processes of (cultural) colonization, human
desires, and dynamics of domination and superiority. We therefore find in
large collections the “impulses and yearnings that have conditioned the
assembling of most of the collections that today establish a monument to past
efforts to gather together knowledge of the world and its treasury of objects
and deeds.”5 The field of statistics, moreover, so vital to the evolution of
modern governance models, is also premised upon the accumulation of ever-more
information.6 And finally, we all recognize the signs of modern
industrialization processes as they appear in the form of globalization,
standardization, and acceleration. Indeed, as French sociologist Henri
Lefebvre once argued (with a nod to Marx), the history of modern society could
plainly and simply be seen as the history of accumulation: of space, of
capital, of property.7

In mass digitization, we hear the political echoes of these histories. From
Jeanneney’s war cry to defend European patrimonies in the face of Google’s
cultural colonization to Google’s megalomaniac numbers game and Europeana’s
territorial maneuverings, scale is used as a point of reference not only to
describe the space of cultural objects in themselves but also to outline a
realm of cultural command.

A central feature in the history of accumulation and scale is the development
of digital technology and the accompanying new modes of information
organization. But even before then, the invention of new technologies offered
not only new modes of producing and gathering information and new
possibilities of


gm, new teleologies were formed that emphasized the latent value of any
piece of information, expressed for instance by Joachim Jungius’s exclamation
that “no field was too remote, no author too obscure that it would not yield
some knowledge or other” and Gabriel Naudé’s observation that there is “no
book, however bad or decried, which will not be sought after by someone over
time.”9 The idea that any piece of information was latently valuable was later
remarked upon by Melvin Dewey, who noted at the beginning of the twentieth
century that a “normal librarian’s instinct is to keep every book and
pamphlet. He knows that possibly some day, somebody wants it.”10

Today, mass digitization repeats similar concerns. It reworks the old dream of
an all-encompassing and universal library and has foregrounded once again
questions about what to save and what to let go. What, one might ask, would
belong in such a library? One important field of interest is the question of
whether, and how, to preserve metadata—today’s marginalia. Is it sufficient to
digitize cultural works, or should all accompanying information about the
provenance of the work also be included? And how can we agree upon what
marginalia actually is across different disciplines? Mass digitization
projects in natural history rarely digitize marginalia such as logs and
written accounts, focusing only on what to that discipline is the main object
at hand, for example, a piece of rock, a fly specimen, a pressed plant. Yet,
in the history of science, logs are an invaluable source of information about
how the collected object ended up in the collection, the meaning it had to the
collector, and the place it takes in the collection.11 In this way, new
questions with old trajectories arise: What is important for understanding a
collection and its life? What should be included and excluded? And how will we
know what will turn out to be important in the future?

In the era of big data, the imperative is often to digitize and “save all.”
Prestige mass digitization projects such as Google Books and Europeana have
thus often contextualized their importance in terms of scale. Indeed, as we
saw in the previous chapters, the question of scale has been a central point
of political contestation used to signal infrastructural power. Thus the hype
around Google Books, as well as the political ire it drew, centered on the
scale of the project just as quantitative goals are used in Europeana to
signal progress and significance. Inherent in these quantitative claims are
not only ideas about political power, but also the widespread belief in
digital circles—and the political regimes that take inspiration from them—that
the more information the user is able to access, the more empowered the user
is to navigate and make meaning on their own. In recent years, the imaginaries
of freedom of navigation have also been adjoined by fantasies of freedom of
infrastructural construction through the image of the platform. Mass
digitization projects should therefore not only offer the user the potential
to navigate collections freely, but also to build new products and services on
top of them.12 Yet, as this chapter argues, the ethos of potentially unlimited
expansion also prompts a new set of infrapolitical questions about agency and
control. While these questions are inherently related to the larger questions
of territory and power explored in the previous chapters, they occur on a
different register, closer to the individual user and within the spatialized
imaginaries of digital information.

As many critics have noted, the logic of expansion and scale, and the
accompanying fantasies of the empowered user, often builds on ne


nt through platforming is often also
shot through with neoliberal ideals that not only fail to take into account
the complex infrapolitical realities of social interaction, but also rely on
an entrepreneurial epistemology that evokes “a flat, two-dimensional stage on
which resources are laid out for users to do stuff with” and which we are not
“inclined to look underneath or behind it, or to question its structure.”14

This chapter unfolds these central infrapolitical problematics of the spatial
imaginaries of knowledge in relation to a set of prevalent cultural spatial
tropes that have gained new life in digital theory and that have informed the
construction and development of mass digitization projects: the flaneur, the
labyrinth, and the platform. Cultural reports, policy papers, and digital
design strategies often use these three tropes to elicit images of pleasure
and playfulness in mass digitization projects; yet, as the following sections
show, they also raise significant questions of control and agency, not least
against the backdrop of ever-increasing scales of information production.

## Too Much—Never Enough

The question of scale in mass digitization is often posed as a rational quest
for knowledge accumulation and interoperability. Yet this section argues that
digitized collections are more than just rational projects; they strike deep
affective cords of desire, domination, and anxiety. As Couze Venn reminds us,
collections harbor an intimate connection between cognition and affective
economy. In this connection, the rationalized drive to collect is often
accompanied by a slippage, from a rationalized urge to a pathological drive
ultimately associated with desire, power, domination, anxiety, nostalgia,
excess, and—sometimes even—compulsion and repetition.15 The practice of
collecting objects thus not only signals a rational need but


ished
annually; and unless this mass be properly arranged, and the means furnished
by which its contents may be ascertained, literature and science will be
overwhelmed by their own unwieldy bulk.”22 The experience of feeling
overwhelmed by information and lacking the right tools to handle it is no
joke. Indeed, a number of German librarians actually went documentably insane
between 1803 and 1825 in the wake of the information glut that followed the
secularization of ecclesiastical libraries.23 The desire for grand collections
has thus always also been followed by an accompanying anxiety relating to
questions of infrastructure.

As the history of collecting pathologies shows, reducing mass digitization
projects to rational and technical information projects would deprive them of
their rich psychological dimensions. Instead of discounting these pathologies,
we should acknowledge them, and examine not only their nature, but also their
implications for the organization of mass digitization projects. As the
following section shows, the pathologies not only exist as psychological
forces, but also as infrastructural imaginaries that directly impact theories
on how best to organize information in mass digitization. If the scale of mass
digitization projects is potentially limitless, how should they be organized?
And how will we feel when moving about in their gargantuan archives?

## The Ambivalent flaneur

In an article on cultures of archiving, sociologist Mike Featherstone asked
whether “the expansion of culture available at our fingertips” could be
“subjected to a meaningful ordering,” or whether the very “desire to remedy
fragmentation” should be “seen as clinging to a form of humanism with its
emphasis upon cultivation of the persona and unity which are now regarded as
merely nostalgic.”24 Featherstone raised the question in response to the
popularization of the Internet at the turn of the millennium. Yet, as the


n has shown, his question is probably as old as the collecting
practices themselves. Such questions have become no less significant with mass
digitization. How are organizational practices conceived of as meaningful
today? As we shall see, this question not only relates to technical
characteristics but is also informed by a strong spatial imaginary that often
takes the shape of labyrinthine infrastructures and often orients itself
toward the figure of the user. Indeed, the role of the organizer of knowledge,
and therefore the accompanying responsibility of making sense of collections,
has been conferred from knowledge professionals to individuals.

Today, as seen in all the examples of mass digitization we have explored in
the previous chapters, cultural memory institutions face a different paradigm
than that of the eighteenth- and nineteenth-century disciplining cultural
memory institution. In an age that encourages individualism, democratic
ideals, and cultural participation, the orientations of the cultural memory
institutions have shifted in discourse, practice, or both, toward an emphasis
on the importance of the subjective experience and active participation of the
individual visitor. As part of this shift, and as a result of the increasing
integration of the digital imaginary and production apparatus into the field
of cultural memory, the visitor has thus metamorphosed from a disciplinary
subject to a prosumer, produser, participant, and/or user.

The organizational shift in the cultural memory ecosystem means that
visionaries and builders of mass digitization infrastructures now pay
attention not only to how collections may reflect upon the institution that
holds the collection, but also on how the user experiences the informational
navigation of collections. This is not to say that making an impression, or
even disciplining the user, is not a concern for many mass digitization
projects. Mass digitizations’ constant public claims to literal greatness
through numbers evidence this. Yet, today’s projects also have to contend with
the opinion of the public and must make their projects palatable and
consumable rather than elitist and intimidating. The concern of the builders
of mass digitization infrastructure is therefore not only to create an
internal logic to their collections, but also to maximize the user’s
experience of being offered a wealth of information, while mitigating the
danger of giving the visitor a sense of losing oneself, or even drowning, in
information. An important question for builders of mass digitization projects
has therefore been how to build visual and semantic infrastructures that offer
the user a sense of meaningful direction as well as a desire to keep browsing.

While digital collections are in principle no longer tethered to their
physical origins in spatial terms, we still encounter ideas about them in
spatialized terms, often using notions such as trails, paths, and alleyways to
visualize the spaces of digital collections.25 This form of spatialized logic
did not emerge with the mass digitization of cultural heritage collections,
however, but also resides at the heart of some of the most influential early
digital theories on the digital realm.26 These theorized and conceptualized
the web as a new form of architectural infrastructure, not only in material
terms (such as cables and servers) but also as a new experiential space.27 And
in this spatialized logic, the figure of the flaneur became a central
character. Thus, we saw in the 1990s the rise of a digital interpretation of
the flaneur, originally an emblematic figure of modern urban culture at the
turn of the twentieth century, in the form of the virtual flaneur or the
cyberflaneur. In 1994, German net artists Heiko Idensen and Ma


unending maze of desire stands in
contrast to the uncomplicated flaneur invoked in celebratory theories on the
digital flaneur. Yet, recent literature on the design of digital realms
suggests that the hesitant man caught in a drive for more information is a
much more accurate image of the digital flaneur than the man-in-the-know.34
Perhaps, then, the allegorical figure of the flaneur in digital design should
be used less to address pleasurable wandering and more to invoke “the most
characteristic response of all to the wholly new forms of life that seemed to
be developing: ambivalence.”35 Caught up in the commodified labyrinth of the
modern digitized archive, the digital flaneur of mass digitization might just
as easily get stuck in a repetitive, monotonous routine of scrolling and
downloading new things, forever suspended in a state of unfulfilled desire,
than move about in meaningful and pleasurable ways.36

Moreover, and just as importantly, the figure of the flaneur is also entangled
in a cultural matrix of assumptions about gender, capabilities, and colonial
implications. In short: the flaneur is a white, able-bodied male. As feminist
theory attests to, the concept of the flaneur is male by definition. Some
feminists such as Griselda Pollock and Janet Wolff have denied the possibility
of a female variant altogether, because of women’s status as (often absent)
objects rather than


mystery and ambivalence, curiosity and risk-taking—is under
assault.”45 These two death sentences, separated by a century, link the
environment of the flaneur to significant questions about the commodification
of space and its infrapolitical implications.

Exploring the implications of this topography, the following section suggests,
will help us understand the infrapolitics of the spatial imaginaries of mass
digitization, not only in relation to questions of globalization and late
sovereignty, but also to cultural imaginaries of knowledge infrastructures.
Indeed, these two dimensions are far from mutually exclusive, but rather
belong to the same overarching tale of the politics of mass digitization.
Thus, while the material spatial infrastructures of mass digitization projects
may help us appreciate certain important political dynamics of Europeana,
Google Books, and shadow libraries (such as their territorializing features or
copyright contestations in relation to knowledge production), only an
inclusion of the infrastructural imaginaries of knowledge production will help
us understand the complex politics of mass digitization as it metamorphoses
from analog buildings, shelves, and cabinets to the circulatory networks of
digital platforms.

## Labyrinthine Imaginaries: Infrastructural Perspectives of Power and
Knowledge Production

If the flaneur is a central early figure in the cultural imaginary of the
observer of cultural texts, the labyrinth has long served as a cultural
imaginary of the library, and, in larger terms, the spatialized
infrastructural conditions of knowledge and power. Thus, literature is rife
with works that draw on libraries and labyrinths to convey stories about
knowledge production and the power struggles hereof. Think only of the elderly
monk-librarian in Umberto Eco’s classic, _The Name of the Rose,_ who notes
that: “the library is a great labyrinth, sign of the labyrinth of the world.
You enter and you do not know whether you will come out” 46; or consider the
haunting images of being lost in Jose Luis Borges’s tales about labyrinthine
libraries.47 This section therefore turns to the infrastructural space of the
labyrinth, to show that this spatial imaginary, much like the flaneur, is
loaded with cultural ambivalence, and to explore the ways in which the
labyrinthine infrastructural imaginary emphasizes and crystallizes the
infrapolitical tension in mass digitization projects between power and
perspective, agency and environment, playful innovation and digital labor.

The labyrinth is a prevalent literary trope, found in authors from Ovid,
Virgil, and Dante to Dickens and Nietzsche, and it has been used particularly
in relation to issues of knowledge and agency, and in haunting and nightmarish
terms in modern literature.48 As the previous section indicates, the labyrinth
also provides a significant image for understanding our relationship to mass
digitization projects as sites of both knowledge production and experience.
Indeed, one shadow library is even named _Aleph_ , which refers to the ancient
Hebrew letter and likely also nods at Jose Luis Borges’s labyrinthine short
story, _Aleph,_ on infinite labyrinthine architectures. Yet, what kind of
infrastructure is a labyrinth, and how does it relate to the potentials and
perils of mass digitization?

In her rich historical study of labyrinths, Penelope Doob argues that the
labyrinth possesses a dual potentiality: on the one hand, if experienced from
within, the labyrinth is a sign of confusion; on the other, when viewed from
above, it is a sign of complex order.49 As Harold Bloom notes, “all of us have
had the experience of admiring a structure when outside it, but becoming
unhappy within it.”50 Envisioning the labyrinth from within links to a
claustrophobic sense of ignorance, while also implying the possibility of
progress if you just turn the next corner. What better way to describe one’s
experience in the labyrinthine infrastructures of mass digitization projects
such as Google Books with its infrastructural conditions and contexts of
experience and agency? On the one hand, Google Books appears to provide the
view from above, lending itself as a logistical aid in its information-rich
environment. On the other hand, Google Books also produces an alienating
effect of impenetrability on two levels. First, although Google presents
itself as a compass, its seemingly infinite and constantly rearranging
universe nevertheless creates a sense of vertigo, only reinforced by the
almost existential question “Do you feel lucky?” Second, Google Books also
feels impenetrable on a deeper level, with its black-boxed governing and
ordering principles, hidden behind complex layers of code, corporate cultures,
and nondisclosure agreements.51 But even less-commercial mass digitization
projects such as, for instance, Europeana and Monoskop can produce a sense of
claustrophobia and alienation in the user. Think only of the frustration
encountered when reaching dead ends in the form of broken links or in lack of
access set down by European copyright regulations. Or even the alienation and
dissatisfaction that can well up when there are seemingly no other limits to
knowledge, such as in Monoskop, than one’s own cognitive shortcomings.

The figure of the labyrinth also serves as a reminder that informational
strolling is not only a leisurely experience, but also a laborious process.
Penelope Doob thus points out the common medieval spelling of labyrinth as
_laborintus_ , which foregrounds the concept of labor and “difficult process,”
whether frustrating, useful, or both.52 In an age in which “labor itself is
now play, just as play becomes more and more laborious,”53 Doob’s etymological
excursion serves to highlight the fact that in many mass digitization projects
it is indeed the user’s leisurely information scrolling that in the end
generates profit, cultural value, and budgetary justification for mass
digitization platforms. Jose van Dijck’s analysis of the valuation of traffic
in a digital environment is a timely reminder of how traffic is valued in a
cultural memory environment that increasingly orients itself toward social
media, “Even though communicative traffic on social media platforms seems
determined by social values such as popularity, attention, and connectivity,
they are impalpably translated into monetary values and redressed in business
models made possible by digital technology.”54 This is visible, for instance,
in Europeana’s usage statistic reports, which links the notions of _traffic_
and _performance_ together in an ontological equation (in this equation poor
performance inevitably means a mark of death). 55 In a blogpost marking the
launch of the _Europeana Statistics Dashboard_ , we are told that information
about mass digitization traffic is “vital information for a modern cultural
institution for both reporting and planning purposes and for public
accountability.”56 Thus, although visitors may feel solitary in their digital
wanderings, their digital footsteps are in fact obsessively traced and tracked
by mass digitization platforms and often also by numerous third parties.

Today, then, the user is indeed at work as she makes her way in the
labyrinthine infrastructures of mass digitization by scrolling, clicking,
downloading, connecting, and clearing and creating new paths. And while
“search” has become a keyword in digital knowledge environments, digital
infrastructures in mass digitization projects in fact distract as much as they
orient. This new economy of cultural memory begs the question: if mass
digitization projects, as labyrinthine infrastructures, invariably disorient
the wanderer as much as they aid her, how might we understand their
infrapolitics? After all, as the previous chapters have shown, mass
digitization projects often present a wide array of motivations for why
digitization should happen on a massive scale, with knowledge production and
cultural enlightenment usually featuring as the strongest arguments. But as
the spatialized heuristics of the flaneur and the labyrinth show, knowledge
production and navigation is anything but a simple concept. Rather, the
political dimensions of mass digitization discussed in previous chapters—such
as standardization, late sovereignty, and network power—are tied up with the
spatial imaginaries of what knowledge production and cultural memory are and
how they should and could be organized and navigated.

The question of the spatial imaginaries of knowledge production and
imagination has a long philosophic history. As historian David Bates notes,
knowledge in the Enlightenment era was often imagined as a labyrinthine
journey. A classic illustration of how this journey was imagined is provided
by Enlightenment philosopher Jean-Louis Castilhon, whose frustration is
palpable in this exclamation: “How cruel and painful is the situation of a
Traveller


s, as Kristin Veel points out, had a much more complex
relationship to the spatial organization of the truth. Eco and Deleuze and
Guattari thus conceived of their labyrinths as networks “in which all points
can be connected with one another” with “no center” but “an almost unlimited
multiplicity of alternative paths,” which makes it “impossible to rise above
the structure and observe it from the outside, because it transcends the
graphic two-dimensionality of the two earlier forms of labyrinths.”58 Deleuze
expressed the senselessness of these contemporary labyrinths as a “theater
where nothing is fixed, a labyrinth without a thread (Ariadne has hung
herself).”59

In mass digitization, this new infrastructural imaginary feeds a looming
concern over how best to curate and infrastructurate cultural collections. It
is this concern that we see at play in the aforementioned institutional
concerns over how to best create meaningful paths in the cultural collections.
The main question that resounds is: where should the paths lead if there is no
longer one truth, that is, if the labyrinth has no center? Some mass
digitization projects seem to revel in this new reality. As we have seen,
shadow libraries such as Monoskop and UbuWeb use the affordances of the
digital to create new cultural connections outside of the formal hierarchies
of cultural memory institutions. Yet, while embraced by some, predictably the
new distribution of authority generates anxiety in the cultural memory circles
that had hitherto been able to hold claim to knowledge organization expertise.
This is the dizzying perspective that haunts the cultural memory professionals
faced with Europeana’s data governance model. Thus, as one Europeana
professional explained to me in 2010, “Europeana aims at an open-linked-data
model with a number of im


course be explained by a rationalized fear of
job insecurity and territorial concerns. Yet, the fear of knowledge
infrastructures without a center may also run deeper. As Penelope Doob reminds
us, the center of the labyrinth historically played a central moral and
epistemological role in the labyrinthine topos, as the site that held the
epiphanous key to unravel whatever evils or secrets the labyrinth contained.
With no center, there is no key, no epiphany.61 From this perspective, then,
it is not only a job that is lost. It is also the meaning of knowledge
itself.62

What, then, can we take from these labyrinthine wanderings as we pursue a
greater understanding of the infrapolitics of mass digitization? Certainly, as
this section shows, the politics of mass digitization is entangled in
spatialized imaginaries that have a long and complex cultural and affective
trajectory interlinked with ontological and epistemological questions about
the very nature of knowledge. Cladding the walls of these trajectories are, of
course, the ever-present political questions of authority and territory, but
also deeper cultural and affective questions about the nature and meaning of
knowledge as it bandies about in our cultural imaginaries, between discoveries
and dead-ends, between freedom and control.

As the next section will show, one concept has in particular come to
encapsulate these concerns: the notion of serendipity. While the notion of
serendipity has a long history, it has gained new relevance with mass
digitization, where it is used to express the realm of possibilities opened up
by the new digital infrastructures of knowledge production. As such, it has
come to play a role, not only as a playful cultural imaginary, but also as an
architectural ideal in software developments for mass digitization. In the
following section, we will look at a few examples of these architectures, as
well as the knowledge politics they are entangled in.

## The Architecture of Serendipitous Platforms

Serendipity has for long been a cherished word in archival studies, used to
describe a magical moment of “Eureka!” A fickle and fabulating concept, it
belongs to the world of discovery, capturing the moment when a meandering
soul, a flaneur, accidentally stumbles upon a valuable find. As such, the
moment of serendipity is almost always a happy circumstance of chance, and
never an unfortunate moment of risk. Serendipity also embodies the word in its
own origins. This section outlines the origins of this


he archive “in the
humanities” represents a “prime site for serendipitous discovery.”71 In most
of these cases, serendipity is taken to mean some form of archival insight,
and often even a critical intellectual process. Deb Verhoeven, Associate Dean
of Engagement and Innovation at the University of Technology Sydney, reminds
us in relation to feminist archival work that “stories of accidental
discovery” can even take on dimensions of feminist solace, consoling “the
researcher, and us, with the idea that no system, whatever its claims to
discipline, comprehensiveness, and structure, is exempt from randomness, flux,
overflow, and therefore potential collapse.”72

But with mass digitization processes, their fusion of probability theories and
archives, and their ideals of combined fun and fact-finding, the questions
raised in the hard sciences about serendipity, its connotations of freedom and
chance, engineering and control, now also haunt the archives of historians and
literary scholars. Serendipity has now often come to be used as a motivating
factor for digitization in the first place, based on arguments that mass
digitized archives allow not only for dedicated and target-oriented research,
but also for new modes of search, of reading haphazardly “on the diagonal”
across genres and disciplines, as well as across institutional and national
borders that hitherto kept works and insights apart. As one spokesperson from
a prominent mass digitization company states, “digital collections have been
designed both to assist researchers in accessing original primary source
materials and to enable them to make serendipitous discoveries and unexpected
connections between sources.”73 And indeed, this sentiment reverberates in all
mass digitization projects from Europeana and Google Books to smaller shadow
libraries such as UbuWeb and Monoskop. Some scholars even argue that
serendipity takes on new forms due to digitization.74

It seems only natural, then, that mass digitization projects, and their
actors, have actively adopted the discourse of serendipity, both as a selling
point and a strategic claim. Talking about Google’s digitization program, Dr.
Sarah Thomas, Bodley’s Librarian and Director of Oxford University Library
Services, notes: “Library users have always loved browsing books for the
serendipitous discoveries they provide. Digital books offer a similar thrill,
but on multiple levels—deep entry into the texts or the ability to browse the
virtual shelf of books assembled from the world's great libraries.”75 But it
has also raised questions for those people who are in charge, not only of
holding serendipity forth as an ideal, but also building the architecture to
facilitate it. Dan Cohen, speaking on behalf of the DPLA, thus noted the
centrality of the concept, but also the challenges that mass digitization
raised in practical terms: “At DPLA, we’ve been thinking a lot about what’s
involved with serendipitous discovery. Since we started from scratch and
didn’t need to create a standard online library catalog experience, we were
free to experiment and provide novel ways into our collection of over five
million items. How to arrange a collection of that scale so that different
users can bump into items of unexpected interest to them?” While adopting the
language of serendipity is easy, its infrastructural construction is much
harder to envision. This challenge clearly troubles the strategic team
developing Europeana’s infrastructure, as it notes in a programmatic tone that
stands hila


a.eu8
deliverable—and in particular those of the “culture vultures”—one finds two
somewhat-opposed requirements. On the one hand, they need to be able to find
what they are looking for, and navigate through clear and well-structured
data. On the other hand, they also come to Europeana looking for
“inspiration”—that is to say, for something new and unexpected that points
them towards possibilities they had previously been unaware of; what, in the
formal literature of user experience and search design, is sometimes referred
to as “serendipity search.” Europeana’s users need the platform to be
structured and predictable—but not entirely so.76

To achieve serendipity, mass digitization projects have often sought to take
advantage of the labyrinthine infrastructures of digitization, relying not
only on their own virtual bookshelves, but also on the algorithmic highways
and back alleys of social media. Twitter, in particular, before it adopted
personalization methods, became a preferred infrastructure for mass
digitization projects, who took advantage of Twitter’s lack of personalized
search to create whimsical bots that injected randomness into the user’s feed.
One example was the Digital Public Library of America’s DPLA Bot, which grabs
a random noun and uses its API to share the first result it finds. The DPLA
Bot aims to “infuse what we all love about libraries—serendipitous
discovery—into the DPLA” and thus seeks to provide a “kind of ‘Surprise me!’
search function for DPLA.”77 It did not take the programmer Peter Meyr much
time to develop a similar bot for Europeana. In an interview with
EuropeanaPro, Peter Meyr directly related the EuropeanaBot to the
serendipitous affordances of Twitter and its rewards for mass digitization
projects, noting that:

> The presentation of digital resources is difficult for libraries. It is no
longer possible to just explore, browse the stacks and make serendipitous
findings. With Europeana, you don't even have a physical library to go to. So
I was interested in bringing a little bit of serendipity back by using a
Twitter bot. … If I just wanted to present (semi)random Europeana findings, I
wouldn’t have needed Twitter—an RSS-Feed or a web page would be enough.
However, I wanted to infuse EuropeanaBot with a little bit of “Twitter
culture” and give it a personality.78

The British Library also developed a Twitter bot titled the Mechanical
Curator, which posts random resources with no customization except a special
focus on images in the library’s seventeenth- to nineteenth-century
collections.79 But there were also many projects that existed outside social
media platforms and operated across mass digitization projects. One example
was the “serendipity engine,” Serendip-o-matic, which first examined the
user’s research interests and then, based on this data, identified “related
content in locations such as the Digital Public Library of America (DPLA),
Europeana, and Flickr Commons.”80 While this initiative was not endorsed by
any of these mass digitization projects, they nevertheless featured it on
their blogs, integrating it into the mass digitization ecosystem.

Yet, while mass digitization for some represents the opportunity to amplify
the chance of chance, other scholars increasingly wonder whether the
engineering processes of mass digitization would take serendipity out of the
archive. Indeed, to them, the digital is antithetical to chance. One such
viewpoint is uttered by historian Tristram Hunt in an op-ed charging against
Google’s British digitization program under the title, “Online is fine, but
history is best hands on.” In it, Hunt argues that the digital, rather than
providing a new means of chance finding, would impede historical discovery and
that only the analog archival environment could foster real historical
discoveries, since it is “… only with MS in hand that the real meaning of the
text becomes apparent: its rhythms and cadences, the relationship of image to
word, the passion of the argument or cold logic of the case. Then there is the
serendipity, the scholar’s eternal hope that something will catch his eye,”81
In similar terms, Graeme Davison describes the lacking of serendipitous
errings in digital archives, as


another world, something to
lead your life down a path you didn't know was there.83

Common to all these statements is the sentiment that the engineering of
serendipity removes the very chance of serendipity. As Nicholas Carr notes,
“Once you create an engine—a machine—to produce serendipity, you destroy the
essence of serendipity. It becomes something expected rather than
unexpected.”84 It appears, then, that computational methods have introduced
historians and literary scholars to the same “beaverish efforts”85 to
domesticate serendipity as the hard sciences had to face at the beginning of
the twentieth century.

To my knowledge, few systematic studies exist about whether mass digitization
projects such as Europeana and Google Books hamper or foster creative and
original research in empirical terms. How one would go about such a study is
also an open question. The dichotomy between digital and analog does seem a
bit contrived, however. As Dan Cohen notes in a blogpost for DPLA, “bookstores
and libraries have their own forms of ‘serendipity engineering,’ from
storefront staff picks to behind-the-scenes cataloguing and shelving methods
that make for happy accidents.”86 Yet there is no doubt that the discourse of
serendipity has been infused with new life that sometimes veers toward a
“spectacle of serendipity.”87

Over the past decade, the digital infrastructures that organize our cultural
memory have become increasingly integrated in a digital economy that valuates
“experience” as a cultural currency that can be exchanged to profit, and our
affective meanderings as a form of industrial production. This digital economy
affects the architecture and infrastructure of digital archives. The archival
discourse on digital serendipity is thus now embroiled in a more deep-seated
infrapolitics of workspace architecture, influenced by Silicon Valley’s
obsession with networks, process, and connectivity.88 Think only of the
increasing importance of Google and Facebook to mass digitization projects:
most of these projects have a Facebook page on which they showcase their
material, just as they take pains to make themselves “algorithmically
recognizable”89 to Google and other search engines in the hope of reaching an
audience beyond the echo chamber of archives and to distribute their archival
material on leisurely tidbit platforms such as Pinterest and Twitter.90 If
serendipity is increasingly thought of as a platform problem, the final
question we might pose is what kind of infrapolitics this platform economy
generates and how it affects mass digitization projects.

## The Infrapolitics of Platform Power

As the previous sections show, mass digitization projects rely upon spatial
metaphors to convey ideas about, and ideals of, cultural memory
infrastructures, their knowledge production, and their serendipitous
potential. Thus, for mass digitization projects, the ideal scenario is that
the labyrinthine errings of the user result in serendipitous finds that in
turn bring about new forms of cultural value. From the point of the user,
however, being caught up in the labyrinth might just as easily give rise to an
experience of being confronted with a sense of lack of oversight and
alienation in the alleyways of commodified infrastructures. These two
scenarios co-exist because of what Penelope Doob (as noted in the section on
labyrinthine imaginaries) refers to as the dual potentiality of the labyrinth,
which when experienced from within can be become a sign of confusion, and when
viewed from above becomes a sign of complex order.91

In this final section, I will turn to a new spatial metaphor, which appears to
have resolved this dual potentiality of the spatial perspective of mass
digitization projects: the platform. The platform has recently emerged as a
new buzzword in the digital economy, connoting simultaneously a perspective, a
business strategy, and a political ideology. Ideally the platform provides a
different perspective than the labyrinth, offering the user the possibility of
simultaneously constructing the labyrinth and viewing it from above. This
final section therefore explores how we might understand the infrapolitics of
the platform, and its role in the digital economy.

In its recent business strategy, Europeana claimed that it was moving from
operating as a “portal” to operating as a “platform.”92 The announcement was
part of a broader infrastructural tran


ion” of the web.94 The notion of the platform has thus recently
become an important heuristic for understanding the cultural development of
the web and its economy, fusing the computational understanding of the
platform as an environment in which a code is executed95 and the political and
social understanding of a platform as a site of politics.96

While the infrapolitics of the platformization of the web has become a central
discussion in software and communication studies, little interest has been
paid to the implications of platforms for the politics of cultural memory.
Yet, Europeana’s business strategy illustrates the significant infrapolitical
role that platforms are given in mass digitization literature. Citing digital
historian Tim Sherratt’s claim that “portals are for visiting, platforms for
building on,”97 Europeana’s strategy argues that if cultural memory sites free
themselves and their content from the “prison of portals” in favor of more
openness and flexibility, this will in turn empower users to created their own
“pathways” through the digital cultural memory, instead of being forced to
follow predetermined “narrative journeys.”98 The business plan’s reliance on
Sherratt’s theory of platforms shows that although the platform has a
technical meaning in computation, Europeana’s discourse goes beyond mere
computational logic. It instead signifies


olitics of
collaboration, even subversion. Olga Gurionova, for instance, explores the
subversive dynamics of critical artistic platforms,110 and Trebor Sholtz
promotes the term “platform cooperativism” to advance worker-based
cooperatives that would “design their own apps-based platforms, fostering
truly peer-to-peer ways of providing services and things, and speak truth to
the new platform capitalists.”111 Shadow libraries such as Monoskop appear as
perfect examples of such subversive platforms and evidence of Srnicek’s
reminder that not _all_ social interactions are co-opted into systems of
profit generation. 112 Yet, as the territorial, legal, and social
infrastructures of mass digitization become increasingly labyrinthine, it
takes a lot of critical consciousness to properly interpret and understand its
infrapolitics. Engage with the shadow library Library Genesis on Facebook, for
instance, and you submit to platform capitalism.

A significant trait of platform-based corporations such as Google and Facebook
is that they more often than not present themselves as apolitical, neutral,
and empowering tools of connectivity, passive until picked up by the user.
Yet, as Lisa Nakamura notes, “reading’s economies, cultures of sharing, and
circuits of travel have never been passive.”113 One of digital platforms’ most
important infrapolitical traits is their dependence on network


the digital
economy. They not only gain access to data, but they also control the rules of
how the data is to be managed and governed. Therefore, when a user is surfing
Google Books, Google—and not the library—collects the user’s search queries,
including results that appeared in searches and pages the user visited from
the search. The browser, moreover, tracks the user’s activity, including pages
the user has visited and when, user data, and possibly user login details with
auto-fill features, user IP address, Internet service provider, device
hardware details, operating system and browser version, cookies, and cached
data from websites. The labyrinthine infrastructure of the mass digitization
ecosystem also means that if you access one platform through another, your
data will be collected in different ways. Thus, if you visit Europeana through
Facebook, it will be Facebook that collects your data, including name and
profile; biographical information such as birthday, hometown, work history,
and interests; username and unique identifier; subscriptions, location,
device, activity date, time and time-zone, activities; and likes, check-ins,
and events.115 As more platforms emerge from which one can access mass
digitized archives, such as social media sites like Facebook, Google+,
Pinterest, and Twitter, as well as mobile devices such as Android, gaining an
overview of who collects one’s data and how becomes more nebulous.

Europeana’s reminder illustrates the assemblatic infrastructural set-up of
mass digitization projects and how they operate with multiple entry points,
each of which may attach its own infrapolitical dynamics. It also illustrates
the labyrinthine infrastructures of privacy settings, over which a mapping is
increasingly difficult to attain because of constant changes and
reconfigurations. It furthermore illustrates the changing legal order from the
relatively stable sovereign order of human rights obligations to the
modulating landscape of privacy policies.

How then might we characterize the infrapolitics of the spatial imaginaries of
mass digitization? As this chapter has sought to convey, writings about mass
digitization projects are shot through with spatialized metaphors, from the
flaneur to the labyrinth and the platform, either in literal terms or in the
imaginaries they draw on. While this section has analyzed these imaginaries in
a somewhat chronological fashion, with the interactivity of the platform
increasingly replacing the more passive gaze of the spectator, they coexist in
that larger complex of spatial digital thinking. While often used to elicit
uncomplicated visions of empowerment, desire, curiosity, and productivity,
these infrapolitical imaginaries in fact show the complexity of mass
digitization projects in their reinscription of users and cultural memory
institutions in new constellations of power and politics.

## Notes

1. Kelly 1994, p. 263. 2. Connection Machines were developed by the
supercomputer manufacturer Thinking Machines, a concept that also appeared in
Jorge Luis Borges’s _The Total Library_. 3. Brewster Kahle, “Transforming Our
Libraries from Analog to Digital: A 2020 Vision,” _Educause Review_ , March
13, 2017, from-analog-to-digital-a-2020-vision>. 4. Ibid. 5. Couze Venn, “The
Collection,” _Theory, Culture & Society_ 23, no. 2–3 (2006), 36. 6. Hacking
2010. 7. Lefebvre 200


in use as early as
AD 361. 64. Letter to Horace Mann, 28 January 1754, in _Walpole’s
Correspondence_ , vol. 20, 407–411. 65. As Robert Merton and Elinor Barber
note, it first made it into the OED in 1912 (Merton and Barber 2004, 72). 66.
Merton and Barber 2004, 40. 67. Lorraine Daston, “Are You Having Fun Today?,”
_London Review of Books_ , September 23, 2004. 68. Ibid. 69. Ibid. 70.
Featherstone 2000, 594. 71. Nancy Lusignan Schulz, “Serendipity in the
Archive,” _Chronicle of Higher Education_ , May 15, 2011,
. 72.
Verhoeven 2016, 18. 73. Caley 2017, 248. 74. Bishop 2016 75. “Oxford-Google
Digitization Project Reaches Milestone,” Bodleian Library and Radcliffe
Camera, March 26, 2009.
. 76. Timothy
Hill, David Haskiya, Antoine Isaac, Hugo Manguinhas, and Valentine Charles
(eds.), _Europeana Search Strategy_ , May 23, 2016,
.
77. “DPLAbot,” _Digital Public Library of America_ , .
78. “Q&A with EuropeanaBot developer,” _EuropeanaPro_ , August 20, 2013,
. 79. There
are of course many other examples, some of which offer great


Automation, Growth, and Employment,”
_ETLA Reports_ 61, October 17, 2016, /ETLA-Raportit-Reports-61.pdf>. 115. Europeana’s privacy page explicitly notes
this, reminding the user that, “this site may contain links to other websites
that are beyond our control. This privacy policy applies solely to the
information you provide while visiting this site. Other websites which you
link to may have privacy policies that are different from this Privacy
Policy.” See “Privacy and Terms,” _Europeana Collections_ ,
.

# 6
Concluding Remarks

I opened this book claiming that the notion of mass digitization has shifted
from a professional concept to a cultural political phenomenon. If the former
denotes a technical way of duplicating analog material in digital form, mass
digitization as a cultural practice is a much more complex apparatus. On the
one hand, it offers the simple promise of heightened public and private access
to—and better preservation of—the past; one the other, it raises significant
political questions about ethics, politics, power, and care in the digital
sphere. I locate the emergence of these questions within the infrastructures
of mass digitization and the ways in which they not only offer new ways of
reading, viewing, and structuring cultural material, but also new models of
value and its extraction, and new infrastructures of control. The political
dynamic of this restructuring, I suggest, may meaningfully be referred to as a
form of infrapolitics, insofar as the political work of mass digitization
often happens at the level of infrastructure, in the form of standardization,
dissent, or both. While mass digitization entwines the cultural politics of
analog artifacts and institutions with the infrapolitical logics of the new
digital economies and technologies, there is no clear-cut distinction between
between the analog and digital realms in this process. Rather, paraphrasing N.
Katherine Hayles, I suggest that mass digitization, like a Janus-figure,
“looks to past and future, simultaneously reinforcing and undermining both.”1

A persistent challenge in the study of mass digitization is the mutability of
the analytical object. The unstable nature of cultural memory archives is not
a new phenomenon. As Derrida points out, they have always been haunted by an
unintended instability, which he calls “archive fever.” Yet, mass digitization
appears to intensify this instability even further, both in its material and
cultural instantiations. Analog preservation practices that seek to stabilize
objects are in the digital realm replaced with dynamic processes of content
migration and software updates. Cultural memory objects become embedded in
what Wendy Chun has referred to as the enduring ephemerality of the digital as
well as the bleeding edge of obsolescence.2

Indeed, from the moment when the seed for this book was first planted to the
time of its publication, the landscape of mass digitization, and the political
battles waged on its maps, has changed considerably. Google Books—which a
decade ago attracted the attention, admiration, and animosity of all—recently
metamorphosed from a giant flood to a quiet trickle. After a spectacle of
press releases on quantitative milestones, epic legal battles, and public
criticisms, Google apparently lost interest in Google Books. Google’s gradual
abandonment of the project resembled more an act of prolonged public ghosting
than a clear-cut break-up, leaving the public to read in between the lines
about where the company was headed: scanning activities dwindled; the Google
Books blog closed along with its Twitter feed; press releases dried


uiet life does
not necessarily equal death. Indeed, this is the lesson we learn from
attending to the subtle workings of infrastructure: the politics of
infrastructure is the politics of what goes on behind the curtains, not only
what is launched to the front page. Thus, as one engineer notes when
confronted with the fate of Google Books, “We’re not focused on shiny features
and things that are very visible to users. … It’s more like behind-the-scenes
work and perfecting the technology—acquiring content, processing it properly
so that we can view the entire book online, and adjusting the search
algorithm.”6 This is a timely reminder that any analysis of the infrapolitics
of mass digitization has to tend not only to the visible and loud politics of
construction, but also the quiet and ongoing politics of infrastructure
maintenance. It makes no sense to write an obituary for Google Books if the
infrastructure is still at work. Moreover, the assemblatic nature of mass
digitization also demands that we do not stop at the immediate borders of a
project when making analytical claims about their infrapolitics. Thus, while
Google Books may have stopped in its tracks, other trains of mass digitization
have pulled up instead, carrying the project of mass digitization forward
toward new, divergent, and experimental sites. Google’s different engagements
with cultural digitization shows that an analysis of the politics of Google’s
memory work needs to operate with an assemblatic method, rather than a
delineating approach.7 Europeana and DPLA also are mutable analytical objects,
both in economic and cultural form. Therefore, Europeana leads a precarious
life from one EU budget framework to the next, and its cultural identity and
software instantiations have transformed from a digital library, to a portal,
to a platform over the course of only a few decades. Last, but not least,
shadow libraries are mediating and multiplying cultural memory objects from
servers and mirror links that sometimes die just as quickly as they emerged.
The question of institutionalization matters greatly in this respect,
outlining what we might call a spectrum of contingency. If a mass digitization
project lives in the margins of institutions, such as in the case of many
shadow libraries, its infrastructure is often fraught with uncertainties. Less
precarious, but nonetheless tumultuous, are the corporate institutions with
their increasingly short market-driven lifespans. And, at the other end of the
spectrum, we find mass digitization projects embedded in bureaucratic
apparatuses whose lumbering budget processes provide publically funded mass
digitization projects with more stable infrastructures.

The temporal dimension of mass digitization projects also raises important
questions about the horizon of cultural memory in material terms. Should mass
digitization, one might ask, also mean whither analog cultural memory? This
question seems relevant not least in cases where institutions consider
digitization as a form of preservation that allows them to discard analog
artifacts once digitized. In digital form, we further have to contend with a
new temporal horizon of cultural memory itself, based not on only on
remembrance but on anticipation in the manner of “If you liked this, you might
also like. ….” Thus, while cultural memory objects link to objects of the
past, mass digitized cultural memory also gives rise to new methods of
prediction and preemption, for instance in the form of personalization. In
this anticipatory regime, cultural memory becomes subject to perpetual
calculatory activities, processing affects, and activities in terms of
likelihoods and probabilistic outcomes.

Thus, cultural memory has today become embedded in new glocalized
infrastructures. On the one hand, these infrastructures present novel
opportunities. Cultural optimists have suggested that mass digitization has
the potential to give rise to new cosmopolitan public spheres tethered from
the straitjackets of national territorializing forces. On the other hand,
critics argue that there is little evidence that cosmopolitan dynamics are in
fact at work. Instead, new colonial and neoliberal platforms arise from a
complex infrastructural apparatus of private and public institutions and
become shaped by political, financial, and social struggles over
representation, control, and ownership of knowledge.

In summary, it is obvious that the scale of mass digitization, public and
private, licit and illicit, has transformed how we engage with texts, cultural
works, and cultural memory. People today have instant access to a wealth of
works that would previously have required large amounts of money, as well as
effort, to engage with. Most of us enjoy the new cultural freedoms we have
been given to roam the archives, collecting and exploring oddities along the
way, and making new connections between works that would previously have been
held separate by taxonomy, geography, and time in the labyrinthine material
and social infrastructures of cultural memory.

A special attraction of mass digitization no doubt lies in its unfathomable
scale and linked nature, and the fantasy and “spectacle of collecting.”8 The
new cultural environment allows the user to accelerate the pace of information
by accessing key works instantly as well as idly rambling in the exotic back
alleys of digitized culture. Mass digitized archives can be explored to
functional, hedonistic, and critical ends (sometimes all at the same time),
and can be used to exhume forgotten works, forgotten authors, and forgotten
topics. Within this paradigm, the user takes center stage—at least
discursively. Suddenly, a link made between a porn magazine and a Courbet
painting could well be a valued cultural connection instead of a frowned-upon
transgression in the halls of high culture. Users do not just download books;
they also upload new folksonomies, “ego-documents,” and new cultural
constellations, which are all welcomed in the name of “citizen science.”
Digitization also infuses texts with new life due to its new connective
properties that allow readers and writers to intimately and
exhibitionistically interact around cultural works, and it provides new ways
of engaging with texts as digital reading migrates toward service-based rather
than hardware-based models of consumption. Digitization allows users to
digitally collect works themselves and indulge in alluring archival riches in
new ways.

But mass digitization also gives rise to a range of new ethical, political,
aesthetic, and methodological questions concerning the spatio-temporality,
ownership, territoriality, re-use, and dissemination of cultural memory
artifacts. Some of those dimensions have been discussed in detail in the
present work and include questions about digital labor, platformization,
management of visibility, ownership, copyright, and other new forms of control
and de- and recentralization and privatization processes. Others have only
been alluded to but continue to gain in relevance as processes of mass
digitization excavate and make public sensitive and contested archival
material. Thus, as the cultural memories and artifacts of indigenous
populations, colonized territories and other marginalized groups are brought
online, as well as artifacts that attest to the violent regimes of colonialism
and patriarchy, an attendant need has emerged for an ethics of care that goes
beyond simplistic calls for right to access, to instead attend to the
sensitivity of the digitized material and the ways in which we encounter these
materials.

Combined, these issues show that mass digitization is far from a
straightforward technical affair. Rather, the productive dimensions of mass
digitization emerge from the rubble of disruptive and turbulent political
processes that violently dislocate established frontiers and power dynamics
and give rise to new ones that are yet to be interpreted. Within these
turbulent processes, the familiar narratives of empowered users collecting and
connecting works and ideas in new and transgressive ways all too often leave
out the simultaneous and integrated story of how the labyrinthine
infrastructures of mass digitization also writes itself on the back of the
users, collecting them and their thoughts in the process, and subjecting them
to new economic logics and political regimes. As Lisa Nakamura reminds us, “by
availing ourselves of its networked virtual bookshelves to collect and display
our readerliness in a postprint age, we have become objects to be collected.”9
Thus, as we gather vintage images on Pinterest, collect books in Google Books,
and retweet sounds files from Europeana, we do best not only to question the
cultural logic and ethics of these actions but also to remember that as we
collect and connect, we are also ourselves collected and connected.

If the power of mass digitization happens at the level of infrastructure,
political resistance will have to take the form of infrastructural
intervention. We play a role in the formulation of the ethics of such
interventions, and as such we have to be willing to abandon the predominant
tropes of scale, access, and acceleration in favor of an infrapolitics of
care—a politics that offers opportunities for mindful, slow, and focused
encounters.

## Notes

1. Hayles 1999, 17. 2. Chun. 2008; Chun 2017. 3. Murrell 2017. 4. James
Somers, “Torching the Modern-Day Library of Alexandria,” _The Atlantic_ ,
April 20, 2017. 5. Jennifer Howard, “What Happened to Google’s Effort to Scan
Millions of University Library Bo


y.” In _The Aleph and Other Stories, 1933–1969: Together with Commentaries and an Autobiographical Essay_. New York: E. P. Dutton.
39. Borges, Jorge Luis. 2001. “The Total Library.” In _The Total Library: Non-fiction 1922–1986_. London: Penguin.
40. Borges, Jorge Luis, and L. S. Dembo. 1970. “An Interview with Jorge Luis Borges.” _Contemporary Literature_ 11 (3): 315–325.
41. Borghi, Maurizio. 2012. “Knowledge, Information and Values in the Age of Mass Digitisation.” In _Value: Sources and Readings on a Key Concept of the Globalized World_ , ed. Ivo de Gennaro. Leiden, the Netherlands: Brill.
42. Borghi, Maurizio, and Stavroula Karapapa. 2013. _Copyright and Mass Digitization: A Cross-Jurisdictional Perspective_. Oxford: Oxford University Press.
43. Borgman, Christine L. 2015. _Big Data, Little Data, No Data: Scholarship in the Networked World_. Cambridge, MA: MIT Press.
44. Bottando, Evelyn. 2012. _Hedging the Commons: Google Books, Libraries, and Open Access to Knowledge_. Iowa City: University of Iowa.
45. Bowker, Geoffrey C., Karen Baker, Florence Millerand, and David Ribes. 2010. “Toward Information Infrastructure Studies: Ways of Knowing in a Networked Environment.” In _The International Handbook of Internet Research_ , eds. Hunsinger Lisbeth Klastrup Jeremy and Matthew Allen. Dordrecht, the Netherlands: Springer.
46. Bowker, Geoffrey C, and Sus


6. “The flaneur, the Sandwichman and the Whore: The Politics of Loitering.” _New German Critique_ (39): 99–140.
51. Budds, Diana. 2016. “Rem Koolhaas: ‘Architecture Has a Serious Problem Today.’” _CoDesign_ 21 (May). .
52. Burkart, Patrick. 2014. _Pirate Politics: The New Information Policy Contests_. Cambridge, MA: MIT Press.
53. Burton, James, and Daisy Tam. 2016. “Towards a Parasitic Ethics.” _Theory, Culture & Society_ 33 (4): 103–125.
54. Busch, Lawrence. 2011. _Standards: Recipes for Reality_. Cambridge, MA: MIT Press.
55. Caley, Seth. 2017. “Digitization for the Masses: Taking Users Beyond Simple Searching in Nineteenth-Century Collections Online.” _Journal of Victorian Culture : JVC_ 22 (2): 248–255.
56. Cadogan, Garnette. 2016. “Walking While Black.” Literary Hub. July 8. .
57. Callon, Michel, Madeleine Akrich, Sophie Dubuisson-Quellier, Catherine Grandclément, Antoine Hennion, Bruno Latour, Alexandre Mallard, et al. 2016. _Sociologie des agencements marchands: Textes choisis_. Paris: Presses des Mines.
58. Cameron, Fiona, and Sarah Kenderdine. 2007. _Theorizing Digital Cultural Heritage: A Critical Discourse_. Cambridge, MA: MIT Press.
59. Canepi, Kitti, Becky Ryder, Michelle Sitko,


m/world/2008/nov/21/eu>.
71. Chun, Wendy H. K. 2006. _Control and Freedom: Power and Paranoia in the Age of Fiber Optics_. Cambridge, MA: MIT Press.
72. Chun, Wendy Hui Kyong. 2008. “The Enduring Ephemeral, or the Future Is a Memory.” _Critical Inquiry_ 35 (1): 148–171.
73. Chun, Wendy H. K. 2017. _Updating to Remain the Same_. Cambridge, MA: MIT Press.
74. Clarke, Michael Tavel. 2009. _These Days of Large Things: The Culture of Size in America, 1865–1930_. Ann Arbor: University of Michigan Press.
75. Cohen, Jerome Bernard. 2006. _The Triumph of Numbers: How Counting Shaped Modern Life_. New York: W.W. Norton.
76. Conway, Paul. 2010. “Preservation in the Age of Google: Digitization, Digital Preservation, and Dilemmas.” _The Library Quarterly: Information, Community, Policy_ 80 (1): 61–79.
77. Courant, Paul N. 2006. “Scholarship and Academic Libraries (and Their Kin) in the World of Google.” _First Monday_ 11 (8).
78. Coyle, Karen. 2006. “Mass Digitization of Books.” _Journal of Academic Librarianship_ 32 (6): 641–645.
79. Darnton, Robert. 2009. _The Case for Books: Past, Present, and Future_. New York: Public Affairs.
80. Daston, Lorraine. 2012. “The Sciences of the Archive.” _Osiris_ 27 (1): 156–187.
81. Davison, Graeme. 2009. “Speed-Relating: Family History in a Digital Age.” _History Australia_ 6 (2). .
82. Deegan, Marilyn, and Kathryn Sutherland. 2009. _Transferred Illusions: Digital Technology and the Forms of Print_. Farnham, UK: Ashgate.
83. de la Durantaye, Katharine. 2011. “H Is for Harmonization: The Google Book Search Settlement and Orphan Works Legislat


l Library for the 21st Century—Knowledge and Cultural Heritage Online.” _Alexandria_ _:_ _The_ _Journal of National and International Library and Information Issues_ 26 (1): 5–14.
161. Kang, Minsoo. 2011. _Sublime Dreams of Living Machines: The Automaton in the European Imagination_. Cambridge, MA: Harvard University Press.
162. Karaganis, Joe. 2011. _Media Piracy in Emerging Economies_. New York: Social Science Research Council.
163. Karaganis, Joe. 2018. _Shadow Libraries: Access to Educational Materials in Global Higher Education_. Cambridge, MA: MIT Press.
164. Kaufman, Peter B., and Jeff Ubois. 2007. “Good Terms—Improving Commercial-Noncommercial Partnerships for Mass Digitization.” _D-Lib Magazine_ 13 (11–12). .
165. Kelley, Robin D. G. 1994. _Race Rebels: Culture, Politics, and the Black Working Class_. New York: Free Press.
166. Kelly, Kevin. 1994. _Out of Control: The Rise of Neo-Biological Civilization_. Reading, MA: Addison-Wesley.
167. Kenney, Anne R, Nancy Y. McGovern, Ida T. Martinez, and Lance J. Heidig. 2003. “Google Meets Ebay: What Academic Librarians Can Learn from Alternative Information Providers." D-lib Magazine, 9 (6) .
168. Kiriya, Ilya. 2012. “The Culture of Subversion and Russian Media Landscape.” _International Jour


igeria_. Durham, NY: Duke University Press.
178. Latour, Bruno. 2005. _Reassembling the Social: An Introduction to Actor-Network Theory_. Oxford: Oxford University Press.
179. Latour, Bruno. 2007. “Beware, Your Imagination Leaves Digital Traces.” _Times Higher Literary Supplement_ , April 6.
180. Latour, Bruno. 2008. _What Is the Style of Matters of Concern?: Two Lectures in Empirical Philosophy_. Assen, the Netherlands: Koninklijke Van Gorcum.
181. Lavoie, Brian F., and Lorcan Dempsey. 2004. “Thirteen Ways of Looking at Digital Preservation.” _D-Lib Magazine_ 10 (July/August). .
182. Leetaru, Kalev. 2008. “Mass Book Digitization: The Deeper Story of Google Books and the Open Content Alliance.” _First Monday_ 13 (10). .
183. Lefebvre, Henri. 2009. _The Production of Space_. Malden, MA: Blackwell.
184. Lefler, Rebecca. 2007. “‘Europeana’ Ready for Maiden Voyage.” _Hollywood Reporter_ , March 23. .
185. Lessig, Lawrence. 2005a. “Lawrence Lessig on Interoperability.” _Creative Commons_ , October 19. .
186. Lessig, Lawrence. 2005b. _Free Culture: The Nature and Future of Creati


ries: Regulation and the Public Interest_ , ed. David Ward. Aldershot, UK: Ashgate.
208. Murrell, Mary. 2017. “Unpacking Google’s Library.” _Limn_ (6). .
209. Nakamura, Lisa. 2002. _Cybertypes: Race, Ethnicity, and Identity on the Internet_. New York: Routledge.
210. Nakamura, Lisa. 2013. “‘Words with Friends’: Socially Networked Reading on Goodreads.” _PMLA_ 128 (1): 238–243.
211. Nava, Mica, and Alan O’Shea. 1996. _Modern Times: Reflections on a Century of English Modernity_ , 38–76. London: Routledge.
212. Negroponte, Nicholas. 1995. _Being Digital_. New York: Knopf.
213. Neubert, Michael. 2008. “Google’s Mass Digitization of Russian-Language Books.” _Slavic & East European Information Resources_ 9 (1): 53–62.
214. Nicholson, William. 1819. “Platform.” In _British Encyclopedia: Or, Dictionary of Arts and Sciences, Comprising an Accurate and Popular View of the Present Improved State of Human Knowledge_. Philadelphia: Mitchell, Ames, and White.
215. Niggemann, Elisabeth. 2011. _The New Renaissance: Report of the “Comité Des Sages.”_ Brussels: Comité des Sages.
216. Noble, Safiya Umoja, and Brendesha M. Tynes. 2016. _The Intersectional Internet: Race, Sex, Class and Culture Online_. New York: Peter Lang Publishing.
217. Nord, Deborah Epstein. 1995. _Walking the Victorian Streets: Women, Rep


go: University of Chicago Press.
245. Raddick, M., et al. 2009. “Galaxy Zoo: Exploring the Motivations of Citizen Science Volunteers.” _Astronomy Education Review_ 9 (1).
246. Ratto, Matt, and Boler Megan. 2014. _DIY Citizenship: Critical Making and Social Media_. Cambridge, MA: MIT Press.
247. Reichardt, Jasia. 1969. _Cybernetic Serendipity: The Computer and the Arts_. New York: Frederick A Praeger. .
248. Ridge, Mia. 2013. “From Tagging to Theorizing: Deepening Engagement with Cultural Heritage through Crowdsourcing.” _Curator_ 56 (4): 435–450.
249. Rieger, Oya Y. 2008. _Preservation in the Age of Large-Scale Digitization: A White Paper_. Washington, DC: Council on Library and Information Resources.
250. Rodekamp, Volker, and Bernhard Graf. 2012. _Museen zwischen Qualität und Relevanz: Denkschrift zur Lage der Museen_. Berlin: G+H Verlag.
251. Rogers, Richard. 2012. “Mapping and the Politics of Web Space.” _Theory, Culture & Society_ 29:193–219.
252. Romeo, Fiona, and Lucinda Blaser. 2011. “Bringing Citizen Scientists and Historians Together.” Museums and the Web. .
253. Russell, Andrew L. 2014. _Open Standards and the Digital Age: History, Ideology, and Networks_. New York: Cambridge University P


ersity Press.
255. Samimian-Darash, Limor, and Paul Rabinow. 2015. _Modes of Uncertainty: Anthropological Cases_. Chicago: The University of Chicago Press.
256. Samuel, Henry. 2009. “Nicolas Sarkozy Fights Google over Classic Books.” _The Telegraph_ , December 14. .
257. Samuelson, Pamela. 2010. “Google Book Search and the Future of Books in Cyberspace.” _Minnesota Law Review_ 94 (5): 1308–1374.
258. Samuelson, Pamela. 2011. “Why the Google Book Settlement Failed—and What Comes Next?” _Communications of the ACM_ 54 (11): 29–31.
259. Samuelson, Pamela. 2014. “Mass Digitization as Fair Use.” _Communications of the ACM_ 57 (3): 20–22.
260. Samyn, Jeanette. 2012. “Anti-Anti-Parasitism.” _The New Inquiry_ , September 18.
261. Sanderhoff, Merethe. 2014. _Sharing Is Caring: Åbenhed Og Deling I Kulturarvssektoren_. Copenhagen: Statens Museum for Kunst.
262. Sassen, Saskia. 2008. _Territory, Authority, Rights: From Medieval to Global Assemblages_. Princeton, NJ: Princeton University Press.
263. Schmidt, Henrike. 2009. “‘Holy Cow’ and ‘Eternal Flame’: Russian Online Libraries.” _Kultura_ 1, 4–8. .
264. Schmitz, Dawn. 2008. _The Seamless Cyberinfrastructure: The Challenges of Studying Users of Mass Digitization and Institutional Repositories_. Washington, DC: Digital Library Federation, Council on Library and Information Resources.
265. Schonfeld, Roger, and Liam Sweeney. 2017. “Inclusion, Diversity, and Equity: Members of the Association of Research Libraries.” _Ithaka S+R_ , August 30. .
266. Schüll, Natasha Dow. 2014. _Addiction by Design: Machine Gambling in Las Vegas_. Princeton, NJ: Princeton University Press.
267. Scott, James C. 2009. _Domination and the Arts of Resistance: Hidden Transcripts_. New Haven, CT: Yale University Press.
268. Seddon, Nicholas. 2013. _Government Contracts: Federal, State and Loc


cho Chambers, and the Front Page.” _Nieman Reports_ 62 (4). .

© 2018 Massachusetts Institute of Technology

All rights reserved. No part of this book may be reproduced in any form by any
electronic or mechanical means (including photocopying, recording, or
information storage and retrieval) without permission in writing from the
publisher.

This book was set in ITC Stone Sans Std and ITC Stone Serif Std by Toppan
Best-set Premedia Limited. Printed and bound in the United States of America.

Library of Congress Cataloging-in-Publication Data

Names: Thylstrup, Nanna Bonde, author.

Title: The politics of mass digitization / Nanna Bonde Thylstrup.

Description: Cambridge, MA : The MIT Press, [2018] | Includes bibliographical
references and index.

Identifiers: LCCN 2018010472 | ISBN 9780262039017 (hardcover : alk. paper)

eISBN 9780262350044

Subjects: LCSH: Library materials--Digitization. | Archival materials--
Digitization. | Copyright and digital preservation.

Classification: LCC Z701.3.D54 T49 2018 | DDC 025.8/4--dc23 LC record
available at

 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.