digitization in Bodo 2014


dreds of thousands of books and millions of
journal articles. In this contribution we try to understand the factors that led to the development of
these sites, and the sociocultural and legal conditions that enable them to operate under hostile legal
and political conditions. Through the reconstruction of the micro-histories of peer produced online text
collections that played a central role in the history of RuNet, we are able to link the formal and informal
support for these sites to the specific conditions developed under the Soviet and post Soviet times.

(pirate) libraries on the net
The digitization and collection of texts was one of the very first activities enabled by computers. Project
Gutenberg, the first in line of digital libraries was established as early as 1971. By the early nineties, a
number of online electronic text archives emerged, all hoping to finally realize the dream that was
chased by humans every since the first library: the collection of everything (Battles, 2004), the Memex
(Bush, 1945), the Mundaneum (Rieusset-Lemarié, 1997), the Library of Babel (Borges, 1998). It did not
take long to realize that the dream was still beyond reach: the information storage and retri


the most important books,
novels that "everyone must read" and such stuff. People typed in poetry, smaller prose pieces. I have
myself read a sci-fi novel printed on a mainframe, which was obviously typed in. This novel was by
Strugatski brothers. It was not prohibited or dissident, but just impossible to buy in the stores. These
were culturally important, cult novels, so people typed them in. […] At this point it became clear that
there was a lot of value in having a plaintext file with some novels, and the most popular novels were first
digitized in this way.”
The next stage in the text digitization started around 1994. By that time growing numbers of people had
computers, scanning peripherals, OCR software. Russian internet and PC penetration while extremely
low overall in the 1990s (0.1% of the population having internet access in 1994, growing to 8.3% by
2003), began to make inroads in educational and scientific institutions and among Moscow and
St.Petersburg elites, who were often the critical players in these networks. As access to technologies
increased a much wider array of people began to digitize their favorite texts, and these collections began
to circulate, first via CD-ROMs, later via the internet.
One of such collection belonged to Maxim Moshkov, who published his library under the name lib.ru in
1994. Moshkov was a graduate of the Moscow State University Department of Mechanics and
Mathematics, which played a large role in the digitization of scientific works. After graduation, he started
to work for the Scientific Research Institute of System Development, a computer science institute
associated with the Russian Academy of Sciences. He describes the early days of his collection as follows:
“ I began to collect electronic texts in 1990, on a desktop computer. When I got on the Internet in 1994, I
found lots of sites with texts. It was like a dream came true: there they were, all the desired books. But
these collections were in a dreadful state! Incompatible formats, different encodings, missing content. I
had to spend hours sco


owly the library grew, and the audience increased with it. People started
to send books to me, because they were easier to read in my collection. And the time came when I
stopped surfing the internet for books: regular readers are now sending me the books. Day after day I get
about 100 emails, and 10-30 of them contain books. So many books were sent in, that I did not have time
to process them. Authors, translators and publishers also started to send texts. They all needed the
library.”(Мошков, 1999)

In the second half of the 1990’s, the Russian Internet—RuNet—was awash in book digitization projects.
With the advent of scanners, OCR technology, and the Internet, the work of digitization eased
considerably. Texts migrated from print to digital and sometimes back to print again. They circulated
through different collections, which, in turn, merged, fell apart, and re-formed. Digital libraries with the
mission to collect and consolidate these free-floating texts sprung up by the dozens.
Such digital librarianship was the antithesis of official Soviet book culture: it was free, bottom-up,
democratic, and uncensored. It also offered a partial remedy to problems created by the post-Soviet
collapse of the economy: the impoverishment of libraries, readers, and publishers. In this context, book
digitization and collecting also offered a sense of political, economic and cultural agency, with parallels
to the copying and distribution of texts in Soviet times. The capacity to scale up these practices coincided
with the moment when anti-totalitarian social sentiments were the strongest, and economic needs the
direst.
The unprecedented bloom of digital librarianship is the result of the superimposition of multiple waves
of distinct transformations: technological, political, economical and social. “Maksim Moshkov's Library”
was ground zero for this convergence and soon became a central point of exchange for the community
engaged in text digitization and collection:
[At the outset] there were just a couple of people who started scanning books in large quantities. Literally
hundreds of books. Others started proofreading, etc. There was a huge hole in the market for books.
Science fiction, adventure, crime fiction, all of this was hugely in demand by the public. So lib.ru was to a
large part the response, and was filled by those books that people most desired and most valued.
For years, lib.ru integrated as much as it could of the different digital libraries flourishing in the RuNet. By
doing so, it preserved the collections of the many shor


ial support. The kolhoz group never had a web site with a database, like
most projects today. They had an ftp server with files, and the access to ftp was given by PM in a forum.
This ftp server was privately supported by one of the members (who was an academic researcher, like
most kolhoz members). The files were distributed directly by burning files on writable DVDs and giving the

4

DJVU is a file format that revolutionized online book distribution the way mp3 revolutionized the online music
distribution. For books that contain graphs, images and mathematical formulae scanning is the only digitization
option. However, the large number of resulting image files is difficult to handle. The DJVU file format allows for the
images of scanned book pages to be stored in the smallest possible file size, which makes it the perfect medium for
the distribution of scanned e-books.

11

Draft Manuscript, 11/4/2014, DO NOT CITE!
DVDs away. Later, the ftp access was closed to the public, and only a temporary file-swapping ftp server
remained. Today the kolhoz DVD releases are mostly spread via torrents.” 5
Kolhoz amassed around fifty thousand documents, the mexmat collection of the Moscow State
Universi


he collection is represented not by the number of books but
by the amount of knowledge it contains. [ALEPH] does not need to grow more and I am not the only one
among us who thinks so. […]
We have absolutely no idea who sends books in. It is practically impossible to know, because there are a
million books. We gather huge collections which eliminate any traces of the original uploaders.
My expectation is that new arrivals will dry up. Not completely, as I described above, some books will
always be scanned or rescanned (it nowadays happens quite surprisingly often) and the overall process of
digitization cannot and should not be stopped. It is also hard to say when the slowdown will occur: I
expected it about a year ago, but then library.nu got shut down and things changed dramatically in many
respects. Now we are "in charge" (we had been the largest anyways, just now everyone thinks we are in
5

Anonymous source #1

12

Draft Manuscript, 11/4/2014, DO NOT CITE!
charge) and there has been a temporary rise in the book inflow. At the moment, relatively small or
previously unseen collections are being integrated into [ALEPH]. Perhaps in a year it will saturate.
However, intuition is not a good g


had to adopt global norms, while the global norms struggled to adapt to the emergence of digital
copying.
The first post-Soviet decade produced new copyright laws that conformed with some of the international
norms advocated by Western rightsholders, but little legal clarity or enforceability (Sezneva & Karaganis,
2011). Under such conditions, informally negotiated copynorms set in to fill the void of non-existent,
unreasonable, or unenforceable laws. The pirate libraries in the RuNet are as much regulated by such
norms as by the actual laws themselves.
During most of the 1990’s user-driven digitization and archiving was legal, or to be more exact, wasn’t
illegal. The first Russian copyright law, enacted in 1993, did not cover “internet rights” until a 2006
amendment (Budylin & Osipova, 2007; Elst, 2005, p. 425). As a result, many argued (including the
Moscow prosecutor’s office), that the distribution of copyrighted works via the internet was not
copyright infringement. Authors and publishers, who saw their works appear in digital form, and
circulated via CD-ROMs and the internet, had to rely on informal norms, still in development, to establish
control over their texts vis-à-vis en


am afraid to live in a world where no one reads
books. This is already the case in America, and it is speeding up with us. I don’t just want to derail this
process, I would like to turn it around.”

17

Draft Manuscript, 11/4/2014, DO NOT CITE!
Moshkov played a crucial role in consolidating copynorms in the Russian digital publishing domain. His
reputation and place in the Russian literary domain is marked by a number of prizes12, and the library’s
continued existence. This place was secured by a number of closely intertwined factors:







Framing and anchoring the digitization and distribution practice in the library tradition.
The non-profit status of the enterprise.
Respecting the wishes of the rights holders even if he was not legally obliged to do so.
Maintaining active communication with the different stakeholders in the community,
including authors and readers.
Responding to a clear gap in affordable, legal access.
Conservatism with regard to the book, anchored in the argument that digital texts are not
substitutes for printed matter.

Many other digital libraries tried to follow Moshkov’s formula, but the times were changing. Internet and
computer access le

 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.