kolhoz in Bodo 2014
the relevant subsections
of lib.ru and split off. Sites specializing in those genres quickly formed their own ecosystem. [L], the first
of its kind, now charges a monthly fee to provide access to the collection. The [f] community split off
from [L] the same way that [L] split off from lib.ru, to provide free and unrestricted access to a
fundamentally similar collection. Finally, some in the community felt the need to focus their efforts on a
separate collection of scientific works. This became Kolhoz collection.
The genesis of a million book scientific library
A Kolhoz (Russian: колхо́ з) was one of the types of collective farm that emerged in the early Soviet
period. In the early days, it was a self-governing, community-owned collaborative enterprise, with many
of the features of a commons. For the Russian digital librarians, these historical resonances were
The kolhoz group was initially a community that scanned and processed scientific materials: books and,
occasionally, articles. The ethos was free sharing. Academic institutes in Russia were in dire need of
scientific texts; they xeroxed and scanned whatever they could. Usually, the files were then stored on the
institute's ftp site and could be downloaded freely. There were at least three major research institutes
that did this, back in early 2000s, unconnected to each other in any way, located in various faraway parts
of Russia. Most of these scans were appropriated by the kolhoz group and processed into DJVU4.
The sources of files for kolhoz were, initially, several collections from academic institutes (downloaded
whenever the ftp servers were open for anonymous access; in one case, from one of the institutes of the
Chinese academy of sciences, but mostly from Russian academic institutes). At that time (around 2002),
there were also several commercialized collections of scanned books on sale in Russia (mostly, these were
college-level textbooks on math and physics); these files were also all copied to kolhoz and processed into
DJVU. The focus was on collecting the most important science textbooks and monographs of all time, in
all fields of natural science.
There was never any commercial support. The kolhoz group never had a web site with a database, like
most projects today. They had an ftp server with files, and the access to ftp was given by PM in a forum.
This ftp server was privately supported by one of the members (who was an academic researcher, like
most kolhoz members). The files were distributed directly by burning files on writable DVDs and giving the
DJVU is a file format that revolutionized online book distribution the way mp3 revolutionized the online music
distribution. For books that contain graphs, images and mathematical formulae scanning is the only digitization
option. However, the large number of resulting image files is difficult to handle. The DJVU file format allows for the
images of scanned book pages to be stored in the smallest possible file size, which makes it the perfect medium for
the distribution of scanned e-books.
Draft Manuscript, 11/4/2014, DO NOT CITE!
DVDs away. Later, the ftp access was closed to the public, and only a temporary file-swapping ftp server
remained. Today the kolhoz DVD releases are mostly spread via torrents.” 5
Kolhoz amassed around fifty thousand documents, the mexmat collection of the Moscow State
University Department of Mechanics and Mathematics (Moshkov’s alma mater) was around the same
size, the “world of books” collection (mirknig) had around thirty thousand files, and there were around a
dozen other smaller archives, each with approximately 10 thousand files in their respective collections.
The Kolhoz group dominated the science-minded ebook community in Russia well into the late 2000’s.
Kolhoz, however, suffered from the same problems as the early Fidonet-based text collections. Since it
was distributed in DVDs, via ftp servers and on torrents, it was hard to search, it lacked a proper catalog
and it was prone to fragmentation. Parallel solutions soon emerged: around 2006-7, an existing book site
called Gigapedia copied the English books from Kolhoz, set up a catalog, and soon became the most
influential pirate library in the English speaking internet.
Similar cataloguing efforts soon emerged elsewhere. In 2007, someone on rutracker.ru, a Russian BBS
focusing on file sharing, posted torrent links to 91 DVDs containing science and technology titles
aggregated from various other Russian sources, including Kolhoz. This massive collection had no
categorization or particular order. But it soon attracted an archivist: a user of the forum started the
laborious task of organizing the texts into a usable, searchable format—first filtering duplicates and
organizing existing metadata first into an excel spreadsheet, and later moving to a more open, webbased database operating under the name Aleph.
Aleph inherited more than just books from Kolhoz and Moshkov’s lib.ru. It inherited their elitism with
regard to canonical texts, and their understanding of librarianship as a community effort. Like the earlier
sites, Aleph’s collections are complemented by a stream of user submissions. Like the other sites, the
number of submissions grew rapidly as the site’s visibility, reputation and trustworthiness was
established, and like the others it later fell, as more and more of what was perceived as canonical
literature was uploaded:
Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.