Medak, Sekulic & Mertens
Book Scanning and Post-Processing Manual Based on Public Library Overhead Scanner v1.2
2014


PUBLIC LIBRARY
&
MULTIMEDIA INSTITUTE

BOOK SCANNING & POST-PROCESSING MANUAL
BASED ON PUBLIC LIBRARY OVERHEAD SCANNER

Written by:
Tomislav Medak
Dubravka Sekulić
With help of:
An Mertens

Creative Commons Attribution - Share-Alike 3.0 Germany

TABLE OF CONTENTS

Introduction
3
I. Photographing a printed book
7
I. Getting the image files ready for post-processing
11
III. Transformation of source images into .tiffs
13
IV. Optical character recognition
16
V. Creating a finalized e-book file
16
VI. Cataloging and sharing the e-book
16
Quick workflow reference for scanning and post-processing
18
References
22

INTRODUCTION:
BOOK SCANNING - FROM PAPER BOOK TO E-BOOK
Initial considerations when deciding on a scanning setup
Book scanning tends to be a fragile and demanding process. Many factors can go wrong or produce
results of varying quality from book to book or page to page, requiring experience or technical skill
to resolve issues that occur. Cameras can fail to trigger, components to communicate, files can get
corrupted in the transfer, storage card doesn't get purged, focus fails to lock, lighting conditions
change. There are trade-offs between the automation that is prone to instability and the robustness
that is prone to become time consuming.
Your initial choice of book scanning setup will have to take these trade-offs into consideration. If
your scanning community is confined to your hacklab, you won't be risking much if technological
sophistication and integration fails to function smoothly. But if you're aiming at a broad community
of users, with varying levels of technological skill and patience, you want to create as much timesaving automation as possible on the condition of keeping maximum stability. Furthermore, if the
time of individual members of your scanning community can contribute is limited, you might also
want to divide some of the tasks between users and their different skill levels.
This manual breaks down the process of digitization into a general description of steps in the
workflow leading from the printed book to a digital e-book, each of which can be in a concrete
situation addressed in various manners depending on the scanning equipment, software, hacking
skills and user skill level that are available to your book scanning project. Several of those steps can
be handled by a single piece of equipment or software, or you might need to use a number of them your mileage will vary. Therefore, the manual will try to indicate the design choices you have in the
process of planning your workflow and should help you make decisions on what design is best for
you situation.
Introducing book scanner designs
The book scanning starts with the capturing of digital image files on the scanning equipment. There
are three principle types of book scanner designs:
 flatbed scanner
 single camera overhead scanner
 dual camera overhead scanner
Conventional flatbed scanners are widely available. However, given that they require the book to be
spread wide open and pressed down with the platen in order to break the resistance of the book
binding and expose sufficiently the inner margin of the text, it is the most destructive approach for
the book, imprecise and slow.
Therefore, book scanning projects across the globe have taken to custom designing improvised
setups or scanner rigs that are less destructive and better suited for fast turning and capturing of
pages. Designs abound. Most include:




one or two digital photo cameras of lesser or higher quality to capture the pages,
transparent V-shaped glass or Plexiglas platen to press the open book against a V-shape
cradle, and
a light source.

The go-to web resource to help you make an informed decision is the DIY book scanning
community at http://diybookscanner.org. A good place to start is their intro
(http://wiki.diybookscanner.org/ ) and scanner build list (http://wiki.diybookscanner.org/scannerbuild-list ).
The book scanners with a single camera are substantially cheaper, but come with an added difficulty
of de-warping the distorted page images due to the angle that pages are photographed at, which can
sometimes be difficult to correct in the post-processing. Hence, in this introductory chapter we'll
focus on two camera designs where the camera lens stands relatively parallel to the page. However,
with a bit of adaptation these instructions can be used to work with any other setup.
The Public Library scanner
In the focus of this manual is the scanner built for the Public Library project, designed by Voja
Antonić (see Illustration 1). The Public Library scanner was built with the immediate use by a wide
community of users in mind. Hence, the principle consideration in designing the Public Library
scanner was less sophistication and more robustness, facility of use and distributed process of
editing.
The board designs can be found here: http://www.memoryoftheworld.org/blog/2012/10/28/ourbeloved-bookscanner. The current iterations are using two Canon 1100 D cameras with the kit lens
Canon EF-S 18-55mm 1:3.5-5.6 IS. Cameras are auto-charging.

Illustration 1: Public Library Scanner
The scanner operates by automatically lowering the Plexiglas platen, illuminating the page and then
triggering camera shutters. The turning of pages and the adjustments of the V-shaped cradle holding

the book are manual.
The scanner is operated by a two-button controller (see Illustration 2). The upper, smaller button
breaks the capture process in two steps: the first click lowers the platen, increases the light level and
allows you to adjust the book or the cradle, the second click triggers the cameras and lifts the platen.
The lower button has
two modes. A quick
click will execute the
whole capture process in
one go. But if you hold
it pressed longer, it will
lower the platen,
allowing you to adjust
the book and the cradle,
and lift it without
triggering cameras when
you press again.

Illustration 2: A two-button controller

More on this manual: steps in the book scanning process
The book scanning process in general can be broken down in six steps, each of which will be dealt
in a separate chapter in this manual:
I. Photographing a printed book
I. Getting the image files ready for post-processing
III. Transformation of source images into .tiffs
IV. Optical character recognition
V. Creating a finalized e-book file
VI. Cataloging and sharing the e-book
A step by step manual for Public Library scanner
This manual is primarily meant to provide a detailed description and step-by-step instructions for an
actual book scanning setup -- based on the Voja Antonić's scanner design described above. This is a
two-camera overhead scanner, currently equipped with two Canon 1100 D cameras with EF-S 1855mm 1:3.5-5.6 IS kit lens. It can scan books of up to A4 page size.
The post-processing in this setup is based on a semi-automated transfer of files to a GNU/Linux
personal computer and on the use of free software for image editing, optical character recognition
and finalization of an e-book file. It was initially developed for the HAIP festival in Ljubljana in
2011 and perfected later at MaMa in Zagreb and Leuphana University in Lüneburg.
Public Library scanner is characterized by a somewhat less automated yet distributed scanning
process than highly automated and sophisticated scanner hacks developed at various hacklabs. A
brief overview of one such scanner, developed at the Hacker Space Bruxelles, is also included in
this manual.
The Public Library scanning process proceeds thus in following discrete steps:

1. creating digital images of pages of a book,
2. manual transfer of image files to the computer for post-processing,
3. automated renaming of files, ordering of even and odd pages, rotation of images and upload to a
cloud storage,
4. manual transformation of source images into .tiff files in ScanTailor
5. manual optical character recognition and creation of PDF files in gscan2pdf
The detailed description of the Public Library scanning process follows below.
The Bruxelles hacklab scanning process
For purposes of comparison, here we'll briefly reference the scanner built by the Bruxelles hacklab
(http://hackerspace.be/ScanBot). It is a dual camera design too. With some differences in hardware functionality
(Bruxelles scanner has automatic turning of pages, whereas Public Library scanner has manual turning of pages), the
fundamental difference between the two is in the post-processing - the level of automation in the transfer of images
from the cameras and their transformation into PDF or DjVu e-book format.
The Bruxelles scanning process is different in so far as the cameras are operated by a computer and the images are
automatically transferred, ordered and made ready for further post-processing. The scanner is home-brew, but the
process is for advanced DIY'ers. If you want to know more on the design of the scanner, contact Michael Korntheuer at
contact@hackerspace.be.
The scanning and post-processing is automated by a single Python script that does all the work
http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEAD
The scanner uses two Canon point and shoot cameras. Both cameras are connected to the PC with USB. They both run
PTP/CHDK (Canon Hack Development Kit). The scanning sequence is the following:
1. Script sends CHDK command line instructions to the cameras
2. Script sorts out the incoming files. This part is tricky. There is no reliable way to make a distinction between the left
and right camera, only between which camera was recognized by USB first. So the protocol is to always power up the
left camera first. See the instructions with the source code.
3. Collect images in a PDF file
4. Run script to OCR a .PDF file to plain .TXT file: http://git.constantvzw.org/?
p=algolit.git;a=blob;f=scanbot_brussel/ocr_pdf.sh;h=2c1f24f9afcce03520304215951c65f58c0b880c;hb=HEAD

I. PHOTOGRAPHING A PRINTED BOOK
Technologically the most demanding part of the scanning process is creating digital images of the
pages of a printed book. It's a process that is very different form scanner design to scanner design,
from camera to camera. Therefore, here we will focus strictly on the process with the Public Library
scanner.
Operating the Public Library scanner
0. Before you start:
Better and more consistent photographs lead to a more optimized and faster post-processing and a
higher quality of the resulting digital e-book. In order to guarantee the quality of images, before you
start it is necessary to set up the cameras properly and prepare the printed book for scanning.
a) Loosening the book
Depending on the type and quality of binding, some books tend to be too resistant to opening fully
to reveal the inner margin under the pressure of the scanner platen. It is thus necessary to “break in”
the book before starting in order to loosen the binding. The best way is to open it as wide as
possible in multiple places in the book. This can be done against the table edge if the book is more
rigid than usual. (Warning – “breaking in” might create irreversible creasing of the spine or lead to
some pages breaking loose.)
b) Switch on the scanner
You start the scanner by pressing the main switch or plugging the power cable into the the scanner.
This will also turn on the overhead LED lights.

c) Setting up the cameras
Place the cameras onto tripods. You need to move the lever on the tripod's head to allow the tripod
plate screwed to the bottom of the camera to slide into its place. Secure the lock by turning the lever
all the way back.
If the automatic chargers for the camera are provided, open the battery lid on the bottom of the
camera and plug the automatic charger. Close the lid.
Switch on the cameras using the lever on the top right side of the camera's body and place it into the
aperture priority (Av) mode on the mode dial above the lever (see Illustration 3). Use the main dial
just above the shutter button on the front side of the camera to set the aperture value to F8.0.

Illustration 3: Mode and main dial, focus mode switch, zoom
and focus ring
On the lens, turn the focus mode switch to manual (MF), turn the large zoom ring to set the value
exactly midway between 24 and 35 mm (see Illustration 3). Try to set both cameras the same.
To focus each camera, open a book on the cradle, lower the platen by holding the big button on the
controller, and turn on the live view on camera LCD by pressing the live view switch (see
Illustration 4). Now press the magnification button twice and use the focus ring on the front of the
lens to get a clear image view.

Illustration 4: Live view switch and magnification button

d) Connecting the cameras
Now connect the cameras to the remote shutter trigger cables that can be found lying on each side
of the scanner. They need to be plugged into a small round port hidden behind a protective rubber
cover on the left side of the cameras.
e) Placing the book into the cradle and double-checking the cameras
Open the book in the middle and place it on the cradle. Hold pressed the large button on the
controller to lower the Plexiglas platen without triggering the cameras. Move the cradle so that the
the platen fits into with the middle of the book.
Turn on the live view on the cameras' LED to see if the the pages fit into the image and if the
cameras are positioned parallel to the page.
f) Double-check storage cards and batteries
It is important that both storage cards on cameras are empty before starting the scanning in order
not to mess up the page sequence when merging photos from the left and the right camera in the
post-processing. To double-check, press play button on cameras and erase if there are some photos
left from the previous scan -- this you do by pressing the menu button, selecting the fifth menu from
the left and then select 'Erase Images' -> 'All images on card' -> 'OK'.
If no automatic chargers are provided, double-check on the information screen that batteries are
charged. They should be fully charged before starting with the scanning of a new book.

g) Turn off the light in the room
Lighting conditions during scanning should be as constant as possible, to reduce glare and achieve
maximum quality remove any source of light that might reflect off the Plexiglas platen. Preferably
turn off the light in the room or isolate the scanner with the black cloth provided.

1. Photographing a book
Now you are ready to start scanning. Place the book closed in the cradle and lower the platen by
holding the large button on the controller pressed (see Illustration 2). Adjust the position of the
cradle and lift the platen by pressing the large button again.
To scan you can now either use the small button on the controller to lower the platen, adjust and
then press it again to trigger the cameras and lift the platen. Or, you can just make a short press on
the large button to do it in one go.
ATTENTION: When the cameras are triggered, the shutter sound has to be heard coming
from both cameras. If one camera is not working, it's best to reconnect both cameras (see
Section 0), make sure the batteries are charged or adapters are connected, erase all images
and restart.
A mistake made in the photographing requires a lot of work in the post-processing, so it's
much quicker to repeat the photographing process.
If you make a mistake while flipping pages, or any other mistake, go back and scan from the page
you missed or incorrectly scanned. Note down the page where the error occurred and in the postprocessing the redundant images will be removed.
ADVICE: The scanner has a digital counter. By turning the dial forward and backward, you
can set it to tell you what page you should be scanning next. This should help you avoid
missing a page due to a distraction.
While scanning, move the cradle a bit to the left from time to time, making sure that the tip of Vshaped platen is aligned with the center of the book and the inner margin is exposed enough.

II. GETTING THE IMAGE FILES READY FOR POST-PROCESSING
Once the book pages have been photographed, they have to be transfered to the computer and
prepared for post-processing. With two-camera scanners, the capturing process will result in two
separate sets of images -- odd and even pages -- coming from the left and right cameras respectively
-- and you will need to rename and reorder them accordingly, rotate them into a vertical position
and collate them into a single sequence of files.
a) Transferring image files
For the transfer of files your principle process design choices are either to copy the files by
removing the memory cards from the cameras and copying them to the computer via a card reader
or to transfer them via a USB cable. The latter process can be automated by remote operating your
cameras from a computer, however this can be done only with a certain number of Canon cameras
(http://bit.ly/16xhJ6b) that can be hacked to run the open Canon Hack Development Kit firmware
(http://chdk.wikia.com).
After transferring the files, you want to erase all the image files on the camera memory card, so that
they would not end up messing up the scan of the next book.
b) Renaming image files
As the left and right camera are typically operated in sync, the photographing process results in two
separate sets of images, with even and odd pages respectively, that have completely different file
names and potentially same time stamps. So before you collate the page images in the order how
they appear in the book, you want to rename the files so that the first image comes from the right
camera, the second from the left camera, the third comes again from the right camera and so on.
You probably want to do a batch renaming, where your right camera files start with n and are offset
by an increment of 2 (e.g. page_0000.jpg, page_0002.jpg,...) and your left camera files start with
n+1 and are also offset by an increment of 2 (e.g. page_0001.jpg, page_0003.jpg,...).
Batch renaming can be completed either from your file manager, in command line or with a number
of GUI applications (e.g. GPrename, rename, cuteRenamer on GNU/Linux).
c) Rotating image files
Before you collate the renamed files, you might want to rotate them. This is a step that can be done
also later in the post-processing (see below), but if you are automating or scripting your steps this is
a practical place to do it. The images leaving your cameras will be positioned horizontally. In order
to position them vertically, the images from the camera on the right will have to be rotated by 90
degrees counter-clockwise, the images from the camera on the left will have to be rotated by 90
degrees clockwise.
Batch rotating can be completed in a number of photo-processing tools, in command line or
dedicated applications (e.g. Fstop, ImageMagick, Nautilust Image Converter on GNU/Linux).
d) Collating images into a single batch
Once you're done with the renaming and rotating of the files, you want to collate them into the same
folder for easier manipulation later.

Getting the image files ready for post-processing on the Public Library scanner
In the case of Public Library scanner, a custom C++ script was written by Mislav Stublić to
facilitate the transfer, renaming, rotating and collating of the images from the two cameras.
The script prompts the user to place into the card reader the memory card from the right camera
first, gives a preview of the first and last four images and provides an entry field to create a subfolder in a local cloud storage folder (path: /home/user/Copy).
It transfers, renames, rotates the files, deletes them from the card and prompts the user to replace the
card with the one from the left camera in order to the transfer the files from there and place them in
the same folder. The script was created for GNU/Linux system and it can be downloaded, together
with its source code, from: https://copy.com/nLSzflBnjoEB
If you have other cameras than Canon, you can edit the line 387 of the source file to change to the
naming convention of your cameras, and recompile by running the following command in your
terminal: "gcc scanflow.c -o scanflow -ludev `pkg-config --cflags --libs gtk+-2.0`"
In the case of Hacker Space Bruxelles scanner, this is handled by the same script that operates the cameras that can be
downloaded from: http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEAD

III. TRANSFORMATION OF SOURCE IMAGES INTO .TIFFS
Images transferred from the cameras are high definition full color images. You want your cameras
to shoot at the largest possible .jpg resolution in order for resulting files to have at least 300 dpi (A4
at 300 dpi requires a 9.5 megapixel image). In the post-processing the size of the image files needs
to be reduced down radically, so that several hundred images can be merged into an e-book file of a
tolerable size.
Hence, the first step in the post-processing is to crop the images from cameras only to the content of
the pages. The surroundings around the book that were captured in the photograph and the white
margins of the page will be cropped away, while the printed text will be transformed into black
letters on white background. The illustrations, however, will need to be preserved in their color or
grayscale form, and mixed with the black and white text. What were initially large .jpg files will
now become relatively small .tiff files that are ready for optical character recognition process
(OCR).
These tasks can be completed by a number of software applications. Our manual will focus on one
that can be used across all major operating systems -- ScanTailor. ScanTailor can be downloaded
from: http://scantailor.sourceforge.net/. A more detailed video tutorial of ScanTailor can be found
here: http://vimeo.com/12524529.
ScanTailor: from a photograph of a page to a graphic file ready for OCR
Once you have transferred all the photos from cameras to the computer, renamed and rotated them,
they are ready to be processed in the ScanTailor.
1) Importing photographs to ScanTailor
- start ScanTailor and open ‘new project’
- for ‘input directory’ chose the folder where you stored the transferred and renamed photo images
- you can leave ‘output directory’ as it is, it will place your resulting .tiffs in an 'out' folder inside
the folder where your .jpg images are
- select all files (if you followed the naming convention above, they will be named
‘page_xxxx.jpg’) in the folder where you stored the transferred photo images, and click 'OK'
- in the dialog box ‘Fix DPI’ click on All Pages, and for DPI choose preferably '600x600', click
'Apply', and then 'OK'
2) Editing pages
2.1 Rotating photos/pages
If you've rotated the photo images in the previous step using the scanflow script, skip this step.
- Rotate the first photo counter-clockwise, click Apply and for scope select ‘Every other page’
followed by 'OK'
- Rotate the following photo clockwise, applying the same procedure like in the previous step
2.2 Deleting redundant photographs/pages
- Remove redundant pages (photographs of the empty cradle at the beginning and the end of the
book scanning sequence; book cover pages if you don’t want them in the final scan; duplicate pages
etc.) by right-clicking on a thumbnail of that page in the preview column on the right side, selecting
‘Remove from project’ and confirming by clicking on ‘Remove’.

# If you by accident remove a wrong page, you can re-insert it by right-clicking on a page
before/after the missing page in the sequence, selecting 'insert after/before' (depending on which
page you selected) and choosing the file from the list. Before you finish adding, it is necessary to
again go through the procedure of fixing DPI and Rotating.
2.3 Adding missing pages
- If you notice that some pages are missing, you can recapture them with the camera and insert them
manually at this point using the procedure described above under 2.2.
3) Split pages and deskew
Steps ‘Split pages’ and ‘Deskew’ should work automatically. Run them by clicking the ‘Play’ button
under the 'Select content' function. This will do the three steps automatically: splitting of pages,
deskewing and selection of content. After this you can manually re-adjust splitting of pages and deskewing.
4) Selecting content
Step ‘Select content’ works automatically as well, but it is important to revise the resulting selection
manually page by page to make sure the entire content is selected on each page (including the
header and page number). Where necessary, use your pointer device to adjust the content selection.
If the inner margin is cut, go back to 'Split pages' view and manually adjust the selected split area. If
the page is skewed, go back to 'Deskew' and adjust the skew of the page. After this go back to
'Select content' and readjust the selection if necessary.
This is the step where you do visual control of each page. Make sure all pages are there and
selections are as equal in size as possible.
At the bottom of thumbnail column there is a sort option that can automatically arrange pages by
the height and width of the selected content, making the process of manual selection easier. The
extreme differences in height should be avoided, try to make selected areas as much as possible
equal, particularly in height, across all pages. The exception should be cover and back pages where
we advise to select the full page.
5) Adjusting margins
For best results select in the previous step content of the full cover and back page. Now go to the
'Margins' step and set under Margins section both Top, Bottom, Left and Right to 0.0 and do 'Apply
to...' → 'All pages'.
In Alignment section leave 'Match size with other pages' ticked, choose the central positioning of
the page and do 'Apply to...' → 'All pages'.
6) Outputting the .tiffs
Now go to the 'Output' step. Ignore the 'Output Resolution' section.
Next review two consecutive pages from the middle of the book to see if the scanned text is too
faint or too dark. If the text seems too faint or too dark, use slider Thinner – Thicker to adjust. Do
'Apply to' → 'All pages'.
Next go to the cover page and select under Mode 'Color / Grayscale' and tick on 'White Margins'.
Do the same for the back page.
If there are any pages with illustrations, you can choose the 'Mixed' mode for those pages and then

under the thumb 'Picture Zones' adjust the zones of the illustrations.
Now you are ready to output the files. Just press 'Play' button under 'Output'. Once the computer is
finished processing the images, just do 'File' → 'Save as' and save the project.

IV. OPTICAL CHARACTER RECOGNITION
Before the edited-down graphic files are finalized as an e-book, we want to transform the image of
the text into an actual text that can be searched, highlighted, copied and transformed. That
functionality is provided by Optical Character Recognition. This a technically difficult task dependent on language, script, typeface and quality of print - and there aren't that many OCR tools
that are good at it. There is, however, a relatively good free software solution - Tesseract
(http://code.google.com/p/tesseract-ocr/) - that has solid performance, good language data and can
be trained for an even better performance, although it has its problems. Proprietary solutions (e.g.
Abby FineReader) sometimes provide superior results.
Tesseract supports as input format primarily .tiff files. It produces a plain text file that can be, with
the help of other tools, embedded as a separate layer under the original graphic image of the text in
a PDF file.
With the help of other tools, OCR can be performed also against other input files, such as graphiconly PDF files. This produces inferior results, depending again on the quality of graphic files and
the reproduction of text in them. One such tool is a bashscript to OCR a ODF file that can be found
here: https://github.com/andrecastro0o/ocr/blob/master/ocr.sh
As mentioned in the 'before scanning' section, the quality of the original book will influence the
quality of the scan and thus the quality of the OCR. For a comparison, have a look here:
http://www.paramoulipist.be/?p=1303
Once you have your .txt file, there is still some work to be done. Because OCR has difficulties to
interpret particular elements in the lay-out and fonts, the TXT file comes with a lot of errors.
Recurrent problems are:
- combinations of specific letters in some fonts (it can mistake 'm' for 'n' or 'I' for 'i' etc.);
- headers become part of body text;
- footnotes are placed inside the body text;
- page numbers are not recognized as such.

V. CREATING A FINALIZED E-BOOK FILE
After the optical character recognition has been completed, the resulting text can be merged with
the images of pages and output into an e-book format. While increasingly the proper e-book file
formats such as ePub have been gaining ground, PDFs still remain popular because many people
tend to read on their computers, and they retain the original layout of the book on paper including
the absolute pagination needed for referencing in citations. DjVu is also an option, as an alternative
to PDF, used because of its purported superiority, but it is far less popular.
The export to PDF can be done again with a number of tools. In our case we'll complete the optical
character recognition and PDF export in gscan2pdf. Again, the proprietary Abbyy FineReader will
produce a bit smaller PDFs.
If you prefer to use an e-book format that works better with e-book readers, obviously you will have
to remove some of the elements that appear in the book - headers, footers, footnotes and pagination.

This can be done earlier in the process of cropping down the original .jpg image files (see under III)
or later by transforming the PDF files. This can be done in Calibre (http://calibre-ebook.com) by
converting the PDF into an ePub, where it can be further tweaked to better accommodate or remove
the headers, footers, footnotes and pagination.
Optical character recognition and PDF export in Public Library workflow
Optical character recognition with the Tesseract engine can be performed on GNU/Linux by a
number of command line and GUI tools. Much of those tools exist also for other operating systems.
For the users of the Public Library workflow, we recommend using gscan2pdf application both for
the optical character recognition and the PDF or DjVu export.
To do so, start gscan2pdf and open your .tiff files. To OCR them, go to 'Tools' and select 'OCR'. In
the dialog box select the Tesseract engine and your language. 'Start OCR'. Once the OCR is
finished, export the graphic files and the OCR text to PDF by selecting 'Save as'.
However, given that sometimes the proprietary solutions produce better results, these tasks can also
be done, for instance, on the Abbyy FineReader running on a Windows operating system running
inside the Virtual Box. The prerequisites are that you have both Windows and Abbyy FineReader
you can install in the Virtual Box. If using Virtual Box, once you've got both installed, you need to
designate a shared folder in your Virtual Box and place the .tiff files there. You can now open them
from the Abbyy FineReader running in the Virtual Box, OCR them and export them into a PDF.
To use Abbyy FineReader transfer the output files in your 'out' out folder to the shared folder of the
VirtualBox. Then start the VirtualBox, start Windows image and in Windows start Abbyy
FineReader. Open the files and let the Abbyy FineReader read the files. Once it's done, output the
result into PDF.

VI. CATALOGING AND SHARING THE E-BOOK
Your road from a book on paper to an e-book is complete. If you want to maintain your library you
can use Calibre, a free software tool for e-book library management. You can add the metadata to
your book using the existing catalogues or you can enter metadata manually.
Now you may want to distribute your book. If the work you've digitized is in the public domain
(https://en.wikipedia.org/wiki/Public_domain), you might consider contributing it to the Gutenberg
project
(http://www.gutenberg.org/wiki/Gutenberg:Volunteers'_FAQ#V.1._How_do_I_get_started_as_a_Pr
oject_Gutenberg_volunteer.3F ), Wikibooks (https://en.wikibooks.org/wiki/Help:Contributing ) or
Arhive.org.
If the work is still under copyright, you might explore a number of different options for sharing.

QUICK WORKFLOW REFERENCE FOR SCANNING AND
POST-PROCESSING ON PUBLIC LIBRARY SCANNER
I. PHOTOGRAPHING A PRINTED BOOK
0. Before you start:
- loosen the book binding by opening it wide on several places
- switch on the scanner
- set up the cameras:
- place cameras on tripods and fit them tigthly
- plug in the automatic chargers into the battery slot and close the battery lid
- switch on the cameras
- switch the lens to Manual Focus mode
- switch the cameras to Av mode and set the aperture to 8.0
- turn the zoom ring to set the focal length exactly midway between 24mm and 35mm
- focus by turning on the live view, pressing magnification button twice and adjusting the
focus to get a clear view of the text
- connect the cameras to the scanner by plugging the remote trigger cable to a port behind a
protective rubber cover on the left side of the cameras
- place the book into the crade
- double-check storage cards and batteries
- press the play button on the back of the camera to double-check if there are images on the
camera - if there are, delete all the images from the camera menu
- if using batteries, double-check that batteries are fully charged
- switch off the light in the room that could reflect off the platen and cover the scanner with the
black cloth
1. Photographing
- now you can start scanning either by pressing the smaller button on the controller once to
lower the platen and adjust the book, and then press again to increase the light intensity, trigger the
cameras and lift the platen; or by pressing the large button completing the entire sequence in one
go;
- ATTENTION: Shutter sound should be coming from both cameras - if one camera is not
working, it's best to reconnect both cameras, make sure the batteries are charged or adapters
are connected, erase all images and restart.
- ADVICE: The scanner has a digital counter. By turning the dial forward and backward,
you can set it to tell you what page you should be scanning next. This should help you to
avoid missing a page due to a distraction.

II. Getting the image files ready for post-processing
- after finishing with scanning a book, transfer the files to the post-processing computer
and purge the memory cards
- if transferring the files manually:
- create two separate folders,
- transfer the files from the folders with image files on cards, using a batch
renaming software rename the files from the right camera following the convention
page_0001.jpg, page_0003.jpg, page_0005.jpg... -- and the files from the left camera
following the convention page_0002.jpg, page_0004.jpg, page_0006.jpg...
- collate image files into a single folder
- before ejecting each card, delete all the photo files on the card
- if using the scanflow script:
- start the script on the computer
- place the card from the right camera into the card reader
- enter the name of the destination folder following the convention
"Name_Surname_Title_of_the_Book" and transfer the files
- repeat with the other card
- script will automatically transfer the files, rename, rotate, collate them in proper
order and delete them from the card
III. Transformation of source images into .tiffs
ScanTailor: from a photograph of page to a graphic file ready for OCR
1) Importing photographs to ScanTailor
- start ScanTailor and open ‘new project’
- for ‘input directory’ chose the folder where you stored the transferred photo images
- you can leave ‘output directory’ as it is, it will place your resulting .tiffs in an 'out' folder
inside the folder where your .jpg images are
- select all files (if you followed the naming convention above, they will be named
‘page_xxxx.jpg’) in the folder where you stored the transferred photo images, and click
'OK'
- in the dialog box ‘Fix DPI’ click on All Pages, and for DPI choose preferably '600x600',
click 'Apply', and then 'OK'
2) Editing pages
2.1 Rotating photos/pages
If you've rotated the photo images in the previous step using the scanflow script, skip this step.
- rotate the first photo counter-clockwise, click Apply and for scope select ‘Every other
page’ followed by 'OK'
- rotate the following photo clockwise, applying the same procedure like in the previous
step

2.2 Deleting redundant photographs/pages
- remove redundant pages (photographs of the empty cradle at the beginning and the end;
book cover pages if you don’t want them in the final scan; duplicate pages etc.) by rightclicking on a thumbnail of that page in the preview column on the right, selecting ‘Remove
from project’ and confirming by clicking on ‘Remove’.
# If you by accident remove a wrong page, you can re-insert it by right-clicking on a page
before/after the missing page in the sequence, selecting 'insert after/before' and choosing the file
from the list. Before you finish adding, it is necessary to again go the procedure of fixing DPI and
rotating.
2.3 Adding missing pages
- If you notice that some pages are missing, you can recapture them with the camera and
insert them manually at this point using the procedure described above under 2.2.
3)

Split pages and deskew
- Functions ‘Split Pages’ and ‘Deskew’ should work automatically. Run them by
clicking the ‘Play’ button under the 'Select content' step. This will do the three steps
automatically: splitting of pages, deskewing and selection of content. After this you can
manually re-adjust splitting of pages and de-skewing.

4)

Selecting content and adjusting margins
- Step ‘Select content’ works automatically as well, but it is important to revise the
resulting selection manually page by page to make sure the entire content is selected on
each page (including the header and page number). Where necessary use your pointer device
to adjust the content selection.
- If the inner margin is cut, go back to 'Split pages' view and manually adjust the selected
split area. If the page is skewed, go back to 'Deskew' and adjust the skew of the page. After
this go back to 'Select content' and readjust the selection if necessary.
- This is the step where you do visual control of each page. Make sure all pages are there
and selections are as equal in size as possible.
- At the bottom of thumbnail column there is a sort option that can automatically arrange
pages by the height and width of the selected content, making the process of manual
selection easier. The extreme differences in height should be avoided, try to make
selected areas as much as possible equal, particularly in height, across all pages. The
exception should be cover and back pages where we advise to select the full page.

5) Adjusting margins
- Now go to the 'Margins' step and set under Margins section both Top, Bottom, Left and
Right to 0.0 and do 'Apply to...' → 'All pages'.
- In Alignment section leave 'Match size with other pages' ticked, choose the central

positioning of the page and do 'Apply to...' → 'All pages'.
6) Outputting the .tiffs
- Now go to the 'Output' step.
- Review two consecutive pages from the middle of the book to see if the scanned text is
too faint or too dark. If the text seems too faint or too dark, use slider Thinner – Thicker to
adjust. Do 'Apply to' → 'All pages'.
- Next go to the cover page and select under Mode 'Color / Grayscale' and tick on 'White
Margins'. Do the same for the back page.
- If there are any pages with illustrations, you can choose the 'Mixed' mode for those
pages and then under the thumb 'Picture Zones' adjust the zones of the illustrations.
- To output the files press 'Play' button under 'Output'. Save the project.
IV. Optical character recognition & V. Creating a finalized e-book file
If using all free software:
1) open gscan2pdf (if not already installed on your machine, install gscan2pdf from the
repositories, Tesseract and data for your language from https://code.google.com/p/tesseract-ocr/)
- point gscan2pdf to open your .tiff files
- for Optical Character Recognition, select 'OCR' under the drop down menu 'Tools',
select the Tesseract engine and your language, start the process
- once OCR is finished and to output to a PDF, go under 'File' and select 'Save', edit the
metadata and select the format, save
If using non-free software:
2) open Abbyy FineReader in VirtualBox (note: only Abby FineReader 10 installs and works with some limitations - under GNU/Linux)
- transfer files in the 'out' folder to the folder shared with the VirtualBox
- point it to the readied .tiff files and it will complete the OCR
- save the file

REFERENCES
For more information on the book scanning process in general and making your own book scanner
please visit:
DIY Book Scanner: http://diybookscannnner.org
Hacker Space Bruxelles scanner: http://hackerspace.be/ScanBot
Public Library scanner: http://www.memoryoftheworld.org/blog/2012/10/28/our-belovedbookscanner/
Other scanner builds: http://wiki.diybookscanner.org/scanner-build-list
For more information on automation:
Konrad Voeckel's post-processing script (From Scan to PDF/A):
http://blog.konradvoelkel.de/2013/03/scan-to-pdfa/
Johannes Baiter's automation of scanning to PDF process: http://spreads.readthedocs.org
For more information on applications and tools:
Calibre e-book library management application: http://calibre-ebook.com/
ScanTailor: http://scantailor.sourceforge.net/
gscan2pdf: http://sourceforge.net/projects/gscan2pdf/
Canon Hack Development Kit firmware: http://chdk.wikia.com
Tesseract: http://code.google.com/p/tesseract-ocr/
Python script of Hacker Space Bruxelles scanner: http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEA
D


Sollfrank, Francke & Weinmayr
Piracy Project
2013


Giving What You Don't Have

Andrea Francke, Eva Weinmayr
Piracy Project

Birmingham, 6 December 2013

[00:12]
Eva Weinmayr: When we talk about the word piracy, it causes a lot of problems
to quite a few institutions to deal with it. So events that we’ve organised
have been announced by Central Saint Martins without using the word piracy.
That’s interesting, the problems it still causes…

Cornelia Sollfrank: And how do you announce the project without “Piracy”? The
Project?

E. W.: It’s a project about intellectual property.

C. S.: The P Project.

Andrea Francke, Eva Weinmayr: [laugh] Yes.

[00:52]
Andrea Francke: The Piracy Project is a knowledge platform, and it is based
around a collection of pirated books, of books that have been copied by
people. And we use it to raise discussion about originality, authorship,
intellectual property questions, and to produce new material, new essays and
new questions.

[01:12]
E. W.: So the Piracy Project includes several aspects. One is that it is an
act of piracy in itself, because it is located in an art school, in a library,
in an officially built up a collection of pirated books. [01:30] So that’s the
second aspect, it’s a collection of books which have been copied,
appropriated, modified, improved, which live in this library. [01:40] And the
third part is that it is a collection of physical books, which is touring. We
create reading rooms and invite people to explore the books and discuss issues
raised by cultural piracy.
[01:58] The Piracy Project started in an art college library, which was
supposed to be closed down. And the Piracy Project is one project of And
Publishing. And Publishing is a publishing activity exploring print-on-demand
and new modes of production and of dissemination, the immediacy of
dissemination. [02:20] And Publishing is a collaboration between myself and
Lynn Harris, and we were hosted by Central Saint Martins College of Art and
Design in London. And the campus where this library was situated was the
campus we were working at. [02:40] So when the library was being closed, we
moved in the library together with other members of staff, and kept the
library open in a self-organised way. But we were aware that there’s no budget
to buy new books, and we wanted to have this as a lively space, so we created
an open call for submissions and we asked people to select a book which is
really important to them and make a copy of it. [03:09] So we weren’t
interested in piling up a collection of second hand books, we were really
interested in this process: what happens when you make a copy of a book, and
how does this copy sit next to the original authoritative copy of the book.
This is how it started.

[03:31]
A. F.: I met Eva at the moment when And Publishing was helping to set up this
new space in the library, and they were trying to think how to make the
library more alive inside that university. [03:44] And I was doing research on
Peruvian book piracy at that time, and I had found this book that was modified
and was in circulation. And it was a very exciting moment for us to think what
happens if we can promote this type of production inside this academic
library.

[04:05] Piracy Project
Collection / Reading Room / Research

[04:11]
The Collection

[04:15]
E. W.: We asked people to make a copy of a book which is important to them and
send it to us, and so with these submission we started to build up the
collections. Lots of students were getting involved, but also lots of people
who work in this topic, and were interested in these topics. [04:38] So we
received about one hundred books in a couple of months. And then, parallel to
this, we started to do research ourselves. [04:50] We had a residency in
China, so we went to China, to Beijing and Shanghai, to meet illegal
booksellers of pirated architecture books. And we had a residency in Turkey,
in Istanbul, where we did lots of interviews with publishers and artists on
book piracy. [05:09] So the collection is a mix of our own research and cases
from the real book markets, and creative work, artistic work which is produced
in the context of an art college and the wider cultural realm.

[05:29]
A. F.: And it is an ongoing project.

E. W.: The project is ongoing, we still receive submissions. The collection is
growing, and at the moment here we have about 180 books, here at Grand Union
(Birmingham).

[05:42]
A. F.: When we did the open call, something that was really important to us
was to make clear for people that they have a space of creativity when they
are making a copy. So we wrote, please send us a copy of a book, and be aware
that things happen when you copy a book. [05:57] Whether you do it
intentionally or not a copy is never the same. So you can use that space, take
ownership of that space and make something out of that; or you can take a step
back and allow things to happen without having control. And I think that is
something that is quite important for us in the project. [06:12] And it is
really interesting how people have embraced that in different measures, like
subtle things, or material things, or adding text, taking text out, mixing
things, judging things. Sometimes just saying, I just want it to circulate, I
don’t mind what happens in the space, I just want the subject to be in the
world again.

[06:35]
E. W.: I think this is one which I find interesting in terms of making a copy,
because it’s not so much about my own creativity, it’s more about exploring
how technology edits what you can see. It’s Jan van Toorn’s Critical Practice,
and the artist is Hester Barnard, a Canadian artist. [07:02] She sent us these
three copies, and we thought, that’s really generous, three copies. But they
are not identical copies, they are very different. Some have a lot of empty
pages in the book. And this book has been screen-captured on a 3.5 inch
iPhone, whereas this book has been screen-captured on a desktop, and this one
has been screen-captured with a laptop. [07:37] So the device you use to
access information online determines what you actually receive. And I find
this really interesting, that she translated this back into a hardcopy, the
online edited material. [07:53] And this is kind of taught by this book,
standard International Copyright. She went to Google Books, and screen-
captured all the pages Google Books are showing. So we are all familiar with
blurry text pages, but then it starts that you get the message “Page 38 is not
shown in this preview.” [08:18] And then it’s going through the whole book, so
she printed every page basically, omitting the actual information. But the
interesting thing is that we are all aware that this is happening on Google,
on screen online, but the fact that she’s translating this back into an
object, into a printed book, is interesting.

[08:44]
Reading Room

[08:48]
A. F.: We create these reading rooms with the collection as a way to tour the
collection, and meet people and have conversations around the books. And that
is something quite important to us, that we go with the physical books to a
place, either for two or three months, and meet different people that have
different interests in relation to the collection in that locality. We’ve been
doing that for the last two years, I think, three years. [09:12] And it’s
quite interesting because different places have very different experiences of
piracy. So you can go to a country where piracy is something very common, or a
different place where people have a very strong position against piracy, or a
different legal framework. And I feel the type of conversations and the
quality of interactions is quite different from being present on the space and
with the books. [09:36] And that’s why we don’t call these exhibitions,
because we always have places where people can come and they can stay, and
they can come again. Sometimes people come three or four times and they
actually read the books. And a few times they go back to their houses and they
bring books back, and they said, I’m going to contact this friend who has been
to Russia and he told me about this book – so we can add it to the collection.
I think that makes a big difference to how the research in the project
functions.

[10:06]
E. W.: One of the most interesting events we did with the Piracy collection
was at the Show Room where we had a residency for the last year. There were
three events, and one was A Day At The Courtroom. This was an afternoon where
we invited three copyright lawyers coming from different legal systems: the
US, the UK, and the Continental European, Athens. And we presented ten
selected cases from the collection and the three copyright lawyers had to
assess them in the eyes of the law, and they had to agree where to put this
book in a scale from legal to illegal. [10:51] So we weren’t interested really
to say, this is legal and this is illegal, we were interested in all the
shades in between. And then they had to discuss where they would place the
book. But then the audience had the last verdict, and then the audience placed
the book. [11:05] And this was an extremely interesting discussion, because it
was interesting to see how different the legal backgrounds are, how blurry the
whole field is, how you can assess when is the moment where a work becomes a
transformative work, or when it stays a derivative work, and this whole
discussion.
[11:30] When we do these reading rooms – and we had one in New York, for
example, at the New York Art Book Fair – people are coming, and they are
coming to see the physical books in a physical space, so this creates a social
encounter and we have these conversations. [11:47] For example, a woman stood
up to us in New york and she told us about a piracy project she run where she
was working in a juvenile detention centre, and she produced a whole shadow
library of books because the incarcerated kids couldn’t take the books in
their cells, so she created these copies, individual chapters, and they could
circulate. [12:20] I’m telling this because the fact that we are having this
reading room and that we are meeting people, and that we are having these
conversations, really furthers our research. We find out about these projects
by sharing knowledge.

[12:38]
Categories

[12:42]
A. F.: Whenever we set our reading room for the Piracy Project we need to
organise the books in a certain way. What we started to do now is that we’ve
created these different categories, and the first set of categories came from
the legal event. [12:56] So we set up, we organised the books in different
categories that would help us have questions for the lawyers, that would work
for groups of books instead of individual works. [13:07] And the idea is that,
for example, we are going to have our next events with librarians, and a new
set of categories would come. So the categories change as our interest or
research in the project is changing. [13:21] The current categories are:
Pirated Design, so books where the look of the book has been copied but not
the content; recirculation, books that have been copied trying to be
reproduced exactly as they were, because they need to be circulating again;
transformation, books that have been modified; For Sale Doctrine, so we
receive quite a few books where people haven’t actually made a copy but they
have cut the book or drawn inside the book, and legally you are allowed to do
anything with a book except copy it, so we thought that it was quite important
so that we didn’t have to discuss that with the lawyers; [14:03] Public
Domain, which are works that are already out of copyright, again, so whatever
you do with those books is legal; and collation, books gathered from different
sources, and who owns the copyright, which was a really interesting question,
which is when you have a book that has many authors – it’s really interesting.
Different systems in different countries have different ways to deal with who
owns the copyright and what are the rights of the owners of the different
works.

[14:36]
E. W.: Ahmet Şık is a journalist who published a book about the Ergenekon
scandal and the Turkish government, and connects that kind of mafioso
structures. Before the book could be published he was arrested and put in jail
for a whole year without trial, and he sent the PDF to friends, and the PDF
was circulating on many different computers so it couldn’t be taken. [15:06]
They published the PDF, and as authors they put over a hundred different
author names, so there was not just one author who could be taken into
responsibility.

[15:22] We have in the collection this book, it’s Teignmouth Electron by
Tacita Dean. This is the original, it’s published by Book Works and Steidl.
And to this round table, to this event, we invited also Jane Rolo, director of
Book Works (and she published this book). [15:41] And we invited her saying,
do you know that your book has been pirated? So she was really interested and
she came along. This is the pirated version, it’s Alias, [by] Damián Ortega in
Mexico. It’s a series of books where he translates texts and theory into
Spanish, which are not available in Spanish. So it’s about access, it’s about
circulation. [16:07] But actually he redesigned the book. The pirated version
looks very different, and it has a small film roll here, from Tacita Dean’s
book. And it was really amazing that Jane Rolo flipped the pirated book and
she said, well, actually this is really very nice.

[16:31] This is kind of a standard academic publishing format, it’s Gilles
Deleuze’s Proust and Signs, and the contributor, the artist who produced the
book is Neil Chapman, a writer based in London. And he made a facsimile of his
copy of this book, including the binding mistakes – so there’s one chapter
upside down printed in the book. [17:04] But the really interesting thing is
that he scanned it on his home inkjet printer – he scanned it on his scanner
and then printed it on his home inkjet printer. And the feel of it is very
crafty, because the inkjet has a very different typographic appearance than
the official copy. [17:28] And this makes you read the book in quite a
different way, you relate differently to the actual text. So it’s not just
about the information conveyed on this page, it’s really about how I can
relate to it visually. I find this really interesting when we put this book
into the library, in our collection in the library, and it sat next to the
original, [17:54] it raises really interesting questions about what kind of
authority decides which book can access the library, because this is
definitely and obviously a self-made copy – so if this self-made copy can
enter the library, any self-made text and self-published copy could enter the
library. So it was raising really interesting questions about gatekeepers of
knowledge, and hierarchies and authorities.

[18:26]
On-line catalogue

[18:30]
E. W.: We created this online catalogue give to an overview of what we have in
the collection. We have a cover photograph and then we have a short text where
we try to frame and to describe the approach taken, like the strategy, what’s
been pirated and what was the strategy. [18:55] And this is quite a lot,
because it’s giving you the framework of it, the conceptual framework. But
it’s not giving you the book, and this is really important because lots of the
books couldn’t be digitised, because it’s exactly their material quality which
is important, and which makes the point. [19:17] So if I would… if I have a
project which is working about mediation, and then I put another layer of
mediation on top of it by scanning it, it just wouldn’t work anymore.
[19:29] The purpose of the online catalogue isn’t to give you insight into all
the books to make actually all the information available, it’s more to talk
about the approach taken and the questions which are raised by this specific
book.

[19:47]
Cultures of the copy

[19:51]
A topic of cultural difference became really obvious when we went to Istanbul.
A copy shop which had many academic titles on the shelves, copied, pirated
titles... The fact is that in London, where I’m based, you can access anything
in any library, and it’s not too expensive to get the original book. [20:27]
But in Istanbul it’s very expensive, and the whole academic community thrives
on pirated, copied academic titles.

[20:39]
A. F.: So this is the original Jaime Bayly [No se lo digas a nadie], and this
is the pirated copy of the Jaime Bayly. This book is from Peru, it was bought
on the street, on a street market. [20:53] And Peru has a very big pirated
book market, most books in Peru are pirated. And we found this because there
was a rumour that books in Peru had been modified, pirated books. And this
version, the pirated version, has two extra chapters that are not in the
original one. [21:13] It’s really hard to understand the motivation behind it.
There’s no credit, so the person is inhabiting this author’s identity in a
sense. They are not getting any cultural capital from it. They are not getting
extra money, because if they are found out, nobody would buy books from this
publisher anymore. [21:33] The chapters are really well written, so you as a
reader would not realise that you are reading something that has been pirated.
And that was really fascinating in terms of what space you create. So when you
have this technology that allows you to have the book open and print it so
easily – how you can you take advantage of that, and take ownership or inhabit
these spaces that technology is opening up for you.

[22:01]
E. W.: Book piracy in China is really important when it comes to architecture
books, Western architecture books. Lots of architecture studios, but even
university libraries would buy from pirate book sellers, because it’s just so
much cheaper. [22:26] And we’ve found this Mark magazine with one of the
architecture sellers, and it’s supposed to be a bargain because you have six
magazines in one. [22:41] And we were really interested in the question, what
are the criteria for the editing? How do you edit six issues into one? But
basically everything is in here, from advertisement, to text, to images, it’s
all there. But then a really interesting question arises when it comes to
technology, because in this magazine there are pages in Italian language
clearly taken from other magazines.

[23:14]
A. F.: But it was also really interesting to go there, and actually interview
the distributor and go through the whole experience. We had to meet the
distributor in a neutral place, and he interviewed us to see if he was going
to allow us to go into the shop and buy his books. [23:31] And then going
through the catalogue and realising how Rem Koolhaas is really popular among
the pirates, but actually Chinese architecture is not popular, so there’s only
like three pirated books on Chinese architecture; or that from all the
architecture universities in the world only the AA books are copied – the
Architectural Association books. [23:51] And I think those small things are
really things that are worth spending time and reflecting on.

[23:58]
E. W.: We found this pirate copy of Tintin when we visited Beijing, and
obviously compared to the original, it looks different, a different format.
But also it’s black and white, but it’s not a photocopy of the original full-
colour. [24:23] It’s redrawn by hand, so all the drawings are redrawn and
obviously translated into Chinese. This is quite a labour of love, which is
really amazing. I can compare the two. The space is slightly differently
interpreted.

[24:50]
A. F.: And it’s really incredible, because at some point in China there were
14 or 15 different publishers publishing Tintin, and they all have their
versions. They are all hand-drawn by different people, so in the back, in
Chinese, it’s the credit. So you can buy it by deciding which person does the
best drawings of the production of Tintin, which I thought it was really…
[25:14] It’s such a different cultural way to actually give credit to the
person that is copying it, and recognise the labour, and the intention and the
value of that work.

[25:24]
Why books?

[25:28]
E. W.: Books have always been very important in my practice, in my artistic
practice, because lots of my projects culminated in a book, or led into a
book. And publications are important because they can circulate freely, they
can circulate much easier than artworks in a gallery. [25:50] So this question
of how to make things public and how to create an audience… not how to create
an audience – how to reach a reader and how to create a dialogue. So the book
is the perfect tool for this.

[26:04]
A. F.: My interest in books comes from making art, or thinking about art as a
way to interact with the world, so outside art settings, and I found books
really interesting in that. And that’s how I met Eva, in a sense, because I
was interested in that part of her practice. [26:26] When I found the Jaime
Bayly book, for me that was a real moment of excitement, of this person that
was doing this things in the world without taking any credit, but was having
such a profound effect on so many readers. I’m quite fascinated by that.
[26:44] I'm also really interested in research and using events – research
that works with people. So it kind of creates communities around certain
subjects, and then it uses that to explore different issues and to interact
with different areas of knowledge. And I think books are a privileged space to
do that.

[27:11]
E. W.: The books in the Piracy collection, because they are objects you can
grab, and because they need a place, they are a really important tool to start
a dialogue. When we had this reading room in the New York Art Book Fair, it
was really the book that created this moment when you started a conversation
with somebody else. And I think this is a very important moment in the Piracy
collection as a tool to start this discussion. [27:44] In the Piracy
collection the books are not so important to circulate, because they don’t
circulate. They only travel with us, in a way, or they travel here to Grand
Union to be installed in this reading room. But they are not meant to be
printed in a thousands print run and circulated in the world.

C. S.: So what is their function?

[28:08]
E. W.: The functions of the books here in the Piracy collection are to create
a dialogue, debate about these issues they are raising, and they are a tool
for a direct encounter, for a social encounter. As Andrea said, building a
community which is debating these issues which they are raising. [28:32] And I
also find it really interesting – when we where in China we also talked with
lots of publishers and artists, and they said that the book, in comparison to
an online file, is a really important tool in China, because it can’t be
controlled as easily as online communication. [28:53] So a book is an
autonomous object which can be passed on from one hand to the other, without
the state or another authority to intervene. I think that is an important
aspect when you talk about books in comparison with circulating information
online.

[29:13]
Passion for piracy

[29:17]
A. F.: I’m quite interested in enclosures, and people that jump those
enclosures. I’m kind of interested in these imposed… Maybe because I come from
Peru and we have a different relation to rules, and I’m in Britain where rules
seem to have so much strength. And I’m quite interested in this agency of
taking personal responsibility and saying, I’m going to obey this rule, I’m
not going to obey this one, and what does that mean. [29:42] That makes me
really interested in all these different strategies, and also to find a way to
value them and show them – how when you make this decision to jump a rule, you
actually help bring up questions, modifications, and propose new models or new
ways about thinking things. [30:02] And I think that is something that is part
of all the other projects that I do: stating the rules and the people that
break them.

[30:12]
E. W.: The pirate as a trickster who tries to push the boundaries which are
being set. And I think the interesting, or the complex part of the Piracy
Project is that we are not saying, I’m for piracy or I’m against piracy, I’m
for copyright, I’m against copyright. It’s really about testing out these
decisions and the own boundaries, the legal boundaries, the moral limits – to
push them and find them. [30:51] I mean, the Piracy Project as a whole is a
project which is pushing the boundaries because it started in this academic
library, and it’s assessed by copyright lawyers as illegal, so to run such a
project is an act of piracy in itself.

[31:17]
This method of doing or approaching this art project is to create a
collaboration to instigate this discourse, and this discourse is happening on
many different levels. One of them is conversation, debate. But the other one
is this material outcome, and then this material outcome is creating a new
debate.

Medak, Mars & WHW
Public Library
2015


Public Library

may • 2015
price 50 kn

This publication is realized along with the exhibition
Public Library • 27/5 –13/06 2015 • Gallery Nova • Zagreb
Izdavači / Publishers
Editors
Tomislav Medak • Marcell Mars •
What, How & for Whom / WHW
ISBN 978-953-55951-3-7 [Što, kako i za koga/WHW]
ISBN 978-953-7372-27-9 [Multimedijalni institut]
A Cip catalog record for this book is available from the
National and University Library in Zagreb under 000907085

With the support of the Creative Europe Programme of the
European Union

ZAGREB • ¶ May • 2015

Public Library

1.
Marcell Mars, Manar Zarroug
& Tomislav Medak

75

Public Library (essay)
2.
Paul Otlet

87

Transformations in the Bibliographical
Apparatus of the Sciences
(Repertory — Classification — Office
of Documentation)
3.
McKenzie Wark

111

Metadata Punk
4.
Tomislav Medak
The Future After the Library
UbuWeb and Monoskop’s Radical Gestures

121

Marcell Mars,
Manar Zarroug
& Tomislav Medak

Public library (essay)

In What Was Revolutionary about the French Revolution? 01 Robert Darnton considers how a complete collapse of the social order (when absolutely
everything — all social values — is turned upside
down) would look. Such trauma happens often in
the life of individuals but only rarely on the level
of an entire society.
In 1789 the French had to confront the collapse of
a whole social order—the world that they defined
retrospectively as the Ancien Régime — and to find
some new order in the chaos surrounding them.
They experienced reality as something that could
be destroyed and reconstructed, and they faced
seemingly limitless possibilities, both for good and
evil, for raising a utopia and for falling back into
tyranny.02
The revolution bootstraps itself.
01 Robert H. Darnton, What Was Revolutionary about the
French Revolution? (Waco, TX: Baylor University Press,
1996), 6.
02 Ibid.

Public library (essay)

75

In the dictionaries of the time, the word revolution was said to derive from the verb to revolve and
was defined as “the return of the planet or a star to
the same point from which it parted.” 03 French political vocabulary spread no further than the narrow
circle of the feudal elite in Versailles. The citizens,
revolutionaries, had to invent new words, concepts
… an entire new language in order to describe the
revolution that had taken place.
They began with the vocabulary of time and space.
In the French revolutionary calendar used from 1793
until 1805, time started on 1 Vendémiaire, Year 1, a
date which marked the abolition of the old monarchy on (the Gregorian equivalent) 22 September
1792. With a decree in 1795, the metric system was
adopted. As with the adoption of the new calendar,
this was an attempt to organize space in a rational
and natural way. Gram became a unit of mass.
In Paris, 1,400 streets were given new names.
Every reminder of the tyranny of the monarchy
was erased. The revolutionaries even changed their
names and surnames. Le Roy or Leveque, commonly
used until then, were changed to Le Loi or Liberté.
To address someone, out of respect, with vous was
forbidden by a resolution passed on 24 Brumaire,
Year 2. Vous was replaced with tu. People are equal.
The watchwords Liberté, égalité, fraternité (freedom, equality, brotherhood)04 were built through
03 Ibid.
04 Slogan of the French Republic, France.fr, n.d.,
http://www.france.fr/en/institutions-and-values/slogan
-french-republic.html.

76

M. Mars • M. Zarroug • T. Medak

literacy, new epistemologies, classifications, declarations, standards, reason, and rationality. What first
comes to mind about the revolution will never again
be the return of a planet or a star to the same point
from which it departed. Revolution bootstrapped,
revolved, and hermeneutically circularized itself.
Melvil Dewey was born in the state of New York in
1851.05 His thirst for knowledge was found its satisfaction in libraries. His knowledge about how to
gain knowledge was developed by studying libraries.
Grouping books on library shelves according to the
color of the covers, the size and thickness of the spine,
or by title or author’s name did not satisfy Dewey’s
intention to develop appropriate new epistemologies in the service of the production of knowledge
about knowledge. At the age of twenty-four, he had
already published the first of nineteen editions of
A Classification and Subject Index for Cataloguing
and Arranging the Books and Pamphlets of a Library,06 the classification system that still bears its
author’s name: the Dewey Decimal System. Dewey
had a dream: for his twenty-first birthday he had
announced, “My World Work [will be] Free Schools
and Free Libraries for every soul.”07
05 Richard F. Snow, “Melvil Dewey”, American Heritage 32,
no. 1 (December 1980),
http://www.americanheritage.com/content/melvil-dewey.
06 Melvil Dewey, A Classification and Subject Index for Cataloguing and Arranging the Books and Pamphlets of a
Library (1876), Project Gutenberg e-book 12513 (2004),
http://www.gutenberg.org/files/12513/12513-h/12513-h.htm.
07 Snow, “Melvil Dewey”.

Public library (essay)

77

His dream came true. Public Library is an entry
in the catalog of History where a fantastic decimal08
describes a category of phenomenon that—together
with free public education, a free public healthcare,
the scientific method, the Universal Declaration of
Human Rights, Wikipedia, and free software, among
others—we, the people, are most proud of.
The public library is a part of these invisible infrastructures that we start to notice only once they
begin to disappear. A utopian dream—about the
place from which every human being will have access to every piece of available knowledge that can
be collected—looked impossible for a long time,
until the egalitarian impetus of social revolutions,
the Enlightment idea of universality of knowledge,
and the expcetional suspenssion of the comercial
barriers to access to knowledge made it possible.
The internet has, as in many other situations, completely changed our expectations and imagination
about what is possible. The dream of a catalogue
of the world — a universal approach to all available
knowledge for every member of society — became
realizable. A question merely of the meeting of
curves on a graph: the point at which the line of
global distribution of personal computers meets
that of the critical mass of people with access to
the internet. Today nobody lacks the imagination
necessary to see public libraries as part of a global infrastructure of universal access to knowledge
for literally every member of society. However, the
08 “Dewey Decimal Classification: 001.”, Dewey.info, 27 October 2014, http://dewey.info/class/001/2009-08/about.en.

78

M. Mars • M. Zarroug • T. Medak

emergence and development of the internet is taking place precisely at the point at which an institutional crisis—one with traumatic and inconceivable
consequences—has also begun.
The internet is a new challenge, creating experiences commonly proferred as ‘revolutionary’. Yet, a
true revolution of the internet is the universal access
to all knowledge that it makes possible. However,
unlike the new epistemologies developed during
the French revolution the tendency is to keep the
‘old regime’ (of intellectual property rights, market
concentration and control of access). The new possibilities for classification, development of languages,
invention of epistemologies which the internet poses,
and which might launch off into new orbits from
existing classification systems, are being suppressed.
In fact, the reactionary forces of the ‘old regime’
are staging a ‘Thermidor’ to suppress the public libraries from pursuing their mission. Today public
libraries cannot acquire, cannot even buy digital
books from the world’s largest publishers.09 The
small amount of e-books that they were able to acquire already they must destroy after only twenty-six
lendings.10 Libraries and the principle of universal
09 “American Library Association Open Letter to Publishers on
E-Book Library Lending”, Digital Book World, 24 September
2012, http://www.digitalbookworld.com/2012/americanlibrary-association-open-letter-to-publishers-on-e-booklibrary-lending/.
10 Jeremy Greenfield, “What Is Going On with Library E-Book
Lending?”, Forbes, 22 June 2012, http://www.forbes.com/
sites/jeremygreenfield/2012/06/22/what-is-going-on-withlibrary-e-book-lending/.

Public library (essay)

79

access to all existing knowledge that they embody
are losing, in every possible way, the battle with a
market dominated by new players such as Amazon.
com, Google, and Apple.
In 2012, Canada’s Conservative Party–led government cut financial support for Libraries and
Archives Canada (LAC) by Can$9.6 million, which
resulted in the loss of 400 archivist and librarian
jobs, the shutting down of some of LAC’s internet
pages, and the cancellation of the further purchase
of new books.11 In only three years, from 2010 to
2012, some 10 percent of public libraries were closed
in Great Britain.12
The commodification of knowledge, education,
and schooling (which are the consequences of a
globally harmonized, restrictive legal regime for intellectual property) with neoliberal austerity politics
curtails the possibilities of adapting to new sociotechnological conditions, let alone further development, innovation, or even basic maintenance of
public libraries’ infrastructure.
Public libraries are an endangered institution,
doomed to extinction.
Petit bourgeois denial prevents society from confronting this disturbing insight. As in many other
fields, the only way out offered is innovative mar11 Aideen Doran, “Free Libraries for Every Soul: Dreaming
of the Online Library”, The Bear, March 2014, http://www.
thebear-review.com/#!free-libraries-for-every-soul/c153g.
12 Alison Flood, “UK Lost More than 200 Libraries in 2012”,
The Guardian, 10 December 2012, http://www.theguardian.
com/books/2012/dec/10/uk-lost-200-libraries-2012.

80

M. Mars • M. Zarroug • T. Medak

ket-based entrepreneurship. Some have even suggested that the public library should become an
open software platform on top of which creative
developers can build app stores13 or Internet cafés
for the poorest, ensuring that they are only a click
away from the Amazon.com catalog or the Google
search bar. But these proposals overlook, perhaps
deliberately, the fundamental principles of access
upon which the idea of the public library was built.
Those who are well-meaning, intelligent, and
tactfull will try to remind the public of all the many
sides of the phenomenon that the public library is:
major community center, service for the vulnerable,
center of literacy, informal and lifelong learning; a
place where hobbyists, enthusiasts, old and young
meet and share knowledge and skills.14 Fascinating. Unfortunately, for purely tactical reasons, this
reminder to the public does not always contain an
explanation of how these varied effects arise out of
the foundational idea of a public library: universal
access to knowledge for each member of the society produces knowledge, produces knowledge about
knowledge, produces knowledge about knowledge
transfer: the public library produces sociability.
The public library does not need the sort of creative crisis management that wants to propose what
13 David Weinberger, “Library as Platform”, Library Journal,
4 September 2012, http://lj.libraryjournal.com/2012/09/
future-of-libraries/by-david-weinberger/.
14 Shannon Mattern, “Library as Infrastructure”, Design
Observer, 9 June 2014, http://places.designobserver.com/
entryprint.html?entry=38488.

Public library (essay)

81

the library should be transformed into once our society, obsessed with market logic, has made it impossible for the library to perform its main mission. Such
proposals, if they do not insist on universal access
to knowledge for all members, are Trojan horses for
the silent but galloping disappearance of the public
library from the historical stage. Sociability—produced by public libraries, with all the richness of its
various appearances—will be best preserved if we
manage to fight for the values upon which we have
built the public library: universal access to knowledge for each member of our society.
Freedom, equality, and brotherhood need brave librarians practicing civil disobedience.
Library Genesis, aaaaarg.org, Monoskop, UbuWeb
are all examples of fragile knowledge infrastructures
built and maintained by brave librarians practicing
civil disobedience which the world of researchers
in the humanities rely on. These projects are re-inventing the public library in the gap left by today’s
institutions in crisis.
Library Genesis15 is an online repository with over
a million books and is the first project in history to
offer everyone on the Internet free download of its
entire book collection (as of this writing, about fifteen terabytes of data), together with the all metadata
(MySQL dump) and PHP/HTML/Java Script code
for webpages. The most popular earlier reposito15 See http://libgen.org/.

82

M. Mars • M. Zarroug • T. Medak

ries, such as Gigapedia (later Library.nu), handled
their upload and maintenance costs by selling advertising space to the pornographic and gambling
industries. Legal action was initiated against them,
and they were closed.16 News of the termination of
Gigapedia/Library.nu strongly resonated among
academics and book enthusiasts circles and was
even noted in the mainstream Internet media, just
like other major world events. The decision by Library Genesis to share its resources has resulted
in a network of identical sites (so-called mirrors)
through the development of an entire range of Net
services of metadata exchange and catalog maintenance, thus ensuring an exceptionally resistant
survival architecture.
aaaaarg.org, started by the artist Sean Dockray, is
an online repository with over 50,000 books and
texts. A community of enthusiastic researchers from
critical theory, contemporary art, philosophy, architecture, and other fields in the humanities maintains,
catalogs, annotates, and initiates discussions around
it. It also as a courseware extension to the self-organized education platform The Public School.17
16 Andrew Losowsky, “Library.nu, Book Downloading Site,
Targeted in Injunctions Requested by 17 Publishers,” Huffington Post, 15 February 2012, http://www.huffingtonpost.
com/2012/02/15/librarynu-book-downloading-injunction_
n_1280383.html.
17 “The Public School”, The Public School, n.d.,
https://www.thepublicschool.org/.

Public library (essay)

83

UbuWeb18 is the most significant and largest online
archive of avant-garde art; it was initiated and is lead
by conceptual artist Kenneth Goldsmith. UbuWeb,
although still informal, has grown into a relevant
and recognized critical institution of contemporary
art. Artists want to see their work in its catalog and
thus agree to a relationship with UbuWeb that has
no formal contractual obligations.
Monoskop is a wiki for the arts, culture, and media
technology, with a special focus on the avant-garde,
conceptual, and media arts of Eastern and Central
Europe; it was launched by Dušan Barok and others.
In the form of a blog Dušan uploads to Monoskop.
org/log an online catalog of curated titles (at the
moment numbering around 3,000), and, as with
UbuWeb, it is becoming more and more relevant
as an online resource.
Library Genesis, aaaaarg.org, Kenneth Goldsmith,
and Dušan Barok show us that the future of the
public library does not need crisis management,
venture capital, start-up incubators, or outsourcing but simply the freedom to continue extending
the dreams of Melvil Dewey, Paul Otlet19 and other
visionary librarians, just as it did before the emergence of the internet.

18 See http://ubu.com/.
19 “Paul Otlet”, Wikipedia, 27 October 2014,
http://en.wikipedia.org/wiki/Paul_Otlet.

84

M. Mars • M. Zarroug • T. Medak

With the emergence of the internet and software
tools such as Calibre and “[let’s share books],”20 librarianship has been given an opportunity, similar to astronomy and the project SETI@home21, to
include thousands of amateur librarians who will,
together with the experts, build a distributed peerto-peer network to care for the catalog of available
knowledge, because
a public library is:
— free access to books for every member of society
— library catalog
— librarian
With books ready to be shared, meticulously
cataloged, everyone is a librarian.
When everyone is librarian, library is
everywhere.22


20 “Tools”, Memory of the World, n.d.,
https://www.memoryoftheworld.org/tools/.
21 See http://setiathome.berkeley.edu/.
22 “End-to-End Catalog”, Memory of the World, 26 November 2012,
https://www.memoryoftheworld.org/end-to-end-catalog/.

Public library (essay)

85

Paul Otlet

Transformations
in the Bibliographical Apparatus
of the Sciences [1]
Repertory — Classification — Office
of Documentation
1. Because of its length, its extension to all countries,
the profound harm that it has created in everyone’s
life, the War has had, and will continue to have, repercussions for scientific productivity. The hour for
the revision of the old order is about to strike. Forced
by the need for economies of men and money, and
by the necessity of greater productivity in order to
hold out against all the competition, we are going to
have to introduce reforms into each of the branches
of the organisation of science: scientific research, the
preservation of its results, and their wide diffusion.
Everything happens simultaneously and the distinctions that we will introduce here are only to
facilitate our thinking. Always adjacent areas, or
even those that are very distant, exert an influence
on each other. This is why we should recognize the
impetus, growing each day even greater in the organisation of science, of the three great trends of
our times: the power of associations, technological
progress and the democratic orientation of institutions. We would like here to draw attention to some
of their consequences for the book in its capacity

Transformations In The Bibliographical
Apparatus Of The Sciences

87

as an instrument for recording what has been discovered and as a necessary means for stimulating
new discoveries.
The Book, the Library in which it is preserved,
and the Catalogue which lists it, have seemed for
a long time as if they had achieved their heights of
perfection or at least were so satisfactory that serious
changes need not be contemplated. This may have
been so up to the end of the last century. But for a
score of years great changes have been occurring
before our very eyes. The increasing production of
books and periodicals has revealed the inadequacy of
older methods. The increasing internationalisation
of science has required workers to extend the range
of their bibliographic investigations. As a result, a
movement has occurred in all countries, especially
Germany, the United States and England, for the
expansion and improvement of libraries and for
an increase in their numbers. Publishers have been
searching for new, more flexible, better-illustrated,
and cheaper forms of publication that are better-coordinated with each other. Cataloguing enterprises
on a vast scale have been carried out, such as the
International Catalogue of Scientific Literature and
the Universal Bibliographic Repertory. [2]
Three facts, three ideas, especially merit study
for they represent something really new which in
the future can give us direction in this area. They
are: The Repertory, Classification and the Office of
Documentation.
•••

88

Paul Otlet

2. The Repertory, like the book, has gradually been
increasing in size, and improvements in it suggest
the emergence of something new which will radically modify our traditional ideas.
From the point of view of form, a book can be
defined as a group of pages cut to the same format
and gathered together in such a way as to form a
whole. It was not always so. For a long time the
Book was a roll, a volumen. The substances which
then took the place of paper — papyrus and parchment — were written on continuously from beginning to end. Reading required unrolling. This was
certainly not very practical for the consultation of
particular passages or for writing on the verso. The
codex, which was introduced in the first centuries of
the modern era and which is the basis of our present
book, removed these inconveniences. But its faults
are numerous. It constitutes something completed,
finished, not susceptible of addition. The Periodical
with its successive issues has given science a continuous means of concentrating its results. But, in
its turn, the collections that it forms runs into the
obstacle of disorder. It is impossible to link similar
or connected items; they are added to one another
pell-mell, and research requires handling great masses of heavy paper. Of course indexes are a help and
have led to progress — subject indexes, sometimes
arranged systematically, sometimes analytically,
and indexes of names of persons and places. These
annual indexes are preceded by monthly abstracts
and are followed by general indexes cumulated every
five, ten or twenty-five years. This is progress, but
the Repertory constitutes much greater progress.

Transformations In The Bibliographical
Apparatus Of The Sciences

89

The aim of the Repertory is to detach what the
book amalgamates, to reduce all that is complex to
its elements and to devote a page to each. Pages, here,
are leaves or cards according to the format adopted.
This is the “monographic” principle pushed to its
ultimate conclusion. No more binding or, if it continues to exist, it will become movable, that is to
say, at any moment the cards held fast by a pin or a
connecting rod or any other method of conjunction
can be released. New cards can then be intercalated,
replacing old ones, and a new arrangement made.
The Repertory was born of the Catalogue. In
such a work, the necessity for intercalations was
clear. Nor was there any doubt as to the unitary or
monographic notion: one work, one title; one title,
one card. As a result, registers which listed the same
collections of books for each library but which had
constantly to be re-done as the collections expanded,
have gradually been discarded. This was practical
and justified by experience. But upon reflection one
wonders whether the new techniques might not be
more generally applied.
What is a book, in fact, if not a single continuous line which has initially been cut to the length
of a page and then cut again to the size of a justified
line? Now, this cutting up, this division, is purely
mechanical; it does not correspond to any division
of ideas. The Repertory provides a practical means
of physically dividing the book according to the
intellectual division of ideas.
Thus, the manuscript library catalogue on cards
has been quickly followed by catalogues printed on
cards (American Library Bureau, the Catalogue or

90

Paul Otlet

the Library of Congress in Washington) [3]; then by
bibliographies printed on cards (International Institute of Bibliography, Concilium Bibliographicum)
[4]; next, indices of species have been published on
cards (Index Speciorum) [5]. We have moved from
the small card to the large card, the leaf, and have
witnessed compendia abandoning the old form for
the new (Jurisclasseur, or legal digests in card form).
Even the idea of the encyclopedia has taken this
form (Nelson’s Perpetual Cyclopedia [6]).
Theoretically and technically, we now have in
the Repertory a new instrument for analytically or
monographically recording data, ideas, information. The system has been improved by divisionary cards of various shapes and colours, placed in
such a way that they express externally the outline
of the classification being used and reduce search
time to a minimum. It has been improved further
by the possibility of using, by cutting and pasting,
materials that have been printed on large leaves or
even books that have been published without any
thought of repertories. Two copies, the first providing the recto, the second the verso, can supply
all that is necessary. One has gone even further still
and, from the example of statistical machines like
those in use at the Census of Washington (sic) [7],
extrapolated the principle of “selection machines”
which perform mechanical searches in enormous
masses of materials, the machines retaining from
the thousands of cards processed by them only those
related to the question asked.
•••

Transformations In The Bibliographical
Apparatus Of The Sciences

91

3. But such a development, like the Repertory before it, presupposes a classification. This leads us to
examine the second practical idea that is bringing
about the transformation of the book.
Classification plays an enormous role in scientific thought. If one could say that a science was a
well-made language, one could equally assert that
it is a completed classification. Science is made up
of verified facts which are organised in a structure
of systems, hypotheses, theories, laws. If there is
a certain order in things, it is necessary to have it
also in science which reflects and explains nature.
That is why, since the time of Greek thought until
the present, constant efforts have been made to improve classification. These have taken three principal directions: classification studied as an activity
of the mind; the general classification and sequence
of the sciences; the systematization appropriate to
each discipline. The idea of order, class, genus and
species has been studied since Aristotle, in passing
by Porphyrus, by the scholastic philosophers and by
modern logicians. The classification of knowledge
goes back to the Greeks and owes much to the contributions of Bacon and the Renaissance. It was posed
as a distinct and separate problem by D’Alembert
and the Encyclopédie, and by Ampère, Comte, and
Spencer. The recent work of Manouvrier, Durand
de Cros, Goblot, Naville, de la Grasserie, has focussed on various aspects of it. [8] As to systematics,
one can say that this has become the very basis of
the organisation of knowledge as a body of science.
When one has demonstrated the existence of 28 million stars, a million chemical compounds, 300,000

92

Paul Otlet

vegetable species, 200,000 animal species, etc., it is
necessary to have a means, an Ariadne’s thread, of
finding one’s way through the labyrinth formed by
all these objects of study. Because there are sciences of beings as well as sciences of phenomena, and
because they intersect with each other as we better
understand the whole of reality, it is necessary that
this means be used to retrieve both. The state of development of a science is reflected at any given time
by its systematics, just as the general classification
of the sciences reflects the state of development of
the encyclopedia, of the philosophy of knowledge.
The need has been felt, however, for a practical
instrument of classification. The classifications of
which we have just spoken are constantly changing, at least in their detail if not in broad outline. In
practice, such instability, such variability which is
dependent on the moment, on schools of thought
and individuals, is not acceptable. Just as the Repertory had its origin in the catalogue, so practical
classification originated in the Library. Books represent knowledge and it is necessary to arrange them
in collections. Schemes for this have been devised
since the Middle Ages. The elaboration of grand
systems occurred in the 17th and 18th centuries
and some new ones were added in the 19th century. But when bibliography began to emerge as an
autonomous field of study, it soon began to develop
along the lines of the catalogue of an ideal library
comprising the totality of what had been published.
From this to drawing on library classifications was
but a step, and it was taken under certain conditions
which must be stressed.

Transformations In The Bibliographical
Apparatus Of The Sciences

93

Up to the present time, 170 different classifications
have been identified. Now, no cooperation is possible if everyone stays shut up in his own system. It
has been necessary, therefore, to choose a universal
classification and to recommend it as such in the
same way that the French Convention recognized
the necessity of a universal system of weights and
measures. In 1895 the first International Conference
of Bibliography chose the Decimal Classification
and adopted a complete plan for its development. In
1904, the edition of the expanded tables appeared. A
new edition was being prepared when the war broke
out Brussels, headquarters of the International Institute of Bibliography, which was doing this work,
was part of the invaded territory.
In its latest state, the Decimal Classification has
become an instrument of great precision which
can meet many needs. The printed tables contain
33,000 divisions and they have an alphabetical index consisting of about 38,000 words. Learning is
here represented in its entire sweep: the encyclopedia of knowledge. Its principle is very simple. The
empiricism of an alphabetical classification by subject-heading cannot meet the need for organising
and systematizing knowledge. There is scattering;
there is also the difficulty of dealing with the complex expressions which one finds in the modern terminology of disciplines like medicine, technology,
and the social sciences. Above all, it is impossible
to achieve any international cooperation on such
a national basis as language. The Decimal Classification is a vast systematization of knowledge, “the
table of contents of the tables of contents” of all

94

Paul Otlet

treatises. But, as it would be impossible to find a
particular subject’s relative place by reference to
another subject, a system of numbering is needed.
This is decimal, which an example will make clear.
Optical Physiology would be classified thus:
5 th Class
3rd Group
5th Division
7th Sub-division

Natural Sciences
Physics
Optics
Optical Physiology

or 535.7
This number 535.7 is called decimal because all
knowledge is taken as one of which each science is
a fraction and each individual subject is a decimal
subdivided to a lesser or greater degree. For the sake
of abbreviation, the zero of the complete number,
which would be 0.5357, has been suppressed because
the zero would be repeated in front of each number.
The numbers 5, 3, 5, 7 (which one could call five hundred and thirty-five point seven and which could
be arranged in blocks of three as for the telephone,
or in groups of twos) form a single number when
the implied words, “class, group, division and subdivision,” are uttered.
The classification is also called decimal because
all subjects are divided into ten classes, then each
of these into at least ten groups, and each group
into at least ten divisions. All that is needed for the
number 535.7 always to have the same meaning is
to translate the tables into all languages. All that is
needed to deal with future scientific developments

Transformations In The Bibliographical
Apparatus Of The Sciences

95

in optical physiology in all of its ramifications is to
subdivide this number by further decimal numbers
corresponding to the subdivisions of the subject
Finally, all that is needed to ensure that any document or item pertaining to optical physiology finds
its place within the sum total of scientific subjects
is to write this number on it In the alphabetic index
to the tables references are made from each word
to the classification number just as the index of a
book refers to page numbers.
This first remarkable principle of the decimal
classification is generally understood. Its second,
which has been introduced more recently, is less
well known: the combination of various classification numbers whenever there is some utility in expressing a compound or complex heading. In the
social sciences, statistics is 31 and salaries, 331.2. By
a convention these numbers can be joined by the
simple sign : and one may write 31:331.2 statistics
of salaries.01
This indicates a general relationship, but a subject also has its place in space and time. The subject
may be salaries in France limited to a period such as
the 18th century (that is to say, from 1700 to 1799).
01 The first ten divisions are: 0 Generalities, 1 Philosophy, 2
Religion, 3 Social Sciences, 4 Philology, Language, 5 Pure
Sciences, 6 Applied Science, Medicine, 7 Fine Arts, 8 Literature, 9 History and Geography. The Index number 31 is
derived from: 3rd class social sciences, 1st group statistics. The
Index number 331.2 is derived from 3rd class social sciences,
3rd group political economy, 1st division topics about work,
2nd subdivision salaries.

96

Paul Otlet

The sign that characterises division by place being
the parenthesis and that by time quotation marks
or double parentheses, one can write:
33:331.2 (44) «17» statistics — of salaries — in
France — in the 17th century
or ten figures and three signs to indicate, in terms
of the universe of knowledge, four subordinated
headings comprising 42 letters. And all of these
numbers are reversible and can be used for geographic or chronologic classification as well as for
subject classification:
(44) 31:331.2 «17»
France — Statistics — Salaries — 17th Century
«17» (44) 31:331.2
17th Century — France — Statistics — Salaries
The subdivisions of relation and location explained
here, are completed by documentary subdivisions
for the form and the language of the document (for
example, periodical, in Italian), and by functional
subdivisions (for example, in zoology all the divisions by species of animal being subdivided by biological aspects). It follows by virtue of the law of
permutations and combinations that the present
tables of the classification permit the formulation
at will of millions of classification numbers. Just as
arithmetic does not give us all the numbers readymade but rather a means of forming them as we
need them, so the classification gives us the means

Transformations In The Bibliographical
Apparatus Of The Sciences

97

of creating classification numbers insofar as we have
compound headings that must be translated into a
notation of numbers.
Like chemistry, mathematics and music, bibliography thus has its own extremely simple notations:
numbers. Immediately and without confusion, it
allows us to find a place for each idea, for each thing
and consequently for each book, article, or document and even for each part of a book or document
Thus it allows us to take our bearings in the midst
of the sources of knowledge, just as the system of
geographic coordinates allows us to take our bearings on land or sea.
One may well imagine the usefulness of such a
classification to the Repertory. It has rid us of the
difficulty of not having continuous pagination. Cards
to be intercalated can be placed according to their
class number and the numbering is that of tables
drawn up in advance, once and for all, and maintained with an unvarying meaning. As the classification has a very general use, it constitutes a true
documentary classification which can be used in
various kinds of repertories: bibliographic repertories; catalogue-like repertories of objects, persons,
phenomena; and documentary repertories of files
made up of written or printed materials of all kinds.
The possibility can be envisaged of encyclopedic
repertories in which are registered and integrated
the diverse data of a scientific field and which draw
for this purpose on materials published in periodicals. Let each article, each report, each item of news
henceforth carry a classification number and, automatically, by clipping, encyclopedias on cards can

98

Paul Otlet

be created in which all the results of international
scientific cooperation are brought together at the
same number. This constitutes a profound change
in the technology of the Book, since the repertory
thus formed is simultaneously a constantly up-dated book and a cooperative book in which are found
printed elements produced in all locations.
•••
4. If we can realize the third idea, the Office of Documentation, then reform will be complete. Such an
office is the old library, but adapted to a new function. Hitherto the library has been a museum of
books. Works were preserved in libraries because
they were precious objects. Librarians were keepers.
Such establishments were not organised primarily
for the use of documents. Moreover, their outmoded
regulations if they did not exclude the most modern
forms of publication at least did not admit them.
They have poor collections of journals; collections
of newspapers are nearly nonexistent; photographs,
films, phonograph discs have no place in them, nor
do film negatives, microscopic slides and many other “documents.” The subject catalogue is considered
secondary in the library so long as there is a good
register for administrative purposes. Thus there is
little possibility of developing repertories in the
library, that is to say of taking publications to pieces and redistributing them in a more directly and
quickly accessible form. For want of personnel to
arrange them, there has not even been a place for
the cards that are received already printed.

Transformations In The Bibliographical
Apparatus Of The Sciences

99

The Office of Documentation, on the contrary, is
conceived of in such a way as to achieve all that is
lacking in the library. Collections of books are the
necessary basis for it, but books, far from being
considered as finished products, are simply materials which must be developed more fully. This
development consists in establishing the connections each individual book has with all of the other
books and forming from them all what might be
called The Universal Book. It is for this that we use
repertories: bibliographic repertories; repertories of
documentary dossiers gathering pamphlets and extracts together by subject; catalogues; chronological
repertories of facts or alphabetical ones of names;
encyclopedic repertories of scientific data, of laws,
of patents, of physical and technical constants, of
statistics, etc. All of these repertories will be set up
according to the method described above and arranged by the same universal classification. As soon
as an organisation to contain these repertories is
created, the Office of Documentation, one may be
sure that what happened to the book when libraries
first opened — scientific publication was regularised
and intensified — will happen to them. Then there
will be good reason for producing in bibliographies,
catalogues, and above all in books and periodicals
themselves, the rational changes which technology and the creative imagination suggest. What is
still an exception today will be common tomorrow.
New possibilities will exist for cooperative work
and for the more effective organisation of science.
•••

100

Paul Otlet

5. Repertory, Classification, Office of Documentation are therefore the three related elements of a
single reform in our methods of registering scientific discoveries and making them available to the
greatest number of people. Already one must speak
less of experiments and uncertain trials than of the
beginning of serious achievement. The International Institute of Bibliography in Brussels constitutes
a vast intellectual cooperative whose members are
becoming more numerous each day. Associations,
scientific establishments, periodical publications,
scientific and technical workers of every kind are
affiliating with it. Its repertories contain millions of
cards. There are sections in several countries02 . But
this was before the War. Since its outbreak, a movement in France, England and the United States has
been emerging everywhere to improve the organisation of the Book. The Office of Documentation has
been suggested as the solution for the requirements
that have been discussed.
It is important that the world of science and
technology should support this movement and
above all that it should endeavour to apply the new
methods to the works which it will be necessary to
re-organise. Among the most important of these is
the International Catalogue of Scientific Literature,
that fine and great work begun at the initiative of the
Royal Society of London. Until now, this work has
02 In France, the Bureau Bibliographique de Paris and great
associations such as the Société pour l’encouragement de
l’industrie nationale, l’Association pour l’avancement des
sciences, etc., are affiliated with it.

Transformations In The Bibliographical
Apparatus Of The Sciences

101

been carried on without relation to other works of
the same kind: it has not recognised the value of a
card repertory or a universal classification. It must
recognise them in the future.03 ❧

03 See Paul Otlet, “La Documentation et I’information au service de I’industrie”, Bulletin de la Société d’encouragement
de l’industrie nationale, June 1917. — La Documentation au
service de l’invention. Euréka, October 1917. — L’Institut
International de Bibliographie, Bibliographie de la France,
21 December 1917. — La Réorganisation du Catalogue international de la littérature scientifique. Revue générale des
sciences, IS February 1918. The publications of the Institute,
especially the expanded tables of the Decimal Classification,
have been deposited at the Bureau Bibliographique de Paris,
44 rue de Rennes at the apartments of the Société de l’encouragement. — See also the report presented by General
Sebert (9] to the Congrès du Génie civil, in March 1918 and
whose conclusions about the creation in Paris of a National
Office of Technical Documentation have been adopted.

102

Paul Otlet

Editor’s Notes
[1] “Transformations operées dans l’appareil bibliographique
des sciences,” Revue scientifique 58 (1918): 236-241.
[2] The International Catalogue of Scientific Literature, an enormous work, was compiled by a Central Bureau under the
sponsorship of the Royal Society from material sent in from
Regional Bureaus around the world. It was published annually beginning in 1902 in 17 parts each corresponding to
a major subject division and comprising one or more volumes. Publication was effectively suspended in 1914. By the
time war broke out, the Universal Bibliographic Repertory
contained over 11 million entries.
[3] For card publication by the Library Bureau and Library of
Congress, see Edith Scott, “The Evolution of Bibliographic
Systems in the United States, 1876–1945” and Editor’s Note
36 to the second paper and Note 5 to the seventh paper in
International Organisation and Dissemination of Knowledge; Selected Essays of Paul Otlet, translated and edited by
W. Boyd Rayward. Amsterdam: Elsevier, 1990: 148–156.
[4] Otlet refers to the Concilium Bibliographicum also in Paper
No. 7, “The Reform of National Bibliographies...” in International Organisation and Dissemination of Knowledge; Selected
Essays of Paul Otlet. See also Editor’s Note 5 in that paper
for the major bibliographies published by the Concilium
Bibliographicum.
[5] A possible example of what Otlet is referring to here is the
Gray Herbarium Index. This was “planned to provide cards
for all the names of vascular plant taxa attributable to the

Transformations In The Bibliographical
Apparatus Of The Sciences

103

Western Hemisphere beginning with the literature of 1886”
(Gray Herbarium Index, Preface, p. iii). Under its first compiler, 20 instalments consisting in all of 28,000 cards were
issued between 1894 and 1903. It has been continued after
that time and was for many years “issued quarterly at the
rate of about 4,000 cards per year.” At the time the cards
were reproduced in a printed catalogue by G. K. Hall in 1968,
there were 85 subscribers to the card sets.
[6] Nelson’s Perpetual Loose-Leaf Encylcopedia was a popular,
12-volume work which went through many editions, its
principle being set down at the beginning of the century.
It was published in binders and the publisher undertook to
supply a certain number of pages of revisions (or renewals)
semi-annually after each edition, the first of which appeared
in 1905. An interesting reference presumably to this work
occurs in a notice, “An Encylcopedia on the Card-Index System,” in the Scientific American 109 (1913): 213. The Berlin
Correspondent of the journal reports a proposal made in
Berlin which contains “an idea, in a sense ... already carried
out in an American loose-leaf encyclopedia, the publishers
of which supply new pages to take the place of those that
are obsolete” (Nelsons, an English firm, set up a New York
branch in 1896. Publication in the U.S. of works to be widely
circulated there was a requirement of the copyright law.)
The reporter observes that the principle suggested “affords
a means of recording all facts at present known as well as
those to be discovered in the future, with the same safety
and ease as though they were registered in our memory, by
providing a universal encyclopedia, incessantly keeping
abreast of the state of human knowledge.” The “bookish”
form of conventional encyclopedias acts against its future
success. “In the case of a mere storehouse of facts the in-

104

Paul Otlet

finitely more mobile form of the card index should however
be adopted, possibly,” the author goes on making a most interesting reference, “in conjunction with Dr. Goldschmidt’s
Microphotographic Library System.” The need for a central
institute, the nature of its work, the advantages of the work
so organised are described in language that is reminiscent
of that of Paul Otlet (see also the papers of Goldschmidt
and Otlet translated in International Organisation and
Dissemination of Knowledge; Selected Essays of Paul Otlet).
[7] These machines were derived from Herman Hollerith’s
punched cards and tabulating machines. Hollerith had
introduced them under contract into the U.S. Bureau of
the Census for the 1890 census. This equipment was later
modified and developed by the Bureau. Hollerith, his invention and his business connections lie at the roots of the
present IBM company. The equipment and its uses in the
census from 1890 to 1910 are briefly described in John H.
Blodgett and Claire K. Schultz, “Herman Hollerith: Data
Processing Pioneer,” American Documentation 20 (1969):
221-226. As they observe, suggesting the accuracy of Otlet’s
extrapolation, “his was not simply a calculating machine,
it performed selective sorting, an operation basic to all information retrieval.”
[8] The history of the classification of knowledge has been treated
in English in detail by E.C. Richardson in his Classification
Theoretical and Practical, the first edition of which appeared
in 1901 and was followed by editions in 1912 and 1930. A
different treatment is given in Robert Flint’s Philosophy as
Scientia Scientarium: a History of the Classification of the
Sciences which appeared in 1904. Neither of these works
deal with Manouvrier, a French anthropologist, or Durand

Transformations In The Bibliographical
Apparatus Of The Sciences

105

de Cros. Joseph-Pierre Durand, sometimes called Durand
de Cros after his birth place, was a French physiologist and
philosopher who died in 1900. In his Traité de documentation,
in the context of his discussion of classification, Otlet refers
to an Essai de taxonomie by Durand published by Alcan. It
seems that this is an error for Aperçus de taxonomie (Alcan,
1899).
[9] General Hippolyte Sebert was President of the Association française pour l’avancement des sciences, and the Société d’encouragement pour l’industrie nationale. He had
been active in the foundation of the Bureau bibliographique
de Paris. For other biographical information about him see
Editor’s Note 9 to Paper no 17, “Henri La Fontaine”, in International Organisation and Dissemination of Knowledge;
Selected Essays of Paul Otlet.

English translation of the Paul Otlet’s text published with the
permission of W. Boyd Rayward. The translation was originally
published as Paul Otlet, “Transformations in the Bibliographical
Apparatus of the Sciences: Repertory–Classification–Office of
Documentation”, in International Organisation and Dissemination of Knowledge; Selected Essays of Paul Otlet, translated and
edited by W. Boyd Rayward, Amsterdam: Elsevier, 1990: 148–156.

106

Paul Otlet

107

108

public library

http://aaaaarg.org/

109

McKenzie Wark

Metadata Punk

So we won the battle but lost the war. By “we”, I
mean those avant-gardes of the late twentieth century whose mission was to free information from the
property form. It was always a project with certain
nuances and inconsistencies, but over-all it succeeded beyond almost anybody’s wildest dreams. Like
many dreams, it turned into a nightmare in the end,
the one from which we are now trying to awake.
The place to start is with what the situationists
called détournement. The idea was to abolish the
property form in art by taking all of past art and
culture as a commons from which to copy and correct. We see this at work in Guy Debord’s texts and
films. They do not quote from past works, as to do
so acknowledges their value and their ownership.
The elements of détournement are nothing special.
They are raw materials for constructing theories,
narratives, affects of a subjectivity no longer bound
by the property form.
Such a project was recuperated soon enough
back into the art world as “appropriation.” Richard
Prince is the dialectical negation of Guy Debord,

Metadata Punk

111

in that appropriation values both the original fragment and contributes not to a subjectivity outside of
property but rather makes a career as an art world
star for the appropriating artist. Of such dreams is
mediocrity made.
If there was a more promising continuation of
détournement it had little to do with the art world.
Détournement became a social movement in all but
name. Crucially, it involved an advance in tools,
from Napster to Bitorrent and beyond. It enabled
the circulation of many kinds of what Hito Steyerl
calls the poor image. Often low in resolution, these
détourned materials circulated thanks both to the
compression of information but also because of the
addition of information. There might be less data
but there’s added metadata, or data about data, enabling its movement.
Needless to say the old culture industries went
into something of a panic about all this. As I wrote
over ten years ago in A Hacker Manifesto, “information wants to be free but is everywhere in chains.”
It is one of the qualities of information that it is indifferent to the medium that carries it and readily
escapes being bound to things and their properties.
Yet it is also one of its qualities that access to it can
be blocked by what Alexander Galloway calls protocol. The late twentieth century was — among other
things — about the contradictory nature of information. It was a struggle between détournement and
protocol. And protocol nearly won.
The culture industries took both legal and technical steps to strap information once more to fixity
in things and thus to property and scarcity. Inter-

112

McKenzie Wark

estingly, those legal steps were not just a question of
pressuring governments to make free information
a crime. It was also a matter of using international
trade agreements as a place outside the scope of de­
mo­­cratic oversight to enforce the old rules of property. Here the culture industries join hands with the
drug cartels and other kinds of information-based
industry to limit the free flow of information.
But laws are there to be broken, and so are protocols of restriction such as encryption. These were
only ever delaying tactics, meant to shore up old
monopoly business for a bit longer. The battle to
free information was the battle that the forces of
détournement largely won. Our defeat lay elsewhere.
While the old culture industries tried to put information back into the property form, there were
other kinds of strategy afoot. The winners were not
the old culture industries but what I call the vulture
industries. Their strategy was not to try to stop the
flow of free information but rather to see it as an
environment to be leveraged in the service of creating a new kind of business. “Let the data roam free!”
says the vulture industry (while quietly guarding
their own patents and trademarks). What they aim
to control is the metadata.
It’s a new kind of exploitation, one based on an
unequal exchange of information. You can have the
little scraps of détournement that you desire, in exchange for performing a whole lot of free labor—and
giving up all of the metadata. So you get your little
bit of data; they get all of it, and more importantly,
any information about that information, such as
the where and when and what of it.

Metadata Punk

113

It is an interesting feature of this mode of exploitation that you might not even be getting paid for your
labor in making this information—as Trebor Scholz
as pointed out. You are working for information
only. Hence exploitation can be extended far beyond
the workplace and into everyday life. Only it is not
so much a social factory, as the autonomists call it.
This is more like a social boudoir. The whole of social
space is in some indeterminate state between public
and private. Some of your information is private to
other people. But pretty much all of it is owned by
the vulture industry — and via them ends up in the
hands of the surveillance state.
So this is how we lost the war. Making information free seemed like a good idea at the time. Indeed, one way of seeing what transpired is that we
forced the ruling class to come up with these new
strategies in response to our own self-organizing
activities. Their actions are reactions to our initiatives. In this sense the autonomists are right, only
it was not so much the actions of the working class
to which the ruling class had to respond in this case,
as what I call the hacker class. They had to recuperate a whole social movement, and they did. So our
tactics have to change.
In the past we were acting like data-punks. Not
so much “here’s three chords, now form your band.”
More like: “Here’s three gigs, now go form your autonomous art collective.” The new tactic might be
more question of being metadata-punks. On the one
hand, it is about freeing information about information rather than the information itself. We need
to move up the order of informational density and

114

McKenzie Wark

control. On the other hand, it might be an idea to
be a bit discreet about it. Maybe not everyone needs
to know about it. Perhaps it is time to practice what
Zach Blas calls infomatic opacity.
Three projects seem to embody much of this
spirit to me. One I am not even going to name or
discuss, as discretion seems advisable in that case.
It takes matters off the internet and out of circulation among strangers. Ask me about it in person if
we meet in person.
The other two are Monoskop Log and UbuWeb.
It is hard to know what to call them. They are websites, archives, databases, collections, repositories,
but they are also a bit more than that. They could be
thought of also as the work of artists or of curators;
of publishers or of writers; of archivists or researchers. They contain lots of files. Monoskop is mostly
books and journals; UbuWeb is mostly video and
audio. The work they contain is mostly by or about
the historic avant-gardes.
Monoskop Log bills itself as “an educational
open access online resource.” It is a component part
of Monoskop, “a wiki for collaborative studies of
art, media and the humanities.” One commenter
thinks they see the “fingerprint of the curator” but
nobody is named as its author, so let’s keep it that
way. It is particularly strong on Eastern European
avant-garde material. UbuWeb is the work of Kenneth Goldsmith, and is “a completely independent
resource dedicated to all strains of the avant-garde,
ethnopoetics, and outsider arts.”
There’s two aspects to consider here. One is the
wealth of free material both sites collect. For any-

Metadata Punk

115

body trying to teach, study or make work in the
avant-garde tradition these are very useful resources.
The other is the ongoing selection, presentation and
explanation of the material going on at these sites
themselves. Both of them model kinds of ‘curatorial’
or ‘publishing’ behavior.
For instance, Monoskop has wiki pages, some
better than Wikipedia, which contextualize the work
of a given artist or movement. UbuWeb offers “top
ten” lists by artists or scholars which give insight
not only into the collection but into the work of the
person making the selection.
Monoskop and UbuWeb are tactics for intervening in three kinds of practices, those of the artworld, of publishing and of scholarship. They respond to the current institutional, technical and
political-economic constraints of all three. As it
says in the Communist Manifesto, the forces for social change are those that ask the property question.
While détournement was a sufficient answer to that
question in the era of the culture industries, they try
to formulate, in their modest way, a suitable tactic
for answering the property question in the era of
the vulture industries.
This takes the form of moving from data to metadata, expressed in the form of the move from writing
to publishing, from art-making to curating, from
research to archiving. Another way of thinking this,
suggested by Hiroki Azuma would be the move from
narrative to database. The object of critical attention
acquires a third dimension, a kind of informational
depth. The objects before us are not just a text or an
image but databases of potential texts and images,
with metadata attached.

116

McKenzie Wark

The object of any avant-garde is always to practice the relation between aesthetics and everyday
life with a new kind of intensity. UbuWeb and
Monoskop seem to me to be intimations of just
such an avant-garde movement. One that does not
offer a practice but a kind of meta-practice for the
making of the aesthetic within the everyday.
Crucial to this project is the shifting of aesthetic
intention from the level of the individual work to the
database of works. They contain a lot of material, but
not just any old thing. Some of the works available
here are very rare, but not all of them are. It is not
just rarity, or that the works are available for free.
It is more that these are careful, artful, thoughtful
collections of material. There are the raw materials here with which to construct a new civilization.
So we lost the battle, but the war goes on. This
civilization is over, and even its defenders know it.
We live in among ruins that accrete in slow motion.
It is not so much a civil war as an incivil war, waged
against the very conditions of existence of life itself.
So even if we have no choice but to use its technologies and cultures, the task is to build another way
of life among the ruins. Here are some useful practices, in and on and of the ruins. ❧

Metadata Punk

117

118

public library

http://midnightnotes.memoryoftheworld.org/

119

Tomislav Medak

The Future After the Library
UbuWeb and Monoskop’s
Radical Gestures

The institution of the public library has crystallized,
developed and advanced around historical junctures
unleashed by epochal economic, technological and
political changes. A series of crises since the advent
of print have contributed to the configuration of the
institutional entanglement of the public library as
we know it today:01 defined by a publicly available
collection, housed in a public building, indexed and
made accessible with a help of a public catalog, serviced by trained librarians and supported through
public financing. Libraries today embody the idea
of universal access to all knowledge, acting as custodians of a culture of reading, archivists of material
and ephemeral cultural production, go-betweens
of information and knowledge. However, libraries have also embraced a broader spirit of public
service and infrastructure: providing information,
01 For the concept and the full scope of the contemporary library
as institutional entanglement see Shannon Mattern, “Library
as Infrastructure”, Places Journal, accessed April 9, 2015,
https://placesjournal.org/article/library-as-infrastructure/.

The Future After the Library

121

education, skills, assistance and, ultimately, shelter
to their communities — particularly their most vulnerable members.
This institutional entanglement, consisting in
a comprehensive organization of knowledge, universally accessible cultural goods and social infrastructure, historically emerged with the rise of (information) science, social regulation characteristic
of modernity and cultural industries. Established
in its social aspect as the institutional exemption
from the growing commodification and economic
barriers in the social spheres of culture, education
and knowledge, it is a result of struggles for institutionalized forms of equality that still reflect the
best in solidarity and universality that modernity
had to offer. Yet, this achievement is marked by
contradictions that beset modernity at its core. Libraries and archives can be viewed as an organon
through which modernity has reacted to the crises
unleashed by the growing production and fixation
of text, knowledge and information through a history of transformations that we will discuss below.
They have been an epistemic crucible for the totalizing formalizations that have propelled both the
advances and pathologies of modernity.
Positioned at a slight monastic distance and indolence toward the forms of pastoral, sovereign or
economic domination that defined the surrounding world that sustained them, libraries could never
close the rift or between the universalist aspirations
of knowledge and their institutional compromise.
Hence, they could never avoid being the battlefield
where their own, and modernity’s, ambivalent epis-

122

Tomislav Medak

temic and social character was constantly re-examined and ripped asunder. It is this ambivalent
character that has been a potent motor for critical theory, artistic and political subversion — from
Marx’s critique of political economy, psychoanalysis
and historic avant-gardes, to revolutionary politics.
Here we will examine the formation of the library
as an epistemic and social institution of modernity
and the forms of critical engagement that continue
to challenge the totalizing order of knowledge and
appropriation of culture in the present.
Here Comes the Flood02
Prior to the advent of print, the collections held in
monastic scriptoria, royal courts and private libraries
typically contained a limited number of canonical
manuscripts, scrolls and incunabula. In Medieval
and early Renaissance Europe the canonized knowledge considered necessary for the administration of
heavenly and worldly affairs was premised on reading and exegesis of biblical and classical texts. It is
02 The metaphor of the information flood, here incanted in the
words of Peter Gabriel’s song with apocalyptic overtones, as
well as a good part of the historic background of the development of index card catalog in the following paragraphs
are based on Markus Krajewski, Paper Machines: About
Cards & Catalogs, 1548–1929 (MIT Press, 2011). The organizing idea of Krajewski’s historical account, that the index
card catalog can be understood as a Turing machine avant
la lettre, served as a starting point for the understanding
of the library as an epistemic institution developed here.

The Future After the Library

123

estimated that by the 15th century in Western Europe
there were no more than 5 million manuscripts held
mainly in the scriptoria of some 21,000 monasteries and a small number of universities. While the
number of volumes had grown sharply from less
than 0.8 million in the 12th century, the number of
monasteries had remained constant throughout that
period. The number of manuscripts read averaged
around 1,000 per million inhabitants, with the total
population of Europe peaking around 60 million.03
All in all, the book collections were small, access was
limited and reading culture played a marginal role.
The proliferation of written matter after the invention of mechanical movable type printing would
greatly increase the number of books, but also the
patterns of literacy and knowledge production. Already in the first fifty years after Gutenberg’s invention, 12 million volumes were printed, and from
this point onwards the output of printing presses
grew exponentially to 700 million volumes in the
18th century. In the aftermath of the explosion in
book production the cost of producing and buying
books fell drastically, reducing the economic barriers to literacy, but also creating a material vector
for a veritable shift of the epistemic paradigm. The
03 For an economic history of the book in the Western Europe
see Eltjo Buringh and Jan Luiten Van Zanden, “Charting
the ‘Rise of the West’: Manuscripts and Printed Books in
Europe, A Long-Term Perspective from the Sixth through
Eighteenth Centuries”, The Journal of Economic History 69,
No. 02 (June 2009): 409–45, doi:10.1017/S0022050709000837,
particularly Tables 1-5.

124

Tomislav Medak

emerging reading public was gaining access to the
new works of a nascent Enlightenment movement,
ushering in the modern age of science. In parallel
with those larger epochal transformations, the explosion of print also created a rising tide of new books
that suddenly inundated the libraries. The libraries
now had to contend both with the orders-of-magnitude greater volume of printed matter and the
growing complexity of systematically storing, ordering, classifying and tracking all of the volumes
in their collection. An once almost static collection
of canonical knowledge became an ever expanding
dynamic flux. This flood of new books, the first of
three to follow, presented principled, infrastructural and organizational challenges to the library that
radically transformed and coalesced its functions.
The epistemic shift created by this explosion of
library holdings led to a revision of the assumption
that the library is organized around a single holy
scripture and a small number of classical sources.
Coextensive with the emergence and multiplication of new sciences, the books that were entering
the library now covered an ever diversified scope
of topics and disciplines. And the sheer number of
new acquisitions demanded the physical expansion of libraries, which in turn required a radical
rethinking of the way the books were stored, displayed and indexed. In fact, the flood caused by the
printing press was nothing short of a revolution in
the organization, formalization and processing of
information and knowledge. This becomes evident
in the changes that unfolded between the 16th and
the early 20th in the cataloging of library collections.

The Future After the Library

125

The initial listings of books were kept in bound
volumes, books in their own right. But as the number of items arriving into the library grew, the constant need to insert new entries made the bound
book format increasingly impractical for library
catalogs. To make things more complicated still,
the diversification of the printed matter demanded
a richer bibliographic description that would allow
better comprehension of what was contained in the
volumes. Alongside the name of the author and the
book’s title, the description now needed to include
the format of the volume, the classification of the
subject matter and the book’s location in the library.
As the pace of new arrivals accelerated, the effort to
create a library catalog became unending, causing a
true crisis in the emerging librarian profession. This
would result in a number of physical and epistemic
innovations in the organization and formalization
of information and knowledge. The requirement
to constantly rearrange the order of entries in the
listing lead to the eventual unbinding of the bound
catalog into separate slips of paper and finally to the
development of the index card catalog. The unbound
index cards and their floating rearrangement, not
unlike that of the movable type, would in turn result in the design of filing cabinets. From Conrad
Gessner’s Bibliotheca Universalis, a three-volume
book-format catalog of around 3,000 authors and
10,000 texts, arranged alphabetically and topically,
published in the period 1545–1548; Gottfried Wilhelm Leibniz’s proposals for a universal library
during his tenure at the Wolfenbüttel library in the
late 17th century; to Gottfried van Swieten’s catalog

126

Tomislav Medak

of the Viennese court library, the index card catalog and the filing cabinets would develop almost to
their present form.04
The unceasing inflow of new books into the library
prompted the need to spatially organize and classify
the arrangement of the collection. The simple addition of new books to the shelves by size; canonical
relevance or alphabetical order, made little sense
in a situation where the corpus of printed matter
was quickly expanding and no individual librarian
could retain an intimate overview of the library’s
entire collection. The inflow of books required that
the brimming shelf-space be planned ahead, while
the increasing number of expanding disciplines required that the collection be subdivided into distinct
sections by fields. First the shelves became classified
and then the books individually received a unique
identifier. With the completion of the Josephinian
catalog in the Viennese court library, every book became compartmentalized according to a systematic
plan of sciences and assigned a unique sequence of
a Roman numeral, a Roman letter and an Arabic
numeral by which it could be tracked down regardless of its physical location.05 The physical location
of the shelves in the library no longer needed to be
reflected in the ordering of the catalog, and the catalog became a symbolic representation of the freely
re-arrangeable library. In the technological lingo of
today, the library required storage, index, search
and address in order to remain navigable. It is this
04 Krajewski, Paper Machines, op. cit., chapter 2.
05 Ibid., 30.

The Future After the Library

127

formalization of a universal system of classification
of objects in the library with the relative location of
objects and re-arrangeable index that would then in
1876 receive its present standardized form in Melvil
Dewey’s Decimal System.
The development of the library as an institution of
public access and popular literacy did not proceed
apace with the development of its epistemic aspects.
It was only a series of social upheavals and transformations in the course of the 18th and 19th century
that would bring about another flood of books and
political demands, pushing the library to become
embedded in an egalitarian and democratic political culture. The first big step in that direction came
with the decision of the French revolutionary National Assembly from 2 November 1789 to seize all
book collections from the Church and aristocracy.
Million of volumes were transferred to the Bibliothèque Nationale and local libraries across France.
In parallel, particularly in England, capitalism was
on the rise. It massively displaced the impoverished rural population into growing urban centers,
propelled the development of industrial production and, by the mid-19th century, introduced the
steam-powered rotary press into the book business.
As books became more easily, and mass produced,
the commercial subscription libraries catering to the
better-off parts of society blossomed. This brought
the class aspect of the nascent demand for public
access to books to the fore. After the failed attempts
to introduce universal suffrage and end the system
of political representation based on property entitlements in 1830s and 1840s, the English Chartist

128

Tomislav Medak

movement started to open reading rooms and cooperative lending libraries that would quickly become
a popular hotbed of social exchanges between the
lower classes. In the aftermath of the revolutionary
upheavals of 1848, the fearful ruling classes heeded
the demand for tax-financed public libraries, hoping
that the access to literature and edification would
ultimately hegemonize the working class for the
benefits of capitalism’s culture of self-interest and
competition.06
The Avant-gardes in the Library
As we have just demonstrated, the public library
in its epistemic and social aspects coalesced in the
context of the broader social transformations of
modernity: early capitalism and processes of nation-building in Europe and the USA. These transformations were propelled by the advancement of
political and economic rationalization, public and
business administration, statistical and archival
procedures. Archives underwent a corresponding and largely concomitant development with the
libraries, responding with a similar apparatus of
classification and ordering to the exponential expansion of administrative records documenting the
social world and to the historicist impulse to capture the material traces of past events. Overlaying
the spatial organization of documentation; rules
06 For the social history of public library see Matthew Battles,
Library: An Unquiet History (Random House, 2014) chapter
5: “Books for all”.

The Future After the Library

129

of its classification and symbolic representation of
the archive in reference tools, they tried to provide
a formalization adequate to the passion for capturing historical or present events. Characteristic
of the ascendant positivism of the 19th century, the
archivists’ and librarians’ epistemologies harbored
a totalizing tendency that would become subject to
subversion and displacement in the first decades of
the 20th century.
The assumption that the classificatory form can
fully capture the archival content would become
destabilized over and over by the early avant-gardist
permutations of formal languages of classification:
dadaist montage of the contingent compositional
elements, surrealist insistence on the unconscious
surpluses produced by automatized formalized language, constructivist foregrounding of dynamic and
spatialized elements in the acts of perception and
cognition of an artwork.07 The material composition
of the classified and ordered objects already contained formalizations deposited into those objects
by the social context of their provenance or projected onto them by the social situation of encounter
with them. Form could become content and content
could become form. The appropriations, remediations and displacements exacted by the neo-avantgardes in the second half of the 20th century pro07 Sven Spieker, The Big Archive: Art from Bureaucracy (MIT
Press, 2008) provides a detailed account of strategies that
the historic avant-gardes and the post-war art have developed toward the classificatory and ordering regime of the
archive.

130

Tomislav Medak

duced subversions, resignifications and simulacra
that only further blurred the lines between histories
and their construction, dominant classifications and
their immanent instabilities.
Where does the library fit into this trajectory? Operating around an uncertain and politically embattled universal principle of public access to knowledge
and organization of information, libraries continued being sites of epistemic and social antagonisms,
adaptations and resilience in response to the challenges created by the waves of radical expansion of
textuality and conflicting social interests between
the popular reading culture and the commodification of cultural consumption. This precarious position is presently being made evident by the third
big flood — after those unleashed by movable type
printing and the social context of industrial book
production — that is unfolding with the transition
of the book into the digital realm. Both the historic
mode of the institutional regulation of access and
the historic form of epistemic classification are
swept up in this transformation. While the internet
has made possible a radically expanded access to
digitized culture and knowledge, the vested interests of cultural industries reliant on copyright for
their control over cultural production have deepened the separation between cultural producers and
their readers, listeners and viewers. While the hypertextual capacity for cross-reference has blurred
the boundaries of the book, digital rights management technologies have transformed e-books into
closed silos. Both the decommodification of access
and the overcoming of the reified construct of the

The Future After the Library

131

self-enclosed work in the form of a book come at
the cost of illegality.
Even the avant-gardes in all their inappropriable
and idiosyncratic recalcitrance fall no less under
the legally delimited space of copyrightable works.
As they shift format, new claims of ownership and
appropriation are built. Copyright is a normative
classification that is totalizing, regardless of the
effects of leaky networks speaking to the contrary.
Few efforts have insisted on the subverting of juridical classification by copyright more lastingly than
the UbuWeb archive. Espousing the avant-gardes’
ethos of appropriation, for almost 20 years it has
collected and made accessible the archives of the
unknown; outsider, rare and canonized avant-gardes and contemporary art that would otherwise remained reserved for the vaults and restricted access
channels of esoteric markets, selective museological
presentations and institutional archives. Knowing
that asking to publish would amount to aligning itself with the totalizing logic of copyright, UbuWeb
has shunned the permission culture. At the level of
poetical operation, as a gesture of displacing the cultural archive from a regime of limited, into a regime
of unlimited access, it has created provocations and
challenges directed at the classifying and ordering
arrangements of property over cultural production.
One can only assume that as such it has become a
mechanism for small acts of treason for the artists,
who, short of turning their back fully on the institutional arrangements of the art world they inhabit,
use UbuWeb to release their own works into unlimited circulation on the net. Sometimes there might

132

Tomislav Medak

be no way or need to produce a work outside the
restrictions imposed by those institutions, just as
sometimes it is for academics impossible to avoid
the contradictory world of academic publishing,
yet that is still no reason to keep one’s allegiance to
their arrangements.
At the same time UbuWeb has played the game
of avant-gardist subversion: “If it doesn’t exist on
the internet, it doesn’t exist”. Provocation is most
effective when it is ignorant of the complexities of
the contexts that it is directed at. Its effect starts
where fissures in the defense of the opposition start
to show. By treating UbuWeb as massive evidence
for the internet as a process of reappropriation, a
process of “giving to all”, its volunteering spiritus
movens, Kenneth Goldsmith, has been constantly rubbing copyright apologists up the wrong way.
Rather than producing qualifications, evasions and
ambivalences, straightforward affirmation of copy­
ing, plagiarism and reproduction as a dominant
yet suppressed mode of operation of digital culture re-enacts the avant-gardes’ gesture of taking
no hostages from the officially sanctioned systems
of classification. By letting the incumbents of control over cultural production react to the norm of
copying, you let them struggle to dispute the norm
rather than you having to try to defend the norm.
UbuWeb was an early-comer, starting in 1996
and still functioning today on seemingly similar
technology, it’s a child of the early days of World
Wide Web and the promissory period of the experimental internet. It’s resolutely Web 1.0, with
a single maintainer, idiosyncratically simple in its

The Future After the Library

133

layout and programmatically committed to the
eventual obsolescence and sudden abandonment.
No platform, no generic design, no widgets, no
kludges and no community features. Only Beckett
avec links. Endgame.
A Book is an Index is an Index is an Index...
Since the first book flood, the librarian dream of
epistemological formalization has revolved around
the aspiration to cross-reference all the objects in
the collection. Within the physical library the topical designation has been relegated to the confines of
index card catalog that remained isolated from the
structure of citations and indexes in the books themselves. With the digital transition of the book, the
time-shifted hypertextuality of citations and indexes
became realizable as the immediate cross-referentiality of the segments of individual text to segments
of other texts and other digital artifacts across now
permeable boundaries of the book.
Developed as a wiki for collaborative studies of
art, media and the humanities, Monoskop.org took
up the task of mapping and describing avant-gardes and media art in Europe. In its approach both
indexical and encyclopedic, it is an extension of
the collaborative editing made possible by wiki
technology. Wikis rose to prominence in the early
2000s allowing everyone to edit and extend websites running on that technology by mastering a
very simple markup language. Wikis have been the
harbinger of a democratization of web publishing
that would eventually produce the largest collabo-

134

Tomislav Medak

rative website on the internet — the Wikipedia, as
well as a number of other collaborative platforms.
Monoskop.org embraces the encyclopedic spirit of
Wikipedia, focusing on its own specific topical and
topological interests. However, from its earliest days
Monoskop.org has also developed as a form of index
that maps out places, people, artworks, movements,
events and venues that compose the dense network
of European avant-gardes and media art.
If we take the index as a formalization of cross-referential relations between names of people, titles
of works and concepts that exist in the books and
across the books, what emerges is a model of a relational database reflecting the rich mesh of cultural
networks. Each book can serve as an index linking
its text to people, other books, segments in them.
To provide a paradigmatic demonstration of that
idea, Monoskop.org has assembled an index of all
persons in Friedrich Kittler’s Discourse Networks,
with each index entry linking both to its location
in the digital version of the book displayed on the
aaaaarg.org archive and to relevant resources for
those persons on the Monoskop.org and the internet. Hence, each object in the library, an index
in its own right, potentially allows one to initiate
the relational re-classification and re-organization
of all other works in the library through linkable
information.
Fundamental to the works of the post-socialist
retro-avant-gardes of the last couple of decades has
been the re-writing of a history of art in reverse.
In the works of IRWIN, Laibach or Mladen Stilinović, or comparable work of Komar & Melamid,

The Future After the Library

135

totalizing modernity is detourned by re-appropriating the forms of visual representation and classification that the institutions of modernity used to
construct a linear historical narrative of evolutions
and breaks in the 19th and 20th century. Genealogical
tables, events, artifacts and discourses of the past
were re-enacted, over-affirmed and displaced to
open up the historic past relegated to the archives
to an understanding that transformed the present
into something radically uncertain. The efforts of
Monoskop.org in digitizing of the artifacts of the
20th century avant-gardes and playing with the
epistemic tools of early book culture is a parallel
gesture, with a technological twist. If big data and
the control over information flows of today increasingly naturalizes and re-affirms the 19th century
positivist assumptions of the steerablity of society,
then the endlessly recombinant relations and affiliations between cultural objects threaten to overflow
that recurrent epistemic framework of modernity’s
barbarism in its cybernetic form.
The institution of the public library finds itself
today under a double attack. One unleashed by
the dismantling of the institutionalized forms of
social redistribution and solidarity. The other by
the commodifying forces of expanding copyright
protections and digital rights management, control
over the data flows and command over the classification and order of information. In a world of
collapsing planetary boundaries and unequal development, those who control the epistemic order

136

Tomislav Medak

control the future.08 The Googles and the NSAs run
on capturing totality — the world’s knowledge and
communication made decipherable, organizable and
controllable. The instabilities of the epistemic order
that the library continues to instigate at its margins
contributes to keeping the future open beyond the
script of ‘commodify and control’. In their acts of
re-appropriation UbuWeb and Monoskop.org are
but a reminder of the resilience of libraries’ instability that signals toward a future that can be made
radically open. ❧

08 In his article “Controlling the Future—Edward Snowden and
the New Era on Earth”, (accessed April 13, 2015, http://www.
eurozine.com/articles/2014-12-19-altvater-en.html), Elmar
Altvater makes a comparable argument that the efforts of
the “Five Eyes” to monitor the global communication flows,
revealed by Edward Snowden, and the control of the future
social development defined by the urgency of mitigating the
effects of the planetary ecological crisis cannot be thought
apart.

The Future After the Library

137

138

public library

http://kok.memoryoftheworld.org

139

Public Library
www.memoryoftheworld.org

Publishers
What, How & for Whom / WHW
Slovenska 5/1 • HR-10000 Zagreb
+385 (0) 1 3907261
whw@whw.hr • www.whw.hr
ISBN 978-953-55951-3-7 [Što, kako i za koga/WHW]
Multimedia Institute
Preradovićeva 18 • HR-10000 Zagreb
+385 (0)1 4856400
mi2@mi2.hr • www.mi2.hr
ISBN 978-953-7372-27-9 [Multimedijalni institut]
Editors
Tomislav Medak • Marcell Mars • What, How & for Whom / WHW
Copy Editor
Dušanka Profeta [Croatian]
Anthony Iles [English]
Translations
Una Bauer
Tomislav Medak
Dušanka Profeta
W. Boyd Rayward
Design & layout
Dejan Kršić @ WHW
Typography
MinionPro [robert slimbach • adobe]

English translation of the Paul
Otlet’s text published with the permission of W. Boyd
Rayward. The translation was originally published as
Paul Otlet, “Transformations in the Bibliographical
Apparatus of the Sciences: Repertory–Classification–Office
of Documentation”, in International Organisation and
Dissemination of Knowledge; Selected Essays of Paul Otlet,
translated and edited by W. Boyd Rayward, Amsterdam:
Elsevier, 1990: 148–156. ❧
format / size
120 × 200 mm
pages
144
Paper
Agrippina 120 g • Rives Laid 300 g
Printed by
Tiskara Zelina d.d.
Print Run
1000
Price
50 kn
May • 2015

This publication, realized along with the exhibition
Public Library in Gallery Nova, Zagreb 2015, is a part of
the collaborative project This Is Tomorrow. Back to Basics:
Forms and Actions in the Future organized by What, How
& for Whom / WHW, Zagreb, Tensta Konsthall, Stockholm
and Latvian Center for Contemporary Art / LCCA, Riga, as a
part of the book edition Art As Life As Work As Art. ❧

Supported by
Office of Culture, Education and Sport of the City of Zagreb
Ministry of Culture of the Republic of Croatia
Croatian Government Office for Cooperation with NGOs
Creative Europe Programme of the European Commission.
National Foundation for Civil Society Development
Kultura Nova Foundation

This project has been funded with support
from European Commision. This publication reflects
the views only of the authors, and the Commission
cannot be held responsible for any use which may be
made of the information contained therein. ❧
Publishing of this book is enabled by financial support of
the National Foundation for Civil Society Development.
The content of the publication is responsibility of
its authors and as such does not necessarily reflect
the views of the National Foundation. ❧
This project is financed
by the Croatian Government Office for Cooperation
with NGOs. The views expressed in this publication
are the sole responsibility of the publishers. ❧

This book is licensed under a Creative
Commons Attribution–ShareAlike 4.0
International License. ❧

Public Library

may • 2015
price 50 kn


 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.