Medak, Sekulic & Mertens
Book Scanning and Post-Processing Manual Based on Public Library Overhead Scanner v1.2
2014

PUBLIC LIBRARY
&
MULTIMEDIA INSTITUTE

BOOK SCANNING & POST-PROCESSING MANUAL
BASED ON PUBLIC LIBRARY OVERHEAD SCANNER

Written by:
Tomislav Medak
Dubravka Sekulić
With help of:
An Mertens

Creative Commons Attribution - Share-Alike 3.0 Germany

TABLE OF CONTENTS

Introduction
3
I. Photographing a printed book
7
I. Getting the image files ready for post-processing
11
III. Transformation of source images into .tiffs
13
IV. Optical character recognition
16
V. Creating a finalized e-book file
16
VI. Cataloging and sharing the e-book
16
Quick workflow reference for scanning and post-processing
18
References
22

INTRODUCTION:
BOOK SCANNING - FROM PAPER BOOK TO E-BOOK
Initial considerations when deciding on a scanning setup
Book scanning tends to be a fragile and demanding process. Many factors can go wrong or produce
results of varying quality from book to book or page to page, requiring experience or technical skill
to resolve issues that occur. Cameras can fail to trigger, components to communicate, files can get
corrupted in the transfer, storage card doesn't get purged, focus fails to lock, lighting conditions
change. There are trade-offs between the automation that is prone to instability and the robustness
that is prone to become time consuming.
Your initial choice of book scanning setup will have to take these trade-offs into consideration. If
your scanning community is confined to your hacklab, you won't be risking much if technological
sophistication and integration fails to function smoothly. But if you're aiming at a broad community
of users, with varying levels of technological skill and patience, you want to create as much timesaving automation as possible on the condition of keeping maximum stability. Furthermore, if the
time of individual members of your scanning community can contribute is limited, you might also
want to divide some of the tasks between users and their different skill levels.
This manual breaks down the process of digitization into a general description of steps in the
workflow leading from the printed book to a digital e-book, each of which can be in a concrete
situation addressed in various manners depending on the scanning equipment, software, hacking
skills and user skill level that are available to your book scanning project. Several of those steps can
be handled by a single piece of equipment or software, or you might need to use a number of them your mileage will vary. Therefore, the manual will try to indicate the design choices you have in the
process of planning your workflow and should help you make decisions on what design is best for
you situation.
Introducing book scanner designs
The book scanning starts with the capturing of digital image files on the scanning equipment. There
are three principle types of book scanner designs:
 flatbed scanner
 single camera overhead scanner
 dual camera overhead scanner
Conventional flatbed scanners are widely available. However, given that they require the book to be
spread wide open and pressed down with the platen in order to break the resistance of the book
binding and expose sufficiently the inner margin of the text, it is the most destructive approach for
the book, imprecise and slow.
Therefore, book scanning projects across the globe have taken to custom designing improvised
setups or scanner rigs that are less destructive and better suited for fast turning and capturing of
pages. Designs abound. Most include:
•
•
•

one or two digital photo cameras of lesser or higher quality to capture the pages,
transparent V-shaped glass or Plexiglas platen to press the open book against a V-shape
cradle, and
a light source.

The go-to web resource to help you make an informed decision is the DIY book scanning
community at http://diybookscanner.org. A good place to start is their intro
(http://wiki.diybookscanner.org/ ) and scanner build list (http://wiki.diybookscanner.org/scannerbuild-list ).
The book scanners with a single camera are substantially cheaper, but come with an added difficulty
of de-warping the distorted page images due to the angle that pages are photographed at, which can
sometimes be difficult to correct in the post-processing. Hence, in this introductory chapter we'll
focus on two camera designs where the camera lens stands relatively parallel to the page. However,
with a bit of adaptation these instructions can be used to work with any other setup.
The Public Library scanner
In the focus of this manual is the scanner built for the Public Library project, designed by Voja
Antonić (see Illustration 1). The Public Library scanner was built with the immediate use by a wide
community of users in mind. Hence, the principle consideration in designing the Public Library
scanner was less sophistication and more robustness, facility of use and distributed process of
editing.
The board designs can be found here: http://www.memoryoftheworld.org/blog/2012/10/28/ourbeloved-bookscanner. The current iterations are using two Canon 1100 D cameras with the kit lens
Canon EF-S 18-55mm 1:3.5-5.6 IS. Cameras are auto-charging.

Illustration 1: Public Library Scanner
The scanner operates by automatically lowering the Plexiglas platen, illuminating the page and then
triggering camera shutters. The turning of pages and the adjustments of the V-shaped cradle holding

the book are manual.
The scanner is operated by a two-button controller (see Illustration 2). The upper, smaller button
breaks the capture process in two steps: the first click lowers the platen, increases the light level and
allows you to adjust the book or the cradle, the second click triggers the cameras and lifts the platen.
The lower button has
two modes. A quick
click will execute the
whole capture process in
one go. But if you hold
it pressed longer, it will
lower the platen,
allowing you to adjust
the book and the cradle,
and lift it without
triggering cameras when
you press again.

Illustration 2: A two-button controller

More on this manual: steps in the book scanning process
The book scanning process in general can be broken down in six steps, each of which will be dealt
in a separate chapter in this manual:
I. Photographing a printed book
I. Getting the image files ready for post-processing
III. Transformation of source images into .tiffs
IV. Optical character recognition
V. Creating a finalized e-book file
VI. Cataloging and sharing the e-book
A step by step manual for Public Library scanner
This manual is primarily meant to provide a detailed description and step-by-step instructions for an
actual book scanning setup -- based on the Voja Antonić's scanner design described above. This is a
two-camera overhead scanner, currently equipped with two Canon 1100 D cameras with EF-S 1855mm 1:3.5-5.6 IS kit lens. It can scan books of up to A4 page size.
The post-processing in this setup is based on a semi-automated transfer of files to a GNU/Linux
personal computer and on the use of free software for image editing, optical character recognition
and finalization of an e-book file. It was initially developed for the HAIP festival in Ljubljana in
2011 and perfected later at MaMa in Zagreb and Leuphana University in Lüneburg.
Public Library scanner is characterized by a somewhat less automated yet distributed scanning
process than highly automated and sophisticated scanner hacks developed at various hacklabs. A
brief overview of one such scanner, developed at the Hacker Space Bruxelles, is also included in
this manual.
The Public Library scanning process proceeds thus in following discrete steps:

1. creating digital images of pages of a book,
2. manual transfer of image files to the computer for post-processing,
3. automated renaming of files, ordering of even and odd pages, rotation of images and upload to a
cloud storage,
4. manual transformation of source images into .tiff files in ScanTailor
5. manual optical character recognition and creation of PDF files in gscan2pdf
The detailed description of the Public Library scanning process follows below.
The Bruxelles hacklab scanning process
For purposes of comparison, here we'll briefly reference the scanner built by the Bruxelles hacklab
(http://hackerspace.be/ScanBot). It is a dual camera design too. With some differences in hardware functionality
(Bruxelles scanner has automatic turning of pages, whereas Public Library scanner has manual turning of pages), the
fundamental difference between the two is in the post-processing - the level of automation in the transfer of images
from the cameras and their transformation into PDF or DjVu e-book format.
The Bruxelles scanning process is different in so far as the cameras are operated by a computer and the images are
automatically transferred, ordered and made ready for further post-processing. The scanner is home-brew, but the
process is for advanced DIY'ers. If you want to know more on the design of the scanner, contact Michael Korntheuer at
contact@hackerspace.be.
The scanning and post-processing is automated by a single Python script that does all the work
http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEAD
The scanner uses two Canon point and shoot cameras. Both cameras are connected to the PC with USB. They both run
PTP/CHDK (Canon Hack Development Kit). The scanning sequence is the following:
1. Script sends CHDK command line instructions to the cameras
2. Script sorts out the incoming files. This part is tricky. There is no reliable way to make a distinction between the left
and right camera, only between which camera was recognized by USB first. So the protocol is to always power up the
left camera first. See the instructions with the source code.
3. Collect images in a PDF file
4. Run script to OCR a .PDF file to plain .TXT file: http://git.constantvzw.org/?
p=algolit.git;a=blob;f=scanbot_brussel/ocr_pdf.sh;h=2c1f24f9afcce03520304215951c65f58c0b880c;hb=HEAD

I. PHOTOGRAPHING A PRINTED BOOK
Technologically the most demanding part of the scanning process is creating digital images of the
pages of a printed book. It's a process that is very different form scanner design to scanner design,
from camera to camera. Therefore, here we will focus strictly on the process with the Public Library
scanner.
Operating the Public Library scanner
0. Before you start:
Better and more consistent photographs lead to a more optimized and faster post-processing and a
higher quality of the resulting digital e-book. In order to guarantee the quality of images, before you
start it is necessary to set up the cameras properly and prepare the printed book for scanning.
a) Loosening the book
Depending on the type and quality of binding, some books tend to be too resistant to opening fully
to reveal the inner margin under the pressure of the scanner platen. It is thus necessary to “break in”
the book before starting in order to loosen the binding. The best way is to open it as wide as
possible in multiple places in the book. This can be done against the table edge if the book is more
rigid than usual. (Warning – “breaking in” might create irreversible creasing of the spine or lead to
some pages breaking loose.)
b) Switch on the scanner
You start the scanner by pressing the main switch or plugging the power cable into the the scanner.
This will also turn on the overhead LED lights.

c) Setting up the cameras
Place the cameras onto tripods. You need to move the lever on the tripod's head to allow the tripod
plate screwed to the bottom of the camera to slide into its place. Secure the lock by turning the lever
all the way back.
If the automatic chargers for the camera are provided, open the battery lid on the bottom of the
camera and plug the automatic charger. Close the lid.
Switch on the cameras using the lever on the top right side of the camera's body and place it into the
aperture priority (Av) mode on the mode dial above the lever (see Illustration 3). Use the main dial
just above the shutter button on the front side of the camera to set the aperture value to F8.0.

Illustration 3: Mode and main dial, focus mode switch, zoom
and focus ring
On the lens, turn the focus mode switch to manual (MF), turn the large zoom ring to set the value
exactly midway between 24 and 35 mm (see Illustration 3). Try to set both cameras the same.
To focus each camera, open a book on the cradle, lower the platen by holding the big button on the
controller, and turn on the live view on camera LCD by pressing the live view switch (see
Illustration 4). Now press the magnification button twice and use the focus ring on the front of the
lens to get a clear image view.

Illustration 4: Live view switch and magnification button

d) Connecting the cameras
Now connect the cameras to the remote shutter trigger cables that can be found lying on each side
of the scanner. They need to be plugged into a small round port hidden behind a protective rubber
cover on the left side of the cameras.
e) Placing the book into the cradle and double-checking the cameras
Open the book in the middle and place it on the cradle. Hold pressed the large button on the
controller to lower the Plexiglas platen without triggering the cameras. Move the cradle so that the
the platen fits into with the middle of the book.
Turn on the live view on the cameras' LED to see if the the pages fit into the image and if the
cameras are positioned parallel to the page.
f) Double-check storage cards and batteries
It is important that both storage cards on cameras are empty before starting the scanning in order
not to mess up the page sequence when merging photos from the left and the right camera in the
post-processing. To double-check, press play button on cameras and erase if there are some photos
left from the previous scan -- this you do by pressing the menu button, selecting the fifth menu from
the left and then select 'Erase Images' -> 'All images on card' -> 'OK'.
If no automatic chargers are provided, double-check on the information screen that batteries are
charged. They should be fully charged before starting with the scanning of a new book.

g) Turn off the light in the room
Lighting conditions during scanning should be as constant as possible, to reduce glare and achieve
maximum quality remove any source of light that might reflect off the Plexiglas platen. Preferably
turn off the light in the room or isolate the scanner with the black cloth provided.

1. Photographing a book
Now you are ready to start scanning. Place the book closed in the cradle and lower the platen by
holding the large button on the controller pressed (see Illustration 2). Adjust the position of the
cradle and lift the platen by pressing the large button again.
To scan you can now either use the small button on the controller to lower the platen, adjust and
then press it again to trigger the cameras and lift the platen. Or, you can just make a short press on
the large button to do it in one go.
ATTENTION: When the cameras are triggered, the shutter sound has to be heard coming
from both cameras. If one camera is not working, it's best to reconnect both cameras (see
Section 0), make sure the batteries are charged or adapters are connected, erase all images
and restart.
A mistake made in the photographing requires a lot of work in the post-processing, so it's
much quicker to repeat the photographing process.
If you make a mistake while flipping pages, or any other mistake, go back and scan from the page
you missed or incorrectly scanned. Note down the page where the error occurred and in the postprocessing the redundant images will be removed.
ADVICE: The scanner has a digital counter. By turning the dial forward and backward, you
can set it to tell you what page you should be scanning next. This should help you avoid
missing a page due to a distraction.
While scanning, move the cradle a bit to the left from time to time, making sure that the tip of Vshaped platen is aligned with the center of the book and the inner margin is exposed enough.

II. GETTING THE IMAGE FILES READY FOR POST-PROCESSING
Once the book pages have been photographed, they have to be transfered to the computer and
prepared for post-processing. With two-camera scanners, the capturing process will result in two
separate sets of images -- odd and even pages -- coming from the left and right cameras respectively
-- and you will need to rename and reorder them accordingly, rotate them into a vertical position
and collate them into a single sequence of files.
a) Transferring image files
For the transfer of files your principle process design choices are either to copy the files by
removing the memory cards from the cameras and copying them to the computer via a card reader
or to transfer them via a USB cable. The latter process can be automated by remote operating your
cameras from a computer, however this can be done only with a certain number of Canon cameras
(http://bit.ly/16xhJ6b) that can be hacked to run the open Canon Hack Development Kit firmware
(http://chdk.wikia.com).
After transferring the files, you want to erase all the image files on the camera memory card, so that
they would not end up messing up the scan of the next book.
b) Renaming image files
As the left and right camera are typically operated in sync, the photographing process results in two
separate sets of images, with even and odd pages respectively, that have completely different file
names and potentially same time stamps. So before you collate the page images in the order how
they appear in the book, you want to rename the files so that the first image comes from the right
camera, the second from the left camera, the third comes again from the right camera and so on.
You probably want to do a batch renaming, where your right camera files start with n and are offset
by an increment of 2 (e.g. page_0000.jpg, page_0002.jpg,...) and your left camera files start with
n+1 and are also offset by an increment of 2 (e.g. page_0001.jpg, page_0003.jpg,...).
Batch renaming can be completed either from your file manager, in command line or with a number
of GUI applications (e.g. GPrename, rename, cuteRenamer on GNU/Linux).
c) Rotating image files
Before you collate the renamed files, you might want to rotate them. This is a step that can be done
also later in the post-processing (see below), but if you are automating or scripting your steps this is
a practical place to do it. The images leaving your cameras will be positioned horizontally. In order
to position them vertically, the images from the camera on the right will have to be rotated by 90
degrees counter-clockwise, the images from the camera on the left will have to be rotated by 90
degrees clockwise.
Batch rotating can be completed in a number of photo-processing tools, in command line or
dedicated applications (e.g. Fstop, ImageMagick, Nautilust Image Converter on GNU/Linux).
d) Collating images into a single batch
Once you're done with the renaming and rotating of the files, you want to collate them into the same
folder for easier manipulation later.

Getting the image files ready for post-processing on the Public Library scanner
In the case of Public Library scanner, a custom C++ script was written by Mislav Stublić to
facilitate the transfer, renaming, rotating and collating of the images from the two cameras.
The script prompts the user to place into the card reader the memory card from the right camera
first, gives a preview of the first and last four images and provides an entry field to create a subfolder in a local cloud storage folder (path: /home/user/Copy).
It transfers, renames, rotates the files, deletes them from the card and prompts the user to replace the
card with the one from the left camera in order to the transfer the files from there and place them in
the same folder. The script was created for GNU/Linux system and it can be downloaded, together
with its source code, from: https://copy.com/nLSzflBnjoEB
If you have other cameras than Canon, you can edit the line 387 of the source file to change to the
naming convention of your cameras, and recompile by running the following command in your
terminal: "gcc scanflow.c -o scanflow -ludev `pkg-config --cflags --libs gtk+-2.0`"
In the case of Hacker Space Bruxelles scanner, this is handled by the same script that operates the cameras that can be
downloaded from: http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEAD

III. TRANSFORMATION OF SOURCE IMAGES INTO .TIFFS
Images transferred from the cameras are high definition full color images. You want your cameras
to shoot at the largest possible .jpg resolution in order for resulting files to have at least 300 dpi (A4
at 300 dpi requires a 9.5 megapixel image). In the post-processing the size of the image files needs
to be reduced down radically, so that several hundred images can be merged into an e-book file of a
tolerable size.
Hence, the first step in the post-processing is to crop the images from cameras only to the content of
the pages. The surroundings around the book that were captured in the photograph and the white
margins of the page will be cropped away, while the printed text will be transformed into black
letters on white background. The illustrations, however, will need to be preserved in their color or
grayscale form, and mixed with the black and white text. What were initially large .jpg files will
now become relatively small .tiff files that are ready for optical character recognition process
(OCR).
These tasks can be completed by a number of software applications. Our manual will focus on one
that can be used across all major operating systems -- ScanTailor. ScanTailor can be downloaded
from: http://scantailor.sourceforge.net/. A more detailed video tutorial of ScanTailor can be found
here: http://vimeo.com/12524529.
ScanTailor: from a photograph of a page to a graphic file ready for OCR
Once you have transferred all the photos from cameras to the computer, renamed and rotated them,
they are ready to be processed in the ScanTailor.
1) Importing photographs to ScanTailor
- start ScanTailor and open ‘new project’
- for ‘input directory’ chose the folder where you stored the transferred and renamed photo images
- you can leave ‘output directory’ as it is, it will place your resulting .tiffs in an 'out' folder inside
the folder where your .jpg images are
- select all files (if you followed the naming convention above, they will be named
‘page_xxxx.jpg’) in the folder where you stored the transferred photo images, and click 'OK'
- in the dialog box ‘Fix DPI’ click on All Pages, and for DPI choose preferably '600x600', click
'Apply', and then 'OK'
2) Editing pages
2.1 Rotating photos/pages
If you've rotated the photo images in the previous step using the scanflow script, skip this step.
- Rotate the first photo counter-clockwise, click Apply and for scope select ‘Every other page’
followed by 'OK'
- Rotate the following photo clockwise, applying the same procedure like in the previous step
2.2 Deleting redundant photographs/pages
- Remove redundant pages (photographs of the empty cradle at the beginning and the end of the
book scanning sequence; book cover pages if you don’t want them in the final scan; duplicate pages
etc.) by right-clicking on a thumbnail of that page in the preview column on the right side, selecting
‘Remove from project’ and confirming by clicking on ‘Remove’.

# If you by accident remove a wrong page, you can re-insert it by right-clicking on a page
before/after the missing page in the sequence, selecting 'insert after/before' (depending on which
page you selected) and choosing the file from the list. Before you finish adding, it is necessary to
again go through the procedure of fixing DPI and Rotating.
2.3 Adding missing pages
- If you notice that some pages are missing, you can recapture them with the camera and insert them
manually at this point using the procedure described above under 2.2.
3) Split pages and deskew
Steps ‘Split pages’ and ‘Deskew’ should work automatically. Run them by clicking the ‘Play’ button
under the 'Select content' function. This will do the three steps automatically: splitting of pages,
deskewing and selection of content. After this you can manually re-adjust splitting of pages and deskewing.
4) Selecting content
Step ‘Select content’ works automatically as well, but it is important to revise the resulting selection
manually page by page to make sure the entire content is selected on each page (including the
header and page number). Where necessary, use your pointer device to adjust the content selection.
If the inner margin is cut, go back to 'Split pages' view and manually adjust the selected split area. If
the page is skewed, go back to 'Deskew' and adjust the skew of the page. After this go back to
'Select content' and readjust the selection if necessary.
This is the step where you do visual control of each page. Make sure all pages are there and
selections are as equal in size as possible.
At the bottom of thumbnail column there is a sort option that can automatically arrange pages by
the height and width of the selected content, making the process of manual selection easier. The
extreme differences in height should be avoided, try to make selected areas as much as possible
equal, particularly in height, across all pages. The exception should be cover and back pages where
we advise to select the full page.
5) Adjusting margins
For best results select in the previous step content of the full cover and back page. Now go to the
'Margins' step and set under Margins section both Top, Bottom, Left and Right to 0.0 and do 'Apply
to...' → 'All pages'.
In Alignment section leave 'Match size with other pages' ticked, choose the central positioning of
the page and do 'Apply to...' → 'All pages'.
6) Outputting the .tiffs
Now go to the 'Output' step. Ignore the 'Output Resolution' section.
Next review two consecutive pages from the middle of the book to see if the scanned text is too
faint or too dark. If the text seems too faint or too dark, use slider Thinner – Thicker to adjust. Do
'Apply to' → 'All pages'.
Next go to the cover page and select under Mode 'Color / Grayscale' and tick on 'White Margins'.
Do the same for the back page.
If there are any pages with illustrations, you can choose the 'Mixed' mode for those pages and then

under the thumb 'Picture Zones' adjust the zones of the illustrations.
Now you are ready to output the files. Just press 'Play' button under 'Output'. Once the computer is
finished processing the images, just do 'File' → 'Save as' and save the project.

IV. OPTICAL CHARACTER RECOGNITION
Before the edited-down graphic files are finalized as an e-book, we want to transform the image of
the text into an actual text that can be searched, highlighted, copied and transformed. That
functionality is provided by Optical Character Recognition. This a technically difficult task dependent on language, script, typeface and quality of print - and there aren't that many OCR tools
that are good at it. There is, however, a relatively good free software solution - Tesseract
(http://code.google.com/p/tesseract-ocr/) - that has solid performance, good language data and can
be trained for an even better performance, although it has its problems. Proprietary solutions (e.g.
Abby FineReader) sometimes provide superior results.
Tesseract supports as input format primarily .tiff files. It produces a plain text file that can be, with
the help of other tools, embedded as a separate layer under the original graphic image of the text in
a PDF file.
With the help of other tools, OCR can be performed also against other input files, such as graphiconly PDF files. This produces inferior results, depending again on the quality of graphic files and
the reproduction of text in them. One such tool is a bashscript to OCR a ODF file that can be found
here: https://github.com/andrecastro0o/ocr/blob/master/ocr.sh
As mentioned in the 'before scanning' section, the quality of the original book will influence the
quality of the scan and thus the quality of the OCR. For a comparison, have a look here:
http://www.paramoulipist.be/?p=1303
Once you have your .txt file, there is still some work to be done. Because OCR has difficulties to
interpret particular elements in the lay-out and fonts, the TXT file comes with a lot of errors.
Recurrent problems are:
- combinations of specific letters in some fonts (it can mistake 'm' for 'n' or 'I' for 'i' etc.);
- headers become part of body text;
- footnotes are placed inside the body text;
- page numbers are not recognized as such.

V. CREATING A FINALIZED E-BOOK FILE
After the optical character recognition has been completed, the resulting text can be merged with
the images of pages and output into an e-book format. While increasingly the proper e-book file
formats such as ePub have been gaining ground, PDFs still remain popular because many people
tend to read on their computers, and they retain the original layout of the book on paper including
the absolute pagination needed for referencing in citations. DjVu is also an option, as an alternative
to PDF, used because of its purported superiority, but it is far less popular.
The export to PDF can be done again with a number of tools. In our case we'll complete the optical
character recognition and PDF export in gscan2pdf. Again, the proprietary Abbyy FineReader will
produce a bit smaller PDFs.
If you prefer to use an e-book format that works better with e-book readers, obviously you will have
to remove some of the elements that appear in the book - headers, footers, footnotes and pagination.

This can be done earlier in the process of cropping down the original .jpg image files (see under III)
or later by transforming the PDF files. This can be done in Calibre (http://calibre-ebook.com) by
converting the PDF into an ePub, where it can be further tweaked to better accommodate or remove
the headers, footers, footnotes and pagination.
Optical character recognition and PDF export in Public Library workflow
Optical character recognition with the Tesseract engine can be performed on GNU/Linux by a
number of command line and GUI tools. Much of those tools exist also for other operating systems.
For the users of the Public Library workflow, we recommend using gscan2pdf application both for
the optical character recognition and the PDF or DjVu export.
To do so, start gscan2pdf and open your .tiff files. To OCR them, go to 'Tools' and select 'OCR'. In
the dialog box select the Tesseract engine and your language. 'Start OCR'. Once the OCR is
finished, export the graphic files and the OCR text to PDF by selecting 'Save as'.
However, given that sometimes the proprietary solutions produce better results, these tasks can also
be done, for instance, on the Abbyy FineReader running on a Windows operating system running
inside the Virtual Box. The prerequisites are that you have both Windows and Abbyy FineReader
you can install in the Virtual Box. If using Virtual Box, once you've got both installed, you need to
designate a shared folder in your Virtual Box and place the .tiff files there. You can now open them
from the Abbyy FineReader running in the Virtual Box, OCR them and export them into a PDF.
To use Abbyy FineReader transfer the output files in your 'out' out folder to the shared folder of the
VirtualBox. Then start the VirtualBox, start Windows image and in Windows start Abbyy
FineReader. Open the files and let the Abbyy FineReader read the files. Once it's done, output the
result into PDF.

VI. CATALOGING AND SHARING THE E-BOOK
Your road from a book on paper to an e-book is complete. If you want to maintain your library you
can use Calibre, a free software tool for e-book library management. You can add the metadata to
your book using the existing catalogues or you can enter metadata manually.
Now you may want to distribute your book. If the work you've digitized is in the public domain
(https://en.wikipedia.org/wiki/Public_domain), you might consider contributing it to the Gutenberg
project
(http://www.gutenberg.org/wiki/Gutenberg:Volunteers'_FAQ#V.1._How_do_I_get_started_as_a_Pr
oject_Gutenberg_volunteer.3F ), Wikibooks (https://en.wikibooks.org/wiki/Help:Contributing ) or
Arhive.org.
If the work is still under copyright, you might explore a number of different options for sharing.

QUICK WORKFLOW REFERENCE FOR SCANNING AND
POST-PROCESSING ON PUBLIC LIBRARY SCANNER
I. PHOTOGRAPHING A PRINTED BOOK
0. Before you start:
- loosen the book binding by opening it wide on several places
- switch on the scanner
- set up the cameras:
- place cameras on tripods and fit them tigthly
- plug in the automatic chargers into the battery slot and close the battery lid
- switch on the cameras
- switch the lens to Manual Focus mode
- switch the cameras to Av mode and set the aperture to 8.0
- turn the zoom ring to set the focal length exactly midway between 24mm and 35mm
- focus by turning on the live view, pressing magnification button twice and adjusting the
focus to get a clear view of the text
- connect the cameras to the scanner by plugging the remote trigger cable to a port behind a
protective rubber cover on the left side of the cameras
- place the book into the crade
- double-check storage cards and batteries
- press the play button on the back of the camera to double-check if there are images on the
camera - if there are, delete all the images from the camera menu
- if using batteries, double-check that batteries are fully charged
- switch off the light in the room that could reflect off the platen and cover the scanner with the
black cloth
1. Photographing
- now you can start scanning either by pressing the smaller button on the controller once to
lower the platen and adjust the book, and then press again to increase the light intensity, trigger the
cameras and lift the platen; or by pressing the large button completing the entire sequence in one
go;
- ATTENTION: Shutter sound should be coming from both cameras - if one camera is not
working, it's best to reconnect both cameras, make sure the batteries are charged or adapters
are connected, erase all images and restart.
- ADVICE: The scanner has a digital counter. By turning the dial forward and backward,
you can set it to tell you what page you should be scanning next. This should help you to
avoid missing a page due to a distraction.

II. Getting the image files ready for post-processing
- after finishing with scanning a book, transfer the files to the post-processing computer
and purge the memory cards
- if transferring the files manually:
- create two separate folders,
- transfer the files from the folders with image files on cards, using a batch
renaming software rename the files from the right camera following the convention
page_0001.jpg, page_0003.jpg, page_0005.jpg... -- and the files from the left camera
following the convention page_0002.jpg, page_0004.jpg, page_0006.jpg...
- collate image files into a single folder
- before ejecting each card, delete all the photo files on the card
- if using the scanflow script:
- start the script on the computer
- place the card from the right camera into the card reader
- enter the name of the destination folder following the convention
"Name_Surname_Title_of_the_Book" and transfer the files
- repeat with the other card
- script will automatically transfer the files, rename, rotate, collate them in proper
order and delete them from the card
III. Transformation of source images into .tiffs
ScanTailor: from a photograph of page to a graphic file ready for OCR
1) Importing photographs to ScanTailor
- start ScanTailor and open ‘new project’
- for ‘input directory’ chose the folder where you stored the transferred photo images
- you can leave ‘output directory’ as it is, it will place your resulting .tiffs in an 'out' folder
inside the folder where your .jpg images are
- select all files (if you followed the naming convention above, they will be named
‘page_xxxx.jpg’) in the folder where you stored the transferred photo images, and click
'OK'
- in the dialog box ‘Fix DPI’ click on All Pages, and for DPI choose preferably '600x600',
click 'Apply', and then 'OK'
2) Editing pages
2.1 Rotating photos/pages
If you've rotated the photo images in the previous step using the scanflow script, skip this step.
- rotate the first photo counter-clockwise, click Apply and for scope select ‘Every other
page’ followed by 'OK'
- rotate the following photo clockwise, applying the same procedure like in the previous
step

2.2 Deleting redundant photographs/pages
- remove redundant pages (photographs of the empty cradle at the beginning and the end;
book cover pages if you don’t want them in the final scan; duplicate pages etc.) by rightclicking on a thumbnail of that page in the preview column on the right, selecting ‘Remove
from project’ and confirming by clicking on ‘Remove’.
# If you by accident remove a wrong page, you can re-insert it by right-clicking on a page
before/after the missing page in the sequence, selecting 'insert after/before' and choosing the file
from the list. Before you finish adding, it is necessary to again go the procedure of fixing DPI and
rotating.
2.3 Adding missing pages
- If you notice that some pages are missing, you can recapture them with the camera and
insert them manually at this point using the procedure described above under 2.2.
3)

Split pages and deskew
- Functions ‘Split Pages’ and ‘Deskew’ should work automatically. Run them by
clicking the ‘Play’ button under the 'Select content' step. This will do the three steps
automatically: splitting of pages, deskewing and selection of content. After this you can
manually re-adjust splitting of pages and de-skewing.

4)

Selecting content and adjusting margins
- Step ‘Select content’ works automatically as well, but it is important to revise the
resulting selection manually page by page to make sure the entire content is selected on
each page (including the header and page number). Where necessary use your pointer device
to adjust the content selection.
- If the inner margin is cut, go back to 'Split pages' view and manually adjust the selected
split area. If the page is skewed, go back to 'Deskew' and adjust the skew of the page. After
this go back to 'Select content' and readjust the selection if necessary.
- This is the step where you do visual control of each page. Make sure all pages are there
and selections are as equal in size as possible.
- At the bottom of thumbnail column there is a sort option that can automatically arrange
pages by the height and width of the selected content, making the process of manual
selection easier. The extreme differences in height should be avoided, try to make
selected areas as much as possible equal, particularly in height, across all pages. The
exception should be cover and back pages where we advise to select the full page.

5) Adjusting margins
- Now go to the 'Margins' step and set under Margins section both Top, Bottom, Left and
Right to 0.0 and do 'Apply to...' → 'All pages'.
- In Alignment section leave 'Match size with other pages' ticked, choose the central

positioning of the page and do 'Apply to...' → 'All pages'.
6) Outputting the .tiffs
- Now go to the 'Output' step.
- Review two consecutive pages from the middle of the book to see if the scanned text is
too faint or too dark. If the text seems too faint or too dark, use slider Thinner – Thicker to
adjust. Do 'Apply to' → 'All pages'.
- Next go to the cover page and select under Mode 'Color / Grayscale' and tick on 'White
Margins'. Do the same for the back page.
- If there are any pages with illustrations, you can choose the 'Mixed' mode for those
pages and then under the thumb 'Picture Zones' adjust the zones of the illustrations.
- To output the files press 'Play' button under 'Output'. Save the project.
IV. Optical character recognition & V. Creating a finalized e-book file
If using all free software:
1) open gscan2pdf (if not already installed on your machine, install gscan2pdf from the
repositories, Tesseract and data for your language from https://code.google.com/p/tesseract-ocr/)
- point gscan2pdf to open your .tiff files
- for Optical Character Recognition, select 'OCR' under the drop down menu 'Tools',
select the Tesseract engine and your language, start the process
- once OCR is finished and to output to a PDF, go under 'File' and select 'Save', edit the
metadata and select the format, save
If using non-free software:
2) open Abbyy FineReader in VirtualBox (note: only Abby FineReader 10 installs and works with some limitations - under GNU/Linux)
- transfer files in the 'out' folder to the folder shared with the VirtualBox
- point it to the readied .tiff files and it will complete the OCR
- save the file

REFERENCES
For more information on the book scanning process in general and making your own book scanner
please visit:
DIY Book Scanner: http://diybookscannnner.org
Hacker Space Bruxelles scanner: http://hackerspace.be/ScanBot
Public Library scanner: http://www.memoryoftheworld.org/blog/2012/10/28/our-belovedbookscanner/
Other scanner builds: http://wiki.diybookscanner.org/scanner-build-list
For more information on automation:
Konrad Voeckel's post-processing script (From Scan to PDF/A):
http://blog.konradvoelkel.de/2013/03/scan-to-pdfa/
Johannes Baiter's automation of scanning to PDF process: http://spreads.readthedocs.org
For more information on applications and tools:
Calibre e-book library management application: http://calibre-ebook.com/
ScanTailor: http://scantailor.sourceforge.net/
gscan2pdf: http://sourceforge.net/projects/gscan2pdf/
Canon Hack Development Kit firmware: http://chdk.wikia.com
Tesseract: http://code.google.com/p/tesseract-ocr/
Python script of Hacker Space Bruxelles scanner: http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEA
D

Sollfrank, Francke & Weinmayr
Piracy Project
2013

Giving What You Don't Have

Andrea Francke, Eva Weinmayr
Piracy Project

Birmingham, 6 December 2013

[00:12]
Eva Weinmayr: When we talk about the word piracy, it causes a lot of problems
to quite a few institutions to deal with it. So events that we’ve organised
have been announced by Central Saint Martins without using the word piracy.
That’s interesting, the problems it still causes…

Cornelia Sollfrank: And how do you announce the project without “Piracy”? The
Project?

E. W.: It’s a project about intellectual property.

C. S.: The P Project.

Andrea Francke, Eva Weinmayr: [laugh] Yes.

[00:52]
Andrea Francke: The Piracy Project is a knowledge platform, and it is based
around a collection of pirated books, of books that have been copied by
people. And we use it to raise discussion about originality, authorship,
intellectual property questions, and to produce new material, new essays and
new questions.

[01:12]
E. W.: So the Piracy Project includes several aspects. One is that it is an
act of piracy in itself, because it is located in an art school, in a library,
in an officially built up a collection of pirated books. [01:30] So that’s the
second aspect, it’s a collection of books which have been copied,
appropriated, modified, improved, which live in this library. [01:40] And the
third part is that it is a collection of physical books, which is touring. We
create reading rooms and invite people to explore the books and discuss issues
raised by cultural piracy.
[01:58] The Piracy Project started in an art college library, which was
supposed to be closed down. And the Piracy Project is one project of And
Publishing. And Publishing is a publishing activity exploring print-on-demand
and new modes of production and of dissemination, the immediacy of
dissemination. [02:20] And Publishing is a collaboration between myself and
Lynn Harris, and we were hosted by Central Saint Martins College of Art and
Design in London. And the campus where this library was situated was the
campus we were working at. [02:40] So when the library was being closed, we
moved in the library together with other members of staff, and kept the
library open in a self-organised way. But we were aware that there’s no budget
to buy new books, and we wanted to have this as a lively space, so we created
an open call for submissions and we asked people to select a book which is
really important to them and make a copy of it. [03:09] So we weren’t
interested in piling up a collection of second hand books, we were really
interested in this process: what happens when you make a copy of a book, and
how does this copy sit next to the original authoritative copy of the book.
This is how it started.

[03:31]
A. F.: I met Eva at the moment when And Publishing was helping to set up this
new space in the library, and they were trying to think how to make the
library more alive inside that university. [03:44] And I was doing research on
Peruvian book piracy at that time, and I had found this book that was modified
and was in circulation. And it was a very exciting moment for us to think what
happens if we can promote this type of production inside this academic
library.

[04:05] Piracy Project
Collection / Reading Room / Research

[04:11]
The Collection

[04:15]
E. W.: We asked people to make a copy of a book which is important to them and
send it to us, and so with these submission we started to build up the
collections. Lots of students were getting involved, but also lots of people
who work in this topic, and were interested in these topics. [04:38] So we
received about one hundred books in a couple of months. And then, parallel to
this, we started to do research ourselves. [04:50] We had a residency in
China, so we went to China, to Beijing and Shanghai, to meet illegal
booksellers of pirated architecture books. And we had a residency in Turkey,
in Istanbul, where we did lots of interviews with publishers and artists on
book piracy. [05:09] So the collection is a mix of our own research and cases
from the real book markets, and creative work, artistic work which is produced
in the context of an art college and the wider cultural realm.

[05:29]
A. F.: And it is an ongoing project.

E. W.: The project is ongoing, we still receive submissions. The collection is
growing, and at the moment here we have about 180 books, here at Grand Union
(Birmingham).

[05:42]
A. F.: When we did the open call, something that was really important to us
was to make clear for people that they have a space of creativity when they
are making a copy. So we wrote, please send us a copy of a book, and be aware
that things happen when you copy a book. [05:57] Whether you do it
intentionally or not a copy is never the same. So you can use that space, take
ownership of that space and make something out of that; or you can take a step
back and allow things to happen without having control. And I think that is
something that is quite important for us in the project. [06:12] And it is
really interesting how people have embraced that in different measures, like
subtle things, or material things, or adding text, taking text out, mixing
things, judging things. Sometimes just saying, I just want it to circulate, I
don’t mind what happens in the space, I just want the subject to be in the
world again.

[06:35]
E. W.: I think this is one which I find interesting in terms of making a copy,
because it’s not so much about my own creativity, it’s more about exploring
how technology edits what you can see. It’s Jan van Toorn’s Critical Practice,
and the artist is Hester Barnard, a Canadian artist. [07:02] She sent us these
three copies, and we thought, that’s really generous, three copies. But they
are not identical copies, they are very different. Some have a lot of empty
pages in the book. And this book has been screen-captured on a 3.5 inch
iPhone, whereas this book has been screen-captured on a desktop, and this one
has been screen-captured with a laptop. [07:37] So the device you use to
access information online determines what you actually receive. And I find
this really interesting, that she translated this back into a hardcopy, the
online edited material. [07:53] And this is kind of taught by this book,
standard International Copyright. She went to Google Books, and screen-
captured all the pages Google Books are showing. So we are all familiar with
blurry text pages, but then it starts that you get the message “Page 38 is not
shown in this preview.” [08:18] And then it’s going through the whole book, so
she printed every page basically, omitting the actual information. But the
interesting thing is that we are all aware that this is happening on Google,
on screen online, but the fact that she’s translating this back into an
object, into a printed book, is interesting.

[08:44]
Reading Room

[08:48]
A. F.: We create these reading rooms with the collection as a way to tour the
collection, and meet people and have conversations around the books. And that
is something quite important to us, that we go with the physical books to a
place, either for two or three months, and meet different people that have
different interests in relation to the collection in that locality. We’ve been
doing that for the last two years, I think, three years. [09:12] And it’s
quite interesting because different places have very different experiences of
piracy. So you can go to a country where piracy is something very common, or a
different place where people have a very strong position against piracy, or a
different legal framework. And I feel the type of conversations and the
quality of interactions is quite different from being present on the space and
with the books. [09:36] And that’s why we don’t call these exhibitions,
because we always have places where people can come and they can stay, and
they can come again. Sometimes people come three or four times and they
actually read the books. And a few times they go back to their houses and they
bring books back, and they said, I’m going to contact this friend who has been
to Russia and he told me about this book – so we can add it to the collection.
I think that makes a big difference to how the research in the project
functions.

[10:06]
E. W.: One of the most interesting events we did with the Piracy collection
was at the Show Room where we had a residency for the last year. There were
three events, and one was A Day At The Courtroom. This was an afternoon where
we invited three copyright lawyers coming from different legal systems: the
US, the UK, and the Continental European, Athens. And we presented ten
selected cases from the collection and the three copyright lawyers had to
assess them in the eyes of the law, and they had to agree where to put this
book in a scale from legal to illegal. [10:51] So we weren’t interested really
to say, this is legal and this is illegal, we were interested in all the
shades in between. And then they had to discuss where they would place the
book. But then the audience had the last verdict, and then the audience placed
the book. [11:05] And this was an extremely interesting discussion, because it
was interesting to see how different the legal backgrounds are, how blurry the
whole field is, how you can assess when is the moment where a work becomes a
transformative work, or when it stays a derivative work, and this whole
discussion.
[11:30] When we do these reading rooms – and we had one in New York, for
example, at the New York Art Book Fair – people are coming, and they are
coming to see the physical books in a physical space, so this creates a social
encounter and we have these conversations. [11:47] For example, a woman stood
up to us in New york and she told us about a piracy project she run where she
was working in a juvenile detention centre, and she produced a whole shadow
library of books because the incarcerated kids couldn’t take the books in
their cells, so she created these copies, individual chapters, and they could
circulate. [12:20] I’m telling this because the fact that we are having this
reading room and that we are meeting people, and that we are having these
conversations, really furthers our research. We find out about these projects
by sharing knowledge.

[12:38]
Categories

[12:42]
A. F.: Whenever we set our reading room for the Piracy Project we need to
organise the books in a certain way. What we started to do now is that we’ve
created these different categories, and the first set of categories came from
the legal event. [12:56] So we set up, we organised the books in different
categories that would help us have questions for the lawyers, that would work
for groups of books instead of individual works. [13:07] And the idea is that,
for example, we are going to have our next events with librarians, and a new
set of categories would come. So the categories change as our interest or
research in the project is changing. [13:21] The current categories are:
Pirated Design, so books where the look of the book has been copied but not
the content; recirculation, books that have been copied trying to be
reproduced exactly as they were, because they need to be circulating again;
transformation, books that have been modified; For Sale Doctrine, so we
receive quite a few books where people haven’t actually made a copy but they
have cut the book or drawn inside the book, and legally you are allowed to do
anything with a book except copy it, so we thought that it was quite important
so that we didn’t have to discuss that with the lawyers; [14:03] Public
Domain, which are works that are already out of copyright, again, so whatever
you do with those books is legal; and collation, books gathered from different
sources, and who owns the copyright, which was a really interesting question,
which is when you have a book that has many authors – it’s really interesting.
Different systems in different countries have different ways to deal with who
owns the copyright and what are the rights of the owners of the different
works.

[14:36]
E. W.: Ahmet Şık is a journalist who published a book about the Ergenekon
scandal and the Turkish government, and connects that kind of mafioso
structures. Before the book could be published he was arrested and put in jail
for a whole year without trial, and he sent the PDF to friends, and the PDF
was circulating on many different computers so it couldn’t be taken. [15:06]
They published the PDF, and as authors they put over a hundred different
author names, so there was not just one author who could be taken into
responsibility.

[15:22] We have in the collection this book, it’s Teignmouth Electron by
Tacita Dean. This is the original, it’s published by Book Works and Steidl.
And to this round table, to this event, we invited also Jane Rolo, director of
Book Works (and she published this book). [15:41] And we invited her saying,
do you know that your book has been pirated? So she was really interested and
she came along. This is the pirated version, it’s Alias, [by] Damián Ortega in
Mexico. It’s a series of books where he translates texts and theory into
Spanish, which are not available in Spanish. So it’s about access, it’s about
circulation. [16:07] But actually he redesigned the book. The pirated version
looks very different, and it has a small film roll here, from Tacita Dean’s
book. And it was really amazing that Jane Rolo flipped the pirated book and
she said, well, actually this is really very nice.

[16:31] This is kind of a standard academic publishing format, it’s Gilles
Deleuze’s Proust and Signs, and the contributor, the artist who produced the
book is Neil Chapman, a writer based in London. And he made a facsimile of his
copy of this book, including the binding mistakes – so there’s one chapter
upside down printed in the book. [17:04] But the really interesting thing is
that he scanned it on his home inkjet printer – he scanned it on his scanner
and then printed it on his home inkjet printer. And the feel of it is very
crafty, because the inkjet has a very different typographic appearance than
the official copy. [17:28] And this makes you read the book in quite a
different way, you relate differently to the actual text. So it’s not just
about the information conveyed on this page, it’s really about how I can
relate to it visually. I find this really interesting when we put this book
into the library, in our collection in the library, and it sat next to the
original, [17:54] it raises really interesting questions about what kind of
authority decides which book can access the library, because this is
definitely and obviously a self-made copy – so if this self-made copy can
enter the library, any self-made text and self-published copy could enter the
library. So it was raising really interesting questions about gatekeepers of
knowledge, and hierarchies and authorities.

[18:26]
On-line catalogue

[18:30]
E. W.: We created this online catalogue give to an overview of what we have in
the collection. We have a cover photograph and then we have a short text where
we try to frame and to describe the approach taken, like the strategy, what’s
been pirated and what was the strategy. [18:55] And this is quite a lot,
because it’s giving you the framework of it, the conceptual framework. But
it’s not giving you the book, and this is really important because lots of the
books couldn’t be digitised, because it’s exactly their material quality which
is important, and which makes the point. [19:17] So if I would… if I have a
project which is working about mediation, and then I put another layer of
mediation on top of it by scanning it, it just wouldn’t work anymore.
[19:29] The purpose of the online catalogue isn’t to give you insight into all
the books to make actually all the information available, it’s more to talk
about the approach taken and the questions which are raised by this specific
book.

[19:47]
Cultures of the copy

[19:51]
A topic of cultural difference became really obvious when we went to Istanbul.
A copy shop which had many academic titles on the shelves, copied, pirated
titles... The fact is that in London, where I’m based, you can access anything
in any library, and it’s not too expensive to get the original book. [20:27]
But in Istanbul it’s very expensive, and the whole academic community thrives
on pirated, copied academic titles.

[20:39]
A. F.: So this is the original Jaime Bayly [No se lo digas a nadie], and this
is the pirated copy of the Jaime Bayly. This book is from Peru, it was bought
on the street, on a street market. [20:53] And Peru has a very big pirated
book market, most books in Peru are pirated. And we found this because there
was a rumour that books in Peru had been modified, pirated books. And this
version, the pirated version, has two extra chapters that are not in the
original one. [21:13] It’s really hard to understand the motivation behind it.
There’s no credit, so the person is inhabiting this author’s identity in a
sense. They are not getting any cultural capital from it. They are not getting
extra money, because if they are found out, nobody would buy books from this
publisher anymore. [21:33] The chapters are really well written, so you as a
reader would not realise that you are reading something that has been pirated.
And that was really fascinating in terms of what space you create. So when you
have this technology that allows you to have the book open and print it so
easily – how you can you take advantage of that, and take ownership or inhabit
these spaces that technology is opening up for you.

[22:01]
E. W.: Book piracy in China is really important when it comes to architecture
books, Western architecture books. Lots of architecture studios, but even
university libraries would buy from pirate book sellers, because it’s just so
much cheaper. [22:26] And we’ve found this Mark magazine with one of the
architecture sellers, and it’s supposed to be a bargain because you have six
magazines in one. [22:41] And we were really interested in the question, what
are the criteria for the editing? How do you edit six issues into one? But
basically everything is in here, from advertisement, to text, to images, it’s
all there. But then a really interesting question arises when it comes to
technology, because in this magazine there are pages in Italian language
clearly taken from other magazines.

[23:14]
A. F.: But it was also really interesting to go there, and actually interview
the distributor and go through the whole experience. We had to meet the
distributor in a neutral place, and he interviewed us to see if he was going
to allow us to go into the shop and buy his books. [23:31] And then going
through the catalogue and realising how Rem Koolhaas is really popular among
the pirates, but actually Chinese architecture is not popular, so there’s only
like three pirated books on Chinese architecture; or that from all the
architecture universities in the world only the AA books are copied – the
Architectural Association books. [23:51] And I think those small things are
really things that are worth spending time and reflecting on.

[23:58]
E. W.: We found this pirate copy of Tintin when we visited Beijing, and
obviously compared to the original, it looks different, a different format.
But also it’s black and white, but it’s not a photocopy of the original full-
colour. [24:23] It’s redrawn by hand, so all the drawings are redrawn and
obviously translated into Chinese. This is quite a labour of love, which is
really amazing. I can compare the two. The space is slightly differently
interpreted.

[24:50]
A. F.: And it’s really incredible, because at some point in China there were
14 or 15 different publishers publishing Tintin, and they all have their
versions. They are all hand-drawn by different people, so in the back, in
Chinese, it’s the credit. So you can buy it by deciding which person does the
best drawings of the production of Tintin, which I thought it was really…
[25:14] It’s such a different cultural way to actually give credit to the
person that is copying it, and recognise the labour, and the intention and the
value of that work.

[25:24]
Why books?

[25:28]
E. W.: Books have always been very important in my practice, in my artistic
practice, because lots of my projects culminated in a book, or led into a
book. And publications are important because they can circulate freely, they
can circulate much easier than artworks in a gallery. [25:50] So this question
of how to make things public and how to create an audience… not how to create
an audience – how to reach a reader and how to create a dialogue. So the book
is the perfect tool for this.

[26:04]
A. F.: My interest in books comes from making art, or thinking about art as a
way to interact with the world, so outside art settings, and I found books
really interesting in that. And that’s how I met Eva, in a sense, because I
was interested in that part of her practice. [26:26] When I found the Jaime
Bayly book, for me that was a real moment of excitement, of this person that
was doing this things in the world without taking any credit, but was having
such a profound effect on so many readers. I’m quite fascinated by that.
[26:44] I'm also really interested in research and using events – research
that works with people. So it kind of creates communities around certain
subjects, and then it uses that to explore different issues and to interact
with different areas of knowledge. And I think books are a privileged space to
do that.

[27:11]
E. W.: The books in the Piracy collection, because they are objects you can
grab, and because they need a place, they are a really important tool to start
a dialogue. When we had this reading room in the New York Art Book Fair, it
was really the book that created this moment when you started a conversation
with somebody else. And I think this is a very important moment in the Piracy
collection as a tool to start this discussion. [27:44] In the Piracy
collection the books are not so important to circulate, because they don’t
circulate. They only travel with us, in a way, or they travel here to Grand
Union to be installed in this reading room. But they are not meant to be
printed in a thousands print run and circulated in the world.

C. S.: So what is their function?

[28:08]
E. W.: The functions of the books here in the Piracy collection are to create
a dialogue, debate about these issues they are raising, and they are a tool
for a direct encounter, for a social encounter. As Andrea said, building a
community which is debating these issues which they are raising. [28:32] And I
also find it really interesting – when we where in China we also talked with
lots of publishers and artists, and they said that the book, in comparison to
an online file, is a really important tool in China, because it can’t be
controlled as easily as online communication. [28:53] So a book is an
autonomous object which can be passed on from one hand to the other, without
the state or another authority to intervene. I think that is an important
aspect when you talk about books in comparison with circulating information
online.

[29:13]
Passion for piracy

[29:17]
A. F.: I’m quite interested in enclosures, and people that jump those
enclosures. I’m kind of interested in these imposed… Maybe because I come from
Peru and we have a different relation to rules, and I’m in Britain where rules
seem to have so much strength. And I’m quite interested in this agency of
taking personal responsibility and saying, I’m going to obey this rule, I’m
not going to obey this one, and what does that mean. [29:42] That makes me
really interested in all these different strategies, and also to find a way to
value them and show them – how when you make this decision to jump a rule, you
actually help bring up questions, modifications, and propose new models or new
ways about thinking things. [30:02] And I think that is something that is part
of all the other projects that I do: stating the rules and the people that
break them.

[30:12]
E. W.: The pirate as a trickster who tries to push the boundaries which are
being set. And I think the interesting, or the complex part of the Piracy
Project is that we are not saying, I’m for piracy or I’m against piracy, I’m
for copyright, I’m against copyright. It’s really about testing out these
decisions and the own boundaries, the legal boundaries, the moral limits – to
push them and find them. [30:51] I mean, the Piracy Project as a whole is a
project which is pushing the boundaries because it started in this academic
library, and it’s assessed by copyright lawyers as illegal, so to run such a
project is an act of piracy in itself.

[31:17]
This method of doing or approaching this art project is to create a
collaboration to instigate this discourse, and this discourse is happening on
many different levels. One of them is conversation, debate. But the other one
is this material outcome, and then this material outcome is creating a new
debate.

Sollfrank
The Surplus of Copying
2018

## essay #11

The Surplus of Copying
How Shadow Libraries and Pirate Archives Contribute to the
Creation of Cultural Memory and the Commons
By Cornelia Sollfrank

Digital artworks tend to have a problematic relationship with the white
cube—in particular, when they are intended and optimized for online
distribution. While curators and exhibition-makers usually try to avoid
showing such works altogether, or at least aim at enhancing their sculptural
qualities to make them more presentable, the exhibition _Top Tens_ featured an
abundance of web quality digital artworks, thus placing emphasis on the very
media condition of such digital artifacts. The exhibition took place at the
Onassis Cultural Center in Athens in March 2018 and was part of the larger
festival _Shadow Libraries: UbuWeb in Athens_ ,1 an event to introduce the
online archive UbuWeb2 to the Greek audience and discuss related cultural,
ethical, technical, and legal issues. This text takes the event—and the
exhibition in particular—as a starting point for a closer look at UbuWeb and
the role an artistic approach can play in building cultural memory within the
neoliberal knowledge economy.

_UbuWeb—The Cultural Memory of the Avant-Garde_

Since Kenneth Goldsmith started Ubu in 1997 the site has become a major point
of reference for anyone interested in exploring twentieth-century avant-garde
art. The online archive provides free and unrestricted access to a remarkable
collection of thousands of artworks—among them almost 700 films and videos,
over 1000 sound art pieces, dozens of filmed dance productions, an
overwhelming amount of visual poetry and conceptual writing, critical
documents, but also musical scores, patents, electronic music resources, plus
an edition of vital new literature, the /ubu editions. Ubu contextualizes the
archived objects within curated sections and also provides framing academic
essays. Although it is a project run by Goldsmith without a budget, it has
built a reputation for making all the things available one would not find
elsewhere. The focus on “avant-garde” may seem a bit pretentious at first, but
when you look closer at the project, its operator and the philosophy behind
it, it becomes obvious how much sense this designation makes. Understanding
the history of the twentieth-century avant-garde as “a history of subversive
takes on creativity, originality, and authorship,”3 such spirit is not only
reflected in terms of the archive’s contents but also in terms of the project
as a whole. Theoretical statements by Goldsmith in which he questions concepts
such as authorship, originality, and creativity support this thesis4—and with
that a conflictual relationship with the notion of intellectual property is
preprogrammed. Therefore it comes as no surprise that the increasing
popularity of the project goes hand-in-hand with a growing discussion about
its ethical justification.

At the heart of Ubu, there is the copy! Every item in the archive is a digital
copy, either of another digital item or, in fact, it is the digitized version
of an analog object.5 That is to say, the creation of a digital collection is
inevitably based on copying the desired archive records and storing them on
dedicated media. However, making a copy is in itself a copyright-relevant act,
if the respective item is an original creation and as such protected under
copyright law.6 Hence, “any reproduction of a copyrighted work infringes the
copyright of the author or the corresponding rights of use of the copyright
holder”.7 Whether the existence of an artwork within the Ubu collection is a
case of copyright infringement varies with each individual case and depends on
the legal status of the respective work, but also on the way the rights
holders decide to act. As with all civil law, there is no judge without a
plaintiff, which means even if there is no express consent by the rights
holders, the work can remain in the archive as long as there is no request for
removal.8 Its status, however, is precarious. We find ourselves in the
notorious gray zone of copyright law where nothing is clear and many things
are possible—until somebody decides to challenge this status. Exploring the
borders of this experimental playground involves risk-taking, but, at the same
time, it is the only way to preserve existing freedoms and make a case for
changing cultural needs, which have not been considered in current legal
settings. And as the 20 years of Ubu’s existence demonstrate, the practice may
be experimental and precarious, but with growing cultural relevance and
reputation it is also gaining in stability.

_Fair Use and Public Interest_

At all public appearances and public presentations Goldsmith and his
supporters emphasize the educational character of the project and its non-
commercial orientation.9 Such a characterization is clearly intended to take
the wind out of the sails of its critics from the start and to shift the
attention away from the notion of piracy and toward questions of public
interest and the common good.

From a cultural point of view, the project unquestionably is of inestimable
value; a legal defense, however, would be a difficult undertaking. Copyright
law, in fact, has a built-in opening, the so-called copyright exceptions or
fair use regulations. They vary according to national law and cultural
traditions and allow for the use of copyrighted works under certain, defined
provisions without permission of the owner. The exceptions basically apply to
the areas of research and private study (both non-commercial), education,
review, and criticism and are described through general guidelines. “These
defences exist in order to restore the balance between the rights of the owner
of copyright and the rights of society at large.”10

A very powerful provision in most legislations is the permission to make
“private copies”, digital and analog ones, in small numbers, but they are
limited to non-commercial and non-public use, and passing on to a third party
is also excluded.11 As Ubu is an online archive that makes all of its records
publicly accessible and, not least, also provides templates for further
copying, it exceeds the notion of a “private copy” by far. Regarding further
fair use provisions, the four factors that are considered in a decision-making
process in US copyright provisions, for instance, refer to: 1) the purpose and
character of the use, including whether such use is of a commercial nature or
is for non-profit educational purposes; 2) the nature of the copyrighted work;
3) the amount and substantiality of the portion used in relation to the
copyrighted work as a whole; and 4) the effect of the use upon the potential
market for the value of the copyrighted work (US Copyright Act, 1976, 17 USC.
§107, online, n.pag.). Applying these fair use provisions to Ubu, one might
consider that the main purposes of the archive relate to education and
research, that it is by its very nature non-commercial, and it largely does
not collide with any third party business interests as most of the material is
not commercially available. However, proving this in detail would be quite an
endeavor. And what complicates matters even more is that the archival material
largely consists of original works of art, which are subject to strict
copyright law protection, that all the works have been copied without any
transformative or commenting intention, and last but not least, that the
aspect of the appropriateness of the amount of used material becomes absurd
with reference to an archive whose quality largely depends on
comprehensiveness: the more the merrier. As Simon Stokes points out, legally
binding decisions can only be made on a case-by-case basis, which is why it is
difficult to make a general evaluation of Ubu’s legal situation.12 The ethical
defense tends to induce the cultural value of the archive as a whole and its
invaluable contribution to cultural memory, while the legal situation does not
consider the value of the project as a whole and necessitates breaking it down
into all the individual items within the collection.

This very brief, when not abridged discussion of the possibilities of fair use
already demonstrates how complex it would be to apply them to Ubu. How
pointless it would be to attempt a serious legal discussion for such a
privately run archive becomes even clearer when looking at the problems public
libraries and archives have to face. While in theory such official
institutions may even have a public mission to collect, preserve, and archive
digital material, in practice, copyright law largely prevents the execution of
this task, as Steinhauer explains.13 The legal expert introduces the example
of the German National Library, which was assigned the task since 2006 to make
back-up copies of all websites published within the .de sublevel domain, but
it turned out to be illegal.14 Identifying a deficiently legal situation when
it comes to collecting, archiving, and providing access to digital cultural
goods, Steinhauer even speaks of a “legal obligation to amnesia”.15 And it is
particularly striking that, from a legal perspective, the collecting of
digitalia is more strictly regulated than the collecting of books, for
example, where the property status of the material object comes into play.
Given the imbalance between cultural requirements, copyright law, and the
technical possibilities, it is not surprising that private initiatives are
being founded with the aim to collect and preserve cultural memory. These
initiatives make use of the affordability and availability of digital
technology and its infrastructures, and they take responsibility for the
preservation of cultural goods by simply ignoring copyright induced
restrictions, i.e. opposing the insatiable hunger of the IP regime for
control.

_Shadow Libraries_

Ubu was presented and discussed in Athens at an event titled _Shadow
Libraries: UbuWeb in Athens_ , thereby making clear reference to the ecosystem
of shadow libraries. A library, in general, is an institution that collects,
orders, and makes published information available while taking into account
archival, economic, and synoptic aspects. A shadow library does exactly the
same thing, but its mission is not an official one. Usually, the
infrastructure of shadow libraries is conceived, built, and run by a private
initiative, an individual, or a small group of people, who often prefer to
remain anonymous for obvious reasons. In terms of the media content provided,
most shadow libraries are peer-produced in the sense that they are based on
the contributions of a community of supporters, sometimes referred to as
“amateur librarians”. The two key attributes of any proper library, according
to Amsterdam-based media scholar Bodo Balazs, are the catalog and the
community: “The catalogue does not just organize the knowledge stored in the
collection; it is not just a tool of searching and browsing. It is a critical
component in the organisation of the community of librarians who preserve and
nourish the collection.”16 What is specific about shadow libraries, however,
is the fact that they make available anything their contributors consider to
be relevant—regardless of its legal status. That is to say, shadow libraries
also provide unauthorized access to copyrighted publications, and they make
the material available for download without charge and without any other
restrictions. And because there is a whole network of shadow libraries whose
mission is “to remove all barriers in the way of science,”17 experts speak of
an ecosystem fostering free and universal access to knowledge.

The notion of the shadow library enjoyed popularity in the early 2000s when
the wide availability of digital networked media contributed to the emergence
of large-scale repositories of scientific materials, the most famous one
having been Gigapedia, which later transformed into library.nu. This project
was famous for hosting approximately 400,000 (scientific) books and journal
articles but had to be shut down in 2012 as a consequence of a series of
injunctions from powerful publishing houses. The now leading shadow library in
the field, Library Genesis (LibGen), can be considered as its even more
influential successor. As of November 2016 the database contained 25 million
documents (42 terabytes), of which 2.1 million were books, with digital copies
of scientiﬁc articles published in 27,134 journals by 1342 publishers.18 The
large majority of the digital material is of scientific and educational nature
(95%), while only 5% serves recreational purposes.19 The repository is based
on various ways of crowd-sourcing, i.e. social and technical forms of
accessing and sharing academic publications. Despite a number of legal cases
and court orders, the site is still available under various and changing
domain names.20

The related project Sci-Hub is an online service that processes requests for
pay-walled articles by providing systematic, automized, but unauthorized
backdoor access to proprietary scholarly journal databases. Users requesting
papers not present in LibGen are advised to download them through Sci-Hub; the
respective PDF ﬁles are served to users and automatically added to LibGen (if
not already present). According to _Nature_ magazine, Sci-Hub hosts around 60
million academic papers and was able to serve 75 million downloads in 2016. On
a daily basis 70,000 users access approximately 200,000 articles.

The founder of the meta library Sci-Hub is Kazakh programmer Alexandra
Elbakyan, who has been sued by large publishing houses and was convicted twice
to pay almost 20 million US$ in compensation for the losses her activities
allegedly have caused, which is why she had to go underground in Russia. For
illegally leaking millions of documents the _New York Times_ compared her to
Edward Snowden in 2016: “While she didn’t reveal state secrets, she took a
stand for the public’s right to know by providing free online access to just
about every scientific paper ever published, ranging from acoustics to
zymology.” 21 In the same year the prestigious _Nature_ magazine elected her
as one of the ten most influential people in science. 22 Unlike other
persecuted people, she went on the offensive and started to explain her
actions and motives in court documents and blog posts. Sci-Hub encourages new
ways of distributing knowledge, beyond any commercial interests. It provides a
radically open infrastructure thus creating an inviting atmosphere. “It is a
knowledge infrastructure that can be freely accessed, used and built upon by
anyone.”23

As both projects LibGen and Sci-Hub are based in post-Soviet countries, Balazs
reconstructed the history and spirit of Russian reading culture and brings
them into connection.24 Interestingly, the author also establishes a
connection to the Kolhoz (Russian: колхо́з), an early Soviet collective farm
model that was self-governing, community-owned, and a collaborative
enterprise, which he considers to be a major inspiration for the digital
librarians. He also identifies parallels between this Kolhoz model and the
notion of the “commons”—a concept that will be discussed in more detail with
regards to shadow libraries further below.

According to Balazs, these sorts of libraries and collections are part of the
Guerilla Open Access movement (GOA) and thus practical manifestations of Aaron
Swartz’s “Guerilla Open Access Manifesto”.25 In this manifesto the American
hacker and activist pointed out the flaws of open access politics and aimed at
recruiting supporters for the idea of “radical” open access. Radical in this
context means to completely ignore copyright and simply make as much
information available as possible. “Information is power” is how the manifesto
begins. Basically, it addresses the—what he calls—“privileged”, in the sense
that they do have access to information as academic staff or librarians, and
he calls on their support for building a system of freely available
information by using their privilege, downloading and making information
available. Swartz and Elbakyan both have become the “iconic leaders”26 of a
global movement that fights for scientific knowledge to be(come) freely
accessible and whose protagonists usually prefer to operate unrecognized.
While their particular projects may be of a more or less temporary nature, the
discursive value of the work of the “amateur librarians” and their projects
will have a lasting impact on the development of access politics.

_Cultural and Knowledge Commons_

The above discussion illustrates that the phenomenon of shadow libraries
cannot be reduced to its copyright infringing aspects. It needs to be
contextualized within a larger sociopolitical debate that situates the demand
for free and unrestricted access to knowledge within the struggle against the
all-co-opting logic of capital, which currently aims to economize all aspects
of life.

In his analysis of the Russian shadow libraries Balazs has drawn a parallel to
the commons as an alternative mode of ownership and a collective way of
dealing with resources. The growing interest in the discourses around the
commons demonstrates the urgency and timeliness of this concept. The
structural definition of the commons conceived by political economist Massimo
de Angelis allows for its application in diverse fields: “Commons are social
systems in which resources are pooled by a community of people who also govern
these resources to guarantee the latter’s sustainability (if they are natural
resources) and the reproduction of the community. These people engage in
‘commoning,’ that is a form of social labour that bears a direct relation to
the needs of the people, or the commoners”.27 While the model originates in
historical ways of sharing natural resources, it has gained new momentum in
relation to very different resources, thus constituting a third paradigm of
production—beyond state and private—however, with all commoning activities
today still being embedded in the surrounding economic system.

As a reason for the newly aroused interest in the commons, de Angelis provides
the crisis of global capital, which has maneuvered itself into a systemic
impasse. While constantly expanding through its inherent logic of growth and
accumulation, it is the very same logic that destroys the two systems capital
relies on: non-market-shaped social reproduction and the ecological system.
Within this scenario de Angelis describes capital as being in need of the
commons as a “fix” for the most urgent systemic failures: “It needs a ‘commons
fix,’ especially in order to deal with the devastation of the social fabric as
a result of the current crisis of reproduction. Since neoliberalism is not
about to give up its management of the world, it will most likely have to ask
the commons to help manage the devastation it creates. And this means: if the
commons are not there, capital will have to promote them somehow.”28

This rather surprising entanglement of capital and the commons, however, is
not the only perspective. Commons, at the same time, have the potential to
create “a social basis for alternative ways of articulating social production,
independent from capital and its prerogatives. Indeed, today it is difficult
to conceive emancipation from capital—and achieving new solutions to the
demands of _buen vivir_ , social and ecological justice—without at the same
time organizing on the terrain of commons, the non-commodified systems of
social production. Commons are not just a ‘third way’ beyond state and market
failures; they are a vehicle for emerging communities of struggle to claim
ownership to their own conditions of life and reproduction.”29 It is their
purpose to satisfy people’s basic needs and empower them by providing access
to alternative means of subsistence. In that sense, commons can be understood
as an _experimental zone_ in which participants can learn to negotiate
responsibilities, social relations, and peer-based means of production.

_Art and Commons_

Projects such as UbuWeb, Monoskop,30 aaaaarg,31 Memory of the World,32 and
0xdb33 vary in size, they have different forms of organization and foci, but
they all care for specific cultural goods and make sure these goods remain
widely accessible—be it digital copies of artworks and original documents,
books and other text formats, videos, film, or sound and music. Unlike the
large shadow libraries introduced above, which aim to provide access to
hundreds of thousands, if not millions of mainly academic papers and books,
thus trying to fully cover the world of scholarly and academic works, the
smaller artist-run projects are of different nature. While UbuWeb’s founder,
for instance, also promotes a generally unrestricted access to cultural goods,
his approach with UbuWeb is to build a curated archive with copies of artworks
that he considers to be relevant for his very context.34 The selection is
based on personal assessment and preference and cared for affectionately.
Despite its comprehensiveness, it still can be considered a “personal website”
on which the artist shares things relevant to him. As such, he is in good
company with similar “artist-run shadow libraries”, which all provide a
technical infrastructure with which they share resources, while the resources
are of specific relevance to their providers.

Just like the large pirate libraries, these artistic archiving and library
practices challenge the notion of culture as private property and remind us
that it is not an unquestionable absolute. As Jonathan Lethem contends,
“[culture] rather is a social negotiation, tenuously forged, endlessly
revised, and imperfect in its every incarnation.”35 Shadow libraries, in
general, are symptomatic of the cultural battles and absurdities around access
and copyright within an economic logic that artificially tries to limit the
abundance of digital culture, in which sharing does not mean dividing but
rather multiplying. They have become a cultural force, one that can be
represented in Foucauldian terms, as symptomatic of broader power struggles as
well as systemic failures inherent in the cultural formation. As Marczewska
puts it, “Goldsmith moves away from thinking about models of cultural
production in proprietary terms and toward paradigms of creativity based on a
culture of collecting, organizing, curating, and sharing content.”36 And by
doing so, he produces major contradictions, or rather he allows the already
existing contradictions to come to light. The artistic archives and libraries
are precarious in terms of their legal status, while it is exactly due to
their disregard of copyright that cultural resources could be built that
exceed the relevance of most official archives that are bound to abide the
law. In fact, there are no comparable official resources, which is why the
function of these projects is at least twofold: education and preservation.37

Maybe UbuWeb and the other, smaller or larger, shadow libraries do not qualify
as commons in the strict sense of involving not only a non-market exchange of
goods but also a community of commoners who negotiate the terms of use among
themselves. This would require collective, formalized, and transparent types
of organization. Furthermore, most of the digital items they circulate are
privately owned and therefore cannot simply be transferred to become commons
resources. These projects, in many respects, are in a preliminary stage by
pointing to the _ideal of culture as a commons_. By providing access to
cultural goods and knowledge that would otherwise not be available at all or
inaccessible for large parts of the general public, they might even fulfill
the function of a “commons fix”, to a certain degree, but at the same time
they are the experimental zone needed to unlearn copyright and relearn new
ways of cultural production and dissemination beyond the property regime. In
any case, they can function as perfect entry points for the discussion and
investigation of the transformative force art can have within the current
global neoliberal knowledge society.

_Top Tens—Showcasing the Copy as an Aesthetic and Political Statement_

The exhibition _Top Tens_ provided an experimental setting to explore the
possibilities of translating the abundance of a digital archive into a “real
space”, by presenting one hundred artworks from the Ubu archive. 38 Although
all works were properly attributed in the exhibition, the artists whose works
were shown neither had a say about their participation in the exhibition nor
about the display formats. Tolerating the presence of a work in the archive is
one thing; tolerating its display in such circumstances is something else,
which might even touch upon moral rights and the integrity of the work.
However, the exhibition was not so much about the individual works on display
but the archiving condition they are subject to. So the discussion here has
nothing to do the abiding art theory question of original and copy.
Marginally, it is about the question of high-quality versus low-quality
copies. In reproducible media the value of an artwork cannot be based on its
originality any longer—the core criterion for sales and market value. This is
why many artists use the trick of high-resolution and limited edition, a kind
of distributed originality status for several authorized objects, which all
are not 100 percent original but still a bit more original than an arbitrary
unlimited edition. Leaving this whole discussion aside was a clear indication
that something else was at stake. The conceptual statement made by the
exhibition and its makers foregrounded the nature of the shadow library, which
visitors were able to experience when entering the gallery space. Instead of
viewing the artworks in the usual way—online—they had the opportunity to
physically immerse themselves in the cultural condition of proliferated acts
of copying, something that “affords their reconceptualization as a hybrid
creative-critical tool and an influential aesthetic category.”39

Appropriation and copying as longstanding methods of subversive artistic
production, where the reuse of existing material serves as a tool for
commentary, social critique, and a means of making a political statement, has
expanded here to the art of exhibition-making. The individual works serve to
illustrate a curatorial concept, thus radically shifting the avant-garde
gesture which copying used to be in the twentieth century, to breathe new life
in the “culture of collecting, organizing, curating, and sharing content.”
Organizing this conceptually concise exhibition was a brave and bold statement
by the art institution: The Onassis Cultural Centre, one of Athens’ most
prestigious cultural institutions, dared to adopt a resolutely political
stance for a—at least in juridical terms—questionable project, as Ubu lives
from the persistent denial of copyright. Neglecting the concerns of the
individual authors and artists for a moment was a necessary precondition in
order to make space for rethinking the future of cultural production.

________________
Special thanks to Eric Steinhauer and all the artists and amateur librarians
who are taking care of our cultural memory.

1 Festival program online: Onassis Cultural Centre, “Shadow Libraries: UbuWeb
in Athens,” (accessed on Sept. 30, 2018).
2 _UbuWeb_ is a massive online archive of avant-garde art created over the
last two decades by New York-based artist and writer Kenneth Goldsmith.
Website of the archive: (accessed on Sept. 30, 2018).
3 Kaja Marczewska, _This Is Not a Copy. Writing at the Iterative Turn_ (New
York: Bloomsbury Academic, 2018), 22.
4 For further reading: Kenneth Goldsmith, _Uncreative Writing: Managing
Language in the Digital Age_ (New York: Columbia University Press, 2011).
5 Many works in the archive stem from the pre-digital era, and there is no
precise knowledge of the sources where Ubu obtains its material, but it is
known that Goldsmith also digitizes a lot of material himself.
6 In German copyright law, for example, §17 and §19a grant the exclusive right
to reproduce, distribute, and make available online to the author. See also:
(accessed on Sept. 30,
2018).
7 Eric Steinhauer, “Rechtspflicht zur Amnesie: Digitale Inhalte, Archive und
Urheberrecht,” _iRightsInfo_ (2013), /rechtspflicht-zur-amnesie-digitale-inhalte-archive-und-urheberrecht/18101>
(accessed on Sept. 30, 2018).
8 In particularly severe cases of copyright infringement also state
prosecutors can become active, which in practice, however, remains the
exception. The circumstances in which criminal law must be applied are
described in §109 of German copyright law.
9 See, for example, “Shadow Libraries” for a video interview with Kenneth
Goldsmith.
10 Paul Torremans, _Intellectual Property Law_ (Oxford: Oxford University
Press, 2010), 265.
11 See also §53 para. 1–3 of the German Act on Copyright and Related Rights
(UrhG), §42 para. 4 in the Austrian UrhG, and Article 19 of Swiss Copyright
Law.
12 Simon Stokes, _Art & Copyright_ (Oxford: Hart Publishing, 2003).
13 Steinhauer, “Rechtspflicht zur Amnesie”.
14 This discrepancy between a state mandate for cultural preservation and
copyright law has only been fixed in 2018 with the introduction of a special
law, §16a DNBG.
15 Steinhauer, “Rechtspflicht zur Amnesie”.
16 Bodo Balazs, “The Genesis of Library Genesis: The Birth of a Global
Scholarly Shadow Library,” Nov. 4, 2014, _SSRN_ ,
, (accessed on
Sept. 30, 2018).
17 Motto of Sci-Hub: “Sci-Hub,” _Wikipedia_ , /Sci-Hub> (accessed on Sept. 30, 2018).
18 Guillaume Cabanac, “Bibliogifts in LibGen? A study of a text-sharing
platform driven by biblioleaks and crowdsourcing,” _Journal of the Association
for Information Science and Technology_ , 67, 4 (2016): 874–884.
19 Ibid.
20 The current address is (accessed on Sept. 30, 2018).
21 Kate Murphy, “Should All Research Papers Be Free?” _New York Times Sunday
Review_ , Mar. 12, 2016, /should-all-research-papers-be-free.html> (accessed on Sept. 30, 2018).
22 Richard Van Noorden, “Nature’s 10,” _Nature_ , Dec. 19, 2016,
(accessed on Sept. 30,
2018).
23 Bodo Balazs, “Pirates in the library – an inquiry into the guerilla open
access movement,” paper for the 8th Annual Workshop of the International
Society for the History and Theory of Intellectual Property, CREATe,
University of Glasgow, UK, July 6–8, 2016. Online available at: https
://adrien-chopin.weebly.com/uploads/2/1/7/6/21765614/2016_bodo_-_pirates.pdf
(accessed on Sept. 30, 2018).
24 Balazs, “The Genesis of Library Genesis”.
25 Aaron Swartz, “Guerilla Open Access Manifesto,” _Internet Archive_ , July
2008,

(accessed on Sept. 30, 2018).
26 Balazs, “Pirates in the library”.
27 Massimo De Angelis, “Economy, Capital and the Commons,” in: _Art,
Production and the Subject in the 21st Century_ , eds. Angela Dimitrakaki and
Kirsten Lloyd (Liverpool: Liverpool University Press, 2015), 201.
28 Ibid., 211.
29 Ibid.
30 See: (accessed on Sept. 30, 2018).
31 Accessible with invitation. See:
[https://aaaaarg.fail/](https://aaaaarg.fail) (accessed on Sept. 30, 2018).
32 See: (accessed on Sept. 30, 2018).
33 See: (accessed on Sept. 30, 2018).
34 Kenneth Goldsmith in conversation with Cornelia Sollfrank, _The Poetry of
Archiving_ , 2013, (accessed on Sept. 30, 2018).
35 Jonathan Lethem, _The Ecstasy of Influence: Nonfictions, etc._ (London:
Vintage, 2012), 101.
36 Marczewska, _This Is Not a Copy_ , 2.
37 The research project _Creating Commons_ , based at Zurich University of the
Arts, is dedicated to the potential of art projects for the creation of
commons: “creating commons,” (accessed on
Sept. 30, 2018).
38 One of Ubu’s features online has been the “top ten”, the idea to invite
guests to pick their ten favorite works from the archive and thus introduce a
mix between chance operation and subjectivity in order to reveal hidden
treasures. The curators of the festival in Athens, Ilan Manouach and Kenneth
Goldsmith, decided to elevate this principle to the curatorial concept of the
exhibition and invited ten guests to select their ten favorite works. The
Athens-based curator Elpida Karaba was commissioned to work on an adequate
concept for the realization, which turned out to be a huge black box divided
into ten small cubicles with monitors and seating areas, supplemented by a
large wall projection illuminating the whole space.
39 Marczewska, _This Is Not a Copy_ , 7.

This text is under a _Creative Commons_ license: CC BY NC SA 3.0 Austria

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.