Medak, Sekulic & Mertens
Book Scanning and Post-Processing Manual Based on Public Library Overhead Scanner v1.2
2014


PUBLIC LIBRARY
&
MULTIMEDIA INSTITUTE

BOOK SCANNING & POST-PROCESSING MANUAL
BASED ON PUBLIC LIBRARY OVERHEAD SCANNER

Written by:
Tomislav Medak
Dubravka Sekulić
With help of:
An Mertens

Creative Commons Attribution - Share-Alike 3.0 Germany

TABLE OF CONTENTS

Introduction
3
I. Photographing a printed book
7
I. Getting the image files ready for post-processing
11
III. Transformation of source images into .tiffs
13
IV. Optical character recognition
16
V. Creating a finalized e-book file
16
VI. Cataloging and sharing the e-book
16
Quick workflow reference for scanning and post-processing
18
References
22

INTRODUCTION:
BOOK SCANNING - FROM PAPER BOOK TO E-BOOK
Initial considerations when deciding on a scanning setup
Book scanning tends to be a fragile and demanding process. Many factors can go wrong or produce
results of varying quality from book to book or page to page, requiring experience or technical skill
to resolve issues that occur. Cameras can fail to trigger, components to communicate, files can get
corrupted in the transfer, storage card doesn't get purged, focus fails to lock, lighting conditions
change. There are trade-offs between the automation that is prone to instability and the robustness
that is prone to become time consuming.
Your initial choice of book scanning setup will have to take these trade-offs into consideration. If
your scanning community is confined to your hacklab, you won't be risking much if technological
sophistication and integration fails to function smoothly. But if you're aiming at a broad community
of users, with varying levels of technological skill and patience, you want to create as much timesaving automation as possible on the condition of keeping maximum stability. Furthermore, if the
time of individual members of your scanning community can contribute is limited, you might also
want to divide some of the tasks between users and their different skill levels.
This manual breaks down the process of digitization into a general description of steps in the
workflow leading from the printed book to a digital e-book, each of which can be in a concrete
situation addressed in various manners depending on the scanning equipment, software, hacking
skills and user skill level that are available to your book scanning project. Several of those steps can
be handled by a single piece of equipment or software, or you might need to use a number of them your mileage will vary. Therefore, the manual will try to indicate the design choices you have in the
process of planning your workflow and should help you make decisions on what design is best for
you situation.
Introducing book scanner designs
The book scanning starts with the capturing of digital image files on the scanning equipment. There
are three principle types of book scanner designs:
 flatbed scanner
 single camera overhead scanner
 dual camera overhead scanner
Conventional flatbed scanners are widely available. However, given that they require the book to be
spread wide open and pressed down with the platen in order to break the resistance of the book
binding and expose sufficiently the inner margin of the text, it is the most destructive approach for
the book, imprecise and slow.
Therefore, book scanning projects across the globe have taken to custom designing improvised
setups or scanner rigs that are less destructive and better suited for fast turning and capturing of
pages. Designs abound. Most include:




one or two digital photo cameras of lesser or higher quality to capture the pages,
transparent V-shaped glass or Plexiglas platen to press the open book against a V-shape
cradle, and
a light source.

The go-to web resource to help you make an informed decision is the DIY book scanning
community at http://diybookscanner.org. A good place to start is their intro
(http://wiki.diybookscanner.org/ ) and scanner build list (http://wiki.diybookscanner.org/scannerbuild-list ).
The book scanners with a single camera are substantially cheaper, but come with an added difficulty
of de-warping the distorted page images due to the angle that pages are photographed at, which can
sometimes be difficult to correct in the post-processing. Hence, in this introductory chapter we'll
focus on two camera designs where the camera lens stands relatively parallel to the page. However,
with a bit of adaptation these instructions can be used to work with any other setup.
The Public Library scanner
In the focus of this manual is the scanner built for the Public Library project, designed by Voja
Antonić (see Illustration 1). The Public Library scanner was built with the immediate use by a wide
community of users in mind. Hence, the principle consideration in designing the Public Library
scanner was less sophistication and more robustness, facility of use and distributed process of
editing.
The board designs can be found here: http://www.memoryoftheworld.org/blog/2012/10/28/ourbeloved-bookscanner. The current iterations are using two Canon 1100 D cameras with the kit lens
Canon EF-S 18-55mm 1:3.5-5.6 IS. Cameras are auto-charging.

Illustration 1: Public Library Scanner
The scanner operates by automatically lowering the Plexiglas platen, illuminating the page and then
triggering camera shutters. The turning of pages and the adjustments of the V-shaped cradle holding

the book are manual.
The scanner is operated by a two-button controller (see Illustration 2). The upper, smaller button
breaks the capture process in two steps: the first click lowers the platen, increases the light level and
allows you to adjust the book or the cradle, the second click triggers the cameras and lifts the platen.
The lower button has
two modes. A quick
click will execute the
whole capture process in
one go. But if you hold
it pressed longer, it will
lower the platen,
allowing you to adjust
the book and the cradle,
and lift it without
triggering cameras when
you press again.

Illustration 2: A two-button controller

More on this manual: steps in the book scanning process
The book scanning process in general can be broken down in six steps, each of which will be dealt
in a separate chapter in this manual:
I. Photographing a printed book
I. Getting the image files ready for post-processing
III. Transformation of source images into .tiffs
IV. Optical character recognition
V. Creating a finalized e-book file
VI. Cataloging and sharing the e-book
A step by step manual for Public Library scanner
This manual is primarily meant to provide a detailed description and step-by-step instructions for an
actual book scanning setup -- based on the Voja Antonić's scanner design described above. This is a
two-camera overhead scanner, currently equipped with two Canon 1100 D cameras with EF-S 1855mm 1:3.5-5.6 IS kit lens. It can scan books of up to A4 page size.
The post-processing in this setup is based on a semi-automated transfer of files to a GNU/Linux
personal computer and on the use of free software for image editing, optical character recognition
and finalization of an e-book file. It was initially developed for the HAIP festival in Ljubljana in
2011 and perfected later at MaMa in Zagreb and Leuphana University in Lüneburg.
Public Library scanner is characterized by a somewhat less automated yet distributed scanning
process than highly automated and sophisticated scanner hacks developed at various hacklabs. A
brief overview of one such scanner, developed at the Hacker Space Bruxelles, is also included in
this manual.
The Public Library scanning process proceeds thus in following discrete steps:

1. creating digital images of pages of a book,
2. manual transfer of image files to the computer for post-processing,
3. automated renaming of files, ordering of even and odd pages, rotation of images and upload to a
cloud storage,
4. manual transformation of source images into .tiff files in ScanTailor
5. manual optical character recognition and creation of PDF files in gscan2pdf
The detailed description of the Public Library scanning process follows below.
The Bruxelles hacklab scanning process
For purposes of comparison, here we'll briefly reference the scanner built by the Bruxelles hacklab
(http://hackerspace.be/ScanBot). It is a dual camera design too. With some differences in hardware functionality
(Bruxelles scanner has automatic turning of pages, whereas Public Library scanner has manual turning of pages), the
fundamental difference between the two is in the post-processing - the level of automation in the transfer of images
from the cameras and their transformation into PDF or DjVu e-book format.
The Bruxelles scanning process is different in so far as the cameras are operated by a computer and the images are
automatically transferred, ordered and made ready for further post-processing. The scanner is home-brew, but the
process is for advanced DIY'ers. If you want to know more on the design of the scanner, contact Michael Korntheuer at
contact@hackerspace.be.
The scanning and post-processing is automated by a single Python script that does all the work
http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEAD
The scanner uses two Canon point and shoot cameras. Both cameras are connected to the PC with USB. They both run
PTP/CHDK (Canon Hack Development Kit). The scanning sequence is the following:
1. Script sends CHDK command line instructions to the cameras
2. Script sorts out the incoming files. This part is tricky. There is no reliable way to make a distinction between the left
and right camera, only between which camera was recognized by USB first. So the protocol is to always power up the
left camera first. See the instructions with the source code.
3. Collect images in a PDF file
4. Run script to OCR a .PDF file to plain .TXT file: http://git.constantvzw.org/?
p=algolit.git;a=blob;f=scanbot_brussel/ocr_pdf.sh;h=2c1f24f9afcce03520304215951c65f58c0b880c;hb=HEAD

I. PHOTOGRAPHING A PRINTED BOOK
Technologically the most demanding part of the scanning process is creating digital images of the
pages of a printed book. It's a process that is very different form scanner design to scanner design,
from camera to camera. Therefore, here we will focus strictly on the process with the Public Library
scanner.
Operating the Public Library scanner
0. Before you start:
Better and more consistent photographs lead to a more optimized and faster post-processing and a
higher quality of the resulting digital e-book. In order to guarantee the quality of images, before you
start it is necessary to set up the cameras properly and prepare the printed book for scanning.
a) Loosening the book
Depending on the type and quality of binding, some books tend to be too resistant to opening fully
to reveal the inner margin under the pressure of the scanner platen. It is thus necessary to “break in”
the book before starting in order to loosen the binding. The best way is to open it as wide as
possible in multiple places in the book. This can be done against the table edge if the book is more
rigid than usual. (Warning – “breaking in” might create irreversible creasing of the spine or lead to
some pages breaking loose.)
b) Switch on the scanner
You start the scanner by pressing the main switch or plugging the power cable into the the scanner.
This will also turn on the overhead LED lights.

c) Setting up the cameras
Place the cameras onto tripods. You need to move the lever on the tripod's head to allow the tripod
plate screwed to the bottom of the camera to slide into its place. Secure the lock by turning the lever
all the way back.
If the automatic chargers for the camera are provided, open the battery lid on the bottom of the
camera and plug the automatic charger. Close the lid.
Switch on the cameras using the lever on the top right side of the camera's body and place it into the
aperture priority (Av) mode on the mode dial above the lever (see Illustration 3). Use the main dial
just above the shutter button on the front side of the camera to set the aperture value to F8.0.

Illustration 3: Mode and main dial, focus mode switch, zoom
and focus ring
On the lens, turn the focus mode switch to manual (MF), turn the large zoom ring to set the value
exactly midway between 24 and 35 mm (see Illustration 3). Try to set both cameras the same.
To focus each camera, open a book on the cradle, lower the platen by holding the big button on the
controller, and turn on the live view on camera LCD by pressing the live view switch (see
Illustration 4). Now press the magnification button twice and use the focus ring on the front of the
lens to get a clear image view.

Illustration 4: Live view switch and magnification button

d) Connecting the cameras
Now connect the cameras to the remote shutter trigger cables that can be found lying on each side
of the scanner. They need to be plugged into a small round port hidden behind a protective rubber
cover on the left side of the cameras.
e) Placing the book into the cradle and double-checking the cameras
Open the book in the middle and place it on the cradle. Hold pressed the large button on the
controller to lower the Plexiglas platen without triggering the cameras. Move the cradle so that the
the platen fits into with the middle of the book.
Turn on the live view on the cameras' LED to see if the the pages fit into the image and if the
cameras are positioned parallel to the page.
f) Double-check storage cards and batteries
It is important that both storage cards on cameras are empty before starting the scanning in order
not to mess up the page sequence when merging photos from the left and the right camera in the
post-processing. To double-check, press play button on cameras and erase if there are some photos
left from the previous scan -- this you do by pressing the menu button, selecting the fifth menu from
the left and then select 'Erase Images' -> 'All images on card' -> 'OK'.
If no automatic chargers are provided, double-check on the information screen that batteries are
charged. They should be fully charged before starting with the scanning of a new book.

g) Turn off the light in the room
Lighting conditions during scanning should be as constant as possible, to reduce glare and achieve
maximum quality remove any source of light that might reflect off the Plexiglas platen. Preferably
turn off the light in the room or isolate the scanner with the black cloth provided.

1. Photographing a book
Now you are ready to start scanning. Place the book closed in the cradle and lower the platen by
holding the large button on the controller pressed (see Illustration 2). Adjust the position of the
cradle and lift the platen by pressing the large button again.
To scan you can now either use the small button on the controller to lower the platen, adjust and
then press it again to trigger the cameras and lift the platen. Or, you can just make a short press on
the large button to do it in one go.
ATTENTION: When the cameras are triggered, the shutter sound has to be heard coming
from both cameras. If one camera is not working, it's best to reconnect both cameras (see
Section 0), make sure the batteries are charged or adapters are connected, erase all images
and restart.
A mistake made in the photographing requires a lot of work in the post-processing, so it's
much quicker to repeat the photographing process.
If you make a mistake while flipping pages, or any other mistake, go back and scan from the page
you missed or incorrectly scanned. Note down the page where the error occurred and in the postprocessing the redundant images will be removed.
ADVICE: The scanner has a digital counter. By turning the dial forward and backward, you
can set it to tell you what page you should be scanning next. This should help you avoid
missing a page due to a distraction.
While scanning, move the cradle a bit to the left from time to time, making sure that the tip of Vshaped platen is aligned with the center of the book and the inner margin is exposed enough.

II. GETTING THE IMAGE FILES READY FOR POST-PROCESSING
Once the book pages have been photographed, they have to be transfered to the computer and
prepared for post-processing. With two-camera scanners, the capturing process will result in two
separate sets of images -- odd and even pages -- coming from the left and right cameras respectively
-- and you will need to rename and reorder them accordingly, rotate them into a vertical position
and collate them into a single sequence of files.
a) Transferring image files
For the transfer of files your principle process design choices are either to copy the files by
removing the memory cards from the cameras and copying them to the computer via a card reader
or to transfer them via a USB cable. The latter process can be automated by remote operating your
cameras from a computer, however this can be done only with a certain number of Canon cameras
(http://bit.ly/16xhJ6b) that can be hacked to run the open Canon Hack Development Kit firmware
(http://chdk.wikia.com).
After transferring the files, you want to erase all the image files on the camera memory card, so that
they would not end up messing up the scan of the next book.
b) Renaming image files
As the left and right camera are typically operated in sync, the photographing process results in two
separate sets of images, with even and odd pages respectively, that have completely different file
names and potentially same time stamps. So before you collate the page images in the order how
they appear in the book, you want to rename the files so that the first image comes from the right
camera, the second from the left camera, the third comes again from the right camera and so on.
You probably want to do a batch renaming, where your right camera files start with n and are offset
by an increment of 2 (e.g. page_0000.jpg, page_0002.jpg,...) and your left camera files start with
n+1 and are also offset by an increment of 2 (e.g. page_0001.jpg, page_0003.jpg,...).
Batch renaming can be completed either from your file manager, in command line or with a number
of GUI applications (e.g. GPrename, rename, cuteRenamer on GNU/Linux).
c) Rotating image files
Before you collate the renamed files, you might want to rotate them. This is a step that can be done
also later in the post-processing (see below), but if you are automating or scripting your steps this is
a practical place to do it. The images leaving your cameras will be positioned horizontally. In order
to position them vertically, the images from the camera on the right will have to be rotated by 90
degrees counter-clockwise, the images from the camera on the left will have to be rotated by 90
degrees clockwise.
Batch rotating can be completed in a number of photo-processing tools, in command line or
dedicated applications (e.g. Fstop, ImageMagick, Nautilust Image Converter on GNU/Linux).
d) Collating images into a single batch
Once you're done with the renaming and rotating of the files, you want to collate them into the same
folder for easier manipulation later.

Getting the image files ready for post-processing on the Public Library scanner
In the case of Public Library scanner, a custom C++ script was written by Mislav Stublić to
facilitate the transfer, renaming, rotating and collating of the images from the two cameras.
The script prompts the user to place into the card reader the memory card from the right camera
first, gives a preview of the first and last four images and provides an entry field to create a subfolder in a local cloud storage folder (path: /home/user/Copy).
It transfers, renames, rotates the files, deletes them from the card and prompts the user to replace the
card with the one from the left camera in order to the transfer the files from there and place them in
the same folder. The script was created for GNU/Linux system and it can be downloaded, together
with its source code, from: https://copy.com/nLSzflBnjoEB
If you have other cameras than Canon, you can edit the line 387 of the source file to change to the
naming convention of your cameras, and recompile by running the following command in your
terminal: "gcc scanflow.c -o scanflow -ludev `pkg-config --cflags --libs gtk+-2.0`"
In the case of Hacker Space Bruxelles scanner, this is handled by the same script that operates the cameras that can be
downloaded from: http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEAD

III. TRANSFORMATION OF SOURCE IMAGES INTO .TIFFS
Images transferred from the cameras are high definition full color images. You want your cameras
to shoot at the largest possible .jpg resolution in order for resulting files to have at least 300 dpi (A4
at 300 dpi requires a 9.5 megapixel image). In the post-processing the size of the image files needs
to be reduced down radically, so that several hundred images can be merged into an e-book file of a
tolerable size.
Hence, the first step in the post-processing is to crop the images from cameras only to the content of
the pages. The surroundings around the book that were captured in the photograph and the white
margins of the page will be cropped away, while the printed text will be transformed into black
letters on white background. The illustrations, however, will need to be preserved in their color or
grayscale form, and mixed with the black and white text. What were initially large .jpg files will
now become relatively small .tiff files that are ready for optical character recognition process
(OCR).
These tasks can be completed by a number of software applications. Our manual will focus on one
that can be used across all major operating systems -- ScanTailor. ScanTailor can be downloaded
from: http://scantailor.sourceforge.net/. A more detailed video tutorial of ScanTailor can be found
here: http://vimeo.com/12524529.
ScanTailor: from a photograph of a page to a graphic file ready for OCR
Once you have transferred all the photos from cameras to the computer, renamed and rotated them,
they are ready to be processed in the ScanTailor.
1) Importing photographs to ScanTailor
- start ScanTailor and open ‘new project’
- for ‘input directory’ chose the folder where you stored the transferred and renamed photo images
- you can leave ‘output directory’ as it is, it will place your resulting .tiffs in an 'out' folder inside
the folder where your .jpg images are
- select all files (if you followed the naming convention above, they will be named
‘page_xxxx.jpg’) in the folder where you stored the transferred photo images, and click 'OK'
- in the dialog box ‘Fix DPI’ click on All Pages, and for DPI choose preferably '600x600', click
'Apply', and then 'OK'
2) Editing pages
2.1 Rotating photos/pages
If you've rotated the photo images in the previous step using the scanflow script, skip this step.
- Rotate the first photo counter-clockwise, click Apply and for scope select ‘Every other page’
followed by 'OK'
- Rotate the following photo clockwise, applying the same procedure like in the previous step
2.2 Deleting redundant photographs/pages
- Remove redundant pages (photographs of the empty cradle at the beginning and the end of the
book scanning sequence; book cover pages if you don’t want them in the final scan; duplicate pages
etc.) by right-clicking on a thumbnail of that page in the preview column on the right side, selecting
‘Remove from project’ and confirming by clicking on ‘Remove’.

# If you by accident remove a wrong page, you can re-insert it by right-clicking on a page
before/after the missing page in the sequence, selecting 'insert after/before' (depending on which
page you selected) and choosing the file from the list. Before you finish adding, it is necessary to
again go through the procedure of fixing DPI and Rotating.
2.3 Adding missing pages
- If you notice that some pages are missing, you can recapture them with the camera and insert them
manually at this point using the procedure described above under 2.2.
3) Split pages and deskew
Steps ‘Split pages’ and ‘Deskew’ should work automatically. Run them by clicking the ‘Play’ button
under the 'Select content' function. This will do the three steps automatically: splitting of pages,
deskewing and selection of content. After this you can manually re-adjust splitting of pages and deskewing.
4) Selecting content
Step ‘Select content’ works automatically as well, but it is important to revise the resulting selection
manually page by page to make sure the entire content is selected on each page (including the
header and page number). Where necessary, use your pointer device to adjust the content selection.
If the inner margin is cut, go back to 'Split pages' view and manually adjust the selected split area. If
the page is skewed, go back to 'Deskew' and adjust the skew of the page. After this go back to
'Select content' and readjust the selection if necessary.
This is the step where you do visual control of each page. Make sure all pages are there and
selections are as equal in size as possible.
At the bottom of thumbnail column there is a sort option that can automatically arrange pages by
the height and width of the selected content, making the process of manual selection easier. The
extreme differences in height should be avoided, try to make selected areas as much as possible
equal, particularly in height, across all pages. The exception should be cover and back pages where
we advise to select the full page.
5) Adjusting margins
For best results select in the previous step content of the full cover and back page. Now go to the
'Margins' step and set under Margins section both Top, Bottom, Left and Right to 0.0 and do 'Apply
to...' → 'All pages'.
In Alignment section leave 'Match size with other pages' ticked, choose the central positioning of
the page and do 'Apply to...' → 'All pages'.
6) Outputting the .tiffs
Now go to the 'Output' step. Ignore the 'Output Resolution' section.
Next review two consecutive pages from the middle of the book to see if the scanned text is too
faint or too dark. If the text seems too faint or too dark, use slider Thinner – Thicker to adjust. Do
'Apply to' → 'All pages'.
Next go to the cover page and select under Mode 'Color / Grayscale' and tick on 'White Margins'.
Do the same for the back page.
If there are any pages with illustrations, you can choose the 'Mixed' mode for those pages and then

under the thumb 'Picture Zones' adjust the zones of the illustrations.
Now you are ready to output the files. Just press 'Play' button under 'Output'. Once the computer is
finished processing the images, just do 'File' → 'Save as' and save the project.

IV. OPTICAL CHARACTER RECOGNITION
Before the edited-down graphic files are finalized as an e-book, we want to transform the image of
the text into an actual text that can be searched, highlighted, copied and transformed. That
functionality is provided by Optical Character Recognition. This a technically difficult task dependent on language, script, typeface and quality of print - and there aren't that many OCR tools
that are good at it. There is, however, a relatively good free software solution - Tesseract
(http://code.google.com/p/tesseract-ocr/) - that has solid performance, good language data and can
be trained for an even better performance, although it has its problems. Proprietary solutions (e.g.
Abby FineReader) sometimes provide superior results.
Tesseract supports as input format primarily .tiff files. It produces a plain text file that can be, with
the help of other tools, embedded as a separate layer under the original graphic image of the text in
a PDF file.
With the help of other tools, OCR can be performed also against other input files, such as graphiconly PDF files. This produces inferior results, depending again on the quality of graphic files and
the reproduction of text in them. One such tool is a bashscript to OCR a ODF file that can be found
here: https://github.com/andrecastro0o/ocr/blob/master/ocr.sh
As mentioned in the 'before scanning' section, the quality of the original book will influence the
quality of the scan and thus the quality of the OCR. For a comparison, have a look here:
http://www.paramoulipist.be/?p=1303
Once you have your .txt file, there is still some work to be done. Because OCR has difficulties to
interpret particular elements in the lay-out and fonts, the TXT file comes with a lot of errors.
Recurrent problems are:
- combinations of specific letters in some fonts (it can mistake 'm' for 'n' or 'I' for 'i' etc.);
- headers become part of body text;
- footnotes are placed inside the body text;
- page numbers are not recognized as such.

V. CREATING A FINALIZED E-BOOK FILE
After the optical character recognition has been completed, the resulting text can be merged with
the images of pages and output into an e-book format. While increasingly the proper e-book file
formats such as ePub have been gaining ground, PDFs still remain popular because many people
tend to read on their computers, and they retain the original layout of the book on paper including
the absolute pagination needed for referencing in citations. DjVu is also an option, as an alternative
to PDF, used because of its purported superiority, but it is far less popular.
The export to PDF can be done again with a number of tools. In our case we'll complete the optical
character recognition and PDF export in gscan2pdf. Again, the proprietary Abbyy FineReader will
produce a bit smaller PDFs.
If you prefer to use an e-book format that works better with e-book readers, obviously you will have
to remove some of the elements that appear in the book - headers, footers, footnotes and pagination.

This can be done earlier in the process of cropping down the original .jpg image files (see under III)
or later by transforming the PDF files. This can be done in Calibre (http://calibre-ebook.com) by
converting the PDF into an ePub, where it can be further tweaked to better accommodate or remove
the headers, footers, footnotes and pagination.
Optical character recognition and PDF export in Public Library workflow
Optical character recognition with the Tesseract engine can be performed on GNU/Linux by a
number of command line and GUI tools. Much of those tools exist also for other operating systems.
For the users of the Public Library workflow, we recommend using gscan2pdf application both for
the optical character recognition and the PDF or DjVu export.
To do so, start gscan2pdf and open your .tiff files. To OCR them, go to 'Tools' and select 'OCR'. In
the dialog box select the Tesseract engine and your language. 'Start OCR'. Once the OCR is
finished, export the graphic files and the OCR text to PDF by selecting 'Save as'.
However, given that sometimes the proprietary solutions produce better results, these tasks can also
be done, for instance, on the Abbyy FineReader running on a Windows operating system running
inside the Virtual Box. The prerequisites are that you have both Windows and Abbyy FineReader
you can install in the Virtual Box. If using Virtual Box, once you've got both installed, you need to
designate a shared folder in your Virtual Box and place the .tiff files there. You can now open them
from the Abbyy FineReader running in the Virtual Box, OCR them and export them into a PDF.
To use Abbyy FineReader transfer the output files in your 'out' out folder to the shared folder of the
VirtualBox. Then start the VirtualBox, start Windows image and in Windows start Abbyy
FineReader. Open the files and let the Abbyy FineReader read the files. Once it's done, output the
result into PDF.

VI. CATALOGING AND SHARING THE E-BOOK
Your road from a book on paper to an e-book is complete. If you want to maintain your library you
can use Calibre, a free software tool for e-book library management. You can add the metadata to
your book using the existing catalogues or you can enter metadata manually.
Now you may want to distribute your book. If the work you've digitized is in the public domain
(https://en.wikipedia.org/wiki/Public_domain), you might consider contributing it to the Gutenberg
project
(http://www.gutenberg.org/wiki/Gutenberg:Volunteers'_FAQ#V.1._How_do_I_get_started_as_a_Pr
oject_Gutenberg_volunteer.3F ), Wikibooks (https://en.wikibooks.org/wiki/Help:Contributing ) or
Arhive.org.
If the work is still under copyright, you might explore a number of different options for sharing.

QUICK WORKFLOW REFERENCE FOR SCANNING AND
POST-PROCESSING ON PUBLIC LIBRARY SCANNER
I. PHOTOGRAPHING A PRINTED BOOK
0. Before you start:
- loosen the book binding by opening it wide on several places
- switch on the scanner
- set up the cameras:
- place cameras on tripods and fit them tigthly
- plug in the automatic chargers into the battery slot and close the battery lid
- switch on the cameras
- switch the lens to Manual Focus mode
- switch the cameras to Av mode and set the aperture to 8.0
- turn the zoom ring to set the focal length exactly midway between 24mm and 35mm
- focus by turning on the live view, pressing magnification button twice and adjusting the
focus to get a clear view of the text
- connect the cameras to the scanner by plugging the remote trigger cable to a port behind a
protective rubber cover on the left side of the cameras
- place the book into the crade
- double-check storage cards and batteries
- press the play button on the back of the camera to double-check if there are images on the
camera - if there are, delete all the images from the camera menu
- if using batteries, double-check that batteries are fully charged
- switch off the light in the room that could reflect off the platen and cover the scanner with the
black cloth
1. Photographing
- now you can start scanning either by pressing the smaller button on the controller once to
lower the platen and adjust the book, and then press again to increase the light intensity, trigger the
cameras and lift the platen; or by pressing the large button completing the entire sequence in one
go;
- ATTENTION: Shutter sound should be coming from both cameras - if one camera is not
working, it's best to reconnect both cameras, make sure the batteries are charged or adapters
are connected, erase all images and restart.
- ADVICE: The scanner has a digital counter. By turning the dial forward and backward,
you can set it to tell you what page you should be scanning next. This should help you to
avoid missing a page due to a distraction.

II. Getting the image files ready for post-processing
- after finishing with scanning a book, transfer the files to the post-processing computer
and purge the memory cards
- if transferring the files manually:
- create two separate folders,
- transfer the files from the folders with image files on cards, using a batch
renaming software rename the files from the right camera following the convention
page_0001.jpg, page_0003.jpg, page_0005.jpg... -- and the files from the left camera
following the convention page_0002.jpg, page_0004.jpg, page_0006.jpg...
- collate image files into a single folder
- before ejecting each card, delete all the photo files on the card
- if using the scanflow script:
- start the script on the computer
- place the card from the right camera into the card reader
- enter the name of the destination folder following the convention
"Name_Surname_Title_of_the_Book" and transfer the files
- repeat with the other card
- script will automatically transfer the files, rename, rotate, collate them in proper
order and delete them from the card
III. Transformation of source images into .tiffs
ScanTailor: from a photograph of page to a graphic file ready for OCR
1) Importing photographs to ScanTailor
- start ScanTailor and open ‘new project’
- for ‘input directory’ chose the folder where you stored the transferred photo images
- you can leave ‘output directory’ as it is, it will place your resulting .tiffs in an 'out' folder
inside the folder where your .jpg images are
- select all files (if you followed the naming convention above, they will be named
‘page_xxxx.jpg’) in the folder where you stored the transferred photo images, and click
'OK'
- in the dialog box ‘Fix DPI’ click on All Pages, and for DPI choose preferably '600x600',
click 'Apply', and then 'OK'
2) Editing pages
2.1 Rotating photos/pages
If you've rotated the photo images in the previous step using the scanflow script, skip this step.
- rotate the first photo counter-clockwise, click Apply and for scope select ‘Every other
page’ followed by 'OK'
- rotate the following photo clockwise, applying the same procedure like in the previous
step

2.2 Deleting redundant photographs/pages
- remove redundant pages (photographs of the empty cradle at the beginning and the end;
book cover pages if you don’t want them in the final scan; duplicate pages etc.) by rightclicking on a thumbnail of that page in the preview column on the right, selecting ‘Remove
from project’ and confirming by clicking on ‘Remove’.
# If you by accident remove a wrong page, you can re-insert it by right-clicking on a page
before/after the missing page in the sequence, selecting 'insert after/before' and choosing the file
from the list. Before you finish adding, it is necessary to again go the procedure of fixing DPI and
rotating.
2.3 Adding missing pages
- If you notice that some pages are missing, you can recapture them with the camera and
insert them manually at this point using the procedure described above under 2.2.
3)

Split pages and deskew
- Functions ‘Split Pages’ and ‘Deskew’ should work automatically. Run them by
clicking the ‘Play’ button under the 'Select content' step. This will do the three steps
automatically: splitting of pages, deskewing and selection of content. After this you can
manually re-adjust splitting of pages and de-skewing.

4)

Selecting content and adjusting margins
- Step ‘Select content’ works automatically as well, but it is important to revise the
resulting selection manually page by page to make sure the entire content is selected on
each page (including the header and page number). Where necessary use your pointer device
to adjust the content selection.
- If the inner margin is cut, go back to 'Split pages' view and manually adjust the selected
split area. If the page is skewed, go back to 'Deskew' and adjust the skew of the page. After
this go back to 'Select content' and readjust the selection if necessary.
- This is the step where you do visual control of each page. Make sure all pages are there
and selections are as equal in size as possible.
- At the bottom of thumbnail column there is a sort option that can automatically arrange
pages by the height and width of the selected content, making the process of manual
selection easier. The extreme differences in height should be avoided, try to make
selected areas as much as possible equal, particularly in height, across all pages. The
exception should be cover and back pages where we advise to select the full page.

5) Adjusting margins
- Now go to the 'Margins' step and set under Margins section both Top, Bottom, Left and
Right to 0.0 and do 'Apply to...' → 'All pages'.
- In Alignment section leave 'Match size with other pages' ticked, choose the central

positioning of the page and do 'Apply to...' → 'All pages'.
6) Outputting the .tiffs
- Now go to the 'Output' step.
- Review two consecutive pages from the middle of the book to see if the scanned text is
too faint or too dark. If the text seems too faint or too dark, use slider Thinner – Thicker to
adjust. Do 'Apply to' → 'All pages'.
- Next go to the cover page and select under Mode 'Color / Grayscale' and tick on 'White
Margins'. Do the same for the back page.
- If there are any pages with illustrations, you can choose the 'Mixed' mode for those
pages and then under the thumb 'Picture Zones' adjust the zones of the illustrations.
- To output the files press 'Play' button under 'Output'. Save the project.
IV. Optical character recognition & V. Creating a finalized e-book file
If using all free software:
1) open gscan2pdf (if not already installed on your machine, install gscan2pdf from the
repositories, Tesseract and data for your language from https://code.google.com/p/tesseract-ocr/)
- point gscan2pdf to open your .tiff files
- for Optical Character Recognition, select 'OCR' under the drop down menu 'Tools',
select the Tesseract engine and your language, start the process
- once OCR is finished and to output to a PDF, go under 'File' and select 'Save', edit the
metadata and select the format, save
If using non-free software:
2) open Abbyy FineReader in VirtualBox (note: only Abby FineReader 10 installs and works with some limitations - under GNU/Linux)
- transfer files in the 'out' folder to the folder shared with the VirtualBox
- point it to the readied .tiff files and it will complete the OCR
- save the file

REFERENCES
For more information on the book scanning process in general and making your own book scanner
please visit:
DIY Book Scanner: http://diybookscannnner.org
Hacker Space Bruxelles scanner: http://hackerspace.be/ScanBot
Public Library scanner: http://www.memoryoftheworld.org/blog/2012/10/28/our-belovedbookscanner/
Other scanner builds: http://wiki.diybookscanner.org/scanner-build-list
For more information on automation:
Konrad Voeckel's post-processing script (From Scan to PDF/A):
http://blog.konradvoelkel.de/2013/03/scan-to-pdfa/
Johannes Baiter's automation of scanning to PDF process: http://spreads.readthedocs.org
For more information on applications and tools:
Calibre e-book library management application: http://calibre-ebook.com/
ScanTailor: http://scantailor.sourceforge.net/
gscan2pdf: http://sourceforge.net/projects/gscan2pdf/
Canon Hack Development Kit firmware: http://chdk.wikia.com
Tesseract: http://code.google.com/p/tesseract-ocr/
Python script of Hacker Space Bruxelles scanner: http://git.constantvzw.org/?
p=algolit.git;a=tree;f=scanbot_brussel;h=81facf5cb106a8e4c2a76c048694a3043b158d62;hb=HEA
D


Sollfrank & Kleiner
Telekommunisten
2012


Dmytri Kleiner
Telekommunisten

Berlin, 20 November 2012

[00:12]
My name is Dmytri Kleiner. I work with Telekommunisten, which is an art
collective based in Berlin that investigates the social relations in bettering
communication technologies.

[00:24]
Peer-To-Peer Communism

[00:29]
Cornelia Sollfrank: I would like to start with the theory, which I think is
very strong, and which actually informs the practice that you are doing. For
me it's like the background where the practice comes from. And I think the
most important and well-known book or paper you've written is The
Telekommunist Manifesto. This is something that you authored personally,
Dmytri Kleiner. It's not written by the Telekommunisten. And I would like to
ask you what the main ideas and the main principles are that you explain, and
maybe you come up with a few things, and I have some bullet points here, and
then we can discuss.

[01:14]
The book has two sections. The first section is called "Peer-To-Peer Communism
Vs. The Client-Server Capitalist State," and that actually explains – using
the history of the Internet as a sort of a basis – it explains the
relationship between modes of production on one hand, like capitalism and
communism, with network topologies on the other hand, mesh networks and star
networks. [01:39] And it explains why the original design of the Internet,
which was supposed to be a decentralised system where everybody could
communicate with everybody without any kind of mediation, or control or
censorship – why that has been replaced with centralised, privatised
platforms, from an economic basis. [02:00] So that the need for capitalist
capture of user data, and user interaction, in order to allow investors to
recoup profits, is the driving force behind centralisation, and so it explains
that.

[02:15]
Copyright Myth

[02:19]
C.S.: The framework of these whole interviews is the relation between cultural
production, artistic production in particular, and copyright, as a regulatory
mechanism. In one of your presentations, you mention, or you made the
assumption or the claim, that the fact that copyright is there to protect, or
to foster or enable artistic cultural production is a myth. Could you please
elaborate a bit on that?

[02:57]
Sure. That's the second part of the manifesto. The second part of the
manifesto is called "A Contribution to the Critique of Free Culture." And in
that title I don't mean to be critiquing the practice of free culture, which I
actively support and participate in. [03:13] I am critiquing the theory around
free culture, and particularly as it's found in the Creative Commons
community. [03:20] And this is one of the myths that you often see in that
community: that copyright somehow was created in order to empower artists, but
it's gone wrong somehow, at some point it's got wrong. [03:34] It went in the
wrong direction and now it needs to be corrected. This is a kind of a
plotline, so to speak, in a lot of creative commons oriented community
discussion about copyright. [03:46] But actually, of course, the history of
copyright is the same as the history of labour and capital and markets in
every other field. So just like the kind of Lockean idea of property
attributes the product of the worker's labour to the worker, so that the
capitalist can appropriate it, so it commodifies the products of labour,
copyright was created for exactly the same reasons, at exactly the same time,
as part of exactly the same process, in order to create a commodity form of
knowledge, so that knowledge could play in markets. [04:21] That's why
copyright was invented. That was the social reason why it needed to exist.
Because as industrial capitalism was manifesting, they required a way to
commodify knowledge work in the same way they commodified other kinds of
labour. [04:37] So the artist was only given the authorship of their work in
exactly the same way as the factory worker supposedly owns the product of
their labour. [04:51] Because the artist doesn't have the means of production,
so the artist has to give away that product, and actually legitimizes the
appropriation of the product of labour from the labourer, whether it's a
cultural labourer or a physical labourer.

[05:07]
(Intellectual) Labour

[05:10]
C.S.: And why do you think that this myth is so persistent? Or, who created
it, and for what reasons?

[05:18]
I think that a lot of kind of liberal criticism sort of starts that way. I
mean, I haven't really researched this, so that's kind of an open question
that you are asking, I don’t really have a specific position. [05:30] But my
impression is always that people that come at things from a liberal critique,
not a critical critique, sort of assume that things were once good and now
they’re bad. That’s kind of a common sort of assumption. [05:42] So instead of
looking at the core structural origin of something, they sort of have an
assumption that at some point this must have served a useful function or it
wouldn’t exist. And so therefore it must have been good and now it’s bad.
[05:57] And also because of the rhetoric, of course, just like the Lockean
rhetoric of property: give the ownership of the product of labour to the
worker. Ideologically speaking, it’s been framed this way since the beginning.
[06:14] But of course, everybody understands that in the market system the
worker is only given the rights to own their labour if they can sell it.

[06:22]
Author Function

[06:26]
C.S.: Based on this assumption, developed a certain function of the author.
Could you please elaborate on this a bit more? The invention of the individual
author.

[06:39]
The author – in a certain point of history, in line of the development of, you
know, as modern society – capitalist industrial society – began to emerge, so
did with it the author. [06:53] Previous to this, the concept of the author
was not nearly so engrained. So the author hasn't always existed in this
static sense, as unique source of new creativity and new knowledge, creating
work ex nihilo from their imagination. [07:10] Previous to this there was
always a more social understanding of authorship, where authors were in a
continuous cultural dialogue with previous authors, contemporary authors,
later authors. [07:20] And authors would frequently reuse themes, plots,
characters, from other authors. For instance, Goethe’s Faust is a good example
that has been used by authors before and after Goethe, in their own stories.
And just like the Homeric traditions of ancient literature. [07:42] Culture
was always seen to be much about dialogue, where each generation of authors
would contribute to a common creative stock of characters, plots, ideas. But
that, of course, is not conducive to making knowledge into a commodity that
can be sold in the market. [08:00] So as we got into a market-based society,
in order to create this idea of intellectual property, of copyright, creating
something that can be sold on the market, the artist and the author had to
become individuals all of a sudden. [08:16] Because this kind of iterative
social dialogue doesn’t work well in a commodity form, because how do you
properly buy it and sell it?

[08:28]
Anti-Copyright

[08:33]
C.S.: The Next concept I would like to talk about is the anti-copyright. Could
you please explain a little bit what it actually is, and where it comes from?

[08:46]
From the very beginning of copyright many artists and authors rejected it from
ideological grounds, right from the beginning. [08:35] Because, of course,
what was now plagiarism, what was now illegal, and a violation of intellectual
property had been in many cases traditional practices that writers took for
granted forever. [09:09] The ability to reuse characters; the ability to take
plots, themes and ideas from other authors and reuse them. [09:16] So many
artists rejected this idea from the beginning. And this was the idea of
copyright. But, of course, because the dominant system that was emerging – the
market capitalist system – required the commodity form to make a living, this
was always a marginal community. [09:37] So it was radical artists, like the
Situationist International, or artists that had strong political beliefs, the
American folk musicians like Woody Guthrie – another famous example. [09:47]
And all of this people were not only against intellectual property. They were
not only against the commodification of cultural work. They were against the
commodification of work, period. [09:57] There was a proletarian movement.
They were very much against capitalism as well as intellectual property.

[10:04]
Examples of Anti-Copyright

[10:08]
C.S.: Could you give also some examples in the artworld for this
anti-copyright, or in the cultural world?

[10:15]
DK: Well, you know Lautréamont’s famous text, “plagiarism is necessary: it
takes a wrong idea and replaces it with the right idea.” [10:29] And
Lautréamont was a huge influence on a bunch of radical French artists
including, most famously, the Situationist International, who published their
journal with no copyright, denying copyright. [10:44] I guess that Woody
Guthrie has a famous thing that I quote in some article or other, maybe even
in the [Telekommunist] Manifesto, I don’t remember if it made it in – where he
expressly says, he openly supports people performing, copying, modifying his
songs. That was a note that he made in a song book of his. [11:11] And many
others – the whole practice is associated with communises, from Dada to
Neoism. [11:18] Much later, up to the mid-1990s, this was the dominant form.
So from the birth of copyright, up to the mid-1990s, the intellectual property
was being questioned on the radical fringes of artists. [11:34] For me
personally, as an artist, I started to become involved with artists like
Negativland and Plunderpalooza – sorry, Plunderpalooza was an act we did;
Plunderphonics is an album by John Oswald – the newest movements and the
festival of plagiarism. [11:51] This was the area that I personally
experienced in the 1990s, but it has a long history going back to Lautréamont,
if not earlier.

[12:01]
On the Fringe

[12:05]
C.S.: But you already mentioned the term fringe, so this kind of
anti-copyright attitude automatically implied that it could only happen on the
fringe, not in the actual cultural world.

[12:15]
Exactly. It is fundamentally incompatible with capitalism, because it denies
the value-form of culture. [12:22] And without the commodity form, it can’t
make a living, it has nothing to sell in the market. Because it’s not allowed
to sell on the market, it’s necessarily marginal. [12:34] So it’s necessarily
people who support themselves through “non-art” income, by other kinds of
work, or the small percentage of artists that can be supported by cultural
funding or universities, which is, you know, a relatively small group compared
to the proper cultural industries that are supported by copyright licensing.
[12:54] That includes the major movie houses, the major record labels, the
major publishing houses. Which is, you know, in orders of magnitude, a larger
number of artists.

[13:05]
Anti-Copyright Attitude

[13:10]
C.S.: So what would you say are the two, three, main characteristics of the
anti-copyright attitude?

[13:16]
Well, it completely rejects copyright as being legitimate. That’s a complete
denial of copyright. And usually it’s a denial of the existence of a unique
author as well. [13:28] So one of the things that is very characteristic is
the blurring of the distinction between producer and consumer. [13:37] So that
art is considered to be a dialogue, an interactive process where every
producer is also a consumer of art. So everybody is an artist in that sense,
everybody potentially can be. And it’s an ongoing process. [13:52] There’s no
distinction between producer and consumer. It’s just a transient role that one
plays in a process.

[13:59]
C.S.: And in that sense it relates back to the earlier ideas of cultural
production.

[14:04]
Exactly, to the pre-commodity form of culture.

[14:11]
Copyleft

[14:15]
C.S.: Could you please explain what copyleft is, where it comes from.

[14:20]
Copyleft comes out of the software community, the hacker community. It doesn’t
come out of artistic practice per se. And it comes out of the need to share
software. [14:30] Famously, Richard Stallman and the Free Software Foundation
started this project called GNU (GNU’s Not Unix), which is the, kind of, very
famous and important project. [14:44] And they publish the license called the
GPL, which sort of defined the copyleft idea. And copyleft is a very clever
kind of a hack, as they say in the hacker community. [14:53] What it does is
that it asserts copyright, full copyright, in order to provide a public
license, a free license. And it requires that any derivative work also carries
the same license. That’s what is different about it to anti-copyright. It’s
that, rather than denying copyright outright, copyleft is a copyright license
– it is a copyright – but then the claim is used in order to publicly make the
work available to anybody that wants it under very open terms. [15:28] The key
requirement, the distinctive requirement, is that any derivative work must
also be licenced under the same terms, under the copyleft terms. [15:38] This
is what we call viral, in that it perpetuates license. This is very clever,
because it takes copyright law, and it uses copyright law to create
intellectual property freedom, within a certain context. [15:55] But the
difference is, of course, that we are talking about software. And software,
economically speaking, from the point of view of the way software developers
actually make a living, is very different. [16:11] Because within the
productive cycle – the productive cycle can be said to have two phases,
sometimes called "department one" and "department two" in Marxian language or
in classical political economics. Producer’s goods and consumer’s goods; or
capital’s goods and consumer's goods models. [16:17] The idea is that some
goods are produced not for consumers but for producers. And these goods are
called capital. So they are goods that are used in production. And because
they are used in production, it’s not as important for capitalists to make a
profit on their circulation because they are input to production. [16:47] They
make their profits up stream, by actually using those goods in production, and
then creating goods that can be sold to the masses, circulated to the masses.
[16:56] And so because culture – art and culture – is normally a “department
two” good, consumer’s good, it’s completely, fundamentally incompatible with
capitalism because capitalism requires the capture of profits and the
circulation of consumer’s goods. But because software is largely a “department
one” good, producer’s good, it has no incompatibility with capitalism at all.
[17:18] In fact, capitalists very much like having their capital costs
reduced, because the vast majority of capitalists do not make commercial
software – license it. That’s only a very small class of capitalists. For the
vast majority of capitalists, the availability of free software as an input to
their production is a wonderful thing. [17:39] So this creates a sort of a
paradox, where under capitalism, only capital can be free. And because
software is capital, free software, and the GNU project, the Linux and the
vanilla projects exploded and became huge. [17:39] So, unlike the marginal-by-
necessity anti-copyright, free software became a mass movement, that has a
billion dollar industry, that has conferences all over the world that are
attended by tens of thousands of people. And everybody is for it. It’s this
really great big thing. [18:26] So it’s been rather different than
anti-copyright in term of its place in society. It’s become very prominent, very
successful. But, unfortunately – and I guess this is where we have to go next
– the reason why it is successful is because software is a producer’s good,
not a consumer’s good.

[18:38]
Copyleft Criticism

[18:42]
C.S.: So what is your basic criticism of copyleft?

[18:47]
I have no criticism of copyleft, except for the fact that some people think
that the model can be expanded into culture. It can’t be, and that’s the
problem. It's that a lot people from the arts community then kind of came back
to this original idea of questioning copyright through free software. [19:12]
So they maybe had some relationship with the original anti-copyright
tradition, or sometimes not at all. They are fresh out of design school, and
they never had any relationship with the radical tradition of anti-copyright.
And they encounter free software – they are like, yeah, that's great. [19:29]
And the spirit of sharing and cooperation inspires them. And they think that
the model can be taken from free software and applied to art and artists as
well, just like that. [19:41] But of course, there is a problem, because in a
capitalist society there has to be some economic sustainability behind the
practice, and because free culture modelled out of the GPL can’t work, because
the artists can’t make a living that way. [20:02] While capital will fund free
software, because they need free software – it’s a producer’s good, it’s input
to their production – capital has no need for free art. So they have also no
need to finance free art. [20:15] So if they can’t be financed by capital,
that automatically gives them a very marginal role in today’s society. [20:19]
Because that means that it has to be funded by something other than capital.
And those means are – back to the anti-copyright model – those are either non-
art income, meaning you do some other kind of work to self-finance your
artistic production, or the relatively small amount of public cultural
financing that is available – or now we have new things, like crowd funding –
all these  kinds of things that create some opportunities. But still
marginally small compared to the size of the capitalist economy. [20:52] So
the only criticism of copyleft is that it is inapplicable to cultural
production.

[21:00]
Copy-left and cultural production

[21:04]
C.S.: Why this principle of free software production, GPL principles, cannot
be applied to cultural production? Just again, to really point this out.

[21:20]
The difference is really the difference between “department one” goods,
producer's goods, and “department two” goods, consumer’s goods. [21:27] It’s
that capitalists, which obviously control the vast majority of investment in
this economy – so the vast majority of money that is spent to allow people to
realise projects of any kind. The source of this money is capital investment.
[21:42] And capital is happy to invest in producer’s goods, even if they are
free. Because they need these goods. So they have no requirement to seek these
goods. [21:53] If you are running a company like Amazon, you are not making
any money selling Linux, you are making money selling web services, books and
other kinds of derivative products. You need free software to run your data
centre, to run your computer. [22:08] So the cost of software to you is a
cost, and so you're happy to have free software and support it. Because it
makes a lot more sense for you to contribute to some project that it’s also
used by five other companies. [22:21] And in the end all of you have this tool
that you can run on your computer, and run your business with, than actually
either buying a license from some company, which can be expensive, inflexible,
and you can't control it, and if it doesn't work the way you want, you cannot
change it. [22:36] So free software has a great utility for producers. That's
why it's a capital good, a producer's good, a "department one" good. [22:45]
But art and culture do not have the same economic role. Capital is not
interested in developing free culture and free art. They don't need it, they
don't do anything with it. And the capitalist that produces art and culture
requires it to have a commodity form, which is what copyright is. [23:00] So
they require a form that they can sell on the market, which requires it to
have the exclusive, non-reproducible commodity form – that copyright was
developed in order to commodify culture. [23:14] So that is why the copyleft
tradition won't work for free culture – because even though free culture and
anti-copyright predates it, it predates it as a radical fringe. And the
radical fringe isn't supported by capital. It's supported, as we said, by
outside income, non-art income, and other kind of things like small cultural
funds.

[23:38]
Creative Commons

[23:42]
C.S.: In the last ten years we have seen new business models that very much
depend on free content as well. Could you please elaborate on this a bit?

[23:56]
Well, that’s the thing. Now we have the kind of Web 2.0/Facebook world.
[24:00] The entire copyright law – the so-called "good copyright" that
protected artists – was all based on the idea of the mechanical copy. And the
mechanical copy made a lot of sense in the printing press era where, if you
had some intellectual property, you could license it through mechanical
copies. So every time it was copied, somebody owed you a royalty. Very simple.
[24:26] But in a Web 2.0 world, where we have YouTube, Facebook, Twitter and
things like that, this doesn't really work very well. Because if you post
something online and then you need to get paid a royalty every time it gets
copied (and it gets copied millions of times), this becomes very impractical.
[24:44] And so this is where the Creative Commons really comes in. Because the
Creative Commons comes in just exactly at this time – as the Internet is kind
of bursting out of its original military and NGO roots, and really hitting the
general public. At the same time free software is something that is becoming
better known, and inspiring more people – so the ideas of questioning
copyright are becoming more prominent. [25:16] So Creative Commons seizes on
this kind of principles approach that anti-copyright and copyleft take. And
again, one of the single most important things about anti-copyright and
copyleft is that in both cases the freedom that they are talking about – the
free culture that they represent – is the freedom of the consumer to become
the producer. It's the denial of the distinction between consumer and
producer. [25:41] So even though the Creative Commons has a lot of different
licenses, including some that are GPL compatible – they're approved for free
cultural work, or whatever it's called – there is one license in particular
that makes up the vast majority of the works in the Creative Commons, one
license in particular which is like the signature license of the Creative
Commons – it's the non-commercial license. And this is obviously... The
utility of that is very clear because, as we said, artists can't make a living
in a copyleft sense. [26:18] In order for artists to make a living in the
capitalist system, they have to be able to negotiate non-free rights with
their publishers. And if they can't do that, they simply can't make a living.
At least, not in the mainstream community. There is a certain small place for
artists to make a living in the alternative and fringe elements of the
artworld. [26:42] But if you are talking about making a movie, a novel, a
record, then you at some point are going to need to negotiate a contract with
the publisher. Which means, you're going to have to be able negotiate non-free
terms. [27:00] So what non-commercial [licensing] does, is that it allows
people to share your stuff, making you more famous, getting more people to
know you – building its value, so to speak. But they can't actually do
anything commercial with it. And if they want to do anything commercial with
it, they have to come back to you and they have to negotiate a non-free
license. [27:19] So this is very practical, because it solves a lot of
problems for artists that want to make work available online in order to get
better known, but still want to eventually, at some point in the future,
negotiate non-free terms with a publishing company. [27:34] But while it's
very practical, it fundamentally violates the idea that copyleft and
anti-copyright set out to challenge – and this is distinction between the producer
and the consumer. Because of this, the consumer cannot become the producer.
And that is the criticism of the Creative Commons. [27:52] That's why I want
to talk about this thing, I often say, a tragedy in three parts. The first
part is a tragedy because it has to remain fringe, because of its complete
incompatibility with the dominant capitalism. [28:04] The second part,
copyleft, is a tragedy because while it works great for software, it can't and
it won't work for art. [28:10] And the third part is a tragedy because it
actually undermines the whole idea and brings the author back to the surface,
back from the dead. But the author kind of remerges as a sort of useful idiot,
because the "some rights reserved" are basically the rights to sell your
intellectual property to the publisher in exactly the same way as the early
industrial factory worker would have sold their labour to the factory.

[28:36]
C.S.: And that creates by no means a commons.

[28:41]
It by no means creative a commons, right. Because a primary function of a
commons is that it would be available for use by others producers, and the
Creative Commons isn't because you don't have any right to create your own
work to make a living from the works in the commons – because of the non-
commercial clause that covers a large percentage of the works there.

[29:09]
Peer Production License

[29:13]
C.S.: But you were thinking of an alternative. What is the alternative?

[29:19]
There is no easy alternative. The fact is that, so long as we have a cultural
industry that is dominated by market capitalism, then the majority of artists
working within it will have to work in that form. We can't arbitrarily, as
artists, simply pretend that the industry as it is doesn't exist. [29:41] But
at the same time we can hope that alternatives will develop – that alternative
ways of producing and sharing cultural works will develop. So that the
copyfarleft license... [29:52] I describe the Creative Commons as
copyjustright. It's not copyright, it's copyjustright – you can tune it, you
can tailor it to your specific interests or needs. But it is still copyright,
just a more fine-tuneable copyright that is better for a Web 2.0 distribution
model. [30:12] The alternative is what I call copyfarleft, which also starts
off with the Creative Commons non-commercial model for the simple reason that,
as we discussed, if you are an actually existing artist in the actually
existing cultural industries of today, you are going to have to make a living,
on the most part, by selling non-free works to publishers, non-free licenses
to publishers. That's simply the way the industry works. [30:37] But in order
not to close the door on another industry developing – a different kind of
industry developing – after denying commercial works blankly (so it has a non-
commercial clause), then it expressly allows commercial usage by non-
capitalist organisations, independent cooperatives, non-profits –
organisations that are not structured around investment capital and wage
labour, and so forth; that are not for-profit organisations that are enriching
private individuals and appropriating value from workers. [31:15] So this
allows you to succeed, at least potentially succeed as a commercial artist in
the commercial world as it is right now. But at the same time it doesn't close
the door on another kind of community from developing, other kind of industry
from developing. [31:35] And we have to understand that we are not going to be
able to get rid of the cultural industries as they exist today, until we have
another set of institutions that can play those same roles. They're not going
to magically vanish, and be magically replaced. [31:52] We have to, at the
same time as those exist, build up new kind of institutions. We have to think
of new ways to produce and share cultural works. And only when we've done
that, will the cultural institutions as they are today potentially go away.
[32:09] So the copyfarleft license tries to bridge that gap by allowing the
commons to grow, but at the same time allowing the commons producers to make a
living as they normally would within the regular cultural industry. [32:25]
Some good examples where you can see something like this – might be clear –
are some of the famous novelists like Wu Ming or Cory Doctorow, people that
have done very well by publishing their works under Creative Commons non-
commercial licenses. [32:42] Wu Ming's books, which are published, I believe,
by Random House or some big publisher, are available under a Creative Commons
non-commercial license. So if you want to download them for personal use, you
can. But if you are Random House, and you want to publish them and put them on
bookstores, and manufacture them in huge supply, you have to negotiate non-
free terms with Wu Ming. And this allows Wu Ming to make a living by licensing
their work to Random House. [33:10] But while it does do that, what it doesn't
do is allow that book to be manufactured any other way. So that means that
this capitalist form of production becomes the only form that you can
commercially produce this book – except for independents, just for their own
personal use. [33:25] Whereas if their book was instead under a copyfarleft
license, what we call the "peer production" licence, then not only could they
continue to work as they do, but also potentially their book could be made
available through other means as well. Like, independent workers cooperatives
could start manufacturing it, selling it and distributing it locally in their
own areas, and make a commercial living out of it. And then perhaps if those
were to actually succeed, then they could grow and start to provide some of
the functions that capitalist institutions do now.

[34:00]
Miscommunication Technology

[34:05]
The artworks that we do are more related to the topologies side of the theory
– the relationship between network topologies, communication topologies, and
the social relations embedded in communication systems with the political
economy and economic ideas, and people's relationships to each other. [34:24]
The Miscommunication Technologies series has been going on for a quite a while
now, I guess since 2006 or so. Most of the works were pretty obscure, but the
more recent works are getting more attention and better known. And I guess
that the ones that we're talking about and exhibiting the most are deadSwap,
Thimbl and R15N, and these all attempt to explore some of the ideas.

[35:01]
deadSwap

[35:06]
deadSwap is a file sharing system. It's playing on the kind of
circumventionist technologies that are coming out of the file sharing
community, and this idea that technology can make us be able to evade the
legal and economic structures. So deadSwap wants to question this by creating
a very extreme parody of what it would actually mean to really be private.
[35:40] It is a file sharing system, that in order to be private it only
exists on one USB stick. And this USB stick is hidden in public space, and its
user send text messages to an anonymous SMS gateway in order to tell other
users where they've hidden the stick. When you have the stick you can upload
and download files to it – it's a file sharing system. It has a Wiki and file
space, essentially. Then you hide the stick somewhere, and you text the system
and it forwards your message to the next person that is waiting to share data.
And this continues like that, so then that person can share data on it, they
hide it somewhere and send an SMS to the system which then it gets forwarded
to the next person. [36:28] This work serves a few different functions at
once. First, it starts to get people to understand networks and all the basic
components. The participants in the artwork actually play a network node – you
are passing on information as if you are part of a network. So this gets
people to start thinking about how networks work, because they are playing the
network. [36:52] But on the other hand, it also tries to get cross the idea
that the behaviour of the user is much important than the technology, when it
comes to security and privacy. So how difficult it is – the system is very
private – how difficult it is to actually use it, not lose the stick, not to
get discovered. [37:11] It's actually very difficult to actually use. Even
though it seems so simple, normally people lose the USB key within like an
hour or two of starting the system. It doesn't... All the secret agent manuals
that say, be a secret agent spy – isn't easy, and it tries to get this across,
that actually it's not nearly as easy to evade the economic and political
dimensions of our society as it should be. [37:45] Maybe it's better that we
politically fight to avoid having to share information only by hiding USB
sticks in public space, sticking around and acting like spies.

[37:57]
Thimbl

[38:02]
Thimbl is another work, and it is completely online. This work in some ways
has become a signature work for us, even though it doesn't really have any
physical presence. It's a purely conceptual work. [38:15] One of the arguments
that the Manifesto makes is that the Internet was a fully distributed social
media platform – that's what the Internet was, and then it was replaced,
because of capitalism and because of the economic logic of the market, with
centralised communication platforms like Twitter and Facebook. [38:40] And
despite that, within the free software community and the hacker community,
there's the opposite myth, just like the copyright myth. There's this idea
that we are moving towards decentralised software. [38:54] You see people like
Eben Moglen making this point a lot, when he says, now we have Facebook, but
because of FreedomBox, Diaspora and a laundry list of other projects, we're
eventually going to reach a decentralised software. [39:07] But this makes two
assumptions that are incorrect. The first is that we are starting with
centralised media and we are going to decentralised media, which actually is
incorrect. We started with a decentralised social media platform and we moved
to a centralised one. [39:40] And the second thing that is incorrect is that
we can move from a centralised platform to a decentralised platform if we just
create the right technology, so the problem is technological. [39:34] With
Thimbl we wanted to make the point that that wasn't true, that the problem was
actually political. The technological problem is trivial. The computer
sciences have been around forever. The problem is political. [39:43] The
problem is that these systems will not be financed by capital, because capital
requires profit in order to sustain itself. In order to capture profit it
needs to have control of user interaction and user's data. [39:57] To
illustrate this, we created a micro-blogging platform like Twitter, but using
a protocol of the 1970s called Finger. So we've used the protocol that has
been around since the 1970s and made a micro-blogging platform out of it –
fully, totally distributed micro-blogging platform. And then promoted it as if
it was a real thing, with videos and website, and stuff like that. But of
course, there is no way to sign up for it, because it's just a concept.
[40:22] And then there are some scripts that other people wrote that actually
made it to a certain degree real. For us it was just a concept, but then
people actually took it and made working implementations of it, and there are
several working implementations of Thimbl. [40:38] But the point remains that
the problem is not technical, the problem is political. So we came up with
this idea of the economic fiction, or the social fiction. [40:47] Because in
science fiction you often have situations where something that eventually
became a real technology was originally introduced in a fictional context as a
science fiction. [40:59] The reason it's fictional is because science at the
time was not able to create the thing, but as science transcends its
limitations, what was once fictional technology became real technology. So we
have this idea of a social or economic fiction. [41:15] Thimbl is not science
fiction. Technologically speaking it demonstrably works – it's a demonstrably
working concept. The problem is economic. [41:23] For Thimbl to become a
reality, society has to transcend its economic limitations – it's social and
economic limitations in order to find ways to create communication systems
that are not simply funded by the capture of user data and information, which
Thimbl can't do because it is a distributive system. You can't control the
users, you can't know who is using it or what they are doing, because it's
fully distributed.

[41:47]
R15N

[41:52]
The R15N has elements of both of those things. We wanted to create a system
that was basically drawn a little from deadSwap, but I wanted to take out the
secret agent element of it. Because I was really... [42:08] The first place it
was commissioned to be in was actually in Tel Aviv, in Israel, the [Israeli]
Center for Digital Art. And this kind of spy aesthetic that deadSwap had, I
didn't think it would be an appropriate aesthetic in that context. [42:22] The
idea that of trying to convince young people in a poor area in Tel Aviv to act
like spies and hide USB sticks in public space didn't seem like a good idea.
[42:34] So I wanted to go the other way, and I wanted to really emphasise the
collaboration, and create a kind of system that is pretty much totally
impossible to use, but only if you really cooperate you can make it work.
[42:45] So I took another old approach called the telephone tree. I don't know
if you remember telephone trees. Telephone trees existed for years before the
Internet, when schools and army reserves needed to be quickly dispatched, and
it worked with a very simple tree topology. [43:01] You had a few people that
were the top nodes, that then called the list of two or three people, that
then called the list of two or three people, that then called the list of two
or three people... And the message can be sent through the community very
rapidly through a telephone tree. [43:14] It is often used in Canada for
announcing snow days at school, for instance. If the school was closed, they
would call three parents, who would each call three parents, who would each
call three parents, and so forth. So that all the parents knew that the school
was closed. That's one aspect. [43:30] Another aspect of it is that
telephones, especially mobile phones, are really advertised as a very freedom
enabling kind of a thing. Things that you can go anywhere... [43:41] I don't
know if you remember some of the early telephones ads where there are always
businessmen on the beach. I remember this one where this woman's daughter
wants to make an appointment with her because she only has time for her
colleague appointments, and so it's this whole thing about spending more time
with her daughter – so she takes her daughter to the beach, which she is able
to do because she can still conduct business on her mobile phone. So it's this
freedom kind of a thing. [44:04] But in areas like the Jessi Cohen area in Tel
Aviv where we were working, and other areas where the project has been
exhibited, like Johannesburg – other places like that, the telephone has a
very different role, because it's free to receive phone calls, but it costs
much to make phone calls, in most parts of the world, especially in these poor
areas. [44:25] So the telephone is a very asymmetric power relationship based
on your availability of credit. So rather than being a freedom enabling thing,
it's a control technology. So young people and poor people that carry them
can't actually make any calls, they can't call anybody. They can only receive
calls. [44:40] So it's used as a tedder, a control system from their parents,
their teachers, their employers, so they can know where they are at any time
and say, hey why aren't you at work, or where are you, what are you doing.
It's actually a control technology. [44:54] We wanted to invert that too. So
the way the phone tree system work is that, when you have a message you
initiate a phone call, so you initiate a new tree, the system phones you...
[45:05] And you can initiate a new tree in the modern versions by pushing a
button in the gallery. There's a physical button in the gallery, you push the
button, there's a phone beside it, it rings a random person, you tell them
your message, and then it creates an ad hoc telephone tree. It takes all the
subscribers and arranges them in a tree, just like in the old telephone tree,
and each person calls each person, until your message, in theory, gets through
the community. [45:28] But of course in reality nobody answers their phones,
you get voicemail, and then you get voicemail talking to voicemail. Of course,
voice from the Internet is fake to begin with, so calls fail. So it actually
becomes this really frenetic system where people actually don’t know what's
going on, and the message is constantly lost. [45:44] And of course, you have
all of these missed phone calls, this high pressure of the always-on world.
You are always getting these phone calls, and you're missing phone calls, and
actually nobody ever knows what the message is. So it actually creates this
kind of mass confusion. [46:00] This once again demonstrates that the users –
what we call jokingly in the R15N literature, the diligence of the users, is
so much required for these systems to work. Technologically, the system is
actually more or less hindered. [46:21] But they also serve not only to make
that message, which is a more general message – but also, like in the other
ones, in R15N you are a node in the network. So when you don’t answer a call
you know that a message is dropped. [46:36] So you can image how volatile
information is in networks. When you pass your information through a third
party, you realise that they can drop it, they can change it, they can
introduce their own information. [46:50] And that is true in R15N, but is also
true in Facebook, in Twitter, and in any time you send messages through some
third party. That is one of the messages that is core to the series.


 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.