Giorgetta, Nicoletti & Adema
A Conversation on Digital Archiving Practices
2015

# A Conversation on Digital Archiving Practices

A couple of months ago Davide Giorgetta and Valerio Nicoletti (both ISIA
Urbino) did an interview with me for their MA in Design of Publishing. Silvio
Lorusso, was so kind to publish the interview on the fantastic
[p-dpa.net](http://p-dpa.net/a-conversation-on-digital-archiving-practices-
with-janneke-adema/). I am reblogging it here.

* * *

[Davide Giorgetta](http://p-dpa.net/creator/davide-giorgetta/) and [Valerio
Nicoletti](http://p-dpa.net/creator/valerio-nicoletti/) are both students from
[ISIA Urbino](http://www.isiaurbino.net/home/), where they attend the Master
Course in Design for Publishing. They are currently investigating the
independent side of digital archiving practices within the scope of the
publishing world.

As part of their research, they asked some questions to Janneke Adema, who is
Research Fellow in Digital Media at Coventry University, with a PhD in Media
(Coventry University) and a background in History (MA) and Philosophy (MA)
(both University of Groningen) and Book and Digital Media Studies (MA) (Leiden
University). Janneke’s PhD thesis focuses on the future of the scholarly book
in the humanities. She has been conducting research for the
[OAPEN](http://project.oapen.org/index.php/about-oapen) project, and
subsequently the OAPEN foundation, from 2008 until 2013 (including research
for OAPEN-NL and DOAB). Her research for OAPEN focused on user needs and
publishing models concerning Open Access books in the Humanities and Social
Sciences.

**Davide Giorgetta & Valerio Nicoletti: Does a way out from the debate between
publishers and digital independent libraries (Monoskop Log, Ubuweb,
Aaaarg.org) exist, in terms of copyright? An alternative solution able to
solve the issue and to provide equal opportunities to everyone? Would the fear
of publishers of a possible reduction of incomes be legitimized if the access
to their digital publications was open and free?**

Janneke Adema: This is an interesting question, since for many academics this
‘way out’ (at least in so far it concerns scholarly publications) has been
envisioned in or through the open access movement and the use of Creative
Commons licenses. However, the open access movement, a rather plural and
loosely defined group of people, institutions and networks, in its more
moderate instantiations tends to distance itself from piracy and copyright
infringement or copy(far)left practices. Through its use of and favoring of
Creative Commons licenses one could even argue that it has been mainly
concerned with a reform of copyright rather than a radical critique of and
rethinking of the common and the right to copy (Cramer 2013, Hall
2014).¹(http://p-dpa.net/a-conversation-on-digital-archiving-practices-
with-janneke-adema/#fn:1 "see footnote") Nonetheless, in its more radical
guises open access can be more closely aligned with the practices associated
with digital pirate libraries such as the ones listed above, for instance
through Aaron Swartz’s notion of [Guerilla Open
Access](https://archive.org/stream/GuerillaOpenAccessManifesto/Goamjuly2008_djvu.txt):

> We need to take information, wherever it is stored, make our copies and
share them with the world. We need to take stuff that’s out of copyright and
add it to the archive. We need to buy secret databases and put them on the
Web. We need to download scientific journals and upload them to file sharing
networks. We need to fight for Guerilla Open Access. (Swartz 2008)

However whatever form or vision of open access you prefer, I do not think it
is a ‘solution’ to any problem—such as copyright/fight—, but I would rather
see it, as I have written
[elsewhere](http://blogs.lse.ac.uk/impactofsocialsciences/2014/11/18
/embracing-messiness-adema-pdsc14/), ‘as an ongoing processual and critical
engagement with changes in the publishing system, in our scholarly
communication practices and in our media and technologies of communication.’
And in this sense open access practices offer us the possibility to critically
reflect upon the politics of knowledge production, including copyright and
piracy, openness and the commons, indeed, even upon the nature of the book
itself.

With respect to the second part of your question, again, where it concerns
scholarly books, [research by Ronald
Snijder](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=PuDczakAAAAJ&citation_for_view=PuDczakAAAAJ:u-x6o8ySG0sC)
shows no decline in sales or income for publishers once they release their
scholarly books in open access. The open availability does however lead to
more discovery and online consultation, meaning that it actually might lead to
more ‘impact’ for scholarly books (Snijder 2010).

**DG, VN: In which way, if any, are digital archiving practices stimulating
new publishing phenomenons? Are there any innovative outcomes, apart the
obvious relation to p.o.d. tools? (or interesting new projects in this
field)**

JA: Beyond extending access, I am mostly interested in how digital archiving
practices have the potential to stimulate the following practices or phenomena
(which in no way are specific to digital archiving or publishing practices, as
they have always been a potential part of print publications too): reuse and
remix; processual research and iterative publishing; and collaborative forms
of knowledge production. These practices interest me mainly as they have the
potential to critique the way the (printed) book has been commodified and
essentialised over the centuries, in a bound, linear and fixed format, a
practice which is currently being replicated in a digital context. Indeed, the
book has been fixed in this way both discursively and through a system of
material production within publishing and academia—which includes our
institutions and practices of scholarly communication—that prefers book
objects as quantifiable and auditable performance indicators and as marketable
commodities and objects of symbolic value exchange. The practices and
phenomena mentioned above, i.e. remix, versioning and collaboration, have the
potential to help us to reimagine the bound nature of the book and to explore
both a spatial and temporal critique of the book as a fixed object; they can
aid us to examine and experiment with various different incisions that can be
made in our scholarship as part of the informal and formal publishing and
communication of our research that goes beyond the final research commodity.
In this sense I am interested in how these specific digital archiving,
research and publishing practices offer us the possibility to imagine a
different, perhaps more ethical humanities, a humanities that is processual,
contingent, unbound and unfinished. How can these practices aid us in how to
cut well in the ongoing unfolding of our research, how can they help us
explore how to make potentially better interventions? How can we take
responsibility as scholars for our entangled becoming with our research and
publications? (Barad 2007, Kember and Zylinska 2012)

Examples that I find interesting in the realm of the humanities in this
respect include projects that experiment with such a critique of our fixed,
print-based practices and institutions in an affirmative way: for example Mark
Amerika’s [remixthebook](http://www.remixthebook.com/) project; Open
Humanities’ [Living Books about Life](http://www.livingbooksaboutlife.org/)
series; projects such as
[Vectors](http://vectors.usc.edu/issues/index.php?issue=7) and
[Scalar](http://scalar.usc.edu/); and collaborative knowledge production,
archiving and creation projects, from wiki-based research projects to AAAARG.

**DG, VN: In which way does a digital container influence its content? Does
the same book — if archived on different platforms, such as _Internet Archive_
, _The Pirate Bay_ , _Monoskop Log_ — still remain the same cultural item?**

JA: In short my answer to this question would be ‘no’. Books are embodied
entities, which are materially established through their specific affordances
in relationship to their production, dissemination, reception and
preservation. This means that the specific materiality of the (digital) book
is partly an outcome of these ongoing processes. Katherine Hayles has argued
in this respect that materiality is an emergent property:

> In this view of materiality, it is not merely an inert collection of
physical properties but a dynamic quality that emerges from the interplay
between the text as a physical artifact, its conceptual content, and the
interpretive activities of readers and writers. Materiality thus cannot be
specified in advance; rather, it occupies a borderland— or better, performs as
connective tissue—joining the physical and mental, the artifact and the user.
(2004: 72)

Similarly, Matthew Kirschenbaum points out that the preservation of digital
objects is:

> _logically inseparable_ from the act of their creation’ (…) ‘The lag between
creation and preservation collapses completely, since a digital object may
only ever be said to be preserved _if_ it is accessible, and each individual
access creates the object anew. One can, in a very literal sense, _never_
access the “same” electronic file twice, since each and every access
constitutes a distinct instance of the file that will be addressed and stored
in a unique location in computer memory. (Kirschenbaum 2013)

Every time we access a digital object, we thus duplicate it, we copy it and we
instantiate it. And this is exactly why, in our strategies of conservation,
every time we access a file we also (re)create these objects anew over and
over again. The agency of the archive, of the software and hardware, are also
apparent here, where archives are themselves ‘active ‘‘archaeologists’’ of
knowledge’ (Ernst 2011: 239) and, as Kirschenbaum puts it, ‘the archive writes
itself’ (2013).

In this sense a book can be seen as an apparatus, consisting of an
entanglement of relationships between, among other things, authors, books, the
outside world, readers, the material production and political economy of book
publishing, its preservation and material instantiations, and the discursive
formation of scholarship. Books as apparatuses are thus reality shaping, they
are performative. This relates to Johanna Drucker’s notion of ‘performative
materiality’, where Drucker argues for an extension of what a book _is_ (i.e.
from a focus on its specific properties and affordances), to what a book
_does_ : ‘Performative materiality suggests that what something _is_ has to be
understood in terms of what it _does_ , how it works within machinic,
systemic, and cultural domains.’ For, as Drucker argues, ‘no matter how
detailed a description of material substrates or systems we have, their use is
performative whether this is a reading by an individual, the processing of
code, the transmission of signals through a system, the viewing of a film,
performance of a play, or a musical work and so on. Material conditions
provide an inscriptional base, a score, a point of departure, a provocation,
from which a work is produced as an event’ (Drucker 2013).

So, to come back to your question, these specific digital platforms (Monoskop,
The Pirate Bay etc.) become integral aspects of the apparatus of the book and
each in their own different way participates in the performance and
instantiation of the books in their archives. Not only does a digital book
therefore differ as a material or cultural object from a printed book, a
digital object also has materially distinct properties related to the platform
on which it is made available. Indeed, building further on the theories
described above, a book is a different object every time it is instantiated or
read, be it by a human or machinic entity; they become part of the apparatus
of the book, a performative apparatus. Therefore, as Silvio Lorusso has
stated:

[![The-Post-Digital-Publishing-Archive-An-Inventory-of-Speculative-Strategies
-----Coventry-University-----June-11th-2014-21](https://i2.wp.com/p-dpa.net
/wp-content/uploads/2015/06/The-Post-Digital-Publishing-Archive-An-Inventory-
of-Speculative-Strategies-Coventry-University-June-
11th-2014-21.png)](http://p-dpa.net/wp-content/uploads/2015/06/The-Post-
Digital-Publishing-Archive-An-Inventory-of-Speculative-Strategies-Coventry-
University-June-11th-2014-21.png)

**DG, VN: In your opinion, can scholarly publishing, in particular self-
archiving practices, constitute a bridge covering the gap between authors and
users in terms of access to knowledge? Could we hope that these practices will
find a broader use, moving from very specific fields (academic papers) to book
publishing in general?**

JA: On the one hand, yes. Self-archiving, or the ‘green road’ to open access,
offers a way for academics to make their research available in a preprint form
via open access repositories in a relatively simple and straightforward way,
making it easily accessible to other academics and more general audiences.
However, it can be argued that as a strategy, the green road doesn’t seem to
be very subversive, where it doesn’t actively rethink, re-imagine, or
experiment with the system of scholarly knowledge production in a more
substantial way, including peer-review and the print-based publication forms
this system continues to promote. With its emphasis on achieving universal,
free, online access to research, a rigorous critical exploration of the form
of the book itself doesn’t seem to be a main priority of green open access
activists. Stevan Harnad, one of the main proponents of green open access and
self-archiving has for instance stated that ‘it’s time to stop letting the
best get in the way of the better: Let’s forget about Libre and Gold OA until
we have managed to mandate Green Gratis OA universally’ (Harnad 2012). This is
where the self-archiving strategy in its current implementation falls short I
think with respect to the ‘breaking-down’ of barriers between authors and
users, where it isn’t necessarily committed to following a libre open access
strategy, which, one could argue, would be more open to adopting and promoting
forms of open access that are designed to make material available for others
to (re) use, copy, reproduce, distribute, transmit, translate, modify, remix
and build upon? Surely this would be a more substantial strategy to bridge the
gap between authors and users with respect to the production, dissemination
and consumption of knowledge?

With respect to the second part of your question, could these practices find a
broader use? I am not sure, mainly because of the specific characteristics of
academia and scholarly publishing, where scholars are directly employed and
paid by their institutions for the research work they do. Hence, self-
archiving this work would not directly lead to any or much loss of income for
academics. In other fields, such as literary publishing for example, this
issue of remuneration can become quite urgent however, even though many [free
culture](https://en.wikipedia.org/wiki/Free_culture_movement) activists (such
as Lawrence Lessig and Cory Doctorow) have argued that freely sharing cultural
goods online, or even self-publishing, doesn’t necessarily need to lead to any
loss of income for cultural producers. So in this respect I don’t think we can
lift something like open access self-archiving out of its specific context and
apply it to other contexts all that easily, although we should certainly
experiment with this of course in different domains of digital culture.

**DG, VN: After your answers, we would also receive suggestions from you. Do
you notice any unresolved or raising questions in the contemporary context of
digital archiving practices and their relation to the publishing realm?**

JA: So many :). Just to name a few: the politics of search and filtering
related to information overload; the ethics and politics of publishing in
relationship to when, where, how and why we decide to publish our research,
for what reasons and with what underlying motivations; the continued text- and
object-based focus of our archiving and publishing practices and platforms,
where there is a lack of space to publish and develop more multimodal,
iterative, diagrammatic and speculative forms of scholarship; issues of free
labor and the problem or remuneration of intellectual labor in sharing
economies etc.

**Bibliography**

* Adema, J. (2014) ‘Embracing Messiness’. [17 November 2014] available from [17 November 2014]
* Adema, J. and Hall, G. (2013) ‘The Political Nature of the Book: On Artists’ Books and Radical Open Access’. _New Formations_ 78 (1), 138–156
* Barad, K. (2007) _Meeting the Universe Halfway: Quantum Physics and the Entanglement of Matter and Meaning_. Duke University Press
* Cramer, F. (2013) _Anti-Media: Ephemera on Speculative Arts_. Rotterdam : New York, NY: nai010 publishers
* Drucker, J. (2013) _Performative Materiality and Theoretical Approaches to Interface_. [online] 7 (1). available from [4 April 2014]
* Ernst, W. (2011) ‘Media Archaeography: Method and Machine versus History and Narrative of Media’. in _Media Archaeology: Approaches, Applications, and Implications_. ed. by Huhtamo, E. and Parikka, J. University of California Press
* Hall, G. (2014) ‘Copyfight’. in _Critical Keywords for the Digital Humanities_ , [online] Lueneburg: Centre for Digital Cultures (CDC). available from [5 December 2014]
* Harnad, S. (2012) ‘Open Access: Gratis and Libre’. [3 May 2012] available from [4 March 2014]
* Hayles, N.K. (2004) ‘Print Is Flat, Code Is Deep: The Importance of Media-Specific Analysis’. _Poetics Today_ 25 (1), 67–90
* Kember, S. and Zylinska, J. (2012) _Life After New Media: Mediation as a Vital Process_. MIT Press
* Kirschenbaum, M. (2013) ‘The .txtual Condition: Digital Humanities, Born-Digital Archives, and the Future Literary’. _DHQ: Digital Humanities Quarterly_ [online] 7 (1). available from [20 July 2014]
* Lorusso, S. (2014) _The Post-Digital Publishing Archive: An Inventory of Speculative Strategies_. in ‘The Aesthetics of the Humanities: Towards a Poetic Knowledge Production’ [online] held 11 June 2014 at Coventry University. available from [31 May 2015]
* Snijder, R. (2010) ‘The Profits of Free Books: An Experiment to Measure the Impact of Open Access Publishing’. _Learned Publishing_ 23 (4), 293–301
* Swartz, A. (2008) _Guerilla Open Access Manifesto_ [online] available from [31 May 2015]

Murtaugh
A bag but is language nothing of words
2016

## A bag but is language nothing of words

### From Mondotheque

#####

(language is nothing but a bag of words)

[Michael Murtaugh](/wiki/index.php?title=Michael_Murtaugh "Michael Murtaugh")

In text indexing and other machine reading applications the term "bag of
words" is frequently used to underscore how processing algorithms often
represent text using a data structure (word histograms or weighted vectors)
where the original order of the words in sentence form is stripped away. While
"bag of words" might well serve as a cautionary reminder to programmers of the
essential violence perpetrated to a text and a call to critically question the
efficacy of methods based on subsequent transformations, the expression's use
seems in practice more like a badge of pride or a schoolyard taunt that would
go: Hey language: you're nothin' but a big BAG-OF-WORDS.

## Bag of words

In information retrieval and other so-called _machine-reading_ applications
(such as text indexing for web search engines) the term "bag of words" is used
to underscore how in the course of processing a text the original order of the
words in sentence form is stripped away. The resulting representation is then
a collection of each unique word used in the text, typically weighted by the
number of times the word occurs.

Bag of words, also known as word histograms or weighted term vectors, are a
standard part of the data engineer's toolkit. But why such a drastic
transformation? The utility of "bag of words" is in how it makes text amenable
to code, first in that it's very straightforward to implement the translation
from a text document to a bag of words representation. More significantly,
this transformation then opens up a wide collection of tools and techniques
for further transformation and analysis purposes. For instance, a number of
libraries available in the booming field of "data sciences" work with "high
dimension" vectors; bag of words is a way to transform a written document into
a mathematical vector where each "dimension" corresponds to the (relative)
quantity of each unique word. While physically unimaginable and abstract
(imagine each of Shakespeare's works as points in a 14 million dimensional
space), from a formal mathematical perspective, it's quite a comfortable idea,
and many complementary techniques (such as principle component analysis) exist
to reduce the resulting complexity.

What's striking about a bag of words representation, given is centrality in so
many text retrieval application is its irreversibility. Given a bag of words
representation of a text and faced with the task of producing the original
text would require in essence the "brain" of a writer to recompose sentences,
working with the patience of a devoted cryptogram puzzler to draw from the
precise stock of available words. While "bag of words" might well serve as a
cautionary reminder to programmers of the essential violence perpetrated to a
text and a call to critically question the efficacy of methods based on
subsequent transformations, the expressions use seems in practice more like a
badge of pride or a schoolyard taunt that would go: Hey language: you're
nothing but a big BAG-OF-WORDS. Following this spirit of the term, "bag of
words" celebrates a perfunctory step of "breaking" a text into a purer form
amenable to computation, to stripping language of its silly redundant
repetitions and foolishly contrived stylistic phrasings to reveal a purer
inner essence.

## Book of words

Lieber's Standard Telegraphic Code, first published in 1896 and republished in
various updated editions through the early 1900s, is an example of one of
several competing systems of telegraph code books. The idea was for both
senders and receivers of telegraph messages to use the books to translate
their messages into a sequence of code words which can then be sent for less
money as telegraph messages were paid by the word. In the front of the book, a
list of examples gives a sampling of how messages like: "Have bought for your
account 400 bales of cotton, March delivery, at 8.34" can be conveyed by a
telegram with the message "Ciotola, Delaboravi". In each case the reduction of
number of transmitted words is highlighted to underscore the efficacy of the
method. Like a dictionary or thesaurus, the book is primarily organized around
key words, such as _act_ , _advice_ , _affairs_ , _bags_ , _bail_ , and
_bales_ , under which exhaustive lists of useful phrases involving the
corresponding word are provided in the main pages of the volume. [1]

[![Liebers
P1016847.JPG](/wiki/images/4/41/Liebers_P1016847.JPG)](/wiki/index.php?title=File:Liebers_P1016847.JPG)

[![Liebers
P1016859.JPG](/wiki/images/3/35/Liebers_P1016859.JPG)](/wiki/index.php?title=File:Liebers_P1016859.JPG)

[![Liebers
P1016861.JPG](/wiki/images/3/34/Liebers_P1016861.JPG)](/wiki/index.php?title=File:Liebers_P1016861.JPG)

[![Liebers
P1016869.JPG](/wiki/images/f/fd/Liebers_P1016869.JPG)](/wiki/index.php?title=File:Liebers_P1016869.JPG)

> [...] my focus in this chapter is on the inscription technology that grew
parasitically alongside the monopolistic pricing strategies of telegraph
companies: telegraph code books. Constructed under the bywords “economy,”
“secrecy,” and “simplicity,” telegraph code books matched phrases and words
with code letters or numbers. The idea was to use a single code word instead
of an entire phrase, thus saving money by serving as an information
compression technology. Generally economy won out over secrecy, but in
specialized cases, secrecy was also important.[2]

In Katherine Hayles' chapter devoted to telegraph code books she observes how:

> The interaction between code and language shows a steady movement away from
a human-centric view of code toward a machine-centric view, thus anticipating
the development of full-fledged machine codes with the digital computer. [3]

[![Liebers
P1016851.JPG](/wiki/images/1/13/Liebers_P1016851.JPG)](/wiki/index.php?title=File:Liebers_P1016851.JPG)
Aspects of this transitional moment are apparent in a notice included
prominently inserted in the Lieber's code book:

> After July, 1904, all combinations of letters that do not exceed ten will
pass as one cipher word, provided that it is pronounceable, or that it is
taken from the following languages: English, French, German, Dutch, Spanish,
Portuguese or Latin -- International Telegraphic Conference, July 1903 [4]

Conforming to international conventions regulating telegraph communication at
that time, the stipulation that code words be actual words drawn from a
variety of European languages (many of Lieber's code words are indeed
arbitrary Dutch, German, and Spanish words) underscores this particular moment
of transition as reference to the human body in the form of "pronounceable"
speech from representative languages begins to yield to the inherent potential
for arbitrariness in digital representation.

What telegraph code books do is remind us of is the relation of language in
general to economy. Whether they may be economies of memory, attention, costs
paid to a telecommunicatons company, or in terms of computer processing time
or storage space, encoding language or knowledge in any form of writing is a
form of shorthand and always involves an interplay with what one expects to
perform or "get out" of the resulting encoding.

> Along with the invention of telegraphic codes comes a paradox that John
Guillory has noted: code can be used both to clarify and occlude. Among the
sedimented structures in the technological unconscious is the dream of a
universal language. Uniting the world in networks of communication that
flashed faster than ever before, telegraphy was particularly suited to the
idea that intercultural communication could become almost effortless. In this
utopian vision, the effects of continuous reciprocal causality expand to
global proportions capable of radically transforming the conditions of human
life. That these dreams were never realized seems, in retrospect, inevitable.
[5]

[![Liebers
P1016884.JPG](/wiki/images/9/9c/Liebers_P1016884.JPG)](/wiki/index.php?title=File:Liebers_P1016884.JPG)

[![Liebers
P1016852.JPG](/wiki/images/7/74/Liebers_P1016852.JPG)](/wiki/index.php?title=File:Liebers_P1016852.JPG)

[![Liebers
P1016880.JPG](/wiki/images/1/11/Liebers_P1016880.JPG)](/wiki/index.php?title=File:Liebers_P1016880.JPG)

Far from providing a universal system of encoding messages in the English
language, Lieber's code is quite clearly designed for the particular needs and
conditions of its use. In addition to the phrases ordered by keywords, the
book includes a number of tables of terms for specialized use. One table lists
a set of words used to describe all possible permutations of numeric grades of
coffee (Choliam = 3,4, Choliambos = 3,4,5, Choliba = 4,5, etc.); another table
lists pairs of code words to express the respective daily rise or fall of the
price of coffee at the port of Le Havre in increments of a quarter of a Franc
per 50 kilos ("Chirriado = prices have advanced 1 1/4 francs"). From an
archaeological perspective, the Lieber's code book reveals a cross section of
the needs and desires of early 20th century business communication between the
United States and its trading partners.

The advertisements lining the Liebers Code book further situate its use and
that of commercial telegraphy. Among the many advertisements for banking and
law services, office equipment, and alcohol are several ads for gun powder and
explosives, drilling equipment and metallurgic services all with specific
applications to mining. Extending telegraphy's formative role for ship-to-
shore and ship-to-ship communication for reasons of safety, commercial
telegraphy extended this network of communication to include those parties
coordinating the "raw materials" being mined, grown, or otherwise extracted
from overseas sources and shipped back for sale.

## "Raw data now!"

From [La ville intelligente - Ville de la connaissance](/wiki/index.php?title
=La_ville_intelligente_-_Ville_de_la_connaissance "La ville intelligente -
Ville de la connaissance"):

Étant donné que les nouvelles formes modernistes et l'utilisation de matériaux
propageaient l'abondance d'éléments décoratifs, Paul Otlet croyait en la
possibilité du langage comme modèle de « [données
brutes](/wiki/index.php?title=Bag_of_words "Bag of words") », le réduisant aux
informations essentielles et aux faits sans ambiguïté, tout en se débarrassant
de tous les éléments inefficaces et subjectifs.

From [The Smart City - City of Knowledge](/wiki/index.php?title
=The_Smart_City_-_City_of_Knowledge "The Smart City - City of Knowledge"):

As new modernist forms and use of materials propagated the abundance of
decorative elements, Otlet believed in the possibility of language as a model
of '[raw data](/wiki/index.php?title=Bag_of_words "Bag of words")', reducing
it to essential information and unambiguous facts, while removing all
inefficient assets of ambiguity or subjectivity.

> Tim Berners-Lee: [...] Make a beautiful website, but first give us the
unadulterated data, we want the data. We want unadulterated data. OK, we have
to ask for raw data now. And I'm going to ask you to practice that, OK? Can
you say "raw"?

>

> Audience: Raw.

>

> Tim Berners-Lee: Can you say "data"?

>

> Audience: Data.

>

> TBL: Can you say "now"?

>

> Audience: Now!

>

> TBL: Alright, "raw data now"!

>

> [...]

>

> So, we're at the stage now where we have to do this -- the people who think
it's a great idea. And all the people -- and I think there's a lot of people
at TED who do things because -- even though there's not an immediate return on
the investment because it will only really pay off when everybody else has
done it -- they'll do it because they're the sort of person who just does
things which would be good if everybody else did them. OK, so it's called
linked data. I want you to make it. I want you to demand it. [6]

## Un/Structured

As graduate students at Stanford, Sergey Brin and Lawrence (Larry) Page had an
early interest in producing "structured data" from the "unstructured" web. [7]

> The World Wide Web provides a vast source of information of almost all
types, ranging from DNA databases to resumes to lists of favorite restaurants.
However, this information is often scattered among many web servers and hosts,
using many different formats. If these chunks of information could be
extracted from the World Wide Web and integrated into a structured form, they
would form an unprecedented source of information. It would include the
largest international directory of people, the largest and most diverse
databases of products, the greatest bibliography of academic works, and many
other useful resources. [...]

>

> **2.1 The Problem**
> Here we define our problem more formally:
> Let D be a large database of unstructured information such as the World
Wide Web [...] [8]

In a paper titled _Dynamic Data Mining_ Brin and Page situate their research
looking for _rules_ (statistical correlations) between words used in web
pages. The "baskets" they mention stem from the origins of "market basket"
techniques developed to find correlations between the items recorded in the
purchase receipts of supermarket customers. In their case, they deal with web
pages rather than shopping baskets, and words instead of purchases. In
transitioning to the much larger scale of the web, they describe the
usefulness of their research in terms of its computational economy, that is
the ability to tackle the scale of the web and still perform using
contemporary computing power completing its task in a reasonably short amount
of time.

> A traditional algorithm could not compute the large itemsets in the lifetime
of the universe. [...] Yet many data sets are difficult to mine because they
have many frequently occurring items, complex relationships between the items,
and a large number of items per basket. In this paper we experiment with word
usage in documents on the World Wide Web (see Section 4.2 for details about
this data set). This data set is fundamentally different from a supermarket
data set. Each document has roughly 150 distinct words on average, as compared
to roughly 10 items for cash register transactions. We restrict ourselves to a
subset of about 24 million documents from the web. This set of documents
contains over 14 million distinct words, with tens of thousands of them
occurring above a reasonable support threshold. Very many sets of these words
are highly correlated and occur often. [9]

## Un/Ordered

In programming, I've encountered a recurring "problem" that's quite
symptomatic. It goes something like this: you (the programmer) have managed to
cobble out a lovely "content management system" (either from scratch, or using
any number of helpful frameworks) where your user can enter some "items" into
a database, for instance to store bookmarks. After this ordered items are
automatically presented in list form (say on a web page). The author: It's
great, except... could this bookmark come before that one? The problem stems
from the fact that the database ordering (a core functionality provided by any
database) somehow applies a sorting logic that's almost but not quite right. A
typical example is the sorting of names where details (where to place a name
that starts with a Norwegian "Ø" for instance), are language-specific, and
when a mixture of languages occurs, no single ordering is necessarily
"correct". The (often) exascerbated programmer might hastily add an additional
database field so that each item can also have an "order" (perhaps in the form
of a date or some other kind of (alpha)numerical "sorting" value) to be used
to correctly order the resulting list. Now the author has a means, awkward and
indirect but workable, to control the order of the presented data on the start
page. But one might well ask, why not just edit the resulting listing as a
document? Not possible! Contemporary content management systems are based on a
data flow from a "pure" source of a database, through controlling code and
templates to produce a document as a result. The document isn't the data, it's
the end result of an irreversible process. This problem, in this and many
variants, is widespread and reveals an essential backwardness that a
particular "computer scientist" mindset relating to what constitutes "data"
and in particular it's relationship to order that makes what might be a
straightforward question of editing a document into an over-engineered
database.

Recently working with Nikolaos Vogiatzis whose research explores playful and
radically subjective alternatives to the list, Vogiatzis was struck by how
from the earliest specifications of HTML (still valid today) have separate
elements (OL and UL) for "ordered" and "unordered" lists.

> The representation of the list is not defined here, but a bulleted list for
unordered lists, and a sequence of numbered paragraphs for an ordered list
would be quite appropriate. Other possibilities for interactive display
include embedded scrollable browse panels. [10]

Vogiatzis' surprise lay in the idea of a list ever being considered
"unordered" (or in opposition to the language used in the specification, for
order to ever be considered "insignificant"). Indeed in its suggested
representation, still followed by modern web browsers, the only difference
between the two visually is that UL items are preceded by a bullet symbol,
while OL items are numbered.

The idea of ordering runs deep in programming practice where essentially
different data structures are employed depending on whether order is to be
maintained. The indexes of a "hash" table, for instance (also known as an
associative array), are ordered in an unpredictable way governed by a
representation's particular implementation. This data structure, extremely
prevalent in contemporary programming practice sacrifices order to offer other
kinds of efficiency (fast text-based retrieval for instance).

## Data mining

In announcing Google's impending data center in Mons, Belgian prime minister
Di Rupo invoked the link between the history of the mining industry in the
region and the present and future interest in "data mining" as practiced by IT
companies such as Google.

Whether speaking of bales of cotton, barrels of oil, or bags of words, what
links these subjects is the way in which the notion of "raw material" obscures
the labor and power structures employed to secure them. "Raw" is always
relative: "purity" depends on processes of "refinement" that typically carry
social/ecological impact.

Stripping language of order is an act of "disembodiment", detaching it from
the acts of writing and reading. The shift from (human) reading to machine
reading involves a shift of responsibility from the individual human body to
the obscured responsibilities and seemingly inevitable forces of the
"machine", be it the machine of a market or the machine of an algorithm.

From [X = Y](/wiki/index.php?title=X_%3D_Y "X = Y"):

Still, it is reassuring to know that the products hold traces of the work,
that even with the progressive removal of human signs in automated processes,
the workers' presence never disappears completely. This presence is proof of
the materiality of information production, and becomes a sign of the economies
and paradigms of efficiency and profitability that are involved.

The computer scientists' view of textual content as "unstructured", be it in a
webpage or the OCR scanned pages of a book, reflect a negligence to the
processes and labor of writing, editing, design, layout, typesetting, and
eventually publishing, collecting and cataloging [11].

"Unstructured" to the computer scientist, means non-conformant to particular
forms of machine reading. "Structuring" then is a social process by which
particular (additional) conventions are agreed upon and employed. Computer
scientists often view text through the eyes of their particular reading
algorithm, and in the process (voluntarily) blind themselves to the work
practices which have produced and maintain these "resources".

Berners-Lee, in chastising his audience of web publishers to not only publish
online, but to release "unadulterated" data belies a lack of imagination in
considering how language is itself structured and a blindness to the need for
more than additional technical standards to connect to existing publishing
practices.

Last Revision: 2*08*2016

1. ↑ Benjamin Franklin Lieber, Lieber's Standard Telegraphic Code, 1896, New York;
2. ↑ Katherine Hayles, "Technogenesis in Action: Telegraph Code Books and the Place of the Human", How We Think: Digital Media and Contemporary Technogenesis, 2006
3. ↑ Hayles
4. ↑ Lieber's
5. ↑ Hayles
6. ↑ Tim Berners-Lee: The next web, TED Talk, February 2009
7. ↑ "Research on the Web seems to be fashionable these days and I guess I'm no exception." from Brin's [Stanford webpage](http://infolab.stanford.edu/~sergey/)
8. ↑ Extracting Patterns and Relations from the World Wide Web, Sergey Brin, Proceedings of the WebDB Workshop at EDBT 1998,
9. ↑ Dynamic Data Mining: Exploring Large Rule Spaces by Sampling; Sergey Brin and Lawrence Page, 1998; p. 2
10. ↑ Hypertext Markup Language (HTML): "Internet Draft", Tim Berners-Lee and Daniel Connolly, June 1993,
11. ↑

Retrieved from

[https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480](https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480)

Fuller & Dockray
In the Paradise of Too Many Books An Interview with Sean Dockray
2011

# In the Paradise of Too Many Books: An Interview with Sean Dockray

By Matthew Fuller, 4 May 2011

[0 Comments](/editorial/articles/paradise-too-many-books-interview-sean-
dockray#comments_none) [9191 Reads](/editorial/articles/paradise-too-many-
books-interview-sean-dockray) Print

If the appetite to read comes with reading, then open text archive Aaaaarg.org
is a great place to stimulate and sate your hunger. Here, Matthew Fuller talks
to long-term observer Sean Dockray about the behaviour of text and
bibliophiles in a text-circulation network

Sean Dockray is an artist and a member of the organising group for the LA
branch of The Public School, a geographically distributed and online platform
for the self-organisation of learning.1 Since its initiation by Telic Arts, an
organisation which Sean directs, The Public School has also been taken up as a
model in a number of cities in the USA and Europe.2

We met to discuss the growing phenomenon of text-sharing. Aaaaarg.org has
developed over the last few years as a crucial site for the sharing and
discussion of texts drawn from cultural theory, politics, philosophy, art and
related areas. Part of this discussion is about the circulation of texts,
scanned and uploaded to other sites that it provides links to. Since
participants in The Public School often draw from the uploads to form readers
or anthologies for specific classes or events series, this project provides a
useful perspective from which to talk about the nature of text in the present
era.

**Sean Dockray** **:** People usually talk about three key actors in
discussions about publishing, which all play fairly understandable roles:
readers; publishers; and authors.

**Matthew Fuller:** Perhaps it could be said that Aaaaarg.org suggests some
other actors that are necessary for a real culture of text; firstly that books
also have some specific kind of activity to themselves, even if in many cases
it is only a latent quality, of storage, of lying in wait and, secondly, that
within the site, there is also this other kind of work done, that of the
public reception and digestion, the response to the texts, their milieu, which
involves other texts, but also systems and organisations, and platforms, such
as Aaaaarg.

![](/sites/www.metamute.org/files/u73/Roland_Barthes_web.jpg)

Image: A young Roland Barthes, with space on his bookshelf

**SD:** Where even the three actors aren't stable! The people that are using
the site are fulfilling some role that usually the publisher has been doing or
ought to be doing, like marketing or circulation.

**MF:** Well it needn't be seen as promotion necessarily. There's also this
kind of secondary work with critics, reviewers and so on - which we can say is
also taken on by universities, for instance, and reading groups, magazines,
reviews - that gives an additional life to the text or brings it particular
kinds of attention, certain kind of readerliness.

**SD:** Situates it within certain discourses, makes it intelligible in a way,
in a different way.

**MF:** Yes, exactly, there's this other category of life to the book, which
is that of the kind of milieu or the organisational structure in which it
circulates and the different kind of networks of reference that it implies and
generates. Then there's also the book itself, which has some kind of agency,
or at least resilience and salience, when you think about how certain books
have different life cycles of appearance and disappearance.

**SD:** Well, in a contemporary sense, you have something like _Nights of
Labour_ , by Ranci _è_ re - which is probably going to be republished or
reprinted imminently - but has been sort of invisible, out of print, until, by
surprise, it becomes much more visible within the art world or something.

**MF:** And it's also been interesting to see how the art world plays a role
in the reverberations of text which isn't the same as that in cultural theory
or philosophy. Certainly _Nights of Labour_ , something that is very close to
the role that cultural studies plays in the UK, but which (cultural studies)
has no real equivalent in France, so then, geographically and linguistically,
and therefore also in a certain sense conceptually, the life of a book
exhibits these weird delays and lags and accelerations, so that's a good
example. I'm interested in what role Aaaaarg plays in that kind of
proliferation, the kind of things that books do, where they go and how they
become manifest. So I think one of the things Aaaaarg does is to make books
active in different ways, to bring out a different kind of potential in
publishing.

**SD:** Yes, the debate has tended so far to get stuck in those three actors
because people tend to end up picking a pair and placing them in opposition to
one another, especially around intellectual property. The discussion is very
simplistic and ends up in that way, where it's the authors against readers, or
authors against their publishers, with the publishers often introducing
scarcity, where the authors don't want it to be - that's a common argument.
There's this situation where the record industry is suing its own audience.
That's typically the field now.

**MF:** So within that kind of discourse of these three figures, have there
been cases where you think it's valid that there needs to be some form of
scarcity in order for a publishing project to exist?

**SD:** It's obviously not for me to say that there does or doesn't need to be
scarcity but the scarcity that I think we're talking about functions in a
really specific way: it's usually within academic publishing, the book or
journal is being distributed to a few libraries and maybe 500 copies of it are
being printed, and then the price is something anywhere from $60 to $500, and
there's just sort of an assumption that the audience is very well defined and
stable and able to cope with that.

**MF:** Yeah, which recognises that the audiences may be stable as an
institutional form, but not that over time the individual parts of say that
library user population change in their relationship to the institution. If
you're a student for a few years and then you no longer have access, you lose
contact with that intellectual community...

**SD:** Then people just kind of have to cling to that intellectual community.
So when scarcity functions like that, I can't think of any reason why that
_needs_ to happen. Obviously it needs to happen in the sense that there's a
relatively stable balance that wants to perpetuate itself, but what you're
asking is something else.

**MF:** Well there are contexts where the publisher isn't within that academic
system of very high costs, sustained by volunteer labour by academics, the
classic peer review system, but if you think of more of a trade publisher like
a left or a movement or underground publisher, whose books are being
circulated on Aaaaarg...

**SD:** They're in a much more precarious position obviously than a university
press whose economics are quite different, and with the volunteer labour or
the authors are being subsidised by salary - you have to look at the entire
system rather than just the publication. But in a situation where the
publisher is much more precarious and relying on sales and a swing in one
direction or another makes them unable to pay the rent on a storage facility,
one can definitely see why some sort of predictability is helpful and
necessary.

**MF:** So that leads me to wonder whether there are models of publishing that
are emerging that work with online distribution, or with the kind of thing
that Aaaaarg does specifically. Are there particular kinds of publishing
initiatives that really work well in this kind of context where free digital
circulation is understood as an a priori, or is it always in this kind of
parasitic or cyclical relationship?

**SD:** I have no idea how well they work actually; I don't know how well,
say, Australian publisher re.press, works for example. 3 I like a lot of what
they publish, it's given visibility when re.press distributes it and that's a
lot of what a publisher's role seems to be (and what Aaaaarg does as well).
But are you asking how well it works in terms of economics?

**MF:** Well, just whether there's new forms of publishing emerging that work
well in this context that cut out some of the problems ?

**SD:** Well, there's also the blog. Certain academic discourses, philosophy
being one, that are carried out on blogs really work to a certain extent, in
that there is an immediacy to ideas, their reception and response. But there's
other problems, such as the way in which, over time, the posts quickly get
forgotten. In this sense, a publication, a book, is kind of nice. It
crystallises and stays around.

**MF:** That's what I'm thinking, that the book is a particular kind of thing
which has it's own quality as a form of media. I also wonder whether there
might be intermediate texts, unfinished texts, draft texts that might
circulate via Aaaaarg for instance or other systems. That, at least to me,
would be kind of unsatisfactory but might have some other kind of life and
readership to it. You know, as you say, the blog is a collection of relatively
occasional texts, or texts that are a work in progress, but something like
Aaaaarg perhaps depends upon texts that are finished, that are absolutely the
crystallisation of a particular thought.

![](/sites/www.metamute.org/files/u73/tree_of_knowledge_web.jpg)

Image: The Tree of Knowledge as imagined by Hans Sebald Beham in his 1543
engraving _Adam and Eve_

**SD:** Aaaaarg is definitely not a futuristic model. I mean, it occurs at a
specific time, which is while we're living in a situation where books exist
effectively as a limited edition. They can travel the world and reach certain
places, and yet the readership is greatly outpacing the spread and
availability of the books themselves. So there's a disjunction there, and
that's obviously why Aaaaarg is so popular. Because often there are maybe no
copies of a certain book within 400 miles of a person that's looking for it,
but then they can find it on that website, so while we're in that situation it
works.

**MF:** So it's partly based on a kind of asymmetry, that's spatial, that's
about the territories of publishers and distributors, and also a kind of
asymmetry of economics?

**SD:** Yeah, yeah. But others too. I remember when I was affiliated with a
university and I had JSTOR access and all these things and then I left my job
and then at some point not too long after that my proxy access expired and I
no longer had access to those articles which now would cost $30 a pop just to
even preview. That's obviously another asymmetry, even though, geographically
speaking, I'm in an identical position, just that my subject position has
shifted from affiliated to unaffiliated.

**MF:** There's also this interesting way in which Aaaaarg has gained
different constituencies globally, you can see the kind of shift in the texts
being put up. It seems to me anyway there are more texts coming from non-
western authors. This kind of asymmetry generates a flux. We're getting new
alliances between texts and you can see new bibliographies emerge.

**SD:** Yeah, the original community was very American and European and
gradually people were signing up at other places in order to have access to a
lot of these texts that didn't reach their libraries or their book stores or
whatever. But then there is a danger of US and European thought becoming
central. A globalisation where a certain mode of thought ends up just erasing
what's going on already in the cities where people are signing up, that's a
horrible possible future.

**MF:** But that's already something that's _not_ happening in some ways?

**SD:** Exactly, that's what seems to be happening now. It goes on to
translations that are being put up and then texts that are coming from outside
of the set of US and western authors and so, in a way, it flows back in the
other direction. This hasn't always been so visible, maybe it will begin to
happen some more. But think of the way people can list different texts
together as ‘issues' - a way that you can make arbitrary groupings - and
they're very subjective, you can make an issue named anything and just lump a
bunch of texts in there. But because, with each text, you can see what other
issues people have also put it in, it creates a trace of its use. You can see
that sometimes the issues are named after the reading groups, people are using
the issues format as a collecting tool, they might gather all Portuguese
translations, or The Public School uses them for classes. At other times it's
just one person organising their dissertation research but you see the wildly
different ways that one individual text can be used.

**MF:** So the issue creates a new form of paratext to the text, acting as a
kind of meta-index, they're a new form of publication themselves. To publish a
bibliography that actively links to the text itself is pretty cool. That also
makes me think within the structures of Aaaaarg it seems that certain parts of
the library are almost at breaking point - for instance the alphabetical
structure.

**SD:** Which is funny because it hasn't always been that alphabetical
structure either, it used to just be everything on one page, and then at some
point it was just taking too long for the page to load up A-Z. And today A is
as long as the entire index used to be, so yeah these questions of density and
scale are there but they've always been dealt with in a very ad hoc kind of
way, dealing with problems as they come. I'm sure that will happen. There
hasn't always been a search and, in a way, the issues, along with
alphabetising, became ways of creating more manageable lists, but even now the
list of issues is gigantic. These are problems of scale.

**MF:** So I guess there's also this kind of question that emerges in the
debate on reading habits and reading practices, this question of the breadth
of reading that people are engaging in. Do you see anything emerging in
Aaaaarg that suggests a new consistency of handling reading material? Is there
a specific quality, say, of the issues? For instance, some of them seem quite
focused, and others are very broad. They may provide insights into how new
forms of relationships to intellectual material may be emerging that we don't
quite yet know how to handle or recognise. This may be related to the lament
for the classic disciplinary road of deep reading of specific materials with a
relatively focused footprint whereas, it is argued, the net is encouraging a
much wider kind of sampling of materials with not necessarily so much depth.

**SD:** It's partially driven by people simply being in the system, in the
same way that the library structures our relationship to text, the net does it
in another way. One comment I've heard is that there's too much stuff on
Aaaaarg, which wasn't always the case. It used to be that I read every single
thing that was posted because it was slow enough and the things were short
enough that my response was, ‘Oh something new, great!' and I would read it.
But now, obviously that is totally impossible, there's too much; but in a way
that's just the state of things. It does seem like certain tactics of making
sense of things, of keeping things away and letting things in and queuing
things for reading later become just a necessary part of even navigating. It's
just the terrain at the moment, but this is only one instance. Even when I was
at the university and going to libraries, I ended up with huge stacks of books
and I'd just buy books that I was never going to read just to have them
available in my library, so I don't think feeling overwhelmed by books is
particularly new, just maybe the scale of it is. In terms of how people
actually conduct themselves and deal with that reality, it's difficult to say.
I think the issues are one of the few places where you would see any sort of
visible answers on Aaaaarg, otherwise it's totally anecdotal. At The Public
School we have organised classes in relationship to some of the issues, and
then we use the classes to also figure out what texts we are going to be
reading in the future, to make new issues and new classes. So it becomes an
organising group, reading and working its way through subject matter and
material, then revisiting that library and seeing what needs to be there.

**MF:** I want to follow that kind of strand of habits of accumulation,
sorting, deferring and so on. I wonder, what is a kind of characteristic or
unusual reading behavior? For instance are there people who download the
entire list? Or do you see people being relatively selective? How does the
mania of the net, with this constant churning of data, map over to forms of
bibliomania?

**SD:** Well, in Aaaaarg it's again very specific. Anecdotally again, I have
heard from people how much they download and sometimes they're very selective,
they just see something that's interesting and download it, other times they
download everything and occasionally I hear about this mania of mirroring the
whole site. What I mean about being specific to Aaaaarg is that a lot of the
mania isn't driven by just the need to have everything; it's driven by the
acknowledgement that the source is going to disappear at some point. That
sense of impending disappearance is always there, so I think that drives a lot
of people to download everything because, you know, it's happened a couple
times where it's just gone down or moved or something like that.

**MF:** It's true, it feels like something that is there even for a few weeks
or a few months. By a sheer fluke it could last another year, who knows.

**SD:** It's a different kind of mania, and usually we get lost in this
thinking that people need to possess everything but there is this weird
preservation instinct that people have, which is slightly different. The
dominant sensibility of Aaaaarg at the beginning was the highly partial and
subjective nature to the contents and that is something I would want to
preserve, which is why I never thought it to be particularly exciting to have
lots of high quality metadata - it doesn't have the publication date, it
doesn't have all the great metadata that say Amazon might provide. The system
is pretty dismal in that way, but I don't mind that so much. I read something
on the Internet which said it was like being in the porn section of a video
store with all black text on white labels, it was an absolutely beautiful way
of describing it. Originally Aaaaarg was about trading just those particular
moments in a text that really struck you as important, that you wanted other
people to read so it would be very short, definitely partial, it wasn't a
completist project, although some people maybe treat it in that way now. They
treat it as a thing that wants to devour everything. That's definitely not the
way that I have seen it.

**MF:** And it's so idiosyncratic I mean, you know it's certainly possible
that it could be read in a canonical mode, you can see that there's that
tendency there, of the core of Adorno or Agamben, to take the a's for
instance. But of the more contemporary stuff it's very varied, that's what's
nice about it as well. Alongside all the stuff that has a very long-term
existence, like historical books that may be over a hundred years old, what
turns up there is often unexpected, but certainly not random or
uninterpretable.

![](/sites/www.metamute.org/files/u1/malraux_web3_0.jpg)

Image: French art historian André Malraux lays out his _Musée Imaginaire_ ,
1947

**SD:** It's interesting to think a little bit about what people choose to
upload, because it's not easy to upload something. It takes a good deal of
time to scan a book. I mean obviously some things are uploaded which are, have
always been, digital. (I wrote something about this recently about the scan
and the export - the scan being something that comes out of a labour in
relationship to an object, to the book, and the export is something where the
whole life of the text has sort of been digital from production to circulation
and reception). I happen to think of Aaaaarg in the realm of the scan and the
bootleg. When someone actually scans something they're potentially spending
hours because they're doing the work on the book they're doing something with
software, they're uploading.

**MF:** Aaaarg hasn't introduced file quality thresholds either.

**SD:** No, definitely not. Where would that go?

**MF:** You could say with PDFs they have to be searchable texts?

**SD:** I'm sure a lot of people would prefer that. Even I would prefer it a
lot of the time. But again there is the idiosyncratic nature of what appears,
and there is also the idiosyncratic nature of the technical quality and
sometimes it's clear that the person that uploads something just has no real
experience of scanning anything. It's kind of an inevitable outcome. There are
movie sharing sites that are really good about quality control both in the
metadata and what gets up; but I think that if you follow that to the end,
then basically you arrive at the exported version being the Platonic text, the
impossible, perfect, clear, searchable, small - totally eliminating any trace
of what is interesting, the hand of reading and scanning, and this is what you
see with a lot of the texts on Aaaaarg. You see the hand of the person who's
read that book in the past, you see the hand of the person who scanned it.
Literally, their hand is in the scan. This attention to the labour of both
reading and redistributing, it's important to still have that.

**MF:** You could also find that in different ways for instance with a pdf, a
pdf that was bought directly as an ebook that's digitally watermarked will
have traces of the purchaser coded in there. So then there's also this work of
stripping out that data which will become a new kind of labour. So it doesn't
have this kind of humanistic refrain, the actual hand, the touch of the
labour. This is perhaps more interesting, the work of the code that strips it
out, so it's also kind of recognising that code as part of the milieu.

**SD:** Yeah, that is a good point, although I don't know that it's more
interesting labour.

**MF:** On a related note, The Public School as a model is interesting in that
it's kind of a convention, it has a set of rules, an infrastructure, a
website, it has a very modular being. Participants operate with a simple
organisational grammar which allows them to say ‘I want to learn this' or ‘I
want to teach this' and to draw in others on that basis. There's lots of
proposals for classes, some of them don't get taken up, but it's a process and
a set of resources which allow this aggregation of interest to occur. I just
wonder how you saw that kind of ethos of modularity in a way, as a set of
minimum rules or set of minimum capacities that allow a particular set of
things occur?

**SD:** This may not respond directly to what you were just talking about, but
there's various points of entry to the school and also having something that
people feel they can take on as their own and I think the minimal structure
invites quite a lot of projection as to what that means and what's possible
with it. If it's not doing what you want it to do or you think, ‘I'm not sure
what it is', there's the sense that you can somehow redirect it.

**MF:** It's also interesting that projection itself can become a technical
feature so in a way the work of the imagination is done also through this kind
of tuning of the software structure. The governance that was handled by the
technical infrastructure actually elicits this kind of projection, elicits the
imagination in an interesting way.

**SD:** Yeah, yeah, I totally agree and, not to put too much emphasis on the
software, although I think that there's good reason to look at both the
software and the conceptual diagram of the school itself, but really in a way
it would grind to a halt if it weren't for the very traditional labour of
people - like an organising committee. In LA there's usually around eight of
us (now Jordan Biren, Solomon Bothwell, Vladada Gallegos, Liz Glynn, Naoko
Miyano, Caleb Waldorf, and me) who are deeply involved in making that
translation of these wishes - thrown onto the website that somehow attract the
other people - into actual classes.

**MF:** What does the committee do?

**SD:** Even that's hard to describe and that's what makes it hard to set up.
It's always very particular to even a single idea, to a single class proposal.
In general it'd be things like scheduling, finding an instructor if an
instructor is what's required for that class. Sometimes it's more about
finding someone who will facilitate, other times it's rounding up materials.
But it could be helping an open proposal take some specific form. Sometimes
it's scanning things and putting them on Aaaaarg. Sometimes, there will be a
proposal - I proposed a class in the very, very beginning on messianic time, I
wanted to take a class on it - and it didn't happen until more than a year and
a half later.

**MF:** Well that's messianic time for you.

**SD:** That and the internet. But other times it will be only a week later.
You know we did one on the Egyptian revolution and its historical context,
something which demanded a very quick turnaround. Sometimes the committee is
going to classes and there will be a new conflict that arises within a class,
that they then redirect into the website for a future proposal, which becomes
another class: a point of friction where it's not just like next, and next,
and next, but rather it's a knot that people can't quite untie, something that
you want to spend more time with, but you may want to move on to other things
immediately, so instead you postpone that to the next class. A lot of The
Public School works like that: it's finding momentum then following it. A lot
of our classes are quite short, but we try and string them together. The
committee are the ones that orchestrate that. In terms of governance, it is
run collectively, although with the committee, every few months people drop
off and new people come on. There are some people who've been on for years.
Other people who stay on just for that point of time that feels right for
them. Usually, people come on to the committee because they come to a lot of
classes, they start to take an interest in the project and before they know it
they're administering it.

**Matthew Fuller's <[m.fuller@gold.ac.uk](mailto:m.fuller@gold.ac.uk)> most
recent book, _Elephant and Castle_ , is forthcoming from Autonomedia. **

**He is collated at**

**Footnotes**

1

2 [http://telic.info/ ](http://telic.info/)

3

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.