Murtaugh
A bag but is language nothing of words
2016

## A bag but is language nothing of words

### From Mondotheque

#####

(language is nothing but a bag of words)

[Michael Murtaugh](/wiki/index.php?title=Michael_Murtaugh "Michael Murtaugh")

In text indexing and other machine reading applications the term "bag of
words" is frequently used to underscore how processing algorithms often
represent text using a data structure (word histograms or weighted vectors)
where the original order of the words in sentence form is stripped away. While
"bag of words" might well serve as a cautionary reminder to programmers of the
essential violence perpetrated to a text and a call to critically question the
efficacy of methods based on subsequent transformations, the expression's use
seems in practice more like a badge of pride or a schoolyard taunt that would
go: Hey language: you're nothin' but a big BAG-OF-WORDS.

## Bag of words

In information retrieval and other so-called _machine-reading_ applications
(such as text indexing for web search engines) the term "bag of words" is used
to underscore how in the course of processing a text the original order of the
words in sentence form is stripped away. The resulting representation is then
a collection of each unique word used in the text, typically weighted by the
number of times the word occurs.

Bag of words, also known as word histograms or weighted term vectors, are a
standard part of the data engineer's toolkit. But why such a drastic
transformation? The utility of "bag of words" is in how it makes text amenable
to code, first in that it's very straightforward to implement the translation
from a text document to a bag of words representation. More significantly,
this transformation then opens up a wide collection of tools and techniques
for further transformation and analysis purposes. For instance, a number of
libraries available in the booming field of "data sciences" work with "high
dimension" vectors; bag of words is a way to transform a written document into
a mathematical vector where each "dimension" corresponds to the (relative)
quantity of each unique word. While physically unimaginable and abstract
(imagine each of Shakespeare's works as points in a 14 million dimensional
space), from a formal mathematical perspective, it's quite a comfortable idea,
and many complementary techniques (such as principle component analysis) exist
to reduce the resulting complexity.

What's striking about a bag of words representation, given is centrality in so
many text retrieval application is its irreversibility. Given a bag of words
representation of a text and faced with the task of producing the original
text would require in essence the "brain" of a writer to recompose sentences,
working with the patience of a devoted cryptogram puzzler to draw from the
precise stock of available words. While "bag of words" might well serve as a
cautionary reminder to programmers of the essential violence perpetrated to a
text and a call to critically question the efficacy of methods based on
subsequent transformations, the expressions use seems in practice more like a
badge of pride or a schoolyard taunt that would go: Hey language: you're
nothing but a big BAG-OF-WORDS. Following this spirit of the term, "bag of
words" celebrates a perfunctory step of "breaking" a text into a purer form
amenable to computation, to stripping language of its silly redundant
repetitions and foolishly contrived stylistic phrasings to reveal a purer
inner essence.

## Book of words

Lieber's Standard Telegraphic Code, first published in 1896 and republished in
various updated editions through the early 1900s, is an example of one of
several competing systems of telegraph code books. The idea was for both
senders and receivers of telegraph messages to use the books to translate
their messages into a sequence of code words which can then be sent for less
money as telegraph messages were paid by the word. In the front of the book, a
list of examples gives a sampling of how messages like: "Have bought for your
account 400 bales of cotton, March delivery, at 8.34" can be conveyed by a
telegram with the message "Ciotola, Delaboravi". In each case the reduction of
number of transmitted words is highlighted to underscore the efficacy of the
method. Like a dictionary or thesaurus, the book is primarily organized around
key words, such as _act_ , _advice_ , _affairs_ , _bags_ , _bail_ , and
_bales_ , under which exhaustive lists of useful phrases involving the
corresponding word are provided in the main pages of the volume. [1]

[![Liebers
P1016847.JPG](/wiki/images/4/41/Liebers_P1016847.JPG)](/wiki/index.php?title=File:Liebers_P1016847.JPG)

[![Liebers
P1016859.JPG](/wiki/images/3/35/Liebers_P1016859.JPG)](/wiki/index.php?title=File:Liebers_P1016859.JPG)

[![Liebers
P1016861.JPG](/wiki/images/3/34/Liebers_P1016861.JPG)](/wiki/index.php?title=File:Liebers_P1016861.JPG)

[![Liebers
P1016869.JPG](/wiki/images/f/fd/Liebers_P1016869.JPG)](/wiki/index.php?title=File:Liebers_P1016869.JPG)

> [...] my focus in this chapter is on the inscription technology that grew
parasitically alongside the monopolistic pricing strategies of telegraph
companies: telegraph code books. Constructed under the bywords “economy,”
“secrecy,” and “simplicity,” telegraph code books matched phrases and words
with code letters or numbers. The idea was to use a single code word instead
of an entire phrase, thus saving money by serving as an information
compression technology. Generally economy won out over secrecy, but in
specialized cases, secrecy was also important.[2]

In Katherine Hayles' chapter devoted to telegraph code books she observes how:

> The interaction between code and language shows a steady movement away from
a human-centric view of code toward a machine-centric view, thus anticipating
the development of full-fledged machine codes with the digital computer. [3]

[![Liebers
P1016851.JPG](/wiki/images/1/13/Liebers_P1016851.JPG)](/wiki/index.php?title=File:Liebers_P1016851.JPG)
Aspects of this transitional moment are apparent in a notice included
prominently inserted in the Lieber's code book:

> After July, 1904, all combinations of letters that do not exceed ten will
pass as one cipher word, provided that it is pronounceable, or that it is
taken from the following languages: English, French, German, Dutch, Spanish,
Portuguese or Latin -- International Telegraphic Conference, July 1903 [4]

Conforming to international conventions regulating telegraph communication at
that time, the stipulation that code words be actual words drawn from a
variety of European languages (many of Lieber's code words are indeed
arbitrary Dutch, German, and Spanish words) underscores this particular moment
of transition as reference to the human body in the form of "pronounceable"
speech from representative languages begins to yield to the inherent potential
for arbitrariness in digital representation.

What telegraph code books do is remind us of is the relation of language in
general to economy. Whether they may be economies of memory, attention, costs
paid to a telecommunicatons company, or in terms of computer processing time
or storage space, encoding language or knowledge in any form of writing is a
form of shorthand and always involves an interplay with what one expects to
perform or "get out" of the resulting encoding.

> Along with the invention of telegraphic codes comes a paradox that John
Guillory has noted: code can be used both to clarify and occlude. Among the
sedimented structures in the technological unconscious is the dream of a
universal language. Uniting the world in networks of communication that
flashed faster than ever before, telegraphy was particularly suited to the
idea that intercultural communication could become almost effortless. In this
utopian vision, the effects of continuous reciprocal causality expand to
global proportions capable of radically transforming the conditions of human
life. That these dreams were never realized seems, in retrospect, inevitable.
[5]

[![Liebers
P1016884.JPG](/wiki/images/9/9c/Liebers_P1016884.JPG)](/wiki/index.php?title=File:Liebers_P1016884.JPG)

[![Liebers
P1016852.JPG](/wiki/images/7/74/Liebers_P1016852.JPG)](/wiki/index.php?title=File:Liebers_P1016852.JPG)

[![Liebers
P1016880.JPG](/wiki/images/1/11/Liebers_P1016880.JPG)](/wiki/index.php?title=File:Liebers_P1016880.JPG)

Far from providing a universal system of encoding messages in the English
language, Lieber's code is quite clearly designed for the particular needs and
conditions of its use. In addition to the phrases ordered by keywords, the
book includes a number of tables of terms for specialized use. One table lists
a set of words used to describe all possible permutations of numeric grades of
coffee (Choliam = 3,4, Choliambos = 3,4,5, Choliba = 4,5, etc.); another table
lists pairs of code words to express the respective daily rise or fall of the
price of coffee at the port of Le Havre in increments of a quarter of a Franc
per 50 kilos ("Chirriado = prices have advanced 1 1/4 francs"). From an
archaeological perspective, the Lieber's code book reveals a cross section of
the needs and desires of early 20th century business communication between the
United States and its trading partners.

The advertisements lining the Liebers Code book further situate its use and
that of commercial telegraphy. Among the many advertisements for banking and
law services, office equipment, and alcohol are several ads for gun powder and
explosives, drilling equipment and metallurgic services all with specific
applications to mining. Extending telegraphy's formative role for ship-to-
shore and ship-to-ship communication for reasons of safety, commercial
telegraphy extended this network of communication to include those parties
coordinating the "raw materials" being mined, grown, or otherwise extracted
from overseas sources and shipped back for sale.

## "Raw data now!"

From [La ville intelligente - Ville de la connaissance](/wiki/index.php?title
=La_ville_intelligente_-_Ville_de_la_connaissance "La ville intelligente -
Ville de la connaissance"):

Étant donné que les nouvelles formes modernistes et l'utilisation de matériaux
propageaient l'abondance d'éléments décoratifs, Paul Otlet croyait en la
possibilité du langage comme modèle de « [données
brutes](/wiki/index.php?title=Bag_of_words "Bag of words") », le réduisant aux
informations essentielles et aux faits sans ambiguïté, tout en se débarrassant
de tous les éléments inefficaces et subjectifs.

From [The Smart City - City of Knowledge](/wiki/index.php?title
=The_Smart_City_-_City_of_Knowledge "The Smart City - City of Knowledge"):

As new modernist forms and use of materials propagated the abundance of
decorative elements, Otlet believed in the possibility of language as a model
of '[raw data](/wiki/index.php?title=Bag_of_words "Bag of words")', reducing
it to essential information and unambiguous facts, while removing all
inefficient assets of ambiguity or subjectivity.

> Tim Berners-Lee: [...] Make a beautiful website, but first give us the
unadulterated data, we want the data. We want unadulterated data. OK, we have
to ask for raw data now. And I'm going to ask you to practice that, OK? Can
you say "raw"?

>

> Audience: Raw.

>

> Tim Berners-Lee: Can you say "data"?

>

> Audience: Data.

>

> TBL: Can you say "now"?

>

> Audience: Now!

>

> TBL: Alright, "raw data now"!

>

> [...]

>

> So, we're at the stage now where we have to do this -- the people who think
it's a great idea. And all the people -- and I think there's a lot of people
at TED who do things because -- even though there's not an immediate return on
the investment because it will only really pay off when everybody else has
done it -- they'll do it because they're the sort of person who just does
things which would be good if everybody else did them. OK, so it's called
linked data. I want you to make it. I want you to demand it. [6]

## Un/Structured

As graduate students at Stanford, Sergey Brin and Lawrence (Larry) Page had an
early interest in producing "structured data" from the "unstructured" web. [7]

> The World Wide Web provides a vast source of information of almost all
types, ranging from DNA databases to resumes to lists of favorite restaurants.
However, this information is often scattered among many web servers and hosts,
using many different formats. If these chunks of information could be
extracted from the World Wide Web and integrated into a structured form, they
would form an unprecedented source of information. It would include the
largest international directory of people, the largest and most diverse
databases of products, the greatest bibliography of academic works, and many
other useful resources. [...]

>

> **2.1 The Problem**
> Here we define our problem more formally:
> Let D be a large database of unstructured information such as the World
Wide Web [...] [8]

In a paper titled _Dynamic Data Mining_ Brin and Page situate their research
looking for _rules_ (statistical correlations) between words used in web
pages. The "baskets" they mention stem from the origins of "market basket"
techniques developed to find correlations between the items recorded in the
purchase receipts of supermarket customers. In their case, they deal with web
pages rather than shopping baskets, and words instead of purchases. In
transitioning to the much larger scale of the web, they describe the
usefulness of their research in terms of its computational economy, that is
the ability to tackle the scale of the web and still perform using
contemporary computing power completing its task in a reasonably short amount
of time.

> A traditional algorithm could not compute the large itemsets in the lifetime
of the universe. [...] Yet many data sets are difficult to mine because they
have many frequently occurring items, complex relationships between the items,
and a large number of items per basket. In this paper we experiment with word
usage in documents on the World Wide Web (see Section 4.2 for details about
this data set). This data set is fundamentally different from a supermarket
data set. Each document has roughly 150 distinct words on average, as compared
to roughly 10 items for cash register transactions. We restrict ourselves to a
subset of about 24 million documents from the web. This set of documents
contains over 14 million distinct words, with tens of thousands of them
occurring above a reasonable support threshold. Very many sets of these words
are highly correlated and occur often. [9]

## Un/Ordered

In programming, I've encountered a recurring "problem" that's quite
symptomatic. It goes something like this: you (the programmer) have managed to
cobble out a lovely "content management system" (either from scratch, or using
any number of helpful frameworks) where your user can enter some "items" into
a database, for instance to store bookmarks. After this ordered items are
automatically presented in list form (say on a web page). The author: It's
great, except... could this bookmark come before that one? The problem stems
from the fact that the database ordering (a core functionality provided by any
database) somehow applies a sorting logic that's almost but not quite right. A
typical example is the sorting of names where details (where to place a name
that starts with a Norwegian "Ø" for instance), are language-specific, and
when a mixture of languages occurs, no single ordering is necessarily
"correct". The (often) exascerbated programmer might hastily add an additional
database field so that each item can also have an "order" (perhaps in the form
of a date or some other kind of (alpha)numerical "sorting" value) to be used
to correctly order the resulting list. Now the author has a means, awkward and
indirect but workable, to control the order of the presented data on the start
page. But one might well ask, why not just edit the resulting listing as a
document? Not possible! Contemporary content management systems are based on a
data flow from a "pure" source of a database, through controlling code and
templates to produce a document as a result. The document isn't the data, it's
the end result of an irreversible process. This problem, in this and many
variants, is widespread and reveals an essential backwardness that a
particular "computer scientist" mindset relating to what constitutes "data"
and in particular it's relationship to order that makes what might be a
straightforward question of editing a document into an over-engineered
database.

Recently working with Nikolaos Vogiatzis whose research explores playful and
radically subjective alternatives to the list, Vogiatzis was struck by how
from the earliest specifications of HTML (still valid today) have separate
elements (OL and UL) for "ordered" and "unordered" lists.

> The representation of the list is not defined here, but a bulleted list for
unordered lists, and a sequence of numbered paragraphs for an ordered list
would be quite appropriate. Other possibilities for interactive display
include embedded scrollable browse panels. [10]

Vogiatzis' surprise lay in the idea of a list ever being considered
"unordered" (or in opposition to the language used in the specification, for
order to ever be considered "insignificant"). Indeed in its suggested
representation, still followed by modern web browsers, the only difference
between the two visually is that UL items are preceded by a bullet symbol,
while OL items are numbered.

The idea of ordering runs deep in programming practice where essentially
different data structures are employed depending on whether order is to be
maintained. The indexes of a "hash" table, for instance (also known as an
associative array), are ordered in an unpredictable way governed by a
representation's particular implementation. This data structure, extremely
prevalent in contemporary programming practice sacrifices order to offer other
kinds of efficiency (fast text-based retrieval for instance).

## Data mining

In announcing Google's impending data center in Mons, Belgian prime minister
Di Rupo invoked the link between the history of the mining industry in the
region and the present and future interest in "data mining" as practiced by IT
companies such as Google.

Whether speaking of bales of cotton, barrels of oil, or bags of words, what
links these subjects is the way in which the notion of "raw material" obscures
the labor and power structures employed to secure them. "Raw" is always
relative: "purity" depends on processes of "refinement" that typically carry
social/ecological impact.

Stripping language of order is an act of "disembodiment", detaching it from
the acts of writing and reading. The shift from (human) reading to machine
reading involves a shift of responsibility from the individual human body to
the obscured responsibilities and seemingly inevitable forces of the
"machine", be it the machine of a market or the machine of an algorithm.

From [X = Y](/wiki/index.php?title=X_%3D_Y "X = Y"):

Still, it is reassuring to know that the products hold traces of the work,
that even with the progressive removal of human signs in automated processes,
the workers' presence never disappears completely. This presence is proof of
the materiality of information production, and becomes a sign of the economies
and paradigms of efficiency and profitability that are involved.

The computer scientists' view of textual content as "unstructured", be it in a
webpage or the OCR scanned pages of a book, reflect a negligence to the
processes and labor of writing, editing, design, layout, typesetting, and
eventually publishing, collecting and cataloging [11].

"Unstructured" to the computer scientist, means non-conformant to particular
forms of machine reading. "Structuring" then is a social process by which
particular (additional) conventions are agreed upon and employed. Computer
scientists often view text through the eyes of their particular reading
algorithm, and in the process (voluntarily) blind themselves to the work
practices which have produced and maintain these "resources".

Berners-Lee, in chastising his audience of web publishers to not only publish
online, but to release "unadulterated" data belies a lack of imagination in
considering how language is itself structured and a blindness to the need for
more than additional technical standards to connect to existing publishing
practices.

Last Revision: 2*08*2016

1. ↑ Benjamin Franklin Lieber, Lieber's Standard Telegraphic Code, 1896, New York;
2. ↑ Katherine Hayles, "Technogenesis in Action: Telegraph Code Books and the Place of the Human", How We Think: Digital Media and Contemporary Technogenesis, 2006
3. ↑ Hayles
4. ↑ Lieber's
5. ↑ Hayles
6. ↑ Tim Berners-Lee: The next web, TED Talk, February 2009
7. ↑ "Research on the Web seems to be fashionable these days and I guess I'm no exception." from Brin's [Stanford webpage](http://infolab.stanford.edu/~sergey/)
8. ↑ Extracting Patterns and Relations from the World Wide Web, Sergey Brin, Proceedings of the WebDB Workshop at EDBT 1998,
9. ↑ Dynamic Data Mining: Exploring Large Rule Spaces by Sampling; Sergey Brin and Lawrence Page, 1998; p. 2
10. ↑ Hypertext Markup Language (HTML): "Internet Draft", Tim Berners-Lee and Daniel Connolly, June 1993,
11. ↑

Retrieved from

[https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480](https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480)

Sollfrank & Dockray
Expanded Appropriation
2013

Sean Dockray
Expanded Appropriation

Berlin, 4 January 2013

[00:13]
Public School [00:17]
We decided to give up doing a gallery because… Well, for one, the material
conditions weren’t so great for it. But I think people who open up galleries
do it in really challenging conditions, so there is no reason why we couldn’t
have done a gallery in that basement. [00:37] I think we were actually
disinterested in exhibition as a format. After a few years – I mean, we did
something like 35 things that could easily be called exhibitions, in a span of
5 years leading up to that. [00:55] I think we just wanted to try something
else. And so we already had started a project called The Public School a year
prior, so we decided that we would use our space primarily as a school.
[01:10] At that time those two things happened. We eliminated the gallery and
then ended up with two new galleries and a school instead!

[01:20] What The Public School is… it’s been going now for five fears. It
began just as a structure or even a diagram, or an idea or something. [01:43]
And the idea is that people would propose things that they wanted to learn
about, or to teach to other people. And then there would be a kind of process
where we use our space or the Internet to allow people to sign up to say they
are also interested in this idea. And then the School’s job would be to turn
those ideas into real meetings of people, real classes where people got
together. [02:15] So in that sense the curriculum would be developed in
public. It wouldn't be public just simply in the sense that anyone could go to
it, but it’d be public in the sense that anyone could produce the form of it.
[02:32] And again, I need a lot more time, I think, to talk about all the
dimensions to it, but in broad strokes that’s kind of what it is. [02:43]
Although we started in Los Angeles, in the basement of our original gallery
five years ago, it’s now been in around a dozen cities around the world, where
people are operating according to the same process, and then sometimes in
conversation with one another. And there’ve been 500-600 classes, and 2000 or
so proposals made in that time.

[03:18]
Motivation

[03:22]
It was in the air at the time already, so I don’t think it’d be an entirely
independent impulse – number one. But I had actually tried to start a couple
of things that had failed. [03:41] Like Aaaaarg – I tried to set up some
physical reading groups that would complement the online archive. So, in Los
Angeles the idea would be that we’d meet and talk about things that were being
posted to the website. So, yes, reading groups. But they never really went
anywhere. They were always really small, and they kind of run out of steam
quite quickly because no one was interested. [04:10] So in a way The Public
School was a later iteration of something that I’d already been trying for a
while. But the other thing was that by doing these reading groups,
intuitively, I knew what was wrong. [04:31] Although I like to read, that is
not all of what education is to me. To me learning and education is something
that is more inclusive of a lot more of what we experience in life, than
simply theoretical discussions. The structures didn’t really allow that in a
way. [04:56] The Public School came out of just trying to imagine what kind of
structure would be inclusive to overcome some of those self-imposed
limitations. [05:14] I’m very interested in technology in a hands-on way. I
like to code and electronics – hacking around with electronics. And at the
same time, I like to read and I like to write. And then once you go down that
line then you think, well, I like music a lot and I like to play chess as
well. [05:46] I think about all these things that I like to do, and I just
thought about how a lot of these gestures towards education that I tried to do
previously, in no way embraced me as a whole person. So in that sense, it was
based in personal interest. [06:22] But the other personal interest had to do
with personal motivation, it had to do with running an art space for, at that
point, four years. And actually seeing the way that that happened, because I’m
not a curator. [06:38] And so the act of putting on exhibitions for me was
less about making value judgments, and more about trying to contribute to the
cultural life of my city, and also provide opportunities that didn’t exist in
Los Angeles. [06:57] For example, no one really knew how to show work with
technology, and we were able to, because, for instance I knew how to set up
projectors, fix electronics or get things to start and stop, and that kind of
stuff. [07:13] But over the course of running it, because it is an exhibition
space, I found myself put into the role of being a curator – Fiona and I both
did. And it was kind of an uncomfortable role to be deciding what became
visible and what wouldn’t be. [07:32] And one thing that was never visible was
the sort of mechanisms by which an institution made certain things visible.
[07:40] So the public in The Public School actually in a way is trying to
eliminate that whole apparatus, or at least, put that apparatus as something
that we didn’t want to be solely the ones interacting with. We wanted that
apparatus to be… that our entire community, the community of people who is
participating in the programme – that they were the ones responsible for it.
[08:14] So that would shift programming, but also accountability and all these
things, to the people who are actually participating in the life of the space.

[08:28]
Technical Infrastructure

[08:32]
The technical infrastructure is incredibly important because at the moment
that’s people’s primary experience of the project. They make proposals on the
website, and then the classes are actually organised by people through the
website. So the website, the entire technical infrastructure becomes the
engine for getting events to happen. [09:01] It’s not an essential part. At
the very beginning we did it on paper, and we had the website and the paper
kind of simultaneously. And we’d print things out onto paper that would be
accessible by coming into the space, and vice versa, we'd enter things from
the paper back into the website. [09:26] But at the moment it’s mostly
orchestrated through the website. And it’s been three versions of it, like
three separate pieces of software, and the last two it’s been Kayla Waldorf
and myself who have been programming it. And we have… [09:45] Number one,
we’ve organised lots of classes, so we’re very involved in the life of the
school. And in a way we try to programme the site according to (A) what would
make things work, but (B), like you say, in a way that expresses the politics,
as we see them, of the site. [10:14] And so almost at every level, at every
design decision that Kayla might be making, or every kind of code or database
decision, you know, interactive decision that I might be making – those
conversations and those ideas are finding their way into that. [10:45] And
vice versa, that you see code, in a certain way, as not determining politics,
but certainly influencing what people see as possible and also choices that
they see available to them, and things like that. [11:09] I guess as users of
the site, as organisers of The Public School and as programmers, this kind of
relationship between the project and the software is quite intertwined.
[11:28] And I don’t think that… I think that typically art institutions use a
website as a kind of publicity vehicle, as a kind of postcard or something
that fits into their broadcasting of a programme, as something as a glue
between their space and their audience. [11:49] And I think for us the website
is actually integral to the space and to the audience. There is more of a
continuum between the space, programme, website and audience.

[12:04]
Aaaaarg.org

[12:08]
It started out small. In a way, it was an extension of what I think as a
practice that all of us are familiar with, which is sharing books that we’ve
read, or sharing articles that we’ve read, especially if your work is somehow
in relationship to things that you might be reading. [12:41] In my
architecture school, for instance, we would read lots and lots, and then we’d
be making work in parallel. It wouldn’t be that either would determine the
other, but in the end, there is a strong relationship between the ideas that
you have and what you see as possible, and the things that you are reading.
[13:07] So as part of the student culture, especially among my friends, the
people that I identified with in school, we’d be discovering different parts
of the library independently. And then when we found something that was quite
moving in whatever way then we would photocopy it to keep it for ourselves
later. [13:34] And we’d also give it to each other as a kind of secret tool,
or something like that, you know, like you have the sense that when you found
something that is really good – and specially if other people aren’t even
interested – then you feel really empowered by having access to that, by being
able to read it and reread it. [14:02] And then you feel more empowered when
there is a community of other people. It may be a small one, but who have read
that thing as well, because then you start building a kind of shared frame of
reference, a shared vocabulary and a shared way of seeing the world, and
seeing what you’re working on. [14:22] And I think out of that comes projects,
like you actually work on projects together, you collaborate, you correspond
with other people or you actually share the work. And that’s what happened.
[14:41] I started Aaaaarg.org after I moved from New York to Los Angeles, so I
was quite far away from some of the people that I was working with – and just
continuing with that very basic activity of sharing reading material in order
to have that shared vocabulary to be able to work together.

[15:08]
Content

[15:12]
It turned out to be architecture at the very beginning. But we all had really
broad understandings of what architecture meant and what it included, so there
was a lot of media theory, art history and philosophy, and occasionally some
architecture too. [15:38] And so that became the initial kind of seed. And I
think everything has, as the site expanded from there, to be not just me and
some collaborators, or then collaborators of collaborators, and then friends
of those people, and so on. [16:03] It’s kind of a ripple effect outwards.
What happened was something that is quite common to almost any platform, which
is this kind of feedback. Even in an open structure, it's never truly open.
There’re always rules in place, there’s always a past history, and those two
things go a long way to influence what happens in the future. [16:33] I’m sure
a lot of people will come to the site who are interested in one thing, and
then find nothing in the site that speaks to them, and then disappear. Whereas
other people, the site really spoke to them, and so what they would contribute
can also fit according to that sense, to that inclination.

[16:59]
Dynamics of growth and community-building

[17:04]
Especially when I’m involved in this kind of projects, I don’t like being
alone. Obviously it contributes a lot to the work, not only because there’s
more people, but actually the kind of relationships and negotiations that
happen in that work are interesting in themselves. [17:29] So anyway, it was
never all that interesting for it to be a private library. I mean, we all have
private libraries, but there is this potential as well, which I think wasn’t
part of the project at the beginning, it really was a tool for sharing in a
particular kind of context. [17:56] But I think, obviously, you know, once
people saw it then they saw a sort of potential in it, because you see what
happens on the Internet and you know that in certain cases you can read from
it and you can write to it. [18:18] And you also know that, although there
still [are] various forms of digital exclusion, that it's quite accessible
relative to other forms, other libraries, like university libraries, for
instance.

[18:37]
Cornelia Sollfrank: It’s not just about having access to certain material, but
what is related to it, and what’s really important, is the dynamics of
building a community and the context, and even smaller discourses around
certain issues, which you don’t have necessarily if you just download a text.
Then you have the text but you don’t have somebody to talk to, or you don’t
write your opinion about it to someone. So that’s, I think, what comes with
the project, which makes it very valuable to a lot of people.

[19:13]
Yes. That’s going back to what I was saying about some of the failures before
The Public School, which was... As the site was growing, as Aaaaarg was
growing, all of a sudden there would be things in there that I didn’t know
about before, that someone felt it was important to share. [19:37] And because
someone felt that it was important to share it, I felt it was important to
read it. And I did, but then I wanted to read it with other people. [19:51]
So, some of those reading groups were always attempts to produce some social
context for the theory.
[20:06] Having a library as if the archive itself is the library – but having
that isn't really that interesting to me. What's interesting is having some
social context that I can feel involved in (not that I ‘have’ to be involved
in it), but having some social context to make use of that reading material.

[20:42]
Copyright

[20:47]
At the beginning it was never a component of the project, because of that sort
of natural extension between what I see as a perfectly… something that I think
that we all do already. And especially in architecture and art, if you are
involved in reading you give books to people. Like you gave me your book… And
I’ve passed on a number of books. [21:34] If I print out something to read and
I’m done with it, then I’m more likely to pass it on than I’m to shred it – I
have to keep it in my closet forever, what do I do with it? If I think I’m
truly done with it, even for a moment, then I’m more likely to pass it on.
[22:00] So at the beginning it had nothing to do with piracy, it had
everything to do with wanting to share things with other people. And a lot of
times it's not just in this abstract “I kind of like to share,” but it was
project-based, and I think it became a little bit more abstract. [22:24] But I
think actually over time, when people were sharing things, sometimes they did
it with this sort of abstract recipient of that sharing, and that they would
think, “I have access to this and I know that other people want access to it,
and so that’s going to be why I share it.” [22:46] In other cases, I know that
people were trying to organise a reading group, and this is quite common,
which is that people would be organising something and then how are they going
to distribute the reading material. Yes, they could give everyone a link to
Amazon so they all order their own book, maybe that would be better for
Amazon. [23:13] But there are another ways that they would organise the
reading material there. A lot of times the stuff they wanted to read was
already on Aaaaarg. Sometimes they had to upload a few new things. [23:26] And
so that’s how a lot of it grew and that’s why people are involved. And I think
sharing was what drove the project. And then it really wasn’t for 3 years that
even there was anything even relating to copyright issues. No one complained
for all that time. [23:53] And then when complains came in then, you know, we
responded by taking it down. It was quite simple. [24:05] But then later in
the life of the project, the copyright problems sort of, in a way,
retroactively made the project more about piracy than about sharing.

[24:22]
Attempts to control file-sharing

[24:26]
Either through making activity which used to be legal, illegal, or which used
to be in a kind of grey area because there wasn’t a framework in place for it,
that sort of draw hard lines to say that something in now illegal. [24:46] And
then there is the technological forms of negation, I think, which is to
actually make it impossible for people to do something that they used to be
able to do – signing copies of a file and not allowing it to open if it’s not
opening in the right place, or through the cloud, through this kind of new
marketing opportunities of centralising a lot of files in one place, and then
sort of governing the access through sites like Spotify. [25:29] Amazon does
the same thing, you know, also with their e-books, where they own the device,
the distribution network and the servers. And so by controlling the entire
pipeline, there’s a lot more control over what people do. [25:51] For
instance, you have to jailbreak the Kindle in to order to share a book. Again,
something that we used to be able to do, now we actually have to break the law
or break our devices. [26:05] So these two things, I think, are how it gets
dealt with. And of course, there’s always responses to those things. [26:12] I
think the technological one is a big [one] ... to me that’s the more
challenging one, especially now, because what’s been produced is much more
miniaturised and a lot more difficult to...

C.S.: Hack?

[26:30] Yes. And also you can’t hack the server farm that’s located in, you
know, this really remote part of some country that you’ve never been to.
Shouldn’t say never. In fact, I’ll say never, just to see if someone can.
[26:50] Positive things would be to say, if we take a more expansive view of
the economy, look at who is making money, and then make an appeal for that.
Because there are people who are making money, like Apple is making a lot of
money, and other people who aren’t making money. [27:15] And I don’t think you
can blame the readers, for instance, for the fact that writers and publishers
aren’t making money, because the readers are going into that too, because of
the same forces. [27:28] So you look at who is making the money, and I think
that is a political argument that needs to be made, that this money is
actually being kind of hoarded by some of these companies, because they are
sort of gaming the system and the restructuring of the economy, but also how
we consume entertainment, and all this kind of things, and the restructuring
of production around the globe.
[27:59] I don’t think sites like Aaaaarg do anything more than point out a
kind of dynamic that is existing in the world – to think that somehow you can
sort of turn that into something positive, you know, in a way that gets
capitalism to stop exploiting people – like it seems silly to me, capitalism
exploits people...

[28:31]
Publishing landscape

[28:35]
I think that the role of the publishers [is] already changing, because of the
Internet and because of companies like Amazon, who changed not only selling
books. They changed not only the bookstore, but also changed the entire
distribution model, which then changes the way publishers work – and more and
more, even the entire life cycle of a book, you know, from the writing to the
sort of organisation and communication, to the distribution to the
consumption. [29:09] The entire life cycle of a book is happening through
these networks, from the software that we write it on, and where is that stuff
stored, you know – is a Google Docs or some other thing? –, and our e-mails
that are circulating, and the accounting software. [29:31] A lot of it is
changing through the entire pipeline anyway, so to me, it’s really difficult
to say how publishing is changing because the entire flow, the entire
apparatus is changing.
[29:48] At the beginning, Aaaaarg was a way of bringing readers together, and
to allow readers to sort of give value to certain things that they were
reading. And I think that’s always been a form of publishing to me. [30:09]
Yes, someone is responsible for having the book edited, having it printed it,
distributing it, there’s a huge material expense in all of that. [30:21] But
then you also have the life of the book after it gets to the store. And it
continues to have a life, like sometimes it lives for decades and decades, and
it goes between readers, it goes through sidewalk vendors, and used book
stores, and sits on people’s libraries, and goes to public libraries. [30:44]
And I would say that Aaaaarg is sort of in that part of the life cycle.
[30:54] These platforms become sort of new publishers themselves, but I
haven’t really thought that kind of statement through enough. In a way, if
publishing is to make something public and to create publics, then of course,
that’s something that Aaaaarg has done since the beginning. [31:22] It made
things public to people who maybe didn’t exist for before, and it also
produced communities of people around books – I mean, if that’s what a
publication and a publisher does, then, of course, it kind of does that within
the context of the Internet, and it does that by both using and producing
social relations between people.

[31:50]
Reading / books

[31:54]
I have lots of books, and I buy them from anywhere. I buy them, as much as it
pains me to admit it, I buy them from Amazon, I buy them from bookstores, I
buy them from used books stores, I buy them on the street, I find them in
trash, I’ve photocopied so many parts of books at the library, because they
didn’t circulate or something, or because I only had four hours to look at the
book; I’ve gotten things for my friends, I’ve gotten things from classes that
I used to take when I was a student but I still have. [32:37] And then with
the Internet, then I'd see it on a screen, sometimes I print that out, you
know. I’m not a purist in any way about reading or about books, I’m not
particularly sentimental about ‘the book.’ Even though I love books and I see
what’s nice about them, I think that every sort of form a book takes has its
own kind of… there’s something unique about it. [33:11] Honestly, this kind
of, let’s say, increase in e-Pubs and PDFs hasn’t really changed my
relationship to books at all. It’s the same as it’s always been, which is,
I’ll read it, how I can get it. And maybe there’s slightly now forms, and
sometimes I read on a little… I bought a touchpad when they had a fire sale a
while ago, so I read on that.

[33:44] And maybe I’m making an obvious argument here, but you see, if you've
ever scanned a book you know that it takes time, and you know that you screw
up quite a lot, and sometimes those screw ups find their way in, and the
labour that goes into making a scan finds its way in. [34:02] And it’s only
through really good scans that you can manage to sort of eliminate a lot of
that, a lot of the traces of that labour. But I know that, in the entire
history of Aaaaarg, the files will always show the labour of the person who is
trying to get something up to share it with other people. It’s not a
frictionless easy activity, there is work that’s involved in it. [34:31] And I
find some of the scans were quite beautiful in that way, even when they
weren’t necessarily so good to read.
[34:41] There’s actually, if we go to scale… Again, I have way more books that
I could possibly read, physical books. And I’m going to continue buying more,
acquiring more through my entire life, I’m sure of it. And I think that’s just
part of loving books and loving to read, you have more than you can possibly
deal with. [35:11] And I think, on a level of scale, maybe, with the Internet
we find ourselves, in orders of magnitude, [with] more than we could possibly
deal with. But in a way, it’s the same kind of anxiety, and the limits are
more or less the same. [35:29] But then there are maybe even new opportunities
for new ways of reading that weren’t available before. I could flip through a
book in a certain way, but maybe now with the possibility of indexing the
whole content of a book, and doing searches, and creating ways of visually
displaying books and relationships between books, and between parts of books,
and this kind of things, and also making lists, and making lists with other
people – all of these maybe provide new ways of reading which weren't
available. [36:13] And of course it means that then other ways of reading that
get sort of buried and, you know, lost. And I’m sure that that's true too,
that slow deep reading maybe isn’t as prevalent as different types of
referencing and stuff. [36:32] Not to say that it’s totally identical, but
certainly an evolution. I don’t think that progression is so linear, that it’s
pure loss, or anything like that.

[36:44]
Form and content

[36:49] For me what’s interesting is to try and examine how structure and
form, or structure and content, form and content – I mean, that’s kind of
another on-going question, how structure is not divorced from content.
Structure is not simply a container for the content, any more than the mind
and body are distinct entities – but that the structure that something takes
influences the shape that content takes, and also the ways that people might
approach that context, or use it in this kind of things. And likewise, the
content begins to affect the structure as well. [37:47] Why I’m interested in
structures is because they aren’t deterministic, they don’t determine what’s
going to happen. And all the projects that you mention are things that I think
of, let’s say, as platforms or something, in the sense that they have… they
involve a lot of people quite often, more than just me, and they also have…
the duration is not specified in advance, and what’s going to happen in them
is not specified in advance. [38:30] So they’re experimental in that way, and
they have that in common. And that is what’s interesting to me, is the
production of situations where we don’t know what’s going to happen. [38:51]
And sometimes when focusing on a work you have vision for what that work is
going to be, and then all your work goes into realising that, and, of course,
you have surprises along the way, but then you get something that surprisingly
ends up like what you kind of imagined at the beginning – that way of working
doesn’t really interest me. I sort of become bored pretty early on in that
process. [39:23] Whereas the kind of longer term thing where the initial
conditions actually produce a situation that’s a little unstable, and
therefore what happens is also kind of unpredictable and unstable, to me this
is about opening up other possibilities for things as small as being together
for a short time, but also as big as ways of living.

[40:00] On the one level, these are structural projects, but on another level
they are all kind of structural appropriations in a way, or appropriations of
structures, like from a gallery, a library, a school, another gallery. [40:23]
And I was actually thinking about that I kind of wish that (and I imagine
soon, maybe in the next decade or two) an art historian will make this kind of
argument for evolving the concept of appropriation, to go beyond objects to…
Because in a way appropriation enters into the discourse when reproduction…
[40:52] I think appropriation it’s been something, let’s say, that maybe is a
historical concept. So at certain point in history maybe it even has a
different name, there’s different ways that it happens, there are different
cultural responses to it. [41:09] And I think that in the twentieth century,
especially with mechanical reproduction, appropriation becomes quite clear
what it is, because images or sounds, you know, things became distributed and
available for people to actually materially use. [41:30] And the tools that
people have available to make work as well allow for this type of reuse of
what’s being circulated through the world. [41:45] And I guess what I’m sort
of saying is, if that’s appropriation of objects, then there might even be a
time now, especially as the economy sort of shifted from being simply about
commodity – the production, and sale and consumption of commodities) – to now,
if we try to understand critically the economy now, it’s something that’s much
more complicated – it involves financialization, debt and derivative trading,
and all this kind of things. [42:25] And so, perhaps also if appropriation is
a historical idea, then appropriation also needs to be updated, and this would
mean – for me this would mean appropriation of systems. [42:46] So rather than
the appropriation of what’s been distributed, it’s the appropriation of the
system of distribution. And to me these are also projects that I get excited
about at the moment. [43:04] In a way it also makes sense, because if
photographs were circulating around the world, and that was, you know, a new
thing, to see that sort of imagery circulating in that way, at a certain point
in time a century ago; then now I think we are even having a similar reaction
to something like Facebook, which to me kind comes out of nowhere, and
suddenly it exists in the world as a structure that is organising a certain
part of the activity of, you know, hundreds of millions of people. [43:47] And
so I think, in a way, that’s the level on which maybe we can start thinking of
appropriation, at a level of this kind of large scale systems. But then that
brings up a whole new set of questions, like what do you call that, number
one. Number two, obviously the legal framework that’s in place, obviously that
will cause problems.

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.