Sekulic
Legal Hacking and Space
2015


# Legal hacking and space

## What can urban commons learn from the free software hackers?

* [Dubravka Sekulic](https://www.eurozine.com/authors/sekulic-dubravka/)

4 November 2015

There is now a need to readdress urban commons through the lens of the digital
commons, writes Dubravka Sekulic. The lessons to be drawn from the free
software community and its resistance to the enclosure of code will likely
prove particularly valuable where participation and regulation are concerned.

> Commons are a particular type of institutional arrangement for governing the
use and disposition of resources. Their salient characteristic, which defines
them in contradistinction to property, is that no single person has exclusive
control over the use and disposition of any particular resource. Instead,
resources governed by commons may be used or disposed of by anyone among some
(more or less defined) number of persons, under rules that may range from
"anything goes" to quite crisply articulated formal rules that are effectively
enforced.
> (Benkler 2003: 6)

The above definition of commons, from the seminal paper "The political economy
of commons" by Yochai Benkler, addresses any type of commons, whether analogue
or digital. In fact, the concept of commons entered the digital realm from
physical space in order to interpret the type of communities, relationships
and production that started to appear with the development of the free as
opposed to the proprietary. Peter Linebaugh charted in his excellent book
_Magna Carta Manifesto_ , how the creation and development of the concept of
commons were closely connected to constantly changing relationships of people
and communities to the physical space. Here, I argue that the concept was
enriched when it was implemented in the digital field. Readdressing urban
space through the lens of digital commons can enable another imagination and
knowledge to appear around urban commons.

[![](http://www.eurozine.com/UserFiles/illustrations/sekulic_commons_220w.jpg)](http://www.derive.at/)The
notion of commons in (urban) space is often complicated by archaic models of
organization and management - "the pasture we knew how to share". There is a
tendency to give the impression that the solution is in reverting to the past
models. In the realm of digital though, there is no "pasture" from the Middle
Ages to fall back on. Digital commons had to start from scratch and define its
own protocols of production and reproduction (caring and sharing). Therefore,
the digital commons and free software community can be the one to turn to, not
only for inspiration and advice, but also as a partner when addressing
questions of urban commons. Or, as Marcell Mars would put it "if we could
start again with (regulating and defining) land, knowing what we know now
about digital networks, we could come up with something much better and
appropriate for today's world. That property wouldn't be private, maybe not
even property, but something else. Only then can we say we have learned
something from the digital" (2013).

## Enclosure as the trigger for action

The moment we turn to commons in relation to (urban) space is the moment in
which the pressure to privatize public space and to commodify every aspect of
urban life has become so strong that it can be argued that it mirrors a moment
in which Magna Carta Libertatum was introduced to protect the basic
reproduction of life for those whose sustenance was connected to the common
pastures and forests of England in the thirteenth century. At the end of the
twentieth century, urban space became the ultimate commodity, and increasing
privatization not only endangered the reproduction of everyday life in the
city; the rent extraction through privatized public space and housing
endangered bare life itself. Additionally, the cities' continuous
privatization of its amenities transformed almost every action in the city, no
matter how mundane - as for example, drinking a glass of water from a tap -,
into an action that creates profit for some private entity and extracts it
from the community. Thus every activity became labour, which a citizen-worker
is not only alienated from, but also unaware of. David Harvey's statement
about the city replacing the factory as a site of class war seems to be not
only an apt description of the condition of life in the city, but also a cry
for action.

When Richard Stallman turned to the foundational gesture of the creation of
free software, GNU/GPL (General Public Licence) was his reaction to the
artificially imposed logic of scarcity on the world of code - and the
increasing and systematic enclosure that took place in the late 1970s and
1980s as "a tidal wave of commercialization transformed software from a
technical object into a commodity, to be bought and sold on the open market
under the alleged protection of intellectual property law" (Coleman 2012:
138). Stallman, who worked as a researcher at MIT's Artificial Intelligence
Laboratory, detected how "[m]any programmers are unhappy about the
commercialization of system software. It may enable them to make more money,
but it requires them to feel in conflict with other programmers in general
rather than feel as comrades. The fundamental act of friendship among
programmers is the sharing of programs; marketing arrangements now typically
used essentially forbid programmers to treat others as friends. The purchaser
of software must choose between friendship and obeying the law. Naturally,
many decide that friendship is more important. But those who believe in law
often do not feel at ease with either choice. They become cynical and think
that programming is just a way of making money" (Stallman 2002: 32).

In the period between 1980 and 1984, "one man [Stallman] envisioned a crusade
to change the situation" (Moglen 1999). Stallman understood that in order to
subvert the system, he would have to intervene in the protocols that regulate
the conditions under which the code is produced, and not the code itself;
although he did contribute some of the best lines of code into the compiler
and text editor - the foundational infrastructure for any development. The
gesture that enabled the creation of a free software community that yielded
the complex field of digital commons was not a perfect line of code. The
creation of GNU General Public License (GPL) was a legal hack to counteract
the imposing of intellectual property law on code. At that time, the only
license available for programmers wanting to keep the code free was public
domain, which gave no protection against the code being appropriated and
closed. GPL enabled free codes to become self-perpetuating. Everything built
using a free code had to be made available under the same condition, in order
to secure the freedom for programmers to continue sharing and not breaking the
law. "By working on and using GNU rather than proprietary programs, we can be
hospitable to everyone and obey the law. In addition, GNU serves as an example
to inspire and as a banner to rally others to join in sharing. This can give
us a feeling of harmony, which is impossible if we use software, which is not
free. For about half the programmers I talk to, this is an important happiness
that money cannot replace" (Stallman 2002: 33).

Architects and planners as well as environmental designers have for too long
believed the opposite, that a good enough design can subvert the logic of
enclosure that dominates the production and reproduction of space; that a good
enough design can keep space open and public by the sheer strength of spatial
intervention. Stallman rightfully understands that no design is strong enough
to keep private ownership from claiming what it believes belongs to it.
Digital and urban commons, despite operating in completely different realms
and economies, are under attack from the same threat of "market processes"
that "crucially depend upon the individual monopoly of capitalists (of all
sorts) over ownership of the means of production, including finance and land.
All rent, recall, is a return to the monopoly power of private ownership of
some crucial asset, such as land or a patent. The monopoly power of private
property is therefore both the beginning-point and the end-point of all
capitalist activity" (Harvey 2012: 100). Stallman envisioned a bleak future
(2003: 26-28) but found a way to "relate the means to the ends". He understood
that the emancipatory task of a struggle "is not only what has to be done, but
also how it will be done and who will do it" (Stavrides & De Angelis: 7).
Thus, to produce the necessary requirements - both for a community to emerge,
but also for the basis of future protocols - tools and methodologies are
needed for the community to create both free software and itself.

## Renegotiating (undoing) property, hacking the law, creating community

Property, as an instrument of allocation of resources, is a right that is
negotiated within society and by society and not written in stone or given as
such. The digital, more than any other field, discloses property as being
inappropriate for contemporary relationships between production and
reproduction and, additionally, proves how it is possible to fundamentally
rethink it. The digital offers this possibility as it is non-material, non-
rival and non-exclusive (Meretz 2013), unlike anything in the physical world.
And Elinor Ostrom's lifelong empirical researches give ground to the belief
that eschewing property, being the sole instrument of allocation, can work as
a tool of management even for rival, excludable goods.
The value of information in digital form is not flat, but property is not the
way to protect that value, as the music industry realized during the course of
the last ten years. Once the copy is _out there_ , the cost of protecting its
exclusivity on the grounds of property becomes too high in relation to the
potential value to be extracted. For example, the value is extracted from
information through controlling the moment of its release and not through
subsequent exploitation. Stallman decided to tackle the imposition of the
concept of property on computer code (and by extension to the digital realm as
a whole) by articulating it in another field: just as property is the product
of constant negotiations within a society, so are legal regulations. After
some time, he was joined by "[m]any free software developers [who] do not
consider intellectual property instruments as the pivotal stimulus for a
marketplace of ideas and knowledge. Instead, they see them as a form of
restriction so fundamental (or poorly executed) that they need to be
counteracted through alternative legal agreements that treat knowledge,
inventions, and other creative expressions not as property but rather as
speech to be freely shared, circulated, and modified" (Coleman 2012: 26).

The digital sphere can give a valid example of how renegotiating regulation
can transform a resource from scarce to abundant. When the change from
analogue signal to packet switching begun to take effect, the distribution of
finite territory and the way the radio frequency spectrum was managed got
renegotiated and the amount of slots of space to be allocated grew by an order
of magnitude while the absolute size of the spectrum stayed the same. This
shift enabled Brecht's dream of a two-sided radio to become reality, thus
enabling what he had suggested: "change this apparatus over from distribution
to communication".1

According to Lawrence Lessig, what regulates behavior in cyberspace is an
interdependence of four constraints: market, law, architecture and norms
(Lessig 2012: 121-25). Analogously, space can be put in place of cyberspace,
as the regulation of space is the sum of these four constraints. These four
constraints are in a dynamic relationship in which the balance can be tilted
towards one, depending on how much each of these categories puts pressure on
the other three. Changes in any one reflect the regulation of the whole.
"Architecture" in Lessig's theory should be understood broadly as the "built
environment" that regulates behaviour in (cyber)space. In the last few decades
we have experienced the domination of the market reconfiguring the basis of
norms, law and architecture. In order to counteract this, the other three
constraints need to be re-negotiated. In digital space, this reconfiguration
happened by declaring the code - that is, the set of instructions written as
highly formalized text in a specific programming language to be executed
(usually) by the computer - to be considered as speech in front of the law,
and by hacking the law in order to disrupt the way that property relationships
are formed.

To put it simply, in order to create a change in dynamics between the
architecture, norms and the market, the law had to be addressed first. This is
not a novel procedure, "legal hacking is going on all the time, it is just
that politics is doing it under the veil of legality because they are the
parliament, they are Microsoft, which can hire a whole law firm to defend them
and find all the legal loopholes. Legal hacking is the norm actually" (Bailey
2013). When it comes to physical space, one of the most obvious examples of
the reconfiguration of regulations under the influence of the market is to
create legal provisions, norms and architecture to sustain the concept of
developing (and privatizing) public space through public-private partnerships.
The decision of the Italian parliament that the privatization of services
(specifically of water management) is legal and does not obstruct one's access
to water as a human right, is another example of a crude manipulation of the
law by the state in favour of the market. Unlike legal hacks by corporations
that aim to create a favourable legal climate for another round of
accumulation through dispossession, Stallman's hack tries to limit the impact
of the market and to create a space of freedom for the creation of a code and
of sharable knowledge, by questioning one of the central pillars of liberal
jurisprudence: (intellectual) property law.

Similarly, translated into physical space, one of the initiatives in Europe
that comes closest to creating a real existing urban commons, Teatro Valle
Occupato in Rome, is doing the same, "pushing the borders of legality of
private property" by legally hacking the institution of a foundation to "serve
a public, or common, purpose" and having "notarized [a] document registered
with the Italian state, that creates a precedent for other people to follow in
its way" (Bailey 2013). Sounds familiar to Stallman's hack as the fundamental
gesture by which community and the whole eco-system can be formed.

It is obvious that, in order to create and sustain that type of legal hack, it
is a necessity to have a certain level of awareness and knowledge of how
systems, both political and legal, work, i.e. to be politically literate.
"While in general", says Italian commons-activist and legal scholar Saki
Bailey, "we've become extremely lazy [when it comes to politics]. We've
started to become a kind of society of people who give up their responsibility
to participate by handing it over to some charismatic leaders, experts of [a]
different type" (2013). Free software hackers, in order to understand and take
part in a constant negotiation that takes place on a legal level between the
market that seeks to cloister the code and hackers who want to keep it free,
had to become literate in an arcane legal language. Gabriella Coleman notes in
_Coding Freedom_ that hacker forums sometimes tend to produce legal analysis
that is just as serious as one would expect to find in a law office. Like the
occupants of Teatro Valle, free software hackers understand the importance of
devoting time and energy to understand constraints and to find ways to
structurally divert them.

This type of knowledge is not shared and created in isolation, but in
socialization, in discussions in physical or cyber spaces (such as #irc chat
rooms, forums, mailing lists…), the same way free software hackers share their
knowledge about code. Through this process of socializing knowledge, "the
community is formed, developed, and reproduced through practices focused on
common space. To generalize this principle: the community is developed through
commoning, through acts and forms of organization oriented towards the
production of the common" (Stavrides 2012: 588). Thus forming a community is
another crucial element of the creation of digital commons, but even more
important are its development and resilience. The emerging community was not
given something to manage, it created something together, and together devised
rules of self-regulation and decision-making.

The prime example of this principle in the free software community is the
Debian Project, formed around the development of the Debian Linux
distribution. It is a volunteer organization consisting of around 3,000
developers that since its inception in 1993 has defined a set of basic
principles by which the project and its members conduct their affairs. This
includes the introduction of new people into the community, a process called
Debian Social Contract (DSC). A special part of the DSC defines the criteria
for "free software", thus regulating technical aspects of the project and also
technical relations with the rest of a free software community. The Debian
Constitution, another document created by the community so it can govern
itself, describes the organizational structure for formal decision-making
within the project.

Another example is Wikipedia, where the community that makes the online
encyclopedia also takes part in creating regulations, with some aspects
debated almost endlessly on forums. It is even possible to detect a loose
community of "Internet users" who took to the streets all over the world when
SOPA (Stop Online Piracy Act) and PIPA (Preventing Real Online Threats to
Economic Creativity and Theft of Intellectual Property Act) threatened to
enclose the Internet, as we know it; the proposed legislation was successfully
contested.

Free software projects that represent the core of the digital commons are most
of the time born of the initiative of individuals, but their growth and life
cycle depend on the fact that they get picked up by a community or generate
community around them that is allowed to take part in their regulation and in
decisions about which shape and forms the project will take in the future.
This is an important lesson to be transferred to the physical space in which
many projects fail because they do not get picked up by the intended
community, as the community is not offered a chance to partake in its creation
and, more importantly, its regulation.

## Building common infrastructure and institutions

"The expansion of intellectual property law" as the main vehicle of the trend
to enclose the code that leads to the act of the creation of free software
and, thus, digital commons, "is part and parcel of a broader neoliberal trend
to privatize what was once under public or under the state's aegis, such as
health provision, water delivery, and military services" (Coleman 2012: 16).
The structural fight headed by the GNU/GPL against the enclosure of code
"defines the contractual relationship that serves to secure the freedom of
means of production and to constitute a community of those participating in
the production and reproduction of free resources. And it is this constitutive
character, as an answer to an every time singular situation of appropriation
by the capital, that is a genuine political emancipation striving for an equal
and free collective production" (Mars & Medak 2004). Thus digital commons "is
based on the _communication_ among _singularities_ and emerges through
collaborative social processes of production " (Negri & Hardt 2005: 204).

The most important lesson urban commons can take from its digital counterpart
is at the same time the most difficult one: how to make a structural hack in
the moment of the creation of an urban commons that will enable it to become
structurally self-perpetuating, thus creating fertile ground not only for a
singular spatialization of urban commons to appear, but to multiply and create
a whole new eco-system. Digital commons was the first field in which what
Negri and Hardt (2009: 3-21) called the "republic of property" was challenged.
Urban commons, in order to really emerge as a spatialization of a new type of
relationship, need to start undoing property as well in order to socially re-
appropriate the city. Or in the words of Stavros Stavrides "the most urgent
and promising task, which can oppose the dominant governance model, is the
reinvention of common space. The realm of the common emerges in a constant
confrontation with state-controlled 'authorized' public space. This is an
emergence full of contradictions, perhaps, quite difficult to predict, but
nevertheless necessary. Behind a multifarious demand for justice and dignity,
new roads to collective emancipation are tested and invented. And, as the
Zapatistas say, we can create these roads only while walking. But we have to
listen, to observe, and to feel the walking movement. Together" (Stavrides
2012: 594).

The big task for both digital and urban commons is "[b]uilding a core common
infrastructure [which] is a necessary precondition to allow us to transition
away from a society of passive consumers buying what a small number of
commercial producers are selling. It will allow us to develop into a society
in which all can speak to all, and in which anyone can become an active
participant in political, social and cultural discourse" (Benkler 2003: 9).
This core common infrastructure has to be porous enough to include people that
are not similar, to provide "a ground to build a public realm and give
opportunities for discussing and negotiating what is good for all, rather than
the idea of strengthening communities in their struggle to define their own
commons. Relating commons to groups of "similar" people bears the danger of
eventually creating closed communities. People may thus define themselves as
commoners by excluding others from their milieu, from their own privileged
commons." (Stavrides 2010). If learning carefully from digital commons, urban
commons need to be conceptualized on the basis of the public, with a self-
regulating community that is open for others to join. That socializes
knowledge and thus produces and reproduces the commons, creating a space for
political emancipation that is capable of judicial arguments for the
protection and extension of regulations that are counter-market oriented.

## References

Bailey, Saki (2013): Interview by Dubravka Sekulic and Alexander de Cuveland.

Benkler, Yochai (2003): "The political economy of commons". _Upgrade_ IV, no.
3, 6-9, [www.benkler.org/Upgrade-
Novatica%20Commons.pdf](http://www.benkler.org/Upgrade-
Novatica%20Commons.pdf).

Benkler, Yochai (2006): _The Wealth of Networks: How Social Production
Transforms Markets and Freedom_. New Haven: Yale University Press.

Brecht, Bertolt (2000): "The radio as a communications apparatus". In: _Brecht
on Film and Radio_ , edited by Marc Silberman. Methuen, 41-6.

Coleman, E. Gabriella (2012): _Coding Freedom: The Ethics and Aesthetics of
Hacking_. Princeton University Press / Kindle edition.

Hardt, Michael and Antonio Negri (2005): _Multitude: War and Democracy in the
Age of Empire_. Penguin Books.

Hardt, Michael and Antonio Negri (2011): _Commonwealth_. Belknap Press of
Harvard University Press.

Harvey, David (2012): The Art of Rent. In: _Rebel Cities: From the Right to
the City to the Urban Revolution_ , 1st ed. Verso, 94-118.

Hill, Benjamin Mako (2012): Freedom for Users, Not for Software. In: Bollier,
David & Helfrich, Silke (Ed.): _The Wealth of the Commons: a World Beyond
Market and State_. Levellers Press / E-book.

Lessig, Lawrence (2012): _Code: Version 2.0_. Basic Books.

Linebaugh, Peter (2008): _The Magna Carta Manifesto: Liberties and Commons for
All_. University of California Press.

Mars, Marcell (2013): Interview by Dubravka Sekulic.

Mars, Marcell and Tomislav Medak (2004): "Both devil and gnu",
[www.desk.org:8080/ASU2/newsletter.Zarez.N5M.MedakRomicTXT.EnGlish](http://www.desk.org:8080/ASU2/newsletter.Zarez.N5M.MedakRomicTXT.EnGlish).

Martin, Reinhold (2013): "Public and common(s): Places: Design observer",
[placesjournal.org/article/public-and-
commons](https://placesjournal.org/article/public-and-commons).

Meretz, Stefan (2010): "Commons in a taxonomy of goods", [keimform.de/2010
/commons-in-a-taxonomy-of-goods](http://keimform.de/2010/commons-in-a
-taxonomy-of-goods/).

Mitrasinovic, Miodrag (2006): _Total Landscape, Theme Parks, Public Space_ ,
1st ed. Ashgate.

Moglen, Eben (1999): "Anarchism triumphant: Free software and the death of
copyright", First Monday,
[firstmonday.org/ojs/index.php/fm/article/view/684/594](http://firstmonday.org/ojs/index.php/fm/article/view/684/594).

Stallman, Richard and Joshua Gay (2002): _Free Software, Free Society:
Selected Essays of Richard M. Stallman_. GNU Press.

Stallman, Richard and Joshua Gay (2003): "The Right to Read". _Upgrade_ IV,
no. 3, 26-8.

Stavrides, Stavros (2012) "Squares in movement". _South Atlantic Quarterly_
111, no. 3, 585-96.

Stavrides, Stavros (2013): "Contested urban rhythms: From the industrial city
to the post-industrial urban archipelago". _The Sociological Review_ 61,
34-50.

Stavrides, Stavros, and Massimo De Angelis (2010): "On the commons: A public
interview with Massimo De Angelis and Stavros Stavrides". _e-flux_ 17, 1-17,
[www.e-flux.com/journal/on-the-commons-a-public-interview-with-massimo-de-
angelis-and-stavros-stavrides/](http://www.e-flux.com/journal/on-the-commons-a
-public-interview-with-massimo-de-angelis-and-stavros-stavrides/).

1

"[...] radio is one-sided when it should be two-. It is purely an apparatus
for distribution, for mere sharing out. So here is a positive suggestion:
change this apparatus over from distribution to communication". See "The radio
as a communications apparatus", Brecht 2000.

Published 4 November 2015
Original in English
First published by derive 61 (2015)

Contributed by dérive © Dubravka Sekulic / dérive / Eurozine

[PDF/PRINT](https://www.eurozine.com/legal-hacking-and-space/?pdf)


Murtaugh
A bag but is language nothing of words
2016


## A bag but is language nothing of words

### From Mondotheque

#####

(language is nothing but a bag of words)

[Michael Murtaugh](/wiki/index.php?title=Michael_Murtaugh "Michael Murtaugh")

In text indexing and other machine reading applications the term "bag of
words" is frequently used to underscore how processing algorithms often
represent text using a data structure (word histograms or weighted vectors)
where the original order of the words in sentence form is stripped away. While
"bag of words" might well serve as a cautionary reminder to programmers of the
essential violence perpetrated to a text and a call to critically question the
efficacy of methods based on subsequent transformations, the expression's use
seems in practice more like a badge of pride or a schoolyard taunt that would
go: Hey language: you're nothin' but a big BAG-OF-WORDS.

## Bag of words

In information retrieval and other so-called _machine-reading_ applications
(such as text indexing for web search engines) the term "bag of words" is used
to underscore how in the course of processing a text the original order of the
words in sentence form is stripped away. The resulting representation is then
a collection of each unique word used in the text, typically weighted by the
number of times the word occurs.

Bag of words, also known as word histograms or weighted term vectors, are a
standard part of the data engineer's toolkit. But why such a drastic
transformation? The utility of "bag of words" is in how it makes text amenable
to code, first in that it's very straightforward to implement the translation
from a text document to a bag of words representation. More significantly,
this transformation then opens up a wide collection of tools and techniques
for further transformation and analysis purposes. For instance, a number of
libraries available in the booming field of "data sciences" work with "high
dimension" vectors; bag of words is a way to transform a written document into
a mathematical vector where each "dimension" corresponds to the (relative)
quantity of each unique word. While physically unimaginable and abstract
(imagine each of Shakespeare's works as points in a 14 million dimensional
space), from a formal mathematical perspective, it's quite a comfortable idea,
and many complementary techniques (such as principle component analysis) exist
to reduce the resulting complexity.

What's striking about a bag of words representation, given is centrality in so
many text retrieval application is its irreversibility. Given a bag of words
representation of a text and faced with the task of producing the original
text would require in essence the "brain" of a writer to recompose sentences,
working with the patience of a devoted cryptogram puzzler to draw from the
precise stock of available words. While "bag of words" might well serve as a
cautionary reminder to programmers of the essential violence perpetrated to a
text and a call to critically question the efficacy of methods based on
subsequent transformations, the expressions use seems in practice more like a
badge of pride or a schoolyard taunt that would go: Hey language: you're
nothing but a big BAG-OF-WORDS. Following this spirit of the term, "bag of
words" celebrates a perfunctory step of "breaking" a text into a purer form
amenable to computation, to stripping language of its silly redundant
repetitions and foolishly contrived stylistic phrasings to reveal a purer
inner essence.

## Book of words

Lieber's Standard Telegraphic Code, first published in 1896 and republished in
various updated editions through the early 1900s, is an example of one of
several competing systems of telegraph code books. The idea was for both
senders and receivers of telegraph messages to use the books to translate
their messages into a sequence of code words which can then be sent for less
money as telegraph messages were paid by the word. In the front of the book, a
list of examples gives a sampling of how messages like: "Have bought for your
account 400 bales of cotton, March delivery, at 8.34" can be conveyed by a
telegram with the message "Ciotola, Delaboravi". In each case the reduction of
number of transmitted words is highlighted to underscore the efficacy of the
method. Like a dictionary or thesaurus, the book is primarily organized around
key words, such as _act_ , _advice_ , _affairs_ , _bags_ , _bail_ , and
_bales_ , under which exhaustive lists of useful phrases involving the
corresponding word are provided in the main pages of the volume. [1]

[![Liebers
P1016847.JPG](/wiki/images/4/41/Liebers_P1016847.JPG)](/wiki/index.php?title=File:Liebers_P1016847.JPG)

[![Liebers
P1016859.JPG](/wiki/images/3/35/Liebers_P1016859.JPG)](/wiki/index.php?title=File:Liebers_P1016859.JPG)

[![Liebers
P1016861.JPG](/wiki/images/3/34/Liebers_P1016861.JPG)](/wiki/index.php?title=File:Liebers_P1016861.JPG)

[![Liebers
P1016869.JPG](/wiki/images/f/fd/Liebers_P1016869.JPG)](/wiki/index.php?title=File:Liebers_P1016869.JPG)

> [...] my focus in this chapter is on the inscription technology that grew
parasitically alongside the monopolistic pricing strategies of telegraph
companies: telegraph code books. Constructed under the bywords “economy,”
“secrecy,” and “simplicity,” telegraph code books matched phrases and words
with code letters or numbers. The idea was to use a single code word instead
of an entire phrase, thus saving money by serving as an information
compression technology. Generally economy won out over secrecy, but in
specialized cases, secrecy was also important.[2]

In Katherine Hayles' chapter devoted to telegraph code books she observes how:

> The interaction between code and language shows a steady movement away from
a human-centric view of code toward a machine-centric view, thus anticipating
the development of full-fledged machine codes with the digital computer. [3]

[![Liebers
P1016851.JPG](/wiki/images/1/13/Liebers_P1016851.JPG)](/wiki/index.php?title=File:Liebers_P1016851.JPG)
Aspects of this transitional moment are apparent in a notice included
prominently inserted in the Lieber's code book:

> After July, 1904, all combinations of letters that do not exceed ten will
pass as one cipher word, provided that it is pronounceable, or that it is
taken from the following languages: English, French, German, Dutch, Spanish,
Portuguese or Latin -- International Telegraphic Conference, July 1903 [4]

Conforming to international conventions regulating telegraph communication at
that time, the stipulation that code words be actual words drawn from a
variety of European languages (many of Lieber's code words are indeed
arbitrary Dutch, German, and Spanish words) underscores this particular moment
of transition as reference to the human body in the form of "pronounceable"
speech from representative languages begins to yield to the inherent potential
for arbitrariness in digital representation.

What telegraph code books do is remind us of is the relation of language in
general to economy. Whether they may be economies of memory, attention, costs
paid to a telecommunicatons company, or in terms of computer processing time
or storage space, encoding language or knowledge in any form of writing is a
form of shorthand and always involves an interplay with what one expects to
perform or "get out" of the resulting encoding.

> Along with the invention of telegraphic codes comes a paradox that John
Guillory has noted: code can be used both to clarify and occlude. Among the
sedimented structures in the technological unconscious is the dream of a
universal language. Uniting the world in networks of communication that
flashed faster than ever before, telegraphy was particularly suited to the
idea that intercultural communication could become almost effortless. In this
utopian vision, the effects of continuous reciprocal causality expand to
global proportions capable of radically transforming the conditions of human
life. That these dreams were never realized seems, in retrospect, inevitable.
[5]

[![Liebers
P1016884.JPG](/wiki/images/9/9c/Liebers_P1016884.JPG)](/wiki/index.php?title=File:Liebers_P1016884.JPG)

[![Liebers
P1016852.JPG](/wiki/images/7/74/Liebers_P1016852.JPG)](/wiki/index.php?title=File:Liebers_P1016852.JPG)

[![Liebers
P1016880.JPG](/wiki/images/1/11/Liebers_P1016880.JPG)](/wiki/index.php?title=File:Liebers_P1016880.JPG)

Far from providing a universal system of encoding messages in the English
language, Lieber's code is quite clearly designed for the particular needs and
conditions of its use. In addition to the phrases ordered by keywords, the
book includes a number of tables of terms for specialized use. One table lists
a set of words used to describe all possible permutations of numeric grades of
coffee (Choliam = 3,4, Choliambos = 3,4,5, Choliba = 4,5, etc.); another table
lists pairs of code words to express the respective daily rise or fall of the
price of coffee at the port of Le Havre in increments of a quarter of a Franc
per 50 kilos ("Chirriado = prices have advanced 1 1/4 francs"). From an
archaeological perspective, the Lieber's code book reveals a cross section of
the needs and desires of early 20th century business communication between the
United States and its trading partners.

The advertisements lining the Liebers Code book further situate its use and
that of commercial telegraphy. Among the many advertisements for banking and
law services, office equipment, and alcohol are several ads for gun powder and
explosives, drilling equipment and metallurgic services all with specific
applications to mining. Extending telegraphy's formative role for ship-to-
shore and ship-to-ship communication for reasons of safety, commercial
telegraphy extended this network of communication to include those parties
coordinating the "raw materials" being mined, grown, or otherwise extracted
from overseas sources and shipped back for sale.

## "Raw data now!"

From [La ville intelligente - Ville de la connaissance](/wiki/index.php?title
=La_ville_intelligente_-_Ville_de_la_connaissance "La ville intelligente -
Ville de la connaissance"):

Étant donné que les nouvelles formes modernistes et l'utilisation de matériaux
propageaient l'abondance d'éléments décoratifs, Paul Otlet croyait en la
possibilité du langage comme modèle de « [données
brutes](/wiki/index.php?title=Bag_of_words "Bag of words") », le réduisant aux
informations essentielles et aux faits sans ambiguïté, tout en se débarrassant
de tous les éléments inefficaces et subjectifs.


From [The Smart City - City of Knowledge](/wiki/index.php?title
=The_Smart_City_-_City_of_Knowledge "The Smart City - City of Knowledge"):

As new modernist forms and use of materials propagated the abundance of
decorative elements, Otlet believed in the possibility of language as a model
of '[raw data](/wiki/index.php?title=Bag_of_words "Bag of words")', reducing
it to essential information and unambiguous facts, while removing all
inefficient assets of ambiguity or subjectivity.


> Tim Berners-Lee: [...] Make a beautiful website, but first give us the
unadulterated data, we want the data. We want unadulterated data. OK, we have
to ask for raw data now. And I'm going to ask you to practice that, OK? Can
you say "raw"?

>

> Audience: Raw.

>

> Tim Berners-Lee: Can you say "data"?

>

> Audience: Data.

>

> TBL: Can you say "now"?

>

> Audience: Now!

>

> TBL: Alright, "raw data now"!

>

> [...]

>

> So, we're at the stage now where we have to do this -- the people who think
it's a great idea. And all the people -- and I think there's a lot of people
at TED who do things because -- even though there's not an immediate return on
the investment because it will only really pay off when everybody else has
done it -- they'll do it because they're the sort of person who just does
things which would be good if everybody else did them. OK, so it's called
linked data. I want you to make it. I want you to demand it. [6]

## Un/Structured

As graduate students at Stanford, Sergey Brin and Lawrence (Larry) Page had an
early interest in producing "structured data" from the "unstructured" web. [7]

> The World Wide Web provides a vast source of information of almost all
types, ranging from DNA databases to resumes to lists of favorite restaurants.
However, this information is often scattered among many web servers and hosts,
using many different formats. If these chunks of information could be
extracted from the World Wide Web and integrated into a structured form, they
would form an unprecedented source of information. It would include the
largest international directory of people, the largest and most diverse
databases of products, the greatest bibliography of academic works, and many
other useful resources. [...]

>

> **2.1 The Problem**
> Here we define our problem more formally:
> Let D be a large database of unstructured information such as the World
Wide Web [...] [8]

In a paper titled _Dynamic Data Mining_ Brin and Page situate their research
looking for _rules_ (statistical correlations) between words used in web
pages. The "baskets" they mention stem from the origins of "market basket"
techniques developed to find correlations between the items recorded in the
purchase receipts of supermarket customers. In their case, they deal with web
pages rather than shopping baskets, and words instead of purchases. In
transitioning to the much larger scale of the web, they describe the
usefulness of their research in terms of its computational economy, that is
the ability to tackle the scale of the web and still perform using
contemporary computing power completing its task in a reasonably short amount
of time.

> A traditional algorithm could not compute the large itemsets in the lifetime
of the universe. [...] Yet many data sets are difficult to mine because they
have many frequently occurring items, complex relationships between the items,
and a large number of items per basket. In this paper we experiment with word
usage in documents on the World Wide Web (see Section 4.2 for details about
this data set). This data set is fundamentally different from a supermarket
data set. Each document has roughly 150 distinct words on average, as compared
to roughly 10 items for cash register transactions. We restrict ourselves to a
subset of about 24 million documents from the web. This set of documents
contains over 14 million distinct words, with tens of thousands of them
occurring above a reasonable support threshold. Very many sets of these words
are highly correlated and occur often. [9]

## Un/Ordered

In programming, I've encountered a recurring "problem" that's quite
symptomatic. It goes something like this: you (the programmer) have managed to
cobble out a lovely "content management system" (either from scratch, or using
any number of helpful frameworks) where your user can enter some "items" into
a database, for instance to store bookmarks. After this ordered items are
automatically presented in list form (say on a web page). The author: It's
great, except... could this bookmark come before that one? The problem stems
from the fact that the database ordering (a core functionality provided by any
database) somehow applies a sorting logic that's almost but not quite right. A
typical example is the sorting of names where details (where to place a name
that starts with a Norwegian "Ø" for instance), are language-specific, and
when a mixture of languages occurs, no single ordering is necessarily
"correct". The (often) exascerbated programmer might hastily add an additional
database field so that each item can also have an "order" (perhaps in the form
of a date or some other kind of (alpha)numerical "sorting" value) to be used
to correctly order the resulting list. Now the author has a means, awkward and
indirect but workable, to control the order of the presented data on the start
page. But one might well ask, why not just edit the resulting listing as a
document? Not possible! Contemporary content management systems are based on a
data flow from a "pure" source of a database, through controlling code and
templates to produce a document as a result. The document isn't the data, it's
the end result of an irreversible process. This problem, in this and many
variants, is widespread and reveals an essential backwardness that a
particular "computer scientist" mindset relating to what constitutes "data"
and in particular it's relationship to order that makes what might be a
straightforward question of editing a document into an over-engineered
database.

Recently working with Nikolaos Vogiatzis whose research explores playful and
radically subjective alternatives to the list, Vogiatzis was struck by how
from the earliest specifications of HTML (still valid today) have separate
elements (OL and UL) for "ordered" and "unordered" lists.

> The representation of the list is not defined here, but a bulleted list for
unordered lists, and a sequence of numbered paragraphs for an ordered list
would be quite appropriate. Other possibilities for interactive display
include embedded scrollable browse panels. [10]

Vogiatzis' surprise lay in the idea of a list ever being considered
"unordered" (or in opposition to the language used in the specification, for
order to ever be considered "insignificant"). Indeed in its suggested
representation, still followed by modern web browsers, the only difference
between the two visually is that UL items are preceded by a bullet symbol,
while OL items are numbered.

The idea of ordering runs deep in programming practice where essentially
different data structures are employed depending on whether order is to be
maintained. The indexes of a "hash" table, for instance (also known as an
associative array), are ordered in an unpredictable way governed by a
representation's particular implementation. This data structure, extremely
prevalent in contemporary programming practice sacrifices order to offer other
kinds of efficiency (fast text-based retrieval for instance).

## Data mining

In announcing Google's impending data center in Mons, Belgian prime minister
Di Rupo invoked the link between the history of the mining industry in the
region and the present and future interest in "data mining" as practiced by IT
companies such as Google.

Whether speaking of bales of cotton, barrels of oil, or bags of words, what
links these subjects is the way in which the notion of "raw material" obscures
the labor and power structures employed to secure them. "Raw" is always
relative: "purity" depends on processes of "refinement" that typically carry
social/ecological impact.

Stripping language of order is an act of "disembodiment", detaching it from
the acts of writing and reading. The shift from (human) reading to machine
reading involves a shift of responsibility from the individual human body to
the obscured responsibilities and seemingly inevitable forces of the
"machine", be it the machine of a market or the machine of an algorithm.

From [X = Y](/wiki/index.php?title=X_%3D_Y "X = Y"):

Still, it is reassuring to know that the products hold traces of the work,
that even with the progressive removal of human signs in automated processes,
the workers' presence never disappears completely. This presence is proof of
the materiality of information production, and becomes a sign of the economies
and paradigms of efficiency and profitability that are involved.


The computer scientists' view of textual content as "unstructured", be it in a
webpage or the OCR scanned pages of a book, reflect a negligence to the
processes and labor of writing, editing, design, layout, typesetting, and
eventually publishing, collecting and cataloging [11].

"Unstructured" to the computer scientist, means non-conformant to particular
forms of machine reading. "Structuring" then is a social process by which
particular (additional) conventions are agreed upon and employed. Computer
scientists often view text through the eyes of their particular reading
algorithm, and in the process (voluntarily) blind themselves to the work
practices which have produced and maintain these "resources".

Berners-Lee, in chastising his audience of web publishers to not only publish
online, but to release "unadulterated" data belies a lack of imagination in
considering how language is itself structured and a blindness to the need for
more than additional technical standards to connect to existing publishing
practices.

Last Revision: 2*08*2016

1. ↑ Benjamin Franklin Lieber, Lieber's Standard Telegraphic Code, 1896, New York;
2. ↑ Katherine Hayles, "Technogenesis in Action: Telegraph Code Books and the Place of the Human", How We Think: Digital Media and Contemporary Technogenesis, 2006
3. ↑ Hayles
4. ↑ Lieber's
5. ↑ Hayles
6. ↑ Tim Berners-Lee: The next web, TED Talk, February 2009
7. ↑ "Research on the Web seems to be fashionable these days and I guess I'm no exception." from Brin's [Stanford webpage](http://infolab.stanford.edu/~sergey/)
8. ↑ Extracting Patterns and Relations from the World Wide Web, Sergey Brin, Proceedings of the WebDB Workshop at EDBT 1998,
9. ↑ Dynamic Data Mining: Exploring Large Rule Spaces by Sampling; Sergey Brin and Lawrence Page, 1998; p. 2
10. ↑ Hypertext Markup Language (HTML): "Internet Draft", Tim Berners-Lee and Daniel Connolly, June 1993,
11. ↑

Retrieved from
[https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480](https://www.mondotheque.be/wiki/index.php?title=A_bag_but_is_language_nothing_of_words&oldid=8480)

 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.