Adema
Scanners, collectors and aggregators. On the underground movement of (pirated) theory text sharing
2009
# Scanners, collectors and aggregators. On the ‘underground movement’ of
(pirated) theory text sharing
_“But as I say, let’s play a game of science fiction and imagine for a moment:
what would it be like if it were possible to have an academic equivalent to
the peer-to-peer file sharing practices associated with Napster, eMule, and
BitTorrent, something dealing with written texts rather than music? What would
the consequences be for the way in which scholarly research is conceived,
communicated, acquired, exchanged, practiced, and understood?”_
Gary Hall – [Digitize this
book!](http://www.upress.umn.edu/Books/H/hall_digitize.html) (2008)
![ubuweb](https://openreflections.files.wordpress.com/2009/09/ubuweb.jpg?w=547)Ubu
web was founded in 1996 by poet [Kenneth
Goldsmith](http://en.wikipedia.org/wiki/Kenneth_Goldsmith "Kenneth Goldsmith")
and has developed from ‘a repository for visual, concrete and (later) sound
poetry, to a site that ‘embraced all forms of the avant-garde and beyond. Its
parameters continue to expand in all directions.’ As
[Wikipedia](http://en.wikipedia.org/wiki/UbuWeb) states, Ubu is non-commercial
and operates on a gift economy. All the same - by forming an amazing resource
and repository for the avant-garde movement, and by offering and hosting these
works on its platform, Ubu is violating copyright laws. As they state however:
‘ _should something return to print, we will remove it from our site
immediately. Also, should an artist find their material posted on UbuWeb
without permission and wants it removed, please let us know. However, most of
the time, we find artists are thrilled to find their work cared for and
displayed in a sympathetic context. As always, we welcome more work from
existing artists on site_.’
Where in the more affluent and popular media realms of block buster movies and
pop music the [Piratebay](http://thepiratebay.org/) and other download sites
(or p2p networks) like [Mininova](http://www.mininova.org/) are being sued and
charged with copyright infringement, the major powers to be seem to turn a
blind eye when it comes to Ubu and many other resource sites online that offer
digital versions of hard-to-get-by materials ranging from books to
documentaries.
This is and has not always been the case: in 2002 [Sebastian
Lütgert](http://www.wizards-of-
os.org/archiv/wos_3/sprecher/l_p/sebastian_luetgert.html) from Berlin/New York
was sued by the "Hamburger Stiftung zur Förderung von Wissenschaft und Kultur"
for putting online two downloadable texts from Theodor W. Adorno on his
website [textz.com](http://www.medienkunstnetz.de/artist/textz-
com/biography/), an underground archive for Literature. According to
[this](http://de.indymedia.org/2004/03/76975.shtml) Indymedia interview with
Lütgert, textz.com was referred to as ‘the Napster for books’ offering about
700 titles, focusing on, as Lütgert states _‘Theorie, Romane, Science-Fiction,
Situationisten, Kino, Franzosen, Douglas Adams, Kritische Theorie, Netzkritik
usw’._
The interview becomes even more interesting when Lütgert remarks that one can
still easily download both Adorno texts without much ado if one wants to. This
leads to the bigger question of the real reasons underlying the charge against
textz.com; why was textz.com sued? As Lütgert says in the interview: “ _Das
kann man sowieso_ [when referring to the still available Adorno texts] _._
_Aber es gibt schon lange einen klaren Unterschied zwischen offener
Verfügbarkeit und dem Untergrund. Man kann die freie Verbreitung von Inhalten
nicht unterbinden, aber man scheint verhindern zu wollen dass dies allzu offen
und selbstverständlich geschieht. Das ist es was sie stört.”
_
_![I don't have any
secrets](https://openreflections.files.wordpress.com/2009/09/i-dont-have-any-
secrets.jpg?w=547)_
But how can something be truly underground in an online environment whilst
still trying to spread or disseminate texts as widely as possible? This seems
to be the paradox of many - not quite legal and/or copyright protected -
resource sharing and collecting communities and platforms nowadays. However,
multiple scenario’s are available to evade this dilemma: by being frankly open
about the ‘status’ of the content on offer, as Ubu does, or by using little
‘tricks’ like an easy website registration, classifying oneself as a reading
group, or by relieving oneself from responsibility by stating that one is only
aggregating sources from elsewhere (linking) and not hosting the content on
its own website or blog. One can also state the offered texts or multimedia
files form a special issue or collection of resources, emphasizing their
educational and not-for-profit value.
Most of the ‘underground’ text and content sharing communities seem to follow
the concept of (the inevitability of) ‘[information wants to be
free](https://openreflections.wordpress.com/tag/information-wants-to-be-
free/)’, especially on the Internet. As Lütgert States: “ _Und vor allem sind
die über Walter Benjamin nicht im Bilde, der das gleiche Problem der
Reproduzierbarkeit von Werken aller Art schon zu Beginn des letzten
Jahrhunderts vor sich hatte und erkannt hat: die Massen haben das Recht, sich
das alles wieder anzueignen. Sie haben das Recht zu kopieren, und das Recht,
kopiert zu werden. Jedenfalls ist das eine ganz schön ungemütliche Situation,
dass dessen Nachlass jetzt von solch einem Bürokraten verwaltet wird._ _A:
Glaubst Du es ist überhaupt legitim intellektuellen Inhalt zu "besitzen"? Oder
__Eigentümer davon zu sein?_ _S: Es ist *unmöglich*. "Geistiges" Irgendwas
verbreitet sich immer weiter. Reemtsmas Vorfahren wären nie von den Bäumen
runtergekommen oder aus dem Morast rausgekrochen, wenn sich "geistiges"
Irgendwas nicht verbreitet hätte.”_
![646px-
Book_scanner_svg.jpg](https://openreflections.files.wordpress.com/2009/09
/646px-book_scanner_svg-jpg1.png?w=547)
What seems to be increasingly obvious, as the interview also states, is that
one can find virtually all Ebooks and texts one needs via p2p networks and
other file sharing community’s (the true
[Darknet](http://en.wikipedia.org/wiki/Darknet_\(file_sharing\)) in a way) –
more and more people are offering (and asking for!) selections of texts and
books (including the ones by Adorno) on openly available websites and blogs,
or they are scanning them and offering them for (educational) use on their
domains. Although the Internet is mostly known for the pirating and
dissemination of pirated movies and music, copyright protected textual content
has (of course) always been spread too. But with the rise of ‘born digital’
text content, and with the help of massive digitization efforts like Google
Books (and accompanying Google Books [download
tools](http://www.codeplex.com/GoogleBookDownloader)) accompanied by the
appearance of better (and cheaper) scanning equipment, the movement of
‘openly’ spreading (pirated) texts (whether or not focusing on education and
‘fair use’) seems to be growing fast.
The direct harm (to both the producers and their publishers) of the free
online availability of (in copyright) texts is also maybe less clear than for
instance with music and films. Many feel texts and books will still be
preferred to be read in print, making the online and free availability of text
nothing more than a marketing tool for the sales of the printed version. Once
discovered, those truly interested will find and buy the print book. Also more
than with music and film, it is felt essential to share information, as a
cultural good and right, to prevent censorship and to improve society.
![Piracy by Mikel Casal](https://openreflections.files.wordpress.com/2009/09
/piracy-by-mikel-casal.jpg?w=432&h=312)
This is one of the reasons the [Open
Access](http://en.wikipedia.org/wiki/Open_access_\(publishing\)) movement for
scientific research has been initiated. But where the amount of people and
institutions supportive of this movement is gradually growing (especially
where it concerns articles and journals in the Sciences), the spread
concerning Open Access (or even digital availability) of monographs in the
Humanities and Social Sciences (of which the majority of the resources on
offer in the underground text sharing communities consists) has only just
started.
This has lead to a situation in which some have decided that change is not
coming fast enough. Instead of waiting for this utopian Open Access future to
come gradually about, they are actively spreading, copying, scanning and
pirating scholarly texts/monographs online. Although many times accompanied by
lengthy disclaimers about why they are violating copyright (to make the
content more widely accessible for one), many state they will take down the
content if asked. Following the
[copyleft](http://en.wikipedia.org/wiki/Copyleft) movement, what has in a way
thus arisen is a more ‘progressive’ or radical branch of the Open Access
movement. The people who spread these texts deem it inevitable they will be
online eventually, they are just speeding up the process. As Lütgert states: ‘
_The desire of an increasingly larger section of the population to 100-percent
of information is irreversible. The only way there can be slowed down in the
worst case, but not be stopped._
![scribd-logo](https://openreflections.files.wordpress.com/2009/09/scribd-
logo.jpg?w=547)
Still we have not yet answered the question of why publishers (and their
pirated authors) are not more upset about these kinds of websites and
platforms. It is not a simple question of them not being aware that these kind
of textual disseminations are occurring. As mentioned before, the harm to
producers (scholars) and their publishers (in Humanities and Social Sciences
mainly Not-For-Profit University Presses) is less clear. First of all, their
main customers are libraries (compare this to the software business model:
free for the consumer, companies pay), who are still buying the legal content
and mostly follow the policy of buying either print or both print and ebook,
so there are no lost sales there for the publishers. Next to that it is not
certain that the piracy is harming sales. Unlike in literary publishing, the
authors (academics) are already paid and do not loose money (very little maybe
in royalties) from the online availability. Perhaps some publishers also see
the Open Access movement as something inevitably growing and they thus don’t
see the urge to step up or organize a collaborative effort against scholarly
text piracy (where most of the presses also lack the scale to initiate this).
Whereas there has been some more upsurge and worries about _[textbook
piracy](http://bookseller-association.blogspot.com/2008/07/textbook-
piracy.html)_ (since this is of course the area where individual consumers –
students – do directly buy the material) and websites like
[Scribd](http://www.scribd.com/), this mostly has to do with the fact that
these kind of platforms also host non-scholarly content and actively promote
the uploading of texts (where many of the text ‘sharing’ platforms merely
offer downloading facilities). In the case of Scribd the size of the platform
(or the amount of content available on the platform) also has caused concerns
and much [media coverage](http://labnol.blogspot.com/2007/04/scribd-youtube-
for-pirated-ebooks-but.html).
All of this gives a lot of potential power to text sharing communities, and I
guess they know this. Only authors might be directly upset (especially famous
ones gathering a lot of royalties on their work) or in the case of Lütgert,
their beneficiaries, who still do see a lot of money coming directly from
individual customers.
Still, it is not only the lack of fear of possible retaliations that is
feeding the upsurge of text sharing communities. There is a strong ideological
commitment to the inherent good of these developments, and a moral and
political strive towards institutional and societal change when it comes to
knowledge production and dissemination.
![Information Libre](https://openreflections.files.wordpress.com/2009/09
/information-libre.jpg?w=547)As Adrian Johns states in his
[article](http://www.culturemachine.net/index.php/cm/article/view/345/348)
_Piracy as a business force_ , ‘today’s pirate philosophy is a moral
philosophy through and through’. As Jonas Andersson
[states](http://www.culturemachine.net/index.php/cm/article/view/346/359), the
idea of piracy has mostly lost its negative connotations in these communities
and is seen as a positive development, where these movements ‘have begun to
appear less as a reactive force (i.e. ‘breaking the rules’) and more as a
proactive one (‘setting the rules’). Rather than complain about the
conservatism of established forms of distribution they simply create new,
alternative ones.’ Although Andersson states this kind of activism is mostly
_occasional_ , it can be seen expressed clearly in the texts accompanying the
text sharing sites and blogs. However, copyright is perhaps so much _an issue_
on most of these sites (where it is on some of them), as it is something that
seems to be simply ignored for the larger good of aggregating and sharing
resources on the web. As is stated clearly for instance in an
[interview](http://blog.sfmoma.org/2009/08/four-dialogues-2-on-aaaarg/) with
Sean Dockray, who maintains AAAARG:
_" The project wasn’t about criticizing institutions, copyright, authority,
and so on. It was simply about sharing knowledge. This wasn’t as general as it
sounds; I mean literally the sharing of knowledge between various individuals
and groups that I was in correspondence with at the time but who weren’t
necessarily in correspondence with each other."_
Back to Lütgert. The files from textz.com have been saved and are still
[accessible](http://web.archive.org/web/20031208043421/textz.gnutenberg.net/index.php3?enhanced_version=http://textz.com/index.php3)
via [The Internet Archive Wayback
Machine](http://web.archive.org/collections/web.html). In the case of
textz.com, these files contain ’typed out text’, so no scanned contents or
PDF’s. Textz.com (or better said its shadow or mirror) offers an amazing
collection of texts, including artists statements/manifestos and screenplays
from for instance David Lynch.
The text sharing community has evolved and now knows many players. Two other
large members in this kind of ‘pirate theory base network’ (although – and I
have to make that clear! – they offer many (and even mostly) legal and out of
copyright texts), still active today, are
[Monoskop/Burundi](http://burundi.sk/monoskop/log/) and
[AAAARG.ORG](http://a.aaaarg.org/). These kinds of platforms all seem to
disseminate (often even on a titular level) similar content, focusing mostly
on Continental Philosophy and Critical Theory, Cultural Studies and Literary
Theory, The Frankfurter Schule, Sociology/Social Theory, Psychology,
Anthropology and Ethnography, Media Art and Studies, Music Theory, and
critical and avant-garde writers like Kafka, Beckett, Burroughs, Joyce,
Baudrillard, etc.etc.
[Monoskop](http://www.burundi.sk/monoskop/index.php/Main_Page) is, as they
state, a collaborative wiki research on the social history of media art or a
‘living archive of writings on art, culture and media technology’. At the
sitemap of their log, or under the categories section, you can browse their
resources on genre: book, journal, e-zine, report, pamphlet etc. As I found
[here](http://www.slovakia.culturalprofiles.net/?id=7958), Burundi originated
in 2003 as a (Slovakian) media lab working between the arts, science and
technologies, which spread out to a European city based cultural network; They
even functioned as a press, publishing the Anthology of New Media Literature
(in Slovak) in 2006, and they hosted media events and curated festivals. It
dissolved in June 2005 although the
[Monoskop](http://www.slovakia.culturalprofiles.net/?id=7964) research wiki on
media art, has continued to run since the dissolving of Burundi.
![AAAARG](https://openreflections.files.wordpress.com/2009/09/aaaarg.jpg?w=547)As
is stated on their website, AAAARG is a conversation platform, or
alternatively, a school, reading group or journal, maintained by Los Angeles
artist[ Sean Dockray](http://www.design.ucla.edu/people/faculty.php?ID=64
"Sean Dockray"). In the true spirit of Critical Theory, its aim is to ‘develop
critical discourse outside of an institutional framework’. Or even more
beautiful said, it operates in the spaces in between: ‘ _But rather than
thinking of it like a new building, imagine scaffolding that attaches onto
existing buildings and creates new architectures between them_.’ To be able to
access the texts and resources that are being ‘discussed’ at AAAARG, you need
to register, after which you will be able to browse the
[library](http://a.aaaarg.org/library). From this library, you can download
resources, but you can also upload content. You can subscribe to their
[feed](http://aaaarg.org/feed) (RSS/XML) and [like
Monoskop](http://twitter.com/monoskop), AAAARG.org also maintains a [Twitter
account](http://twitter.com/aaaarg) on which updates are posted. The most
interesting part though is the ‘extra’ functions the platform offers: after
you have made an account, you can make your own collections, aggregations or
issues out of the texts in the library or the texts you add. This offers an
alternative (thematically ordered) way into the texts archived on the site.
You also have the possibility to make comments or start a discussion on the
texts. See for instance their elaborate [discussion
lists](http://a.aaaarg.org/discussions). The AAAARG community thus serves both
as a sharing and feedback community and in this way operates in a true p2p
fashion, in a way like p2p seemed originally intended. The difference being
that AAAARG is not based on a distributed network of computers, but is based
on one platform, to which registered users are able to upload a file (which is
not the case on Monoskop for instance – only downloading here).
Via[
mercurunionhall](http://mercerunionhall.blogspot.com/2009/06/aaaargorg.html),
I found the image underneath which depicts AAAARG.ORG's article index
organized as a visual map, showing the connections between the different
texts. This map was created and posted by AAAARG user john, according to
mercurunionhall.
![Connections-v1 by
John](https://openreflections.files.wordpress.com/2009/09/connections-v1-by-
john.jpg?w=547)
Where AAAArg.org focuses again on the text itself - typed out versions of
books - Monoskop works with more modern versions of textual distribution:
scanned versions or full ebooks/pdf’s with all the possibilities they offer,
taking a lot of content from Google books or (Open Access) publishers’
websites. Monoskop also links back to the publishers’ websites or Google
Books, for information about the books or texts (which again proves that the
publishers should know about their activities). To download the text however,
Monoskop links to [Sharebee](http://www.sharebee.com/), keeping the actual
text and the real downloading activity away from its platform.
Another part of the text sharing content consists of platforms offering
documentaries and lectures (so multi-media content) online. One example of the
last is the [Discourse Notebook Archive](http://www.discoursenotebook.com/),
which describes itself as an effort which has as its main goal ‘to make
available lectures in contemporary continental philosophy’ and is maintained
by Todd Kesselman, a PhD Student at The New School for Social Research. Here
you can find lectures from Badiou, Kristeva and Zizek (both audio and video)
and lectures aggregated from the European Graduate School. Kesselman also
links to resources on the web dealing with contemporary continental
philosophy.
![Eule - Society of
Control](https://openreflections.files.wordpress.com/2009/09/eule-society-of-
control.gif?w=547)Society of Control is a website maintained by [Stephan
Dillemuth](http://www.kopenhagen.dk/fileadmin/oldsite/interviews/solmennesker.htm),
an artist living and working in Munich, Germany, offering amongst others an
overview of his work and scientific research. According to
[this](http://www2.khib.no/~hovedfag/akademiet_05/tekster/interview.html)
interview conducted by Kristian Ø Dahl and Marit Flåtter his work is a
response to the increased influence of the neo-liberal world order on
education, creating a culture industry that is more than often driven by
commercial interests. He asks the question ‘How can dissidence grow in the
blind spots of the ‘society of control’ and articulate itself?’ His website,
the [Society of Control](http://www.societyofcontrol.com/disclaimer1.htm) is,
as he states, ‘an independent organization whose profits are entirely devoted
to research into truth and meaning.’
Society of Control has a [library
section](http://www.societyofcontrol.com/library/) which contains works from
some of the biggest thinkers of the twentieth century: Baudrillard, Adorno,
Debord, Bourdieu, Deleuze, Habermas, Sloterdijk und so weiter, and so much
more, a lot in German, and all ‘typed out’ texts. The library section offers a
direct search function, a category function and a a-z browse function.
Dillemuth states that he offers this material under fair use, focusing on not
for profit, freedom of information and the maintenance of freedom of speech
and information and making information accessible to all:
_“The Societyofcontrol website site contains information gathered from many
different sources. We see the internet as public domain necessary for the free
flow and exchange of information. However, some of these materials contained
in this site maybe claimed to be copyrighted by various unknown persons. They
will be removed at the copyright holder 's request within a reasonable period
of time upon receipt of such a request at the email address below. It is not
the intent of the Societyofcontrol to have violated or infringed upon any
copyrights.”_
![Vilem Flusser, Andreas Strohl, Erik Eisel Writings
\(2002\)](https://openreflections.files.wordpress.com/2009/09/vilem-flusser-
andreas-strohl-erik-eisel-writings-2002.jpg?w=547)Important in this respect is
that he put the responsibility of reading/using/downloading the texts on his
site with the viewers, and not with himself: _“Anyone reading or looking at
copyright material from this site does so at his/her own peril, we disclaim
any participation or liability in such actions.”_
Fark Yaraları = [Scars of Différance](http://farkyaralari.blogspot.com/) and
[Multitude of blogs](http://multitudeofblogs.blogspot.com/) are maintained by
the same author, Renc-u-ana, a philosophy and sociology student from Istanbul.
The first is his personal blog (with also many links to downloadable texts),
focusing on ‘creating an e-library for a Heideggerian philosophy and
Bourdieuan sociology’ on which he writes ‘market-created inequalities must be
overthrown in order to close knowledge gap.’ The second site has a clear
aggregating function with the aim ‘to give united feedback for e-book
publishing sites so that tracing and finding may become easier.’ And a call
for similar blogs or websites offering free ebook content. The blog is
accompanied by a nice picture of a woman warning to keep quiet, very
paradoxically appropriate to the context. Here again, a statement from the
host on possible copyright infringement _: ‘None of the PDFs are my own
productions. I 've collected them from web (e-mule, avax, libreremo, socialist
bros, cross-x, gigapedia..) What I did was thematizing._’ The same goes for
[pdflibrary](http://pdflibrary.wordpress.com/) (which seems to be from the
same author), offering texts from Derrida, Benjamin, Deleuze and the likes:
_‘_ _None of the PDFs you find here are productions of this blog. They are
collected from different places in the web (e-mule, avax, libreremo, all
socialist bros, cross-x, …). The only work done here is thematizing and
tagging.’_
[![GRUP_Z~1](https://openreflections.files.wordpress.com/2009/09/grup_z11.jpg?w=547)](http://multitudeofblogs.blogspot.com/)Our
student from Istanbul lists many text sharing sites on Multitude of blogs,
including [Inishark](http://danetch.blogspot.com/) (amongst others Badiou,
Zizek and Derrida), [Revelation](http://revelation-online.blogspot.com/2009/02
/keeping-ten-commandments.html) (a lot of history and bible study), [Museum of
accidents](http://museumofaccidents.blogspot.com/) (many resources relating to
again, critical theory, political theory and continental philhosophy) and
[Makeworlds](http://makeworlds.net/) (initiated from the [make world
festival](http://www.makeworlds.org/1/index.html) 2001).
[Mariborchan](http://mariborchan.wordpress.com/) is mainly a Zizek resource
site (also Badiou and Lacan) and offers next to ebooks also video and audio
(lectures and documentaries) and text files, all via links to file sharing
platforms.
What is clear is that the text sharing network described above (I am sure
there are many more related to other fields and subjects) is also formed and
maintained by the fact that the blogs and resource sites link to each other in
their blog rolls, which is what in the end makes up the network of text
sharing, only enhanced by RSS feeds and Twitter accounts, holding together
direct communication streams with the rest of the community. That there has
not been one major platform or aggregation site linking them together and
uploading all the texts is logical if we take into account the text sharing
history described before and this can thus be seen as a clear tactic: it is
fear, fear for what happened to textz.com and fear for the issue of scale and
fear of no longer operating at the borders, on the outside or at the fringes.
Because a larger scale means they might really get noticed. The idea of
secrecy and exclusivity which makes for the idea of the underground is very
practically combined with the idea that in this way the texts are available in
a multitude of places and can thus not be withdrawn or disappear so easily.
This is the paradox of the underground: staying small means not being noticed
(widely), but will mean being able to exist for probably an extended period of
time. Becoming (too) big will mean reaching more people and spreading the
texts further into society, however it will also probably mean being noticed
as a treat, as a ‘network of text-piracy’. The true strategy is to retain this
balance of openly dispersed subversivity.
Update 25 November 2005: Another interesting resource site came to my
attention recently: [Bedeutung](http://http://www.bedeutung.co.uk/index.php),
a philosophical and artistic initiative consisting of three projects:
[Bedeutung
Magazine](http://www.bedeutung.co.uk/index.php?option=com_content&view=article&id=1&Itemid=3),
[Bedeutung
Collective](http://www.bedeutung.co.uk/index.php?option=com_content&view=article&id=67&Itemid=4)
and [Bedeutung Blog](http://bedeutung.wordpress.com/), hosts a
[library](http://www.bedeutung.co.uk/index.php?option=com_content&view=article&id=85&Itemid=45)
section which links to freely downloadable online e-books, articles, audio
recordings and videos.
### Share this:
* [Twitter](https://openreflections.wordpress.com/2009/09/20/scanners-collectors-and-aggregators-on-the-%e2%80%98underground-movement%e2%80%99-of-pirated-theory-text-sharing/?share=twitter "Click to share on Twitter")
* [Facebook](https://openreflections.wordpress.com/2009/09/20/scanners-collectors-and-aggregators-on-the-%e2%80%98underground-movement%e2%80%99-of-pirated-theory-text-sharing/?share=facebook "Click to share on Facebook")
*
### Like this:
Like Loading...
### _Related_
### 17 comments on " Scanners, collectors and aggregators. On the
‘underground movement’ of (pirated) theory text sharing"
1. Pingback: [Humanism at the fringe « Snarkmarket](http://snarkmarket.com/2009/3428)
2. Pingback: [Scanners, collectors and aggregators. On the 'underground movement' of (pirated) theory text sharing « Mariborchan](http://mariborchan.wordpress.com/2009/09/20/scanners-collectors-and-aggregators-on-the-underground-movement-of-pirated-theory-text-sharing/)
3. Mariborchan
September 20, 2009
![](https://2.gravatar.com/avatar/b8eea582f7e9ac0a622e3dacecad5835?s=55&d=&r=G)
I took the liberty to pirate this article.
4. [jannekeadema1979](http://www.openreflections.wordpress.com)
September 20, 2009
![](https://2.gravatar.com/avatar/e4898febe4230b412db7f7909bcb9fc9?s=55&d=&r=G)
Thanks, it's all about the sharing! Hope you liked it.
5. Pingback: [links for 2009-09-20 « Blarney Fellow](http://blarneyfellow.wordpress.com/2009/09/21/links-for-2009-09-20/)
6. [scars of différance](http://farkyaralari.blogspot.com)
September 30, 2009
![](https://1.gravatar.com/avatar/7b10f9b53e5fe3d284857da59fe8919c?s=55&d=&r=G)
hi there, I'm the owner of the Scars of Différance blog, I'm grateful for your
reading which nurtures self-reflexivity.
text-sharers phylum is a Tardean phenomena, it works through imitation and
differences differentiate styles and archives. my question was inherited from
aby warburg who is perhaps the first kantian librarian (not books, but the
nomenclatura of books must be thought!), I shape up a library where books
speak to each other, each time fragmentary.
you are right about the "fear", that's why I don't reupload books that are
deleted from mediafire. blog is one of the ways, for ex there are e-mail
groups where chain-sharings happen and there are forums where people ask each
other from different parts of the world, to scan a book that can't be found in
their library/country. I understand publishers' qualms (I also work in a
turkish publishing house and make translations). but they miss a point, it was
the very movement which made book a medium that de-posits "book" (in the
Blanchotian sense): these blogs do indeed a very important service, they save
books from the databanks. I'm not going to make a easy rider argument and
decry technology.what I mean is this: these books are the very bricks which
make up resistance -they are not compost-, it is a sharing "partage" and these
fragmentary impartations (the act in which 'we' emancipate books from the
proper names they bear: author, editor, publisher, queen,…) make words blare.
our work: to disenfranchise.
to get larger, to expand: these are too ambitious terms, one must learn to
stay small, remain finite. a blog can not supplant the non-place of the
friendships we make up around books.
the epigraph at the top of my blog reads: "what/who exorbitates mutates into
its opposite" from a Turkish poet Cahit Zarifoğlu. and this logic is what
generates the slithering of the word. we must save books from its own ends.
thanks again, best.
p.s. I'm not the owner of pdf library.
7. Bedeutung
November 24, 2009
![](https://0.gravatar.com/avatar/665e8f5cb5d701f1c7e310b9b6fef277?s=55&d=&r=G)
Here, an article that might interest:
sharing-free-piracy>
8. [jannekeadema1979](http://www.openreflections.wordpress.com)
November 24, 2009
![](https://2.gravatar.com/avatar/e4898febe4230b412db7f7909bcb9fc9?s=55&d=&r=G)
Thanks for the link, good article, agree with the contents, especially like
the part 'Could, for instance, the considerable resources that might be
allocated to protecting, policing and, ultimately, sanctioning online file-
sharing not be used for rendering it less financially damaging for the
creative sector?'
I like this kind of pragmatic reasoning, and I know more people do.
By the way, checked Bedeutung, great journal, and love your
[library](http://www.bedeutung.co.uk/index.php?option=com_content&view=article&id=86&Itemid=46)
section! Will add it to the main article.
9. Pingback: [Borderland › Critical Readings](http://borderland.northernattitude.org/2010/01/07/critical-readings/)
10. Pingback: [Mariborchan » Scanners, collectors and aggregators. On the 'underground movement' of (pirated) theory text sharing](http://mariborchan.com/scanners-collectors-and-aggregators-on-the-underground-movement-of-pirated-theory-text-sharing/)
11. Pingback: [Urgh! AAAARG dead? « transversalinflections](http://transversalinflections.wordpress.com/2010/05/29/urgh-aaaarg-dead/)
12. [nick knouf](http://turbulence.org/Works/JJPS)
June 18, 2010
![](https://0.gravatar.com/avatar/9908205c0ec5ecb5f27266e7cb7bff13?s=55&d=&r=G)
This is Nick, the author of the JJPS project; thanks for the tweet! I actually
came across this blog post while doing background research for the project and
looking for discussions about AAAARG; found out about a lot of projects that I
didn't already know about. One thing that I haven't been able to articulate
very well is that I think there's an interesting relationship between, say,
Kenneth Goldsmith's own poetry and his founding of Ubu Web; a collation and
reconfiguration of the detritus of culture (forgotten works of the avant-
gardes locked up behind pay walls of their own, or daily minutiae destined to
be forgotten), which is something that I was trying to do, in a more
circumscribed space, in JJPS Radio. But the question of distribution of
digital works is something I find fascinating, as there are all sorts of
avenues that we could be investigating but we are not. The issue, as it often
is, is one of technical ability, and that's why one of the future directions
of JJPS is to make some of the techniques I used easier to use. Those who want
to can always look into the code, which is of course freely available, but
that cannot and should not be a prerequisite.
13. [jannekeadema1979](http://www.openreflections.wordpress.com)
June 18, 2010
![](https://2.gravatar.com/avatar/e4898febe4230b412db7f7909bcb9fc9?s=55&d=&r=G)
Hi Nick, thanks for your comment. I love the JJPS and it would be great if the
technology you mention would be easily re-usable. What I find fascinating is
how you use another medium (radio) to translate/re-mediate and in a way also
unlock textual material. I see you also have an Open Access and a Cut-up hour.
I am very much interested in using different media to communicate scholarly
research and even more in remixing and re-mediating textual scholarship. I
think your project(s) is a very valuable exploration of these themes while at
the same time being a (performative) critique of the current system. I am in
awe.
14. Pingback: [Text-sharing "in the paradise of too many books" – SLOTHROP](http://slothrop.com/2012/11/16/text-sharing-in-the-paradise-of-too-many-books/)
15. [Jason Kennedy](http://www.facebook.com/903035234)
May 6, 2015
![](https://i2.wp.com/graph.facebook.com/v2.2/903035234/picture?q=type%3Dlarge%26_md5%3Da95c382cfe878c70aaad88831f511711&resize=55%2C55)
Some obvious fails suggest major knowledge gaps regarding sourcing texts
online (outside of legal channels).
And featuring Scribd doesn't help.
Q: What's the largest pirate book site on the net, with an inventory almost as
large as Amazon?
And it's not L_____ G_____
16. [Janneke Adema](http://www.openreflections.wordpress.com)
May 6, 2015
![](https://2.gravatar.com/avatar/e4898febe4230b412db7f7909bcb9fc9?s=55&d=&r=G)
Do enlighten us Jason… And might I remind you that this post was written in
2009?
17. Mike Andrews
May 7, 2015
![](https://0.gravatar.com/avatar/c255ce6922fbb867a2ee635beb85bd71?s=55&d=&r=G)
Interesting topic, but also odd in some respects. Not translating the German
quotes is very unthoughtful and maybe even arrogant. If you are interested in
open access accessibility needs to be your top priority. I can read German,
but many of my friends (and most of the world) can't. It take a little effort
to just fix this, but you can do it.
Barok
Poetics of Research
2014
_An unedited version of a talk given at the conference[Public
Library](http://www.wkv-stuttgart.de/en/program/2014/events/public-library/)
held at Württembergischer Kunstverein Stuttgart, 1 November 2014._
_Bracketed sequences are to be reformulated._
Poetics of Research
In this talk I'm going to attempt to identify [particular] cultural
algorithms, ie. processes in which cultural practises and software meet. With
them a sphere is implied in which algorithms gather to form bodies of
practices and in which cultures gather around algorithms. I'm going to
approach them through the perspective of my practice as a cultural worker,
editor and artist, considering practice in the same rank as theory and
poetics, and where theorization of practice can also lead to the
identification of poetical devices.
The primary motivation for this talk is an attempt to figure out where do we
stand as operators, users [and communities] gathering around infrastructures
containing a massive body of text (among other things) and what sort of things
might be considered to make a difference [or to keep making difference].
The talk mainly [considers] the role of text and the word in research, by way
of several figures.
A
A reference, list, scheme, table, index; those things that intervene in the
flow of narrative, illustrating the point, perhaps in a more economic way than
the linear text would do. Yet they don't function as pictures, they are
primarily texts, arranged in figures. Their forms have been
standardised[normalised] over centuries, withstood the transition to the
digital without any significant change, being completely intuitive to the
modern reader. Compared to the body of text they are secondary, run parallel
to it. Their function is however different to that of the punctuation. They
are there neither to shape the narrative nor to aid structuring the argument
into logical blocks. Nor is their function spatial, like in visual poems.
Their positions within a document are determined according to the sequential
order of the text, [standing as attachments] and are there to clarify the
nature of relations among elements of the subject-matter, or to establish
relations with other documents. The [premise] of my talk is that these
_textual figures_ also came to serve as the abstract[relational] models
determining possible relations among documents as such, and in consequence [to
structure conditions [of research]].
B
It can be said that research, as inquiry into a subject-matter, consists of
discrete queries. A query, such as a question about what something is, what
kinds, parts and properties does it have, and so on, can be consulted in
existing documents or generate new documents based on collection of data [in]
the field and through experiment, before proceeding to reasoning [arguments
and deductions]. Formulation of a query is determined by protocols providing
access to documents, which means that there is a difference between collecting
data outside the archive (the undocumented, ie. in the field and through
experiment), consulting with a person--an archivist (expert, librarian,
documentalist), and consulting with a database storing documents. The
phenomena such as [deepening] of specialization and throughout digitization
[have given] privilege to the database as [a|the] [fundamental] means for
research. Obviously, this is a very recent [phenomenon]. Queries were once
formulated in natural language; now, given the fact that databases are queried
[using] SQL language, their interfaces are mere extensions of it and
researchers pose their questions by manipulating dropdowns, checkboxes and
input boxes mashed together on a flat screen being ran by software that in
turn translates them into a long line of conditioned _SELECTs_ and _JOINs_
performed on tables of data.
Specialization, digitization and networking have changed the language of
questioning. Inquiry, once attached to the flesh and paper has been
[entrusted] to the digital and networked. Researchers are querying the black
box.
C
Searching in a collection of [amassed/assembled] [tangible] documents (ie.
bookshelf) is different from searching in a systematically structured
repository (library) and even more so from searching in a digital repository
(digital library). Not that they are mutually exclusive. One can devise
structures and algorithms to search through a printed text, or read books in a
library one by one. They are rather [models] [embodying] various [processes]
associated with the query. These properties of the query might be called [the
sequence], the structure and the index. If they are present in the ways of
querying documents, and we will return to this issue, are they persistent
within the inquiry as such? [wait]
D
This question itself is a rupture in the sequence. It makes a demand to depart
from one narrative [a continuous flow of words] to another, to figure out,
while remaining bound to it [it would be even more as a so-called rhetorical
question]. So there has been one sequence, or line, of the inquiry--about the
kinds of the query and its properties. That sequence itself is a digression,
from within the sequence about what is research and describing its parts
(queries). We are thus returning to it and continue with a question whether
the properties of the inquiry are the same as the properties of the query.
E
But isn't it true that every single utterance occurring in a sequence yields a
query as well? Let's consider the word _utterance_. [wait] It can produce a
number of associations, for example with how Foucault employs the notion of
_énoncé_ in his _Archaeology of Knowledge_ , giving hard time to his English
translators wondering whether _utterance_ or _statement_ is more appropriate,
or whether they are interchangeable, and what impact would each choice have on
his reception in the Anglophone world. Limiting ourselves to textual forms for
now (and not translating his work but pursing a different inquiry), let us say
the utterance is a word [or a phrase or an idiom] in a sequence such as a
sentence, a paragraph, or a document.
## (F) The
structure[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=1
"Edit section: \(F\) The structure")]
This distinction is as old as recorded Western thought since both Plato and
Aristotle differentiate between a word on its own ("the said", a thing said)
and words in the company of other words. For example, Aristotle's _Categories_
[lay] on the [notion] of words on their own, and they are made the subject-
matter of that inquiry. [For him], the ambiguity of connotation words
[produce] lies in their synonymity, understood differently from the moderns--
not as more words denoting a similar thing but rather one word denoting
various things. Categories were outlined as a device to differentiate among
words according to kinds of these things. Every word as such belonged to not
less and not more than one of ten categories.
So it happens to the word _utterance_ , as to any other word uttered in a
sequence, that it poses a question, a query about what share of the spectrum
of possibly denoted things might yield as the most appropriate in a given
context. The more context the more precise share comes to the fore. When taken
out of the context ambiguity prevails as the spectrum unveils in its variety.
Thus single words [as any other utterances] are questions, queries,
themselves, and by occuring in statements, in context, their [means] are being
singled out.
This process is _conditioned_ by what has been formalized as the techniques of
_regulating_ definitions of words.
### (G) The structure: words as
words[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=2
"Edit section: \(G\) The structure: words as words")]
* [![](/images/thumb/c/c8/Philitas_in_P.Oxy.XX_2260_i.jpg/144px-Philitas_in_P.Oxy.XX_2260_i.jpg)](/File:Philitas_in_P.Oxy.XX_2260_i.jpg)
P.Oxy.XX 2260 i: Oxyrhynchus papyrus XX, 2260, column i, with quotation from
Philitas, early 2nd c. CE. 1(http://163.1.169.40/cgi-
bin/library?e=q-000-00---0POxy--00-0-0--0prompt-10---4------0-1l--1-en-50---
20-about-2260--
00031-001-0-0utfZz-8-00&a=d&c=POxy&cl=search&d=HASH13af60895d5e9b50907367)
2(http://en.wikipedia.org/wiki/File:POxy.XX.2260.i-Philitas-
highlight.jpeg)
* [![](/images/thumb/9/9e/Cyclopaedia_1728_page_210_Dictionary_entry.jpg/88px-Cyclopaedia_1728_page_210_Dictionary_entry.jpg)](/File:Cyclopaedia_1728_page_210_Dictionary_entry.jpg)
Ephraim Chambers, _Cyclopaedia, or an Universal Dictionary of Arts and
Sciences_ , 1728, p. 210. 3(http://digicoll.library.wisc.edu/cgi-
bin/HistSciTech/HistSciTech-
idx?type=turn&entity=HistSciTech.Cyclopaedia01.p0576&id=HistSciTech.Cyclopaedia01&isize=L)
* [![](/images/thumb/b/b8/Detail_from_the_Liddell-Scott_Greek-English_Lexicon_c1843.jpg/160px-Detail_from_the_Liddell-Scott_Greek-English_Lexicon_c1843.jpg)](/File:Detail_from_the_Liddell-Scott_Greek-English_Lexicon_c1843.jpg)
Detail from the Liddell-Scott Greek-English Lexicon, c1843.
Dictionaries have had a long life. The ancient Greek scholar and poet Philitas
of Cos living in the 4th c. BCE wrote a vocabulary explaining the meanings of
rare Homeric and other literary words, words from local dialects, and
technical terms. The vocabulary, called _Disorderly Words_ (Átaktoi glôssai),
has been lost, with a few fragments quoted by later authors. One example is
that the word πέλλα (pélla) meant "wine cup" in the ancient Greek region of
Boeotia; contrasted to the same word meaning "milk pail" in Homer's _Iliad_.
Not much has changed in the way how dictionaries constitute order. Selected
archives of statements are queried to yield occurrences of particular words,
various _criteria[indicators]_ are applied to filtering and sorting them and
in turn the spectrum of [denoted] things allocated in this way is structured
into groups and subgroups which are then given, according to other set of
rules, shorter or longer names. These constitute facets of [potential]
meanings of a word.
So there are at least _four_ sets of conditions [structuring] dictionaries.
One is required to delimit an archive[corpus of texts], one to select and give
preference[weights] to occurrences of a word, another to cluster them, and yet
another to abstract[generalize] the subject-matter of each of these clusters.
Needless to say, this is a craft of a few and these criteria are rarely being
disclosed, despite their impact on research, and more generally, their
influence as conditions for production[making] of a so called _common sense_.
It doesn't take that much to reimagine what a dictionary is and what it could
be, especially having large specialized corpora of texts at hand. These can
also serve as aids in production of new words and new meanings.
### (H) The structure: words as knowledge and the
world[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=3
"Edit section: \(H\) The structure: words as knowledge and the world")]
* [![](/images/thumb/0/02/Boethius_Porphyrys_Isagoge.jpg/120px-Boethius_Porphyrys_Isagoge.jpg)](/File:Boethius_Porphyrys_Isagoge.jpg)
Boethius's rendering of a classification tree described in Porphyry's Isagoge
(3th c.), [6th c.] 10th c.
4(http://www.e-codices.unifr.ch/en/sbe/0315/53/medium)
* [![](/images/thumb/d/d0/Cyclopaedia_1728_page_ii_Division_of_Knowledge.jpg/94px-Cyclopaedia_1728_page_ii_Division_of_Knowledge.jpg)](/File:Cyclopaedia_1728_page_ii_Division_of_Knowledge.jpg)
Ephraim Chambers, _Cyclopaedia, or an Universal Dictionary of Arts and
Sciences_ , London, 1728, p. II. 5(http://digicoll.library.wisc.edu/cgi-
bin/HistSciTech/HistSciTech-
idx?type=turn&entity=HistSciTech.Cyclopaedia01.p0015&id=HistSciTech.Cyclopaedia01&isize=L)
* [![](/images/thumb/d/d6/Encyclopedie_1751_Systeme_figure_des_connaissances_humaines.jpg/116px-Encyclopedie_1751_Systeme_figure_des_connaissances_humaines.jpg)](/File:Encyclopedie_1751_Systeme_figure_des_connaissances_humaines.jpg)
Système figuré des connaissances humaines, _Encyclopédie ou Dictionnaire
raisonné des sciences, des arts et des métiers_ , 1751.
6(http://encyclopedie.uchicago.edu/content/syst%C3%A8me-figur%C3%A9-des-
connaissances-humaines)
* [![](/images/thumb/9/96/Haeckel_Ernst_1874_Stammbaum_des_Menschen.jpg/96px-Haeckel_Ernst_1874_Stammbaum_des_Menschen.jpg)](/File:Haeckel_Ernst_1874_Stammbaum_des_Menschen.jpg)
Haeckel - Darwin's tree.
Another _formalized_ and [internalized] process being at play when figuring
out a word is its [containment]. Word is not only structured by way of things
it potentially denotes but also by words it is potentially part of and those
it contains.
The fuzz around categorization of knowledge _and_ the world in the Western
thought can be traced back to Porphyry, if not further. In his introduction to
Aristotle's _Categories_ this 3rd century AD Neoplatonist began expanding the
notions of genus and species into their hypothetic consequences. Aristotle's
brief work outlines ten categories of 'things that are said' (legomena,
λεγόμενα), namely substance (or substantive, {not the same as matter!},
οὐσία), quantity (ποσόν), qualification (ποιόν), a relation (πρός), where
(ποῦ), when (πότε), being-in-a-position (κεῖσθαι), having (or state,
condition, ἔχειν), doing (ποιεῖν), and being-affected (πάσχειν). In his
different work, _Topics_ , Aristotle outlines four kinds of subjects/materials
indicated in propositions/problems from which arguments/deductions start.
These are a definition (όρος), a genus (γένος), a property (ἴδιος), and an
accident (συμβεβηϰόϛ). Porphyry does not explicitly refer _Topics_ , and says
he omits speaking "about genera and species, as to whether they subsist (in
the nature of things) or in mere conceptions only"
8(http://www.ccel.org/ccel/pearse/morefathers/files/porphyry_isagogue_02_translation.htm#C1),
which means he avoids explicating whether he talks about kinds of concepts or
kinds of things in the sensible world. However, the work sparked confusion, as
the following passage [suggests]:
> "[I]n each category there are certain things most generic, and again, others
most special, and between the most generic and the most special, others which
are alike called both genera and species, but the most generic is that above
which there cannot be another superior genus, and the most special that below
which there cannot be another inferior species. Between the most generic and
the most special, there are others which are alike both genera and species,
referred, nevertheless, to different things, but what is stated may become
clear in one category. Substance indeed, is itself genus, under this is body,
under body animated body, under which is animal, under animal rational animal,
under which is man, under man Socrates, Plato, and men particularly." (Owen
1853,
9(http://www.ccel.org/ccel/pearse/morefathers/files/porphyry_isagogue_02_translation.htm#C2))
Porphyry took one of Aristotle's ten categories of the word, substance, and
dissected it using one of his four rhetorical devices, genus. Employing
Aristotle's categories, genera and species as means for logical operations,
for dialectic, Porphyry's interpretation resulted in having more resemblance
to the perceived _structures_ of the world. So they began to bloom.
There were earlier examples, but Porphyry was the most influential in
injecting the _universalist_ version of classification [implying] the figure
of a tree into the [locus] of Aristotle's thought. Knowledge became
monotheistic.
Classification schemes [growing from one point] play a major role in
untangling the format of modern encyclopedia from that of the dictionary
governed by alphabet. Two of the most influential encyclopedias of the 18th
century are cases in the point. Although still keeping 'dictionary' in their
titles, they are conceived not to represent words but knowledge. The [upper-
most] genus of the body was set as the body of knowledge. The English
_Cyclopaedia, or an Universal Dictionary of Arts and Sciences_ (1728) splits
into two main branches: "natural and scientifical" and "artificial and
technical"; these further split down to 47 classes in total, each carrying a
structured list (on the following pages) of thematic articles, serving as
table of contents. The French _Encyclopedia: or a Systematic Dictionary of the
Sciences, Arts, and Crafts_ (1751) [unwinds] from judgement ( _entendement_ ),
branches into memory as history, reason as philosophy, and imagination as
poetry. The logic of containers was employed as an aid not only to deal with
the enormous task of naming and not omiting anything from what is known, but
also for the management of labour of hundreds of writers and researchers, to
create a mechanism for delegating work and the distribution of
responsibilities. Flesh was also more present, in the field research, with
researchers attending workshops and sites of everyday life to annotate it.
The world came forward to unshine the word in other schemes. Darwin's tree of
evolution and some of the modern document classification systems such as
Charles A. Cutter's _Expansive Classification_ (1882) set to classify the
world itself and set the field for what has came to be known as authority
lists structuring metadata in today's computing.
### The structure
(summary)[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=4
"Edit section: The structure \(summary\)")]
Facetization of meaning and branching of knowledge are both the domain of the
unit of utterance.
While lexicographers[dictionarists] structure thought through multi-layered
processes of abstraction of the written record, knowledge growers dissect it
into hierarchies of [mutually] contained notions.
One seek to describe the word as a faceted list of small worlds, another to
describe the world as a structured lists of words. One play prime in the
domain of epistemology, in what is known, controlling the vocabulary, another
in the domain of ontology, in what is, controlling reality.
Every [word] has its given things, every thing has its place, closer or
further from a single word.
The schism between classifying words and classifying the world implies it is
not possible to construct a universal classification scheme[system]. On top of
that, any classification system of words is bound to a corpus of texts it is
operating upon and any classification system of the world again operates with
words which are bound to a vocabulary[lexicon] which is again bound to a
corpus [of texts]. It doesn't mean it would prevent people from trying.
Classifications function as descriptors of and 'inscriptors' upon the world,
imprinting their authority. They operate from [a locus of] their
corpus[context]-specificity. The larger the corpus, the more power it has on
shaping the world, as far as the word shapes it (yes, I do imply Google here,
for which it is a domain to be potentially exploited).
## (J) The
sequence[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=5
"Edit section: \(J\) The sequence")]
The structure-yielding query [of] the single word [shrinks][zuzuje
sa,spresnuje] with preceding and following words. Inquiry proceeds in the flow
that establishes another kind[mode] of relationality, chaining words into the
sequence. While the structuring property of the query brings words apart from
each other, its sequential property establishes continuity and brings these
units into an ordered set.
This is what is responsible for attaching textual figures mentioned earlier
(lists, schemes, tables) to the body of the text. Associations can be also
stated explicitly, by indexing tables and then referring them from a
particular point in the text. The same goes for explicit associations made
between blocks of the text by means of indexed paragraphs, chapters or pages.
From this follows that all utterances point to the following utterance by the
nature of sequential order, and indexing provides means for pointing elsewhere
in the document as well.
A lot can be said about references to other texts. Here, to spare time, I
would refer you to a talk I gave a few months ago and which is online
10(http://monoskop.org/Talks/Communing_Texts).
This is still the realm of print. What happens with document when it is
digitized?
Digitization breaks a document into units of which each is assigned a numbered
position in the sequence of the document. From this perspective digitization
can be viewed as a total indexation of the document. It is converted into
units rendered for machine operations. This sequentiality is made explicit, by
means of an underlying index.
Sequences and chains are orders of one dimension. Their one-dimensional
ordering allows addressability of each element and [random] access. [Jumps]
between [random] addresses are still sequential, processing elements one at a
time.
## (K) The
index[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=6
"Edit section: \(K\) The index")]
* [![](/images/thumb/2/27/Summa_confessorum.1310.jpg/103px-Summa_confessorum.1310.jpg)](/File:Summa_confessorum.1310.jpg)
Summa confessorum [1297-98], 1310.
7(http://www.bl.uk/onlinegallery/onlineex/illmanus/roymanucoll/j/011roy000008g11u00002000.html)
[The] sequencing not only weaves words into statements but activates other
temporalities, and _presents occurrences of words from past statements_. As
now when I am saying the word _utterance_ , each time there surface contexts
in which I have used it earlier.
A long quote from Frederick G. Kilgour, _The Evolution of the Book_ , 1998, pp
76-77:
> "A century of invention of various types of indexes and reference tools
preceded the advent of the first subject index to a specific book, which
occurred in the last years of the thirteenth century. The first subject
indexes were "distinctions," collections of "various figurative or symbolic
meanings of a noun found in the scriptures" that "are the earliest of all
alphabetical tools aside from dictionaries." (Richard and Mary Rouse supply an
example: "Horse = Preacher. Job 39: 'Hast thou given the horse strength, or
encircled his neck with whinning?')
>
> [Concordance] By the end of the third decade of the thirteenth century Hugh
de Saint-Cher had produced the first word concordance. It was a simple word
index of the Bible, with every location of each word listed by [its position
in the Bible specified by book, chapter, and letter indicating part of the
chapter]. Hugh organized several dozen men, assigning to each man an initial
letter to search; for example, the man assigned M was to go through the entire
Bible, list each word beginning with M and give its location. As it was soon
perceived that this original reference work would be even more useful if words
were cited in context, a second concordance was produced, with each word in
lengthy context, but it proved to be unwieldy. [Soon] a third version was
produced, with words in contexts of four to seven words, the model for
biblical concordances ever since.
>
> [Subject index] The subject index, also an innovation of the thirteenth
century, evolved over the same period as did the concordance. Most of the
early topical indexes were designed for writing sermons; some were organized,
while others were apparently sequential without any arrangement. By midcentury
the entries were in alphabetical order, except for a few in some classified
arrangement. Until the end of the century these alphabetical reference works
indexed a small group of books. Finally John of Freiburg added an alphabetical
subject index to his own book, _Summa Confessorum_ (1297—1298). As the Rouses
have put it, 'By the end of the [13]th century the practical utility of the
subject index is taken for granted by the literate West, no longer solely as
an aid for preachers, but also in the disciplines of theology, philosophy, and
both kinds of law.'"
In one sense neither subject-index nor concordane are indexes, they are words
or group of words selected according to given criteria from the body of the
text, each accompanied with a list of identifiers. These identifiers are
elements of an index, whether they represent a page, chapter, column, or other
[kind of] block of text. Every identifier is an unique _address_.
The index is thus an ordering of a sequence by means of associating its
elements with a set of symbols, when each element is given unique combination
of symbols. Different sizes of sets yield different number of variations.
Symbol sets such as an alphabet, arabic numerals, roman numerals, and binary
digits have different proportions between the length of a string of symbols
and the number of possible variations it can contain. Thus two symbols of
English alphabet can store 26^2 various values, of arabic numerals 10^2, of
roman numberals 8^2 and of binary digits 2^2.
Indexation is segmentation, a breaking into segments. From as early as the
13th century the index such as that of sections has served as enabler of
search. The more [detailed] indexation the more precise search results it
enables.
The subject-index and concordance are tables of search results. There is a
direct lineage from the 13th-century biblical concordances and the birth of
computational linguistic analysis, they were both initiated and realised by
priests.
During the World War II, Jesuit Father Roberto Busa began to look for machines
for the automation of the linguistic analysis of the 11 million-word Latin
corpus of Thomas Aquinas and related authors.
Working on his Ph.D. thesis on the concept of _praesens_ in Aquinas he
realised two things:
> "I realized first that a philological and lexicographical inquiry into the
verbal system of an author has t o precede and prepare for a doctrinal
interpretation of his works. Each writer expresses his conceptual system in
and through his verbal system, with the consequence that the reader who
masters this verbal system, using his own conceptual system, has to get an
insight into the writer's conceptual system. The reader should not simply
attach t o the words he reads the significance they have in his mind, but
should try t o find out what significance they had in the writer's mind.
Second, I realized that all functional or grammatical words (which in my mind
are not 'empty' at all but philosophically rich) manifest the deepest logic of
being which generates the basic structures of human discourse. It is .this
basic logic that allows the transfer from what the words mean today t o what
they meant to the writer.
>
> In the works of every philosopher there are two philosophies: the one which
he consciously intends to express and the one he actually uses to express it.
The structure of each sentence implies in itself some philosophical
assumptions and truths. In this light, one can legitimately criticize a
philosopher only when these two philosophies are in contradiction."
11(http://www.alice.id.tue.nl/references/busa-1980.pdf)
Collaborating with the IBM in New York from 1949, the work, a concordance of
all the words of Thomas Aquinas, was finally published in the 1970s in 56
printed volumes (a version is online since 2005
12(http://www.corpusthomisticum.org/it/index.age)). Besides that, an
electronic lexicon for automatic lemmatization of Latin words was created by a
team of ten priests in the scope of two years (in two phases: grouping all the
forms of an inflected word under their lemma, and coding the morphological
categories of each form and lemma), containing 150,000 forms
13(http://www.alice.id.tue.nl/references/busa-1980.pdf#page=4). Father
Busa has been dubbed the father of humanities computing and recently also of
digital humanities.
The subject-index has a crucial role in the printed book. It is the only means
for search the book offers. Subjects composing an index can be selected
according to a classification scheme (specific to a field of an inquiry), for
example as elements of a certain degree (with a given minimum number of
subclasses).
Its role seemingly vanishes in the digital text. But it can be easily
transformed. Besides serving as a table of pre-searched results the subject-
index also gives a distinct idea about content of the book. Two patterns give
us a clue: numbers of occurrences of selected words give subjects weights,
while words that seem specific to the book outweights other even if they don't
occur very often. A selection of these words then serves as a descriptor of
the whole text, and can be thought of as a specific kind of 'tags'.
This process was formalized in a mathematical function in the 1970s, thanks to
a formula by Karen Spärck Jones which she entitled 'inverse document
frequency' (IDF), or in other words, "term specificity". It is measured as a
proportion of texts in the corpus where the word appears at least once to the
total number of texts. When multiplied by the frequency of the word _in_ the
text (divided by the maximum frequency of any word in the text), we get _term
frequency-inverse document frequency_ (tf-idf). In this way we can get an
automated list of subjects which are particular in the text when compared to a
group of texts.
We came to learn it by practice of searching the web. It is a mechanism not
dissimilar to thought process involved in retrieving particular information
online. And search engines have it built in their indexing algorithms as well.
There is a paper proposing attaching words generated by tf-idf to the
hyperlinks when referring websites 14(http://bscit.berkeley.edu/cgi-
bin/pl_dochome?query_src=&format=html&collection=Wilensky_papers&id=3&show_doc=yes).
This would enable finding the referred content even after the link is dead.
Hyperlinks in references in the paper use this feature and it can be easily
tested: 15(http://www.cs.berkeley.edu/~phelps/papers/dissertation-
abstract.html?lexical-
signature=notemarks+multivalent+semantically+franca+stylized).
There is another measure, cosine similarity, which takes tf-idf further and
can be applied for clustering texts according to similarities in their
specificity. This might be interesting as a feature for digital libraries, or
even a way of organising library bottom-up into novel categories, new
discourses could emerge. Or as an aid for researchers to sort through texts,
or even for editors as an aid in producing interesting anthologies.
## Final
remarks[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=7
"Edit section: Final remarks")]
1
New disciplines emerge all the time - most recently, for example, cultural
techniques, software studies, or media archaeology. It takes years, even
decades, before they gain dedicated shelves in libraries or a category in
interlibrary digital repositories. Not that it matters that much. They are not
only sites of academic opportunities but, firstly, frameworks of new
perspectives of looking at the world, new domains of knowledge. From the
perspective of researcher the partaking in a discipline involves negotiating
its vocabulary, classifications, corpus, reference field, and specific
terms[subjects]. Creating new fields involves all that, and more. Even when
one goes against all disciplines.
2
Google can still surprise us.
3
Knowledge has been in the making for millenia. There have been (abstract)
mechanisms established that govern its conditions. We now possess specialized
corpora of texts which are interesting enough to serve as a ground to discuss
and experiment with dictionaries, classifications, indexes, and tools for
references retrieval. These all belong to the poetic devices of knowledge-
making.
4
Command-line example of tf-idf and concordance in 3 steps.
* 1\. Process the files text.1-5.txt and produce freq.1-5.txt with lists of (nonlemmatized) words (in respective texts), ordered by frequency:
> for i in {1..5}; do tr '[A-Z]' '[a-z]' < text.$i.txt | tr -c '[a-z]'
'[\012*]' | tr -d '[:punct:]' | sort | uniq -c | sort -k 1nr | sed '1,1d' >
temp.txt; max=$(awk -vvar=1 -F" " 'NRDisplay 200
300
400
500
600
700
800
900
1000
ALL
characters around the word.