Barok
Poetics of Research
2014


_An unedited version of a talk given at the conference[Public
Library](http://www.wkv-stuttgart.de/en/program/2014/events/public-library/)
held at Württembergischer Kunstverein Stuttgart, 1 November 2014._

_Bracketed sequences are to be reformulated._

Poetics of Research

In this talk I'm going to attempt to identify [particular] cultural
algorithms, ie. processes in which cultural practises and software meet. With
them a sphere is implied in which algorithms gather to form bodies of
practices and in which cultures gather around algorithms. I'm going to
approach them through the perspective of my practice as a cultural worker,
editor and artist, considering practice in the same rank as theory and
poetics, and where theorization of practice can also lead to the
identification of poetical devices.

The primary motivation for this talk is an attempt to figure out where do we
stand as operators, users [and communities] gathering around infrastructures
containing a massive body of text (among other things) and what sort of things
might be considered to make a difference [or to keep making difference].

The talk mainly [considers] the role of text and the word in research, by way
of several figures.

A

A reference, list, scheme, table, index; those things that intervene in the
flow of narrative, illustrating the point, perhaps in a more economic way than
the linear text would do. Yet they don't function as pictures, they are
primarily texts, arranged in figures. Their forms have been
standardised[normalised] over centuries, withstood the transition to the
digital without any significant change, being completely intuitive to the
modern reader. Compared to the body of text they are secondary, run parallel
to it. Their function is however different to that of the punctuation. They
are there neither to shape the narrative nor to aid structuring the argument
into logical blocks. Nor is their function spatial, like in visual poems.
Their positions within a document are determined according to the sequential
order of the text, [standing as attachments] and are there to clarify the
nature of relations among elements of the subject-matter, or to establish
relations with other documents. The [premise] of my talk is that these
_textual figures_ also came to serve as the abstract[relational] models
determining possible relations among documents as such, and in consequence [to
structure conditions [of research]].

B

It can be said that research, as inquiry into a subject-matter, consists of
discrete queries. A query, such as a question about what something is, what
kinds, parts and properties does it have, and so on, can be consulted in
existing documents or generate new documents based on collection of data [in]
the field and through experiment, before proceeding to reasoning [arguments
and deductions]. Formulation of a query is determined by protocols providing
access to documents, which means that there is a difference between collecting
data outside the archive (the undocumented, ie. in the field and through
experiment), consulting with a person--an archivist (expert, librarian,
documentalist), and consulting with a database storing documents. The
phenomena such as [deepening] of specialization and throughout digitization
[have given] privilege to the database as [a|the] [fundamental] means for
research. Obviously, this is a very recent [phenomenon]. Queries were once
formulated in natural language; now, given the fact that databases are queried
[using] SQL language, their interfaces are mere extensions of it and
researchers pose their questions by manipulating dropdowns, checkboxes and
input boxes mashed together on a flat screen being ran by software that in
turn translates them into a long line of conditioned _SELECTs_ and _JOINs_
performed on tables of data.

Specialization, digitization and networking have changed the language of
questioning. Inquiry, once attached to the flesh and paper has been
[entrusted] to the digital and networked. Researchers are querying the black
box.

C

Searching in a collection of [amassed/assembled] [tangible] documents (ie.
bookshelf) is different from searching in a systematically structured
repository (library) and even more so from searching in a digital repository
(digital library). Not that they are mutually exclusive. One can devise
structures and algorithms to search through a printed text, or read books in a
library one by one. They are rather [models] [embodying] various [processes]
associated with the query. These properties of the query might be called [the
sequence], the structure and the index. If they are present in the ways of
querying documents, and we will return to this issue, are they persistent
within the inquiry as such? [wait]

D

This question itself is a rupture in the sequence. It makes a demand to depart
from one narrative [a continuous flow of words] to another, to figure out,
while remaining bound to it [it would be even more as a so-called rhetorical
question]. So there has been one sequence, or line, of the inquiry--about the
kinds of the query and its properties. That sequence itself is a digression,
from within the sequence about what is research and describing its parts
(queries). We are thus returning to it and continue with a question whether
the properties of the inquiry are the same as the properties of the query.

E

But isn't it true that every single utterance occurring in a sequence yields a
query as well? Let's consider the word _utterance_. [wait] It can produce a
number of associations, for example with how Foucault employs the notion of
_énoncé_ in his _Archaeology of Knowledge_ , giving hard time to his English
translators wondering whether _utterance_ or _statement_ is more appropriate,
or whether they are interchangeable, and what impact would each choice have on
his reception in the Anglophone world. Limiting ourselves to textual forms for
now (and not translating his work but pursing a different inquiry), let us say
the utterance is a word [or a phrase or an idiom] in a sequence such as a
sentence, a paragraph, or a document.

## (F) The
structure[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=1
"Edit section: \(F\) The structure")]

This distinction is as old as recorded Western thought since both Plato and
Aristotle differentiate between a word on its own ("the said", a thing said)
and words in the company of other words. For example, Aristotle's _Categories_
[lay] on the [notion] of words on their own, and they are made the subject-
matter of that inquiry. [For him], the ambiguity of connotation words
[produce] lies in their synonymity, understood differently from the moderns--
not as more words denoting a similar thing but rather one word denoting
various things. Categories were outlined as a device to differentiate among
words according to kinds of these things. Every word as such belonged to not
less and not more than one of ten categories.

So it happens to the word _utterance_ , as to any other word uttered in a
sequence, that it poses a question, a query about what share of the spectrum
of possibly denoted things might yield as the most appropriate in a given
context. The more context the more precise share comes to the fore. When taken
out of the context ambiguity prevails as the spectrum unveils in its variety.

Thus single words [as any other utterances] are questions, queries,
themselves, and by occuring in statements, in context, their [means] are being
singled out.

This process is _conditioned_ by what has been formalized as the techniques of
_regulating_ definitions of words.

### (G) The structure: words as
words[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=2
"Edit section: \(G\) The structure: words as words")]

* [![](/images/thumb/c/c8/Philitas_in_P.Oxy.XX_2260_i.jpg/144px-Philitas_in_P.Oxy.XX_2260_i.jpg)](/File:Philitas_in_P.Oxy.XX_2260_i.jpg)

P.Oxy.XX 2260 i: Oxyrhynchus papyrus XX, 2260, column i, with quotation from
Philitas, early 2nd c. CE. 1(http://163.1.169.40/cgi-
bin/library?e=q-000-00---0POxy--00-0-0--0prompt-10---4------0-1l--1-en-50---
20-about-2260--
00031-001-0-0utfZz-8-00&a=d&c=POxy&cl=search&d=HASH13af60895d5e9b50907367)
2(http://en.wikipedia.org/wiki/File:POxy.XX.2260.i-Philitas-
highlight.jpeg)

* [![](/images/thumb/9/9e/Cyclopaedia_1728_page_210_Dictionary_entry.jpg/88px-Cyclopaedia_1728_page_210_Dictionary_entry.jpg)](/File:Cyclopaedia_1728_page_210_Dictionary_entry.jpg)

Ephraim Chambers, _Cyclopaedia, or an Universal Dictionary of Arts and
Sciences_ , 1728, p. 210. 3(http://digicoll.library.wisc.edu/cgi-
bin/HistSciTech/HistSciTech-
idx?type=turn&entity=HistSciTech.Cyclopaedia01.p0576&id=HistSciTech.Cyclopaedia01&isize=L)

* [![](/images/thumb/b/b8/Detail_from_the_Liddell-Scott_Greek-English_Lexicon_c1843.jpg/160px-Detail_from_the_Liddell-Scott_Greek-English_Lexicon_c1843.jpg)](/File:Detail_from_the_Liddell-Scott_Greek-English_Lexicon_c1843.jpg)

Detail from the Liddell-Scott Greek-English Lexicon, c1843.

Dictionaries have had a long life. The ancient Greek scholar and poet Philitas
of Cos living in the 4th c. BCE wrote a vocabulary explaining the meanings of
rare Homeric and other literary words, words from local dialects, and
technical terms. The vocabulary, called _Disorderly Words_ (Átaktoi glôssai),
has been lost, with a few fragments quoted by later authors. One example is
that the word πέλλα (pélla) meant "wine cup" in the ancient Greek region of
Boeotia; contrasted to the same word meaning "milk pail" in Homer's _Iliad_.

Not much has changed in the way how dictionaries constitute order. Selected
archives of statements are queried to yield occurrences of particular words,
various _criteria[indicators]_ are applied to filtering and sorting them and
in turn the spectrum of [denoted] things allocated in this way is structured
into groups and subgroups which are then given, according to other set of
rules, shorter or longer names. These constitute facets of [potential]
meanings of a word.

So there are at least _four_ sets of conditions [structuring] dictionaries.
One is required to delimit an archive[corpus of texts], one to select and give
preference[weights] to occurrences of a word, another to cluster them, and yet
another to abstract[generalize] the subject-matter of each of these clusters.
Needless to say, this is a craft of a few and these criteria are rarely being
disclosed, despite their impact on research, and more generally, their
influence as conditions for production[making] of a so called _common sense_.

It doesn't take that much to reimagine what a dictionary is and what it could
be, especially having large specialized corpora of texts at hand. These can
also serve as aids in production of new words and new meanings.

### (H) The structure: words as knowledge and the
world[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=3
"Edit section: \(H\) The structure: words as knowledge and the world")]

* [![](/images/thumb/0/02/Boethius_Porphyrys_Isagoge.jpg/120px-Boethius_Porphyrys_Isagoge.jpg)](/File:Boethius_Porphyrys_Isagoge.jpg)

Boethius's rendering of a classification tree described in Porphyry's Isagoge
(3th c.), [6th c.] 10th c.
4(http://www.e-codices.unifr.ch/en/sbe/0315/53/medium)

* [![](/images/thumb/d/d0/Cyclopaedia_1728_page_ii_Division_of_Knowledge.jpg/94px-Cyclopaedia_1728_page_ii_Division_of_Knowledge.jpg)](/File:Cyclopaedia_1728_page_ii_Division_of_Knowledge.jpg)

Ephraim Chambers, _Cyclopaedia, or an Universal Dictionary of Arts and
Sciences_ , London, 1728, p. II. 5(http://digicoll.library.wisc.edu/cgi-
bin/HistSciTech/HistSciTech-
idx?type=turn&entity=HistSciTech.Cyclopaedia01.p0015&id=HistSciTech.Cyclopaedia01&isize=L)

* [![](/images/thumb/d/d6/Encyclopedie_1751_Systeme_figure_des_connaissances_humaines.jpg/116px-Encyclopedie_1751_Systeme_figure_des_connaissances_humaines.jpg)](/File:Encyclopedie_1751_Systeme_figure_des_connaissances_humaines.jpg)

Système figuré des connaissances humaines, _Encyclopédie ou Dictionnaire
raisonné des sciences, des arts et des métiers_ , 1751.
6(http://encyclopedie.uchicago.edu/content/syst%C3%A8me-figur%C3%A9-des-
connaissances-humaines)

* [![](/images/thumb/9/96/Haeckel_Ernst_1874_Stammbaum_des_Menschen.jpg/96px-Haeckel_Ernst_1874_Stammbaum_des_Menschen.jpg)](/File:Haeckel_Ernst_1874_Stammbaum_des_Menschen.jpg)

Haeckel - Darwin's tree.

Another _formalized_ and [internalized] process being at play when figuring
out a word is its [containment]. Word is not only structured by way of things
it potentially denotes but also by words it is potentially part of and those
it contains.

The fuzz around categorization of knowledge _and_ the world in the Western
thought can be traced back to Porphyry, if not further. In his introduction to
Aristotle's _Categories_ this 3rd century AD Neoplatonist began expanding the
notions of genus and species into their hypothetic consequences. Aristotle's
brief work outlines ten categories of 'things that are said' (legomena,
λεγόμενα), namely substance (or substantive, {not the same as matter!},
οὐσία), quantity (ποσόν), qualification (ποιόν), a relation (πρός), where
(ποῦ), when (πότε), being-in-a-position (κεῖσθαι), having (or state,
condition, ἔχειν), doing (ποιεῖν), and being-affected (πάσχειν). In his
different work, _Topics_ , Aristotle outlines four kinds of subjects/materials
indicated in propositions/problems from which arguments/deductions start.
These are a definition (όρος), a genus (γένος), a property (ἴδιος), and an
accident (συμβεβηϰόϛ). Porphyry does not explicitly refer _Topics_ , and says
he omits speaking "about genera and species, as to whether they subsist (in
the nature of things) or in mere conceptions only"
8(http://www.ccel.org/ccel/pearse/morefathers/files/porphyry_isagogue_02_translation.htm#C1),
which means he avoids explicating whether he talks about kinds of concepts or
kinds of things in the sensible world. However, the work sparked confusion, as
the following passage [suggests]:

> "[I]n each category there are certain things most generic, and again, others
most special, and between the most generic and the most special, others which
are alike called both genera and species, but the most generic is that above
which there cannot be another superior genus, and the most special that below
which there cannot be another inferior species. Between the most generic and
the most special, there are others which are alike both genera and species,
referred, nevertheless, to different things, but what is stated may become
clear in one category. Substance indeed, is itself genus, under this is body,
under body animated body, under which is animal, under animal rational animal,
under which is man, under man Socrates, Plato, and men particularly." (Owen
1853,
9(http://www.ccel.org/ccel/pearse/morefathers/files/porphyry_isagogue_02_translation.htm#C2))

Porphyry took one of Aristotle's ten categories of the word, substance, and
dissected it using one of his four rhetorical devices, genus. Employing
Aristotle's categories, genera and species as means for logical operations,
for dialectic, Porphyry's interpretation resulted in having more resemblance
to the perceived _structures_ of the world. So they began to bloom.

There were earlier examples, but Porphyry was the most influential in
injecting the _universalist_ version of classification [implying] the figure
of a tree into the [locus] of Aristotle's thought. Knowledge became
monotheistic.

Classification schemes [growing from one point] play a major role in
untangling the format of modern encyclopedia from that of the dictionary
governed by alphabet. Two of the most influential encyclopedias of the 18th
century are cases in the point. Although still keeping 'dictionary' in their
titles, they are conceived not to represent words but knowledge. The [upper-
most] genus of the body was set as the body of knowledge. The English
_Cyclopaedia, or an Universal Dictionary of Arts and Sciences_ (1728) splits
into two main branches: "natural and scientifical" and "artificial and
technical"; these further split down to 47 classes in total, each carrying a
structured list (on the following pages) of thematic articles, serving as
table of contents. The French _Encyclopedia: or a Systematic Dictionary of the
Sciences, Arts, and Crafts_ (1751) [unwinds] from judgement ( _entendement_ ),
branches into memory as history, reason as philosophy, and imagination as
poetry. The logic of containers was employed as an aid not only to deal with
the enormous task of naming and not omiting anything from what is known, but
also for the management of labour of hundreds of writers and researchers, to
create a mechanism for delegating work and the distribution of
responsibilities. Flesh was also more present, in the field research, with
researchers attending workshops and sites of everyday life to annotate it.

The world came forward to unshine the word in other schemes. Darwin's tree of
evolution and some of the modern document classification systems such as
Charles A. Cutter's _Expansive Classification_ (1882) set to classify the
world itself and set the field for what has came to be known as authority
lists structuring metadata in today's computing.

### The structure
(summary)[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=4
"Edit section: The structure \(summary\)")]

Facetization of meaning and branching of knowledge are both the domain of the
unit of utterance.

While lexicographers[dictionarists] structure thought through multi-layered
processes of abstraction of the written record, knowledge growers dissect it
into hierarchies of [mutually] contained notions.

One seek to describe the word as a faceted list of small worlds, another to
describe the world as a structured lists of words. One play prime in the
domain of epistemology, in what is known, controlling the vocabulary, another
in the domain of ontology, in what is, controlling reality.

Every [word] has its given things, every thing has its place, closer or
further from a single word.

The schism between classifying words and classifying the world implies it is
not possible to construct a universal classification scheme[system]. On top of
that, any classification system of words is bound to a corpus of texts it is
operating upon and any classification system of the world again operates with
words which are bound to a vocabulary[lexicon] which is again bound to a
corpus [of texts]. It doesn't mean it would prevent people from trying.
Classifications function as descriptors of and 'inscriptors' upon the world,
imprinting their authority. They operate from [a locus of] their
corpus[context]-specificity. The larger the corpus, the more power it has on
shaping the world, as far as the word shapes it (yes, I do imply Google here,
for which it is a domain to be potentially exploited).

## (J) The
sequence[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=5
"Edit section: \(J\) The sequence")]

The structure-yielding query [of] the single word [shrinks][zuzuje
sa,spresnuje] with preceding and following words. Inquiry proceeds in the flow
that establishes another kind[mode] of relationality, chaining words into the
sequence. While the structuring property of the query brings words apart from
each other, its sequential property establishes continuity and brings these
units into an ordered set.

This is what is responsible for attaching textual figures mentioned earlier
(lists, schemes, tables) to the body of the text. Associations can be also
stated explicitly, by indexing tables and then referring them from a
particular point in the text. The same goes for explicit associations made
between blocks of the text by means of indexed paragraphs, chapters or pages.

From this follows that all utterances point to the following utterance by the
nature of sequential order, and indexing provides means for pointing elsewhere
in the document as well.

A lot can be said about references to other texts. Here, to spare time, I
would refer you to a talk I gave a few months ago and which is online
10(http://monoskop.org/Talks/Communing_Texts).

This is still the realm of print. What happens with document when it is
digitized?

Digitization breaks a document into units of which each is assigned a numbered
position in the sequence of the document. From this perspective digitization
can be viewed as a total indexation of the document. It is converted into
units rendered for machine operations. This sequentiality is made explicit, by
means of an underlying index.

Sequences and chains are orders of one dimension. Their one-dimensional
ordering allows addressability of each element and [random] access. [Jumps]
between [random] addresses are still sequential, processing elements one at a
time.

## (K) The
index[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=6
"Edit section: \(K\) The index")]

* [![](/images/thumb/2/27/Summa_confessorum.1310.jpg/103px-Summa_confessorum.1310.jpg)](/File:Summa_confessorum.1310.jpg)

Summa confessorum [1297-98], 1310.
7(http://www.bl.uk/onlinegallery/onlineex/illmanus/roymanucoll/j/011roy000008g11u00002000.html)

[The] sequencing not only weaves words into statements but activates other
temporalities, and _presents occurrences of words from past statements_. As
now when I am saying the word _utterance_ , each time there surface contexts
in which I have used it earlier.

A long quote from Frederick G. Kilgour, _The Evolution of the Book_ , 1998, pp
76-77:

> "A century of invention of various types of indexes and reference tools
preceded the advent of the first subject index to a specific book, which
occurred in the last years of the thirteenth century. The first subject
indexes were "distinctions," collections of "various figurative or symbolic
meanings of a noun found in the scriptures" that "are the earliest of all
alphabetical tools aside from dictionaries." (Richard and Mary Rouse supply an
example: "Horse = Preacher. Job 39: 'Hast thou given the horse strength, or
encircled his neck with whinning?')

>

> [Concordance] By the end of the third decade of the thirteenth century Hugh
de Saint-Cher had produced the first word concordance. It was a simple word
index of the Bible, with every location of each word listed by [its position
in the Bible specified by book, chapter, and letter indicating part of the
chapter]. Hugh organized several dozen men, assigning to each man an initial
letter to search; for example, the man assigned M was to go through the entire
Bible, list each word beginning with M and give its location. As it was soon
perceived that this original reference work would be even more useful if words
were cited in context, a second concordance was produced, with each word in
lengthy context, but it proved to be unwieldy. [Soon] a third version was
produced, with words in contexts of four to seven words, the model for
biblical concordances ever since.

>

> [Subject index] The subject index, also an innovation of the thirteenth
century, evolved over the same period as did the concordance. Most of the
early topical indexes were designed for writing sermons; some were organized,
while others were apparently sequential without any arrangement. By midcentury
the entries were in alphabetical order, except for a few in some classified
arrangement. Until the end of the century these alphabetical reference works
indexed a small group of books. Finally John of Freiburg added an alphabetical
subject index to his own book, _Summa Confessorum_ (1297—1298). As the Rouses
have put it, 'By the end of the [13]th century the practical utility of the
subject index is taken for granted by the literate West, no longer solely as
an aid for preachers, but also in the disciplines of theology, philosophy, and
both kinds of law.'"

In one sense neither subject-index nor concordane are indexes, they are words
or group of words selected according to given criteria from the body of the
text, each accompanied with a list of identifiers. These identifiers are
elements of an index, whether they represent a page, chapter, column, or other
[kind of] block of text. Every identifier is an unique _address_.

The index is thus an ordering of a sequence by means of associating its
elements with a set of symbols, when each element is given unique combination
of symbols. Different sizes of sets yield different number of variations.
Symbol sets such as an alphabet, arabic numerals, roman numerals, and binary
digits have different proportions between the length of a string of symbols
and the number of possible variations it can contain. Thus two symbols of
English alphabet can store 26^2 various values, of arabic numerals 10^2, of
roman numberals 8^2 and of binary digits 2^2.

Indexation is segmentation, a breaking into segments. From as early as the
13th century the index such as that of sections has served as enabler of
search. The more [detailed] indexation the more precise search results it
enables.

The subject-index and concordance are tables of search results. There is a
direct lineage from the 13th-century biblical concordances and the birth of
computational linguistic analysis, they were both initiated and realised by
priests.

During the World War II, Jesuit Father Roberto Busa began to look for machines
for the automation of the linguistic analysis of the 11 million-word Latin
corpus of Thomas Aquinas and related authors.

Working on his Ph.D. thesis on the concept of _praesens_ in Aquinas he
realised two things:

> "I realized first that a philological and lexicographical inquiry into the
verbal system of an author has t o precede and prepare for a doctrinal
interpretation of his works. Each writer expresses his conceptual system in
and through his verbal system, with the consequence that the reader who
masters this verbal system, using his own conceptual system, has to get an
insight into the writer's conceptual system. The reader should not simply
attach t o the words he reads the significance they have in his mind, but
should try t o find out what significance they had in the writer's mind.
Second, I realized that all functional or grammatical words (which in my mind
are not 'empty' at all but philosophically rich) manifest the deepest logic of
being which generates the basic structures of human discourse. It is .this
basic logic that allows the transfer from what the words mean today t o what
they meant to the writer.

>

> In the works of every philosopher there are two philosophies: the one which
he consciously intends to express and the one he actually uses to express it.
The structure of each sentence implies in itself some philosophical
assumptions and truths. In this light, one can legitimately criticize a
philosopher only when these two philosophies are in contradiction."
11(http://www.alice.id.tue.nl/references/busa-1980.pdf)

Collaborating with the IBM in New York from 1949, the work, a concordance of
all the words of Thomas Aquinas, was finally published in the 1970s in 56
printed volumes (a version is online since 2005
12(http://www.corpusthomisticum.org/it/index.age)). Besides that, an
electronic lexicon for automatic lemmatization of Latin words was created by a
team of ten priests in the scope of two years (in two phases: grouping all the
forms of an inflected word under their lemma, and coding the morphological
categories of each form and lemma), containing 150,000 forms
13(http://www.alice.id.tue.nl/references/busa-1980.pdf#page=4). Father
Busa has been dubbed the father of humanities computing and recently also of
digital humanities.

The subject-index has a crucial role in the printed book. It is the only means
for search the book offers. Subjects composing an index can be selected
according to a classification scheme (specific to a field of an inquiry), for
example as elements of a certain degree (with a given minimum number of
subclasses).

Its role seemingly vanishes in the digital text. But it can be easily
transformed. Besides serving as a table of pre-searched results the subject-
index also gives a distinct idea about content of the book. Two patterns give
us a clue: numbers of occurrences of selected words give subjects weights,
while words that seem specific to the book outweights other even if they don't
occur very often. A selection of these words then serves as a descriptor of
the whole text, and can be thought of as a specific kind of 'tags'.

This process was formalized in a mathematical function in the 1970s, thanks to
a formula by Karen Spärck Jones which she entitled 'inverse document
frequency' (IDF), or in other words, "term specificity". It is measured as a
proportion of texts in the corpus where the word appears at least once to the
total number of texts. When multiplied by the frequency of the word _in_ the
text (divided by the maximum frequency of any word in the text), we get _term
frequency-inverse document frequency_ (tf-idf). In this way we can get an
automated list of subjects which are particular in the text when compared to a
group of texts.

We came to learn it by practice of searching the web. It is a mechanism not
dissimilar to thought process involved in retrieving particular information
online. And search engines have it built in their indexing algorithms as well.

There is a paper proposing attaching words generated by tf-idf to the
hyperlinks when referring websites 14(http://bscit.berkeley.edu/cgi-
bin/pl_dochome?query_src=&format=html&collection=Wilensky_papers&id=3&show_doc=yes).
This would enable finding the referred content even after the link is dead.
Hyperlinks in references in the paper use this feature and it can be easily
tested: 15(http://www.cs.berkeley.edu/~phelps/papers/dissertation-
abstract.html?lexical-
signature=notemarks+multivalent+semantically+franca+stylized).

There is another measure, cosine similarity, which takes tf-idf further and
can be applied for clustering texts according to similarities in their
specificity. This might be interesting as a feature for digital libraries, or
even a way of organising library bottom-up into novel categories, new
discourses could emerge. Or as an aid for researchers to sort through texts,
or even for editors as an aid in producing interesting anthologies.

## Final
remarks[[edit](/index.php?title=Talks/Poetics_of_Research&action=edit§ion=7
"Edit section: Final remarks")]

1

New disciplines emerge all the time - most recently, for example, cultural
techniques, software studies, or media archaeology. It takes years, even
decades, before they gain dedicated shelves in libraries or a category in
interlibrary digital repositories. Not that it matters that much. They are not
only sites of academic opportunities but, firstly, frameworks of new
perspectives of looking at the world, new domains of knowledge. From the
perspective of researcher the partaking in a discipline involves negotiating
its vocabulary, classifications, corpus, reference field, and specific
terms[subjects]. Creating new fields involves all that, and more. Even when
one goes against all disciplines.

2

Google can still surprise us.

3

Knowledge has been in the making for millenia. There have been (abstract)
mechanisms established that govern its conditions. We now possess specialized
corpora of texts which are interesting enough to serve as a ground to discuss
and experiment with dictionaries, classifications, indexes, and tools for
references retrieval. These all belong to the poetic devices of knowledge-
making.

4

Command-line example of tf-idf and concordance in 3 steps.

* 1\. Process the files text.1-5.txt and produce freq.1-5.txt with lists of (nonlemmatized) words (in respective texts), ordered by frequency:

> for i in {1..5}; do tr '[A-Z]' '[a-z]' < text.$i.txt | tr -c '[a-z]'
'[\012*]' | tr -d '[:punct:]' | sort | uniq -c | sort -k 1nr | sed '1,1d' >
temp.txt; max=$(awk -vvar=1 -F" " 'NR

1 {print $var}' temp.txt); awk
-vmaxx=$max -F' ' '{printf "%-7.7f %s\n", $1=0.5+($1/(maxx*2)), $2}' > freq.$i.txt; done && rm temp.txt

* 2\. Process the files freq.1-5.txt and produce tfidf.1-5.txt containing a list of words (out of 500 most frequent in respective lists), ordered by weight (specificity for each text):

> for j in {1..5}; do rm freq.$j.txt.temp; lines=$(wc -l freq.$j.txt) && for i
in {1..500}; do word=$(awk -vline="$i" -vfield=2 -F" " 'NR

line {print
$field}' freq.$j.txt); tf=$(awk -vline="$i" -vfield=1 -F" " 'NR

line {print
$field}' freq.$j.txt); count=$(egrep -lw $word freq.?.txt | wc -l); idf=$(echo
"1+l(5/$count)" | bc -l); tfidf=$(echo $tf*$idf | bc); echo $word $tfidf >>
freq.$j.txt.temp; done; sort -k 2nr < freq.$j.txt.temp > tfidf.$j.txt; done

* 3\. Process the files tfidf.1-5.txt and their source text, text.txt, and produce occ.txt with concordance of top 3 words from each of them:

> rm occ.txt && for j in {1..5}; do echo "$j" >> occ.txt; ptx -f -w 150
text.txt.$j > occ.$j.txt; for i in {1..3}; do word=$(awk -vline="$i" -vfield=1
-F" " 'NR

line {print $field}' tfidf.$j.txt); egrep -i
"[alpha:](/index.php?title=Alpha:&action=edit&redlink=1 "Alpha: \(page does
not exist\)") $word" occ.$j.txt >> occ.txt; done; done

Dušan Barok

_Written 23 October - 1 November 2014 in Bratislava and Stuttgart._


Sollfrank & Snelting
Performing Graphic Design Practice
2014


Femke Snelting
Performing Graphic Design Practice

Leipzig, 7 April 2014

[00:12]
What is Libre Graphics?

[00:16]
Libre Graphics is quite a large ecosystem of software tools, of people –
people that develop these tools, but also people that use these tools;
practices, like how do you then work with them, not just how you make things
quickly and in an impressive way, but also these tools might change your
practice and the cultural artefacts that result from it. So it’s all these
elements that come together, and we call Libre Graphics. [00:53] The term
“Libre” is chosen deliberately. It’s slightly more mysterious that the term
“free”, especially when it turns up in the English language. It sort of hints
that there’s something different, that there’s something done on purpose.
[01:16] And it is a group of people that are inspired by free software
culture, by free culture, by thinking about how to share both their tools,
their recipes and the outcomes of all this. [01:31] So Libre Graphics is quite
wild, it goes in many directions, but it’s an interesting context to work in,
that for me it has been quite inspiring for a few years now.

[01:46]
The context of Libre Graphics

[01:50]
The context of Libre Graphics is multiple. I think that’s part of why I’m
excited about it, and also part of why it’s sometimes difficult to describe it
in a short sentence. [02:04] The context is design – so people that are
interested in design, in creating visuals, in creating animations, videos,
typography. And that is already a multiple context, because each of these
disciplines have their own histories, and their own sort of types of people
that get touched by them. [02:23] Then there is software, people that are
interested in the digital material – so, let’s say, excited about raw bits and
the way a vector gets produced. So that’s a very, almost formal interest in
how graphics are made. [02:47] Then there’s people that do software, so they
are interested in programming, in programming languages, in thinking about
interfaces and thinking about ways software can become a tool. And then
there’s people that are interested in free software, so how can you make
digital tools that can be shared, but also how can you produce processes that
can be shared. [03:11] So there you have from free software activists to
people that are interested in developing specific tools for sharing design and
software development processes, like Git or [Apache] Subversion, or those
kinds of things. So I think that multiple context is really special and rich
in Libre Graphics.

[03:34]
Free software culture

[03:38]
Free software culture… And I use the term culture because I’m more interested
in, let’s say, the cultural aspect of it, and this includes software, for me
software is a cultural object – but I think it’s important to emphasise this,
because it's easily turned into a very technocentric approach which I think is
important to stay away from. [04:01] So free software culture is the thinking
that, when you develop technology – and I’m using technology in the sense that
is cultural as well, to me, deeply cultural – you need to take care of sharing
the recipes for how this technology has been developed as well. [04:28] And
this produces many different other tools, ways of working, ways of speaking,
vocabularies, because it changes radically the way we make and the way we
produce hierarchies. [04:49] So it means, for example, if you produce a
graphic design artefact, for example, that you share all the source files that
were necessary to make it. But you also share, as much as you can,
descriptions and narrations of how it came to be, which does include, maybe,
how much was paid for it, what difficulties were in negotiating with the
printer, and what elements were included – because the graphic design object
is usually a compilation of different elements –, what software was used to
make it and where it might have resisted. [05:34] So the consequences of
taking free software culture seriously in a graphic design or a design
context, means that you care about all these different layers of the work, all
the different conditions that actually make the work happen.

[05:50]
Free culture

[05:54]
The relationship from Libre Graphics to free culture is not always that
explicit. For some people it’s enough to work with tools that are released
under GPL (GNU General Public License), or like an open content license, and
there it stops. So even their work would be released under proprietary
licenses. [06:18] For others it’s important to make the full circle and to
think about what the legal status is of the work they release. So that’s the
more general one. [06:34] Then free culture – we can use that very loosely, as
in everything that is circulating under conditions that it can be reused and
remade, that would be my position – free culture, of course, also refers to
the very specific idea of how that would work, namely Creative Commons.
[06:56] For myself, Creative Commons is problematic, although I value the fact
that it exists and has really created a broader discussion around licenses in
creative practices, so I value that. [07:11] For me, the distinction Creative
Commons makes, almost for all the licenses they promote, between commercial
and non-commercial work, and as a consequence between professional and amateur
work – I find that very problematic, because I think one of the most important
elements of free software culture, for me, is the possibility of people from
different backgrounds, with different skill sets, to actually engage the
digital artefacts they are surrounded with. [07:47] And so by making this
quite lazy separation between commercial and non-commercial, which, especially
in the context of the web as it is right now, since it’s not very easy to hold
up, seems really problematic, because it creates an illusion of clarity that I
think actually makes more trouble than clarity. [08:15] So I use free culture
licenses, I use licenses that are more explicit about the fact that anyone can
use whatever I produce, in any context, because I think that’s where the real
power is of free software culture. [08:31] For me, free software licenses and
all the licenses around them – because I think there are many different types,
and that’s interesting – is that they have a viral power built in. So if you
apply a free software license to, for example, a typeface, it means that
someone else, even someone else you don’t know, has the permission, and
doesn’t have to ask for the permission to reuse the typeface, to change it, to
mix it with something else, to distribute it and to sell it. [09:08] That’s
one part that is already very powerful. But the real secret of such a license
is that once this person re-releases a typeface, it means that they need to
keep the same license. So it means that it propagates across the network, and
that is where it’s really powerful.

[09:31]
Free tools

[09:35]
It’s important to have tools that are released under conditions that allow me
to look further than its surface, for many reasons. There is an ethical
reason. It’s very problematic, I think, to, as a friend explained last week,
to feel like you are renting a room in a hotel – because that is often the way
practitioners nowadays relate to their tools, they have no right to remove the
furniture, they’ve no right to invite friends to their hotel room, they have
to check out at 11, etc. So it’s a very sterile relationship to your tools. So
that’s one part. [10:24] The other is that there is little way of coming into
contact with the cultural aspects of the tools. Something that I suspected
before I started to use free software tools for my practice, but has been
already for almost ten years continuously exciting, is the whole… let’s say,
all the other elements around it: the way people organise themselves in
conferences, mailing lists, the fact that the kinds of communications that
happens, the vocabularies, the histories, the connections between different
disciplines. [11:07] And all that is available to look at, to work with, to
come into contact with, even to speak to people that do these tools and ask
them, why is like this and not like that. And so to me it seems obvious that
artists want to have that kind of, let’s say, layered relation with their
tools, and not just accept whatever comes out of the next-door shop. [11:36] I
have a very different, almost different physical experience of these tools,
because I can enter on many levels. And that makes them part of my practice
and not just means to an end, I really can take them into my practice, and
that I find interesting as an artist and as a designer.

[11:56] Artefacts

[12:00] The outcomes of this type of practice are different, or at least the
kind of work I make, try to make, and the people I like to work with. There’s
obviously also a group of people that would like to do Hollywood movies with
those tools. And, you know, that’s kind of interesting too, that that happens.
[12:21] For me, somehow the technological context or conditions that made the
work possible will always occur in the final result. So that’s one part.
[12:38] And the other is that the, let’s say, the product is never the end. So
it means that because, in whatever way, source materials would be released,
would be made available, it means that the product is always the beginning of
another project or product, either by me or by other people. [13:02] So I
think that’s two things that you can always see in the kind of works we make
when we do Libre Graphics – my style.

[13:15] Libre Fonts

[13:18] A very exciting part of Libre Graphics is the Libre Font movement,
which is strong, and has been strong for a long time. Fonts are the basic
building block of how a graphic comes to life. I mean, when you type
something, it’s there. [13:40] And the fact that that part of the work is free
is important in many levels. Things that you often don’t think about when we
speak English and we stay within a limited character set, is that when you
live in, let’s say, India, the language you speak is not available as a
digital typeface, meaning that when you want to produce book in the tools that
are available, or publish it online, your language has no way of expressing
itself. [14:26] And so it’s important, and that has to do with commercial
interests, laws, ways that the technical infrastructure has been built. And so
by understanding that it’s important that you can express yourself in the
language and with the characters you need, it’s also obvious that that part
needs to be free. [14:53] Fonts are also interesting because they exist on
many levels. They exist on your system. They are almost software, because they
are quite complicated objects. They appear in your screen, when you print a
document – they are there all the time. [15:17] But at the same time it’s the
alphabet. It’s the most, let’s say… we consider it as a totally accessible,
available and universal right, to have the alphabet at our disposal. [15:29]
So I think, politically and, let’s say, from a sort of interest in that kind
of practice that is very technical but at the same time also very basic, in
the sense that is about “freeing an A,” that’s quite a beautiful energy – I
think that that has made the Libre Font movement very strong.

[15:55] Free artefacts / open standards

[15:59] It took me a while to figure out myself – that for me it was so
obvious that if you do free software, that you would produce free artefacts, I
mean, it seems kind of obvious, but that is not at all the case. [16:12] There
is full-fledged commercial production happening with these tools. But one
thing that sort of keeps the results, the outcomes of these projects, freer
than most commercial tools is that there is really an emphasis on open
document formats. [16:34] And that is extremely important because, first of
all, through this sort of free software thinking it’s very obvious that the
documents that you produce with the tool should not belong to the software
vendor, they are yours. [16:49] And to be able to own your own documents you
need to be able to look, to inspect how they are produced. I know many tragic
stories of designers that with several upgrades of “their” tool set lost
documents, because they could never open them again. [17:12] So there’s really
an emphasis and a lot of work in making sure that the documents produced from
these tools remain inspectable, are documented, so that either you can open
them in another tool, or could develop a tool to open them in, to have these
files available for you. [17:38] So it’s really part and parcel of free
software culture, it’s that you care about that what generates your artefact,
but also about the materiality of your artefact. And so there, open standards
are extremely important – or maybe, let’s say, that file formats are
documented and can be understood. [18:04] And what’s interesting to see is
that in this whole Libre Graphics world there is also a very strong group of
reverse engineers, that are document formants, document activists, I would
say. [18:19] And I think that’s really interesting. They claim, they say,
documents need to be free, and so we would go against… let’s say, we would
risk breaking the law to be able to understand how non-free documents actually
are constructed. [18:37] So they are really working to be able to understand
non-free documents, to be able to read them, and to be able to develop tools
for them, so that they can be reused and remade. [18:54] So the difference
between a free and a non-free document is that, for example, an InDesign file,
which is the result of a commercial product, there’s no documentation
available to how this file works. [19:10] This means that the only way to open
the file is with that particular program. So there is a connection between
that what you’ve made and the software you’ve used to produce it. [19:24] It
also means that if the software updates, or the license runs out, you will not
have access to your own file. It means it’s fixed, you can never change it,
and you can never allow anyone else to change it. [19:39] And open document
format has documentation. That means that not only the software that created
it is available, and so that way you can understand how it was made, but also
there’s independent documentation available. [19:55] So that whenever a
project, like a software, doesn’t work anymore or it’s too old to be run, or
you don’t have it available, you have other ways of understanding the document
and being able to open it, and reuse and remake it. [20:11] Examples of open
document formats are, for example, SVG (Scalable Vector Graphics), ODT (Open
Document Text format), or OGG, a format for video that allows you to look at
all the elements that are packed into the video format. [20:31] What’s
important is that, around these open formats, you see a whole ecosystem exists
of tools to inspect, to create, to read, to change, to manipulate these
formats. And I think it’s very easy to see how around InDesign files this
culture does not exist at all.

[20:55] Getting started

[20:59] If you would be interested to start using Libre Graphics, you can
enter it in different levels. There’s well-developed tools that look a bit
like commercial photo manipulation tools, or layout tools. [21:19] There’s
something called Gimp, which is a well-developed software for treating photos.
There’s Blender, which is a fast-developing animation software, that’s being
used by thousands of thousands of people, and even it’s being used in
commercial productions, Pixar-style stuff. [21:43] These tools can be
installed on any system, so you don’t have to run a Linux system to be able to
use them. You can install them on a Macintosh or on a Windows, for example. Of
course, they are usually more powerful when you run them on a system that
recognises that power.

[22:09] Sharing practice / re-learn

[22:14] This way of working changes the way you learn, and also therefore the
way you teach. And so, as many of us have understood the relation between
learning and practice, we’ve all been somehow involved in education, many of
us are teaching in formal design or art education. [22:43] And it’s very clear
how those traditional schools are really not fit for the type of learning and
teaching that needs to happen around Libre Graphics. [22:57] So one of the
problems that we run into is the fact that art academies are traditionally
really organised on many levels – so that the validation systems are really
geared towards judging individuals. And our type of practice is always
multiple, it’s always about, let’s say, things that happen with many people.
[23:17] And it’s really difficult to inspire students to work that way, and at
the same time know that at the end of the day, they will be judged on their
own, what they produce as an individual. So that’s one part. [23:31] In
traditional education there’s always like a separation between teaching
technology and practice. So you have, in different ways, let’s say, you have
the studio practice and then you have the workshops. And it’s very difficult
to make conceptual connections between the two, so we end up trying to make
that happen but it’s clearly not made for that. [24:02] And then there is the
problematics of the hierarchies between tutors and students, that are hard to
break in formal education, just because the set up is – even when it’s a very
informal situation – that someone comes to teach and someone else comes to be
taught. [24:28] And there’s no way to truly break that hierarchy because
that’s the way the school works. So since a year we’ve been starting to think
about how to do… Well, no, for years we’ve been thinking about how to do
teaching differently, or how to do learning differently. [24:48] And so last
year for the first time we organised a summer school, just as a kind of
experiment to see if we could learn and teach differently. And the title, the
name of the school is Relearn, because the sort of relearning, for yourself
but also to others, through teaching-learning, has became really a good
methodology, it seems.

[25:15] Affiliations

[25:19] If I say “we”, that’s always a bit uncomfortable, because I like to be
clear about who that is, but when I’m speaking here there’s many “we” in my
mind. So there’s a group of designers called OSP (Opens Source Publishing).
They started in 2006 with the simple decision to not use any proprietary
software anymore for their work. And from that this whole set of questions,
and practices and methods developed. [25:51] So right now that’s about twelve
people working in Brussels having a design practice. And I’m lucky to be an
honorary member of this group, and so I’m in close contact with them, but I’m
not actively working with the design group. [20:11] Another “we”, and
overlapping “we”, is Constant, an association for art and media active in
Brussels since 1996, 1997 maybe. Our interest is more in mixing copyleft
thinking, free software thinking and feminism. And in many ways that
intersects with OSP, but they might phrase it in a different way. [26:42]
Another “we” is the Libre Graphics community, which is even a more
uncomfortable “we” because it includes engineers that would like to conquer
the world, and small hyper-intelligent developers that creep out of their
corner to talk about the very strange world they are creating, or typographers
that care about universal typefaces. [27:16] I mean, there’s many different
people that are involved in that world. So I think, in this conversation the
“we” are Contant, OSP and Libre Graphics community, whatever that is.

[27:29] Libre Graphics annual meeting, Leipzig 2014

[27:34] We worked on a Code of Conduct – which is something that seems to
appear in free software or tech conferences more and more, it comes a bit from
the U.S. context – where we have started to understand that the fact that free
software is free doesn’t mean that everyone feels welcome. [28:02] For long
there still are large problems with diversity in this community. The
excitement about freedom has led people to think that people that were no
there would probably not want to be there, and therefore had no role to be
there. [28:26] And so if you think, for example, the fact that there is very
little, that there’s not a lot of women active in free software, a lot less
than in proprietary software, which is quite painful if you think about it.
[28:41] That has to do with this sort of cyclical effects of: because women
are not there they would probably be not interested, and because they are not
interested they might not be capable, or feel capable of being active, and
they feel they might not belong. So that’s one part. [29:07] The other part is
that there’s a very brutal culture of harassment, of racist and sexist
language, of using imagery that is, let’s say, unacceptable. And that needs to
be dealt with. [29:26] Over the last two years, I think, the documents like
the Code of Conduct have started to come out from feminists active in this
world, like Geek Feminism or the Ada Initiative, as a way to deal with this.
And what it does is it describes, in a bit… let’s say, it’s slightly pompous
in the sense that you describe your values. [29:56] But it is a way to
acknowledge the fact that this communities have a problem with harassment,
first; that they explicitly say, we want diversity, which is important; that
it gives very clear and practical guidelines for what someone that feels
harassed can do, who he or she can speak to, and what will be the
consequences. [30:31] Meaning that it takes away the burden from, well, at
least as much as possible, from someone who is harassed to defend, actually,
the gravity of the case.

[30:43] Art as integrative concept

[30:47] For me, calling myself an artist is useful, it’s very useful. I’m not
so busy, let’s say, with the institutional art context – that doesn’t help me
at all. [31:03] But what does help me is the figure of the artist, the kinds
of intelligences that I sort of project on myself, and I use from others, from
my colleagues (before and contemporary), because it allows me to not have too
many… to be able to define my own context and concepts without forgetting
practice. [31:37] And I think art is one of the rare places that allows this.
Not only it allows it, but actually it rigorously asks for it. It’s really
wanting me to be explicit about my historical connections, my way of making,
my references, my choices, that are part of the situation I build. [32:11] So
the figure of the artist is a very useful toolbox in itself. And I think I use
it more than I would have thought, because it allows me to make these cross-
connections in a productive way.



 

Display 200 300 400 500 600 700 800 900 1000 ALL characters around the word.