MFA Thesis Proposal Draft
Kavi Duvvoori, Fall 2018
A collection of short language games, mostly computationally mediated, each in conversation with a formal model of language, presented on a website and in a physical installation.
TITLE UNDECIDED will gather a collection of small digital language pieces: prose, poetry generators, and interactive textual interfaces, all referred to as “language games.” Each piece will explicitly reference a model of language, for example from linguistics, NLP, or mathematical logic. The language game will then attempt to offer a literary text or artwork that implements part of the model, while also playing with its metaphors, and exploring the “remainder-work” that can happen in relation to each model. The aim, in gathering pieces, is to develop and suggest a way of working, referencing a cohesive collection of poetry or short fiction.
The pieces will primarily be written for the web. For the exhibition, each piece on the website will be presented on a seperate surface (for digital pieces old laptops, tablets, or PCs). Ideally these surfaces will be arranged in a quiet space where visitors are invited to sit down for a few minutes and read. The question of the relation between the installation and the digital server may be played with explicitly in a couple games, such as a hand-printed piece and an Arduino and LED based text.
The collection tentatively includes:
Word-embedding models (GloVe, Word2Vec, Skip-Gram) are an increasingly ubiquitous approach to lexical semantics that treats the meaning of words geometrically, as arbitrary points in a high-dimensional vector space, with geometric distance corresponding to similarity, and similarity predicted from the likelihood of words appearing in similar contexts (“the cat runs” / “ the dog runs” -> “cat” and “dog” are similar). The papers’ key result is that the arithmetic of this space captures something semantic – the closest word to “London” – “England” + “France” (added as vectors) is allegedly “Paris” and to “King” – “Man” + “Woman” is “Queen.”
I use the GloVe model to create diagram-poems that gradually degenerate a sentence. A program tracks a sentence down a series of disturbances, each time replacing each word with its nearest neighbor in the vector space. The sentences are then “graphed” by projecting down to 2 dimensions (an alternate possibility would be an interactive 3d display) and plotting the text. The fonts are also varied across iterations using the Metaflop tool built on Donald Knuth’s Metafont parametrized typography system. The source sentences follow an additional alphabetic constraint, that each word include the first letter not included in the previous words until each letter is used once – I do not yet have a way to maintain this constraint.
The quirks and failures of such mathematical spaces, calculated from our collective language use in Wikipedia and similar tracts of the web, now act upon and evaluate us through systems for translation, sentiment and network surveillance, and the corporate characters for synthetic conversation. Some of the GloVe model’s correspondences may be recognizable in a thesaurus, but others not: there is an intriguing amount of noise in the trained models – obscure words and misspellings adjacent to common terms. I would like to investigate the “geometry” of this space, and one way to do this is to explore the surroundings of a few “points” in such a space. A related project would be to graph sets of synonyms or opposites to draw out expected and unexpected proximities.
Draft readable at: http://kavid.xyz/embed/
Room With Montague:
One of the founding theories of formal semantics is Richard Montague’s 1970 12 page paper “English as a Formal Language.” In firmly Fregean fashion, it identifies the meaning of sentences as assertions of first-order logic about set-theoretic models, built “compositionally” using lambda calculus. Modern approaches replace first-order logic with modal logic, and aim to build “discourse structures” rather than single assertions, but work in the recognizable scaffolding of Montague (and Frege’s) approach remains dominant in formal semantics; it is a key source articulating (uncritically) the metaphor between computational and human “symbolic systems.”
A parser fiction, that uses semantic composition to evaluate the truth of sentences the player builds up. The player learns about two rooms, my dorm room and that of the lurid murder of Richard Montague. Juxtaposed against the truth values are paragraphs of prose I write very loosely, also offering description. Both components of the text aim for (different forms of) pure description.
It is a way for me to work through some of the ambivalence / contradictions I feel when moving between linguistic and literary approaches to language and writing. This problem – and it doesn’t have to be in this piece – is one I want to continue to focus on. Richard Montague’s murder seems interesting because his theoretical contribution is so concise and pristine, while the events are so sensational and mysterious that they loosely inspired both a Samuel Delaney and a random mystery novel, working from very little public information (mostly in the biography of another logician): it’s a cliche, but the opposition of work that attempts to excise affect to pulpy biography is an easy / effective trope to use when writing about theoreticians. I can’t deny an interest in the centrality of tragic gay men to 20th century formal logic (Turing, likely Wittgenstein, Montague).
Playable draft (coding basically complete, writing needs much editing and expansion) at http://rooms.kavid.xyz/
A loose reference to the idea that language is a cognitive faculty refined by genetic evolution. A close reference to evolutionary algorithms, that search for a solution to a problem by varying (“mutating”) a population of candidates and keeping the most successful, in loose metaphor to natural selection. Genetic algorithms, a subclass, generate new candidates by combining the features of two old ones.
Uses a “genetic” algorithm to seek out n-gram approximations of John Ashbery’s “Whatever It Is, Wherever You Are” in a population of 100 sentences. Every second, 100 new sentences are generated by splicing together samples from the most successful of the previous 100. Success is evaluated by the number of n-grams (up to n=5) that occur in both a sentence and the source text. An interesting direction to explore would be to vary the evaluation measure, or even let the reader change the measure of success live. The source text is about evolution and language also.
It is a way for me to explore this class of techniques for language generation. An interest of the piece is in watching the “population” shift, rather than waiting for a single interesting sentence. I hope to explore the dynamics possible in digital literature more, as variation-in-time is a key axis of many works I enjoy.
Draft (currently down due to server configuration difficulties) http://thefaraway.kavid.xyz/
The primary reference point would be Douglas Hofstadter’s playful obsession with circularity, paradoxical self-reference, and “strange loops” as central phenomena in understanding language and consciousness, referencing Kurt Gödel’s incompleteness theorems. This preoccupation may be somewhat out of fashion, or not to have had obvious algorithmic successes, but had great cultural influence, including on me.
Two short programs that write themselves or write about whether they write themselves. One outputs its source. The other, encoding Russell’s paradox, attempts to output its source only if it does not output its source, leading to a stack overflow. A quote from “Borges and I / Borges y Yo” and the repetition of the phrase “In all of these words can there be found this author, who chooses to write these words?” attempt to relate this to the problem of authorship.
It took just a couple hours and was great fun.
Basically complete at: https://repl.it/@notkrd/do-i-write-myself, https://github.com/notkrd/doiwritemyself,
Secret and Sensibility:
Word or character-based RNNs are neural networks that try to naively predict a text as a sequence of glyphs. Their naivety, general function fitting, claims to be “unreasonably effective.” Parameters to tune include the length of the “window” considered at each step, the number of cells, the number of layers, the nature of cells in the network (common choices are LSTMs with gates for inputting, outputting, and forgetting, and GRUs with gates for inputting, updating, and resetting), the representation of characters or words (all equivalent or themselves encoded in embeddings), and more sophisticated augmentations (“attention” mechanisms, changes to loss function, switching to a Generative Adversarial Network style architecture). I don’t know much about how to engineer all this, but the models can function well enough as a black box from various tutorials.
Produces text from an LSTM word model trained simultaneously on Gutenberg editions of Pride & Prejudice, Sense & Sensibility, and the erotic Victorian novel My Secret Life to suggest something about the figure-ground relationship between what a corpus includes and what it systematically excludes: in this case “absolute style” and smut. It will generate an arbitrarily long block of semi-readable text, fluctuating between these two modes.
What I like about My Secret Life is how extreme and direct this reversal is, though it may be very problematic in other ways. An alternative would be a colonial novel from an overseas subject of the British Empire near the turn of the 19th Century – but does this novel exist? If there was some English novel, full text available, from around 1800, written by a brown or black person in the Caribbean or India (fortune from a forgotten uncle in Jamaica or a captain returning from a posting in Bombay, which they do not speak of, being staples), I would like that even more, but don’t know of it. The closest thing I can think of would be slave narratives, which I think are too heavy handed or appropriative to use here. Is there a clear choice of countertext for Shakespeare instead? Also, I don’t mean the relationship to Austen to be purely parody – I take that fiction quite seriously, took one of my favorite classes on Austen & Eliot, it just seems to have a particularly tight, controlled, relationship with what it excludes.
Tested, not viewable. Waiting on a new computer with GPU.
Nonspecifically, those that consider language to simply be interchangeable with its material substrate, and material to be understood from its sensory impressions. Hume, maybe?
A pun on a “concrete poem.” Uses a 3D Unity game-space to subject 3D modeled words to Unity’s physics engine. They fall from the ceiling and the player, using first-person DOOM-guy style controls may exert force to attract and repel the variously textured words. Maybe, for hype, this should be done in VR?
In addition to the problem about materiality (and the simulation / skeuomorphism of it) I’m interested in the prefab logic, even prefab ontologies, that come with working in a free(mium) game engine, with logic for physics, lighting, movement, included or adapted from someone else. One puts words into this simulation space in order to explore that space.
Draft with video documentation https://qfwfq.itch.io/language-kept-happening
The Great Game of aPlayername1870’s Amazing Gaming:
Narrative realist fiction. The corresponding formal theory might be that of narratology, or the computational modeling of narrative.
A novella about the rise and fall of a professional Starcraft 2 gamer, eventually defeated by a match fixing scandal. It is told in forum posts by a fan, researched by the fan spending many hours watching the gamer stream himself playing the game. Ideally to be self-published with a few copies printed (if that is affordable).
I like reinterpreting the act of writing fiction in this way, as a play with the patterns from conventional fiction-writing instruction (not a kind I received). I know the concept of “language game” refers a little bit to chess, and not at all Starcraft 2, but am interested in exploring what performing that crude literalization anyway does. I follow this esport, and do find its cultural situation fascinating, tied up in contradictory metaphors, staging nationalism in ways similar to but different from traditional sports, teams establishing corporate identities.
Sitting on my Google Drive, waiting for heavy editing down and repurposing.
Contemporary syntax (minimalist program / merge), while it no longer assumes the construction of preliminary and final trees, variously transformed, totally distinct from surface structure, still works with a very particular kind of evidence: syntax papers center on the contrast of grammatical and ungrammatical example sentences (marked with an asterix) that aim to be uncontroversial, but when possible, show a measure of wit. While I think most in the field no longer use the same story from the 1960s (universal grammar, the language acquisition device) some part of that model, where a grammar machine parses sentences into (headed, binary) trees seems to remain ubiquitous and fundamental, despite the empirical fact that most things people say simply do not parse into such trees without much munging.
A generator contrasts grammatical sentences (built with an implementation of a merge grammar) with parallel ungrammatical ones, created by reversing or breaking the constituency rules. The trick is that the ungrammatical versions aim for metaphoric or poetic resonance, while the grammatical ones aim for flatness.
The construction of linguistic evidence strikes me as a constrained literary practice, and a fascinating one. Lecercle talks about a “rag-bag” of wordplay and odd language we gather: I feel linguists are actually doing something parallel, if what totally different intent, by talking to one another about language in terms of “donkey sentences” and “pied-piping” and all these other very canonical, very fraught, examples.
Other Models Being Considered
The metaphors in this discipline of computing seem very interesting: agents, messages, accommodating failure, elections. I’m thinking about what poetic structure could accomodate messages (perhaps single words, or suggested poetic moves) being sent between a variety of small units, which are somehow assembled into a morphing text. This could be a good place to do something with electronics: it is straightforward to install Elixir (a programming language for distributed systems) on arduino, so a device in the installation could somehow receive and display messages in LED, or send signals when a button is pushed, or something like that.
Speech Act & Discourse Theories
I want to generate a “novel” for NaNoGenMo (National Novel Generation Month, initiated by Darius Kazemi) that tracks a group of people making and breaking promises, apologies, assertions, pronouncements, to one another. It would encode a minimal speech act theory in the style of JL Austin, and simply simulate the extended interaction of agents acting in such a world.
The Guide to Nonexistent Birds: An Ornithological Logic:
A constrained essay in comments to a Prolog model that generates texts resembling a birdwatching guide. The essay comments on the OuLiPo, language and code. A new third comment-essay (to the lyrical comments in the first essay) will critique the original essay I wrote to try to unpack some of its assumptions about logic and poetics. One of the things I would also seriously grapple with in the third essay – and which it will be an important challenge to speak about properly against the tone of the original piece – is the sexual violence of Pablo Neruda, who epigrams the essay, not to mention that of Charles Bukowski who has a joke-poem included in the original (or in other first projects, extensive references to John Searle’s theories, when it recently came out that UC Berkeley ignored, presumably due to the prestige of those theories, a series of sexual assault allegations against him from students). I don’t know if this exercise would work or not – likely not! – in the context of the thesis, whether there is anything productive to say, but I may eventually find a more than naive way to write about the complicity of these received forms I work, at times exuberantly, near (military tech, leering lyricism).
OHELLOHI ILIEHIDE Variations:
Response to the prompt to create a booklet from a single 80×66 page of Python code, that operates by applying various transformations to a grid of these 6 characters, including Conway’s Game of Life: http://kavid.xyz/OHELLOHIILIEHIDE.html. Aims to get at the way digital text can “flicker between being data and language.”
Overall Contexts & Motivations
A first problem, in investigating the use of language, is the question of which second-order language to make use of in one’s investigations. It is this question of discourses about language that TITLE UNDECIDED remains focused on. A presupposition here is that we cannot presume one form, such as mathematical logic or the assertive impersonal essay, can describe all the relevant phenomena adequately, and that the moment where one second-order discourse attempts to represent another (as is happening here) reveals much about both discourses, as well as their limits. There may also be a specific contemporary problem with the speed of proliferation of formal yet incommensurable discourses about language: word-embedding models, categorial grammars, discourse theories, temporal logics, and so on have each gained prominence in a technical literature and demand substantial work to understand the effects of their use. It seems (and I will not defend this further) that a logicist (which may or may not be synonymous here with logocentric?) understanding of meaning is assumed across these forms of theory, but simultaneously the models themselves contradict one another, and are not even structurally similar in the forms of their logic. How is a digital writer, in this situation, to research the material they work in?
This project has precedent in the writings of movements that articulate their literary production as a form of investigation or intervention into language, and the discourses commenting on or mediating it: Oulipo, L=A=N=G=U=A=G=E Poetry, Fluxus, net.art and Interactive Fiction. I will develop my conceptualization of this intervention, and relation to these movements (which are all, like the language, diffuse clusters of activity and resemblance rather than discrete clubs) in the thesis writing.
There are many popular tropes to describe such investigations proceeding by accumulation, juxtaposition, correspondence, and fibration, rather than primarily synthesis, unification, and hierarchization. These distinct, contradictory, metaphors include Walter Benjamin’s constellations, Deleuze and Guattari’s alluring rhizomes, Feyerabend’s epistemic anarchy, Wittgenstein’s “sketches of a landscape” resisting singular direction and “language games,” Hito Steyerl’s “free falling” perspective, and simply bricolage; finding ways to articulate the epistemic function of such playfully non-systematic “artistic” investigations remains a deep challenge: is this “research”?
This is a fairly abstruse or formalist mfa thesis (this is in no way a prescriptive choice: I believe I have substantial ethical & political commitments but do not currently see a way to honestly involve this MFA project with them). TITLE UNDECIDED may still have a specific educational use, simply in drawing attention to some culturally unfamiliar algorithmic models that are nonetheless coming to operate dramatically in the world, through search & translation engines, sentiment analysis, voice assistants, and so on. One hopes that, for example, “playing” with word-embeddings for the first time will give a reader a better sense of how the use of word-embeddings for sentiment analysis can encode racial and gender biases that are already expressed in the way Wikipedia or news articles are written. The idea that in general algorithms reproduce cultural formations is spreading, through books like Algorithms of Oppression, and it is also necessary to investigate the ways specific models enable this; I hesitate to claim my intended interventions are “tactical” but do think the asymmetry of our cultural awareness to the scope & scale of these language technologies is still so great that mere non-advertising exposure, visualization or “textualization,” is more likely to help than harm.
(limiting here to one per author; will format & expand citations properly upon request; all partially engaged with, but of course still a rather aspirational list for 5-6 months)
Theoretical (Critical / Analytical)
Mary Burger (ed.) – Biting the Error: Writers Explore Narrative
John Cayley – Grammalepsy
Wendy Chun – Programmed Visions: Software and Memory
Jacques Derrida – Of Grammatology
Paul Feyerabend – Against Method
Vilém Flusser – Does Writing Have a Future
Noah Wardrip-Fruin – Expressive Processing
Alexander Galloway – The Interface Effect
Douglas Hofstadter – Metamagical Themas
David Jhave Johnston – Aesthetic Animism
Saul Kripke – Naming and Necessity
Jean-Jacques Lecercle – The Violence of Language
Jean-François Lyotard, Jean-Loup Thébaud – Just Gaming
Nick Montfort – Twisty Little Passages
Lisa Nakamura – Digitizing Race: Visual Cultures on the Internet
Safiya Noble – Algorithms of Oppression
William Van Orman Quine – Word and Object
Bonnie Ruberg, Adrienne Shaw (ed.) – Queer Game Studies
Warren Sack – The Software Arts
McKenzie Wark – Gamer Theory
Ludwig Wittgenstein – Philosophical Investigations
J.L. Austin – “How to do things with words”
Christopher Bishop – Pattern Recognition and Machine Learning
Shan Carter, David Ha, Ian Johnson, Christopher Olah – Four Experiments in Handwriting with a Neural Network
Noam Chomsky – Syntactic Structures
Herbert Enderton – A Mathematical Introduction to Logic
Daniel Jurafsky, James H Martin – Speech and Language Processing
Hans Kamp – “Discourse Representation Theory”
Andrej Karpathy – “The Unreasonable Effectiveness of Neural Networks”
Angelika Kratzer, Irene Heim – Semantics in Generative Grammar
Donald Knuth – Digital Typography
H.P. Grice – “Logic and Conversation”
Saunders Mac Lane – Categories for the Working Mathematician
Richard Montague – “English as a Formal Language”
Ernest Nagel, James R. Newman – Godel’s Proof
J Pennington, R Socher, C Manning – “GloVe: Global vectors for word representation”
Gerald Jay Sussman and Hal Abelson – Structure and Interpretation of Computer Programs
Cesar Aira – The Musical Brain
John Ashbery – “Whatever it is, wherever you are”
Pippin Barr – works on https://www.pippinbarr.com/category/games/
Charles Bernstein – Girly Man
Jorge Luis Borges – Labyrinths
John Cage – “Lecture on Nothing”
Italo Calvino – Cosmicomics
Anne Carson – Float
Inger Christensen – It
Alfred Jarry – Opinions and Exploits of Dr. Faustroll Pataphysician
Darius Kazemi – Teens Wander Around a House
John Keene – Counternarratives
Milton Läufer – works on https://www.miltonlaufer.com.ar/
Jackson Mac Low – Pronouns
Harry Matthews, Alistair Brotchie (ed.) – Oulipo Compendium
Alice Notley – Grave of Light
Alison Parrish – Articulations
Everest Pipkin – picking figs in the *.garden
Joan Retallack – Afterimages
Emily Short – The Annals of the Parrigues
Gertrude Stein – Tender Buttons
Rosmarie Waldrop – Gap Gardening
Christine Wertheim and Matias Viegener (ed.) – The /n/oulipian Analects,