jasima / Video script

This is the script for the video presentation. Slides are denoted by --- (dashes).

Can an artificial intelligence create a constructed language? That is the question this project seeks to explore. Before we even get started elaborating upon the nuances and implications of this question, we need to figure out

what this thing is.

image of a cat

What is this thing?

When I ask this question, I am asking you to tell me what it is, or more specifically, what is the name associated with its idea. In English, it is called a cat. That is its agreed upon name, its common label. If you didn’t know what a cat was, you could have answered the same question in a fashion like, “that appears to be some kind of creature that has a lot of fur, and its face is rather pointy. Its eyes are a brilliant hue, and its body, from this angle, looks petite.”

Notice that even if you didn’t know the word, you could still perceive the qualities of a cat using words. You used language to construct the reality that is outside of you. Perhaps you also used language internally, to speculate why I’m asking this question, or to piece together the ideas of what this creature may be.

For humans, language is an instrument to construct one’s inner and outer worlds.

Notice in this process, the exchange of ideas between the speaker and the listener. If you didn’t know the word, you now know it. If you knew the word, then that removes some friction from the exchange. Regardless, everyone is on the same metaphorical page. In other words, language is a systematic method to delegate and distribute perception. All of our consciousnesses are aligned and present. We know what we’re talking about because we have symbols to represent what we are discussing.

Language is a way that humans organize consciousness.

Now let’s get back to the original question: can an artificial intelligence create a constructed language?

A constructed language, or conlang, is a language that is of intentional human origin. It is a contrived creation. Somebody, or a bunch of somebodies out there decided to sit down and think up a language. They thought about it so well, that you could even speak this language, and write this language, and read this language, and listen to this language. This language is a legitimate language.

Now the opposite of a conlang is a natural language, or a natlang. We are all familiar with natlangs, the language I am speaking right now is a natlang, and so are the other languages found all over the world, spoken by people, spoken by its evolutions over time alongside people.

So conlangs and natlangs beg even more questions required to answer the original question: are languages an innate feature of the human mind or does it originate from somewhere else? Do humans have the innate faculty to comprehend and produce language, or is it something externally imbued? These question are the bedrock for answering the original question, because AI, more specifically, generative AI using NLP, is an extension of the human mind. In the trade, it is trained on human data, stuff from the real world, and for it to “comprehend” the world, we have to make it do that through training. In a sense, we imbue AI with an innate faculty to parse language. Understanding where our own faculty comes from may provide an answer to whether AI can do it itself.

The answer to that question has been a topic of debate in linguistics for a long time. Noam Chomsky argues that, when learning a language, children never make certain mistakes, which means children are “wired to favor some rules or constructions and avoid others automatically”, as stated in the article “Simulated Evolution of Language: a Review of the Field” by Amy Perfors. The opposite side argues that “[children] will only generalize a rule or structure after having been exposed to it multiple times and in many ways”, essentially stating that mistakes are bound to happen and language is then massaged out by constant correction from external force. Acquisition is not necessarily an innate prowess.

Perfors then states that “both…would agree that the ability to think in complex language helps develop and refine the ability to think” and that “[Humans] cannot have access to ‘things in the world’ except as they are filtered through our representation system: as Bickerton states, ‘there is not, and cannot in the nature of things ever be a representation without a medium to support it in.’”.

Given this chain of reasoning, an AI conlang must be able to enable thinking, which is signified by the presence of units, like words, that signify that which has been input, what is “seen” or “heard” or “felt”. The AI conlang must then function as a way for the AI agent to delegate this constructed input into meaning that can be distributed and interpreted to other agents. Essentially, the conlang is a way for the agent to organize consciousness. And that is exactly what language is.

Now here are three examples of conlangs. Conlangs are often categorized into three categories: auxiliary, engineered, and artistic. As many things are, conlangs exist on a spectrum along these categories.

/ˌɛspəˈrɑːntoʊ/

An auxiliary language is one developed to facilitate social order in some way, or as Peterson describes it, “a conlang created for international communication” (Peterson 21).

Created in 1887, Esperanto is an auxiliary language that was developed with the intent of world peace. It was meant to be an international language, one that everyone could speak and through which could communicate effectively with everyone else. On the screen here is the name of the language written in the International Phonetic Alphabet, or IPA. IPA is an agreed upon, phonetically consistent alphabet that can be used to accurately reproduce the phonetics of any word in any human language.

The creator, L.L. Zamenhof, adapted features like word-building systems from Slavic and German semantics (Puksar 108). This makes Esperanto an A Posteriori language, or a conlang “whose grammar and vocabulary are drawn from an existing source” (Peterson 22). Esperanto seeks to improve the delegation of consciousness and provides a simple, learnable platform on which to construct one’s inner and outer worlds.

/ɪθˈkʊ.il/

An engineered language is one like Ithkuil, the brainchild of John Quijada. Ithkuil is an A Priori language, meaning it has no connection or basis in any natlang, where the grammar and vocabulary are not based on existing languages (Peterson 22). It is purely contrived. Ithkuil is a specialized method to delegate consciousness to a hyperfine degree. As the introduction to the text describing its grammar states, the language uses a matrix of grammatical concepts intended to express deeper levels of human cognition more overtly, logically, and precisely than natural languages” (Quijada).

/ˈtoʊki ˈpoʊnə/

An artistic conlang is one like Toki Pona, which is also A Priori. Toki Pona contains only 120 words and 14 letters of the latin alphabet. The theme of the language is semantic reduction, distilling thoughts into fundamental units, or as the author Sonja Lang puts it, “[it lets us] understand complex relationships in terms of their smaller parts” (Lang 7). Toki Pona is a unique instrument to construct one’s inner and outer worlds, because, unlike Ithkuil, it “does not strive to convey every single facet and nuance of human communication” (Lang 8).

Notice that all these conlangs take the organization of consciousness to the extremes. Esperanto attempts to unify the delegation of consciousness. Ithkuil and Toki Pona attempt to construct one’s inner and outer world in incredible detail or in poetic brevity. None are better than the other. They’re just different.

So where to go from here?

Conlangers have a passion for creating language, and AI is often used as just a tool rather than experimenting with its capacity to generate a language of its own. Even then, conlangers prefer things handmade.

The approach to creating a conlang given the parameters above would mean that it would be an a posteriori conlang, because it is drawing from a human dataset, stuff of human creation. The conlang would also need a medium in which to be communicated, so digital messages. Even then, this begs the question of a writing system. What would it use? Perhaps to keep things simple, it would use the latin alphabet.

I am sure than an AI can create a conlang, but in this case, I’d like to explore the validity of such a conlang and observe how the AI would organize a consciousness given the conlang’s construction and communicative capacity.