jasima / Whos, Whats, and Hows

Academia is unconcerned with conlangs. In my research, I have found no experiments or case studies that evaluate or explore conlang evolution using simulation, especially with the involvement of AI. Conlangs are often relegated to the boundaries of the closed drawer, hidden away from the external world—or in the words of J.R.R. Tolkien, “a secret vice”. It is unfortunate that conlangs are relegated to these positions in academic spaces because they are instruments that unlock unlimited opportunities in exploring ways of constructing thought. Even though research focuses on natlangs, there is much to learn about the approach and implementation of technologies and experimentation techniques related to the evolution of language in general, which can then be applied to the study of conlangs.

Structures of Approach

This section details structures of approach that others in the field have developed in order to experiment with language evolution. These are general concepts rather than implementations. They are the “what” and not the “how”.

The framework for conducting an experiment as this is called a simulation. Rather than going out and observing people use language over generations of time to verify a hypothesis and perhaps deduce a theory, computers are used to simulate such evolving processes to support a hypothesis and perhaps induce a theory. Cangelosi and Parisi state that “simulation models are implementations of theories, and as theories they aim at describing reality at some essential and necessarily simplified level because in science it is simplification that produces understanding” (13). Therefore, simulations of any kind of phenomena are bottom-up rather than top-down experiments; scenarios are created and then observed.

Generational gaming

The common procedure for simulating the evolution of language is to create circumstances, like a problem or conflict, to incite and catalyze such a long-term process by creating something called a language game, which can be played by both humans and machines. Luc Steels describes a language game as a “routinized turn taking interaction”. Steels then describes the simple premise:

There is a shared cooperative goal… The speaker has a specific communicative goal, conceptualizes the world for language, and transforms this conceptualization into an utterance. The hearer must parse the utterance, reconstruct its meaning and map it into his own perceptual experience of the world. Games may fail in which case diagnostics and repair strategies are used by speaker and hearer to expand, adjust and align their language systems so that they may have more success in the future. (Steels 343)

This description has a few implications:

There is a shared cooperative goal…

All participants in a language game share the same goal. This means that to attain a goal, some kind of systematic method to delegate and distribute perception is needed. This is language.

The speaker has a specific communicative goal, conceptualizes the world for language, and transforms this conceptualization into an utterance.

The speaker, or transmitter, uses the instrument of language to construct the inner world, their conceptualization of the communicative goal held in their mind, and their outer world, transmitting this conceptualization with a medium.

The hearer must parse the utterance, reconstruct its meaning and map it into his own perceptual experience of the world.

The hearer, or receiver, maps the utterances to their symbolic counterparts, effectively using language to construct a bridge between their outer and inner world.

Games may fail in which case diagnostics and repair strategies are used by speaker and hearer to expand, adjust and align their language systems so that they may have more success in the future.

Failure is determined by whether or not the aforementioned shared goal has been achieved.

Language games facilitate a type of learning called iterated learning. Iterated learning is “the process by which a behavior arises in one individual through induction on the basis of observations of behavior in another individual who acquired that behavior in the same way” (Kirby et al. 108). Iterated learning is the method through which an individual acquires their native language. So in a sense, the game of real life is the language game for humans.

It is worth noting that the book section Language Evolution with Deep Learning by Dupoux et al. writes of the term communication games, which are defined as “a framework used to investigate how perceptual, interactive, or environmental pressures shape the emergence of structured communication protocols” (2). For the purposes of this project, I will use the term language games instead. The term “communication” defines too broad of a discourse space, since communication can span to multiple domains of communication, not just linguistic communication, such as the gestures of the great apes which indicate intention or attention (Tomasello 22).

Generational learning

In the setting of a computer simulation: In one generation, a predecessor agent teaches a successor agent, in the same way a parent teaches their child a language. After this teaching, the generation concludes. A new generation starts and the successor, or the child, from the previous generation becomes the predecessor, and now teaches a new successor agent. This generational sequence continues an arbitrary number of times. There does not need to be a single length generation lifetime for a predecessor. Predecessors one to n generations removed can continue to exist in the same generation as the present generation successors. Such variation in lifetime was conducted in experiments conducted in the paper Understanding Language Evolution in Overlapping Generations of Reinforcement Learning Agents by Lewis G. Brace and Seth Bullock.

The advantage of generational learning is that a simulation can simulate several hundred or thousand generations of iterations of a single language in a short amount of real-life time. This is useful when conducting experiments observing broad changes over long time spans. This project is concerned with the creativity aspects of a conlang as manifested by LLMs. There is no true numerical measurement to be collected or observed by the conlang’s change over time, but rather the changes that manifested in the conlang: the final continuous product. Therefore, multi-generational agent learning will not be a focal point of this project.

Agent interactions

The article Bias Amplification in Language Model Evolution: An Iterated Learning Perspective by Guo et al. explains an approach to language evolution using iterated learning. Learning can be done with just two agents, and consists of three phases (Guo et al. 3):

Imitation: an ignorant agent learns something from a predecessor.
Interaction: the same agent uses this new knowledge to execute a task (often conducted as a language game).
Transmission: after completing the task, the agent generates useful data for the next generation of ignorant agents.

To recap, language evolution experiments are often conducted using computer simulations that represent simplified versions of realistic variables. Simulations have a beginning and end, and are divided up into a certain number of epochs. In each epoch, one or multiple instances of a language game takes place. The language game is the scenario that is created for agents in the simulation to exercise the theory that is being tested. Language games facilitate iterated learning. The process of iterated learning is what generates useful data for the next epoch to instantiate the next instance of the language game. An arbitrary number of epochs may take place before the simulation’s termination.

Implementation of Structures

Before significant advancements in portable processing power in computers, genetic algorithms have been used to simulate language evolution. Algorithms do exactly as stated and are not flexible in terms of high levels of noise in signal to noise ratios. Neural networks (NNs) on the other hand are less terse and rigid, but still lack the agility of algorithmic processes that move quick because they require more computing power. Since neural networks are based on models of the human brain, “they can focus on the influence of both cognitive and neural mechanisms on language development and evolution” (Cangelosi and Parisi 401). This trait makes NNs favorable choices when modeling smaller levels of detail in language like the complex entwining of phonology and morphology, especially because they are significantly more resilient to noise in data processing. Faults and errors in a simulation have less opportunity and less probability to propagate through generations. However, genetic algorithms can be used to enhance NN neuron evolution. Cangelosi and Parisi used a genetic algorithm to evolve the connection weights of an agent’s neural network (404).

Coming into being

The book section Language Evolution with Deep Learning also provides a significant case study into the design of linguistic deep learning models in a simulation, from high level overviews of deep learning concepts to basic implementations. One of those implementations is the communicative agent.

The entities that talk to each other in a language are agents. One type in particular is a communicative agent (CA), which is an agent that has faculty for communication behaviors and interpretation of meaning. CAs are composed of functional modules and an internal map of systematic representations that act like a switchboard for each module’s operations (Dupoux et al. 5):

Perception module: maps observations of an environment to an internal representation.
Action module: maps actions to internal internal mappings.
Generation module: maps internal mappings to a message.
Understanding module: maps a message to an internal representation.

CAs fit into a sender-receiver framework. The aforementioned paper by Brace and Bullock explores the concept of this framework using reinforcement learning, in order to “offer insights into how lexical items become established within the context of the cultural evolution of human language in structured populations with overlapping generations” (1). CAs can send and receive lexical items, like words, within a particular circumstance, like in a problem-solving situation, and amongst or between a milieu, like in an environment where CAs can communicate with one another.

Iterated learning can be implemented with CAs. These CAs can exist in a simulated environment.

Putting the pieces together

One experiment done by Guo et al. was to make LLM agents—such as various GPT models, Claude, and Mixtral—“infer and generate [a] shared rule by summarizing several input-output pairs”. One generation is given a list of three pairings of input and output, where the input is understood to equate the output. The agent would then have to induce the underlying rules of what the output would be given a new type of pairing with the same inputs in a different configuration. The simulated world or another LLM would give feedback to the initial agent, explaining whether the submitted rules are wrong or not, and asking them to refine it again (6). Similar to what I will be proposing, this paper uses already-trained LLMs as agents in a simulation.

Another experiment done by Dupoux et al. is a visual discrimination game using reinforcement learning. A sender sees an image and tells a receiver. The receiver then has to guess what the image depicts amongst a set of other images. The original image is revealed to receiver and both the sender and receiver are rewarded according to their performance (16). This experiment does not use LLMs.

Furthermore, an experiment done by Brace and Bullock aggregate multiple agents from multiple generations in a single context. Agents are categorized by either being mature or immature. Mature agents have existed for more than one generation. Immature agents have only existed in the current one. Immature agents learn by observing the outcome of a language game in which two mature agents were the participants. After two generations, the mature agent is removed from the simulation. In each generation, agents in the immature group interact amongst themselves after observing the mature group interact. The experiment found that success of evolution is proportional to the number of participants and the more those participants interact with each other (495). This experiment also does not use LLMs.

All such experiments presented above use English or a morphology and syntax that is A Posteriori. The project I propose will use A Priori morphology and syntax. Also, any particular interaction that happened follow the interaction framework explained earlier.

Finally, one informal and on-going experiment regarding the evolution of language with participants unable to communicate without a shared language is the Viossa project, an online human community-driven conlang pidgin. International participants talk to each other in Viossa, a conlang pidgin, which was not constructed with intent to be an international language like Esperanto, but rather as a necessary way to communicate like a typical linguistic pidgin, like Hawaiian pidgin. There is no dictionary or translation or grammar standardizations. The bottom line is that English is not allowed and “if you are understood, you are speaking correctly”. This is on the exact opposite of experiments above, but is still worth mentioning because it is an example of the brute-force nature of language emergence amongst a population. It is the epitome of what a language simulation attempts to simulate.