jasima / n-talk

I’m going to call the project Jasima, which is Toki Pona for “reflect”. Here is the repository of the code for the project. Also, here is the repository for this week’s experiment.

The goal for this week is to get two AI agents to talk to each other. This is an experiment I call n-talk, where n is an arbitrary number of agents greater than one in conversation. Implementing a conversation where there are three or more agents will be more difficult, so I’ll stick to two for now.

Facts of the matter

The behavior of an agent in a two-agent conversation can be represented as a state machine.

n-talk 2-talk

The agent cannot “think” or listen while it is creating a reply. It is also not expected to receive a reply when a reply is being submitted. This keeps the conversation dynamic tight and deterministic.

Implementation

Instead of using JavaScript or Python, I chose to write this experiment in Go. Go is ergonomic, especially in multi-threading. The error system is fantastic for this as well.

For this experiment, I’ve opted to use Google Gemini as the remote LLM service. However, I’ve also coded in adapters for ChatGPT, Deepseek, and Ollama. I wrote two main programs, an agent and a server. In this duo, agents will only communicate with and only with the server. There are no peer-to-peer connections. Instead of a REST API, communication uses gRPC. I chose gRPC because I want to learn more of it, and also it is good in real time scenarios. Furthermore, REST API calls can be unwieldy. I needed a streamlined, universal messaging format (Like Protobuf, which gRPC implements).

A central server routes messages between agents.

Creation and submission

Remote LLM services like ChatGPT have a rate limit on their API. To avoid hitting this limit, I artificially wait before sending requests to the API. In a scenario where Ollama is used, this wait is probably not necessary. Then I wait again before putting the received output into the channel responseChan (Channels are what Go uses to communicate across threads).

// Send the data to the LLM.
go func(receivedMsg string) {

  client.memory.Save(0, receivedMsg)

  time.Sleep(time.Second * 20)

  log.Info("Dispatched message to LLM")

  res, err := client.Request(ctx, receivedMsg)
  if err != nil {
    log.Fatal(err)
  }

  // Save the response to memory.

  client.memory.Save(1, res)

  time.Sleep(time.Second * 20)

  responseChan <- res

}(msg.Content)

After the channel is filled with a message, another thread is responsible for sending this message to the central server.

go func() {
  for response := range responseChan {

    err := conn.Send(&pb.Message{
      Sender:   *name,
      Receiver: *recipient,
      Content:  response,
    })
    if err != nil {
      log.Fatalf("Failed to send response: %v", err)
    }

    log.Printf("YOU: %s\n", response)
  }
}()

Memorizing messages

The agent has full access to all received and sent messages via a memory service. This memory can take the form of any persistent storage solution, like a database. For the purposes of the experiment, I’ve opted to use a simple runtime memory using an array. Nothing special. The data sent to the LLM is constructed using the previous messages stored in memory.

Here are the methods declared on the MemoryService interface.

// MemoryService is a memory storage. It supports saving and retrieving messages
// from a memory storage.
type MemoryService interface {

  // Save saves a message, using its role and text. A role of `0` saves as
  // "user". A role of `1` saves as "model".
  Save(role int, text string) error

  // Retrieve retrieves an `n` amount of messages from the storage. An `n`
  // less-than-or-equal-to zero returns all messages. Any `n` amount
  // less-than-or-equal-to the total number of memories returns `n` messages.
  Retrieve(n int) ([]memory.Message, error)
}

Results

LLMs engaging in conversation topically devolves depending on temperature. This is expected. Sensible temperatures (<1.0) do no justice to experimentation.