Conclusions from 6 Months of AI Usage

Continuing from Conclusions from 4 Months of AI Usage. The best thing I've done since I started using these models has been the design brief workflow. Forcing the model to describe what it will do to the codebase in plain text has turned out to be the best middle ground between too much and too little detail, to the point where the issues I mentioned in the previous post of needing to make my own tools for handling AI code have become unnecessary. You need to actually read the brief, though, as mistakes are often just in the middle of a bunch of text, as an innocent assumption the robot shouldn't have made that's easily skimmed through.

The other important realization has been that models will not reliably carry your intent across sessions, so intent has to present itself in the artifacts and structures the model interfaces with, not in instructions or external files it has to remember to load into context. In the case of code, this follows cleanly. A month ago I had an instance generate generic UI code for me, and I pre-loaded it with what I wrote in the emoji-merge tutorial, where I talked about gameplay code:

There are two types of gameplay code: action-based and rules-based gameplay code. Action-based gameplay code happens in games where most of the game's rules take place within game objects or when game objects interact. Most action and physics games are like this, for example: Spelunky, Risk of Rain, Hades, Isaac, Vampire Survivors, Fall Guys, etc. In most games like this, objects and interactions between objects are the primary way the game's design happens, and so it makes sense that there should be a 1:1 mapping between game objects and their representation in code. This means that for these kinds of games, they are best coded using a primarily game object oriented approach.

Rules-based gameplay code, on the other hand, happens in games where most of the game's rules take place above game objects. Most turn-based games are like this, but also various simulation games, puzzle games, card games and strategy games. For example: Cities: Skylines, Slay the Spire, Artifact, FTL, Slipways, Mini Metro/Motorways, etc. In most games like this, high level game rules are the primary way the game's design happens, and so it makes sense that there should be a 1:1 mapping between those rules and their representation in code. This most often makes sense with a function oriented approach, where ideally each rule is a function that does everything needed for that rule to work completely, and objects are mostly there as structs that hold data relevant to themselves and nothing more. In these games most of the gameplay code will be in the functions, and not in the objects, which is the opposite of the action-based games.

Most gameplay code can be placed somewhere between those two extremes, and it is my claim that knowing exactly where each piece of your game falls on this spectrum, and where your game as a whole also falls on it, is what makes a game's code easy to read and work with, versus making it an unmanageable and confusing hellscape. If a problem clearly is of the rules-based type, forcing the rules into objects is going to be a mistake that is going to make the game's code harder to reason about, because you'll effectively be dividing a rule that should be one thing into multiple objects. Conversely, if a problem clearly is of the action-based type, forcing the rule to be outside the object it belongs to will also be unnatural because often the rules are about how objects react or feel when something happens to them, and coding most of that outside the object itself would be incorrect.

Most of the hard problems in gameplay code are problems that are right in the center of the spectrum, where both solutions are needed in different places of it. A good example of this is UI code. UI has high level rules that have to be outside any one object (i.e. behavior that happens when multiple objects are selected, or when frames can be moved by the user and have to reorder how other frames look, etc), but each UI object also clearly has its own behaviors that can get quite internally complex. It's a perfect mix of needing both approaches, and people hate it because it's hard to context switch between both, since it's often hard to identify this distinction in reality in the first place. Retained mode UIs, for instance, are an example of an overly action-based solution. IMGUIs, on other hand, try to turn the problem into a rules-based one entirely, which might work depending on the kind of UI work you have to do, but doesn't work as well whenever you need to do fundamentally action-based things with your UIs that require stateful objects to have more ownership of the rules.

It is tempting to think that what I'm saying can be expressed as "object oriented vs. functional" or "stateful vs. stateless", but that would be a mistake. You can have very action-oriented code written completely procedurally or even completely functionally, and you can have very rules-oriented code written entirely in one of those languages that only allows functions inside classes. It's more about the fact that a game design rule exists, and this rule needs to be represented in code. There is a way to express this (design rule, code) pair in a way that comes naturally to most human brains, and you could say that this way is the ground reality, or the truth of how the (design rule, code) pair should be expressed. In the same way that a structural engineer has to consider physical rules in his calculations so the building doesn't collapse, a gameplay coder has to consider the reality of each (design rule, code) pair so that his code doesn't get unmanageable.

Deviations from these truths will generate complexity, and I would argue that most complexity in gameplay code comes from failure to properly identify the truth of each (design rule, code) pair. When a (design rule, code) pair is far away from its truth, coding any further design rules that depend on it becomes a problem, it feels as though you are coding against something that is resisting. When a (design rule, code) pair is close to its truth, on the other hand, the feeling is completely different, everything else that depends on that rule simply flows naturally from it as though it didn't even exist in the first place.

Most games have both types of rules in them, so whenever I'm coding something new I often ask myself: is this a more action-based game or a more rules-based game? And then further, what are this game's design rules, and then for each of those, is this an action-based rule or a rules-based rule? This offers a very nice and clean first cut for organizing your code, and I find that in lots of cases getting this right leads to prosperity, and getting it wrong leads to ruin. There is a reality to how gameplay code should be expressed, and that reality lives on this spectrum. Being able to identify it correctly is, to me, one of the most important skills I've developed so far, as this action-based vs. rules-based distinction has proven itself to be a useful way of thinking about gameplay code.

Given this initial framing, that instance was able to generate UI code that was unusually clean and faithful in how it adheres to these principles, and then all next instances, up to now, end up coding the same way, respecting the principles, even though they have not read the text above. They know instinctively because the code is written such that it makes embodying the right path the path of least resistance.

This means that first times are especially important, the first time you do anything new with the model, you're writing the template every future instance will copy, if the decisions here are good everything will be a breeze, if the decisions here are bad then it compounds in a bad way that will eventually collapse. This is, in a sense, obvious, and I mentioned it in the previous post, the model just propagates whatever currently exists in the codebase, but I don't think it was quite obvious to me back then that the solution was just... just get it right! This wasn't visible to me then because I didn't have the design brief technology.

This problem also exists in a non-coding context, but the solution is slightly different. The two secret projects I'm working on have ability definitions that the player can read. I really like how Artifact does the wording in their abilities, so I had the AI read all Artifact cards and extract their wording rules. Some rules are hard, others are soft, but overall it's all very comprehensible. Then I ask the AI, any time we're creating a new ability description, follow the rules. Five sessions and 20 abilities later, is the AI following the Artifact wording rules? No. Each session just has too much competing context and loading up the rules just doesn't happen, unless I explicitly ask it to do it, which I don't because I also just want to get the ability done quickly. What this means is that it's better to just let the AI word abilities however it will, and then dedicate a single session where you go through every ability at the same time, loading the rules at the start of the session. Batching for an eventual single session is better for long-running problems that are kind of soft, that require some creativity, as eventually you'll have 10 of these going concurrently and the AI just can't keep up with it all.

All this also means I've largely decreased my usage of Markdown files for plans, TODOs, keeping track of things generally. The AI should own the low-level portions of the project, but just because it can generate a decent high-level plan, or an architecture review, it doesn't follow that those documents will be useful. You still need to have a complete picture of the project in your head, so often what the AI generates will go unused unless you already have it as information that your brain can recall quickly, and if that's the case the document wasn't needed in the first place. Your job is to be the manager, and as the manager you need the high-level view of the project in your head, outsourcing that to the AI is a mistake.

In the end, I think ultimately I just re-learned what I already knew when doing things by myself, which is that there is a truth to code, or to words, or to any artifact created, and it's your job to find the right way to have that truth expressed. Once you find it, the model will follow that righteous path effortlessly, as it is aligned.