Conclusions from 4 Months of AI Usage

I've been using Claude Code every day now for almost four months and it's been enough time for me to update my thoughts on the tool.

But first, someone e-mailed me asking how I feel about Claude Mythos not being openly released. My answer on this can be derived from what I said in @grok explode his balls about the two fundamental questions of being: how do you act when you have power over others? How do you act when others have power over you?

Following the Soul Society mythos, worlds that answer these questions correctly survive, worlds that don't, don't. So Claude Mythos being locked down due to safety concerns is, on balance, a good thing. It's the responsible answer to the first question. And me being okay with this and not sperging out is a responsible answer to the second question.

As I think about magic systems often, one question I always have to answer when designing one is: why wouldn't magic users just kill each other all day every day, right? If you have telekinesis, why wouldn't you just slash someone's throat open if they annoy you? Shin Sekai Yori, my favorite artistic artifact ever, answers this creatively.

Most authors answer it through a combination of societal structure and the way magic defenses work: you could overwhelm someone weaker, but there's always someone stronger, so you don't, and instead bind yourself to guilds, houses and clans until your faction can make a play for power. It's a simple solution that's overdone and that I'll steer away from in my stories, but it genuinely solves the problem.

Our world is no different. How do you handle the existence of powerful tools such as the presence? And the answer is that you initially restrict them and bind them to the societal structures of the world, such that the people who have access to them will mostly use them responsibly.

Now, the US government has apparently had access to Mythos for months. Is the US using it responsibly? Not really. I think the most prescient argument I've read on how US behavior will evolve over the years is this:

I wrote this a few years ago but people still aren't aware of how a chaos maximization strategy serves US interests. People like to compare US hegemony to late imperial Rome but aren't cognizant of the fact that once Rome no longer had the strength to enforce the Roman peace or Roman law, that didn't mean it no longer had the strength to create Roman disorder. The strategic policy of the Eastern Roman empire along frontiers it could no longer govern was to make them ungovernable so no one could. Once the US can longer be the city on a golden hill, it will not fade quietly but instead seek to be king of the dung heep.

What this means in practice is taking advantage of it's relatively secure and isolated geostrategic position and creating bush fires across the rest of the world for others so that it itself is viewed as an oasis of stability and security in comparison so that financial flows can be redirected to itself from the rest of the world.

Once you move from a win-win game to a lose-lose game, the strategy becomes to make sure that others lose more in relative terms than yourself. It is why the US is undermining the international system it publicly espouses to uphold and why it's immediate satrapies are being put to the torch first. There are of course ways to strike back at the US strategy here, but none of them are particularly pleasant and beyond the mental blocks of people who cling to Liberalism or even just basic decency.

This was faintly visible to me in 2022, but time has shown it to be a largely accurate lens.

As a Brazilian, I'm mostly insulated from the problems of the world. Brazil's primary exposure is fertilizers, which means food prices will go up, but likely never enough to break down society fully. As long as I keep working and people keep enjoying my works, I'll mostly be fine.

Brazil is also interestingly mixed between US and Chinese interests. In a WW3 scenario we'd probably be forced to side with the US, but I personally prefer China. I don't like the American system of governance. I don't think in such a scenario my apartment would be bombed... I don't think anyone has claims on the land because it was promised to them three thousand years ago, but you never know.

Regardless, I don't have a problem with the American people themselves. Any people in their position would develop this similar kind of arrogance that permeates them. They're human beings like everyone else. Plus, their arrogance is kind of earned, we're talking about this because of a primarily American advancement.

The point being, Claude Mythos being locked down isn't that bad. It's good that Anthropic is acting responsibly as it gains more power. It's neutral-to-bad that the US gets first access to further its position globally. It's neutral to me personally, as my work doesn't benefit much from more raw intelligence. So overall it's... fine?

Now, for the main post itself. Four months ago, in Opus 4.5, I wrote:

I want to read every single line it writes. I think not doing so would be irresponsible.

I also proposed the Verifiers and Readers duality: verifiers build scaffolding for the AI to run autonomously, readers stay in the loop and read everything.

Time has changed my opinion somewhat.

The duality remains, it's just more like automatable work vs. non-automatable work. Indie game development, the kind of work that has to be tasteful, remains largely non-automatable. The decisions of what to build, how it should feel, what the player experience should be like, and so on, all require high-level vision and taste. This generally isn't something you can specify well enough in a prompt or encode as a test.

But calling part of the duality "readers" was wrong. You simply do not need to read the code. The models are good enough that it just isn't necessary. I haven't read any code Claude has written for Orblike since the game started, pretty much. main.lua has grown to 12000 lines and it's fine. I tell the robot what to do, it does it, and the fact that I don't read anything doesn't really manifest itself as a visible problem.

In the past I said things like "forcing myself to read everything does make me slower" and "my plans are measured in centuries, slow but careful baby steps now will compound into giant strides later." Turns out, I didn't last more than like 3 months, lol. The goal-oriented mindset creates an irresistible pull towards productivity and it just wins. The code got away from me quickly, and once it was gone, that's that.

Now, just because this is true, it doesn't mean it has to be true forever, at least not for me. One issue I have with this is that I don't feel a sense of ownership over the codebase. This sense seems important to not lose.

The way the loss happens seems clear to me too. The AI constantly makes locally correct decisions that are globally suboptimal, and those decisions accumulate silently because no one is reading the code from a high-level to notice issues. I'll often catch Claude doing something in a way that works, but it really is a dumb hack that is wrong and against the spirit of doing the task correctly. The fix to this hack in the future will also work, because Claude can just do it, but the actual correct solution was available N steps ago. By the time you notice, if you do, the technical debt is baked in and largely intractable.

Each individually reasonable hack creates context that makes the next hack more likely. The AI is lazy and wants to conserve tokens. It explains why Claude Code is actually such a dogshit app where they can't even fix the fucking scrolling bug for months, while the model can also hack China's super secret advanced drone database at the same time. It's really smart and capable, but the more it works on the same codebase without a human maintaining a global mental model, the worse and lazier it gets.

Claude Code's own UX makes this worse. The terminal workflow nudges you into not owning the codebase. You see only the snippets that Claude chose to focus on, and the diffs on those snippets. You don't see the file the way you would if you were browsing it yourself. There's no physicality, no sense of place in the codebase, you don't know where things are, you don't know how data flows.

This is where tools like Cursor have an inherent advantage. The feeling of physicality and place in a file turns out to be important for keeping technical debt under control, in my estimation. Claude Code's workflow pretty aggressively removes it, and I didn't realize how much that mattered until now.

So the solution for me, personally, is simple: I have to build my own space.

A single app I use for everything, suited specifically for my own tasks, built entirely with my engine. An app where I control the UX, where the AI integration works exactly how I want, where every feature exists because I need it, where the sense of ownership is absolute because I built everything myself.

As an example, why do I need to use Windows' filesystem to interact with files my games need? Why do I need to navigate five subfolders deep into a sound pack directory to find a specific attack sound? There's no reason for me to be bound by this system. I should be able to tag files with relevant keywords using AI, then search by tags. If I right-click a file, I should be able to assign it to a project, and then the system handles the routing, organization and processing for that project.

More generally, why do I need to write blog posts in Notepad, make music in a separate DAW, code in NeoVim or Claude Code, browse files in Explorer? Every context switch between apps is friction and lost state. Instead I should have an app with named modes, press a key, the workspace reconfigures itself for writing, press another, it's configured for code. Each mode has its own layout, its own set of active agents, its own keybindings. Context switching becomes instant and lossless because the app holds all the state.

All my notes, references, ideas, reading highlights, all of it could be arranged on an infinite 2D canvas where things have positions that I remember spatially. My brain can remember where things are way better than what they're called or their paths. This is a particularly strong mode of working that leverages how some human minds work best (visual memory) with what I primarily do, which is draw objects to the screen.

And also... say I'm working on a visual effect for a specific attack in Orblike and I want to write a short blog post about it. I should be able to drag the running effect directly into the blog post, and readers should be able to interact with it just as I interact with it in my development view. There's no reason this can't happen with current technology. Game artifacts, blog posts, design docs, music, it's all the same thing, all referenceable from each other.

This also aligns with my creative goals.

I have seven stories that need to merge with games in specific ways. The structure is: books and games are separate, but there's an interactive version of each book that contains the game elements inline.

To do this with normal web technologies, I'd need HTML/CSS + hacks to make interactive elements work. It's much easier to just display the website as a game entirely. Displaying a website and displaying a game are the same thing, it's all just putting pixels on the screen in the exact way you want. My engine already compiles to WebAssembly and runs in browsers, so my website becomes an Anchor app, blog posts are rendered by the engine, interactive elements are then trivial because they're just additional game objects.

This does mean that I have to build lots of functionality myself, like UI and text layout systems, but it's worth it. And it all also can be used by my games, so it's doubly worth it. Actually, it's triple worth it. Everything I build for this app has potentially triple use, since it can serve both my development environment, my website, and my artistic artifacts such as games, books, music, etc.

An idea I had that I ended up deciding against was to further complicate this omega app by making it a 3D anime MMO kind of thing, just visually, though. Like, it's an actual living world where you control a character going through a central hub/city which is the workspace.

Different districts correspond to different kinds of work; a workshop for code, a library for writing, a studio for music. NPCs are AI agents, each specific to a kind of work, i.e. an Architect that has context about code structure, a Writer that has all context on my writings and opinions on how I phrase things, a Composer who is the same for music, etc. Files are items in an inventory. Quests are todo tasks. The city grows (and dies) organically as you add and jump from project to project.

This isn't what I'll actually do, but it's spiritually the right framing, which is that the felt sense of place matters a lot. The practical version of the app keeps some of the ideas that have genuine functional benefit, like spatial canvases, persistent AI agents with different contexts, filesystem-as-database, but drops the fantasy elements.

In the end, either framing works, but the core insight remains the same: I need to own my space.

So those are my conclusions. The progression was: Claude Code steals the sense of ownership over the codebase, I realize ownership and a sense of physicality matters more than I thought for code, I realize ownership and sense of physicality in my workspace matters even more than for a single codebase, I conclude that I need to own my working space fully, just like I own my engine.

This is the natural conclusion I reached from using these tools seriously for months. I think this will be the main side project I have going for a while, while Orblike remains the main project. Orblike will have to be finished using Claude Code, without me fully owning or understanding the codebase. As I said, it works, and I don't think there will be a point where Claude can't keep working on it. But for future projects, I'll build the omega app first. Future games and other works will be developed inside my own space, with my own AI integration, my own editors, filesystem and publishing pipelines.

Building such a tool is not trivial, even if the AI handles the implementation. I still have to do a lot of design work and have the right ideas. But Anchor itself was built in five days and it ended up being way easier than I expected. Each individual feature for this app is similarly tractable. Regardless of how long it takes, it's fine. I'm building something I'll use for the rest of my life, so I should take the time.