Ability Scores, Skills and Items

Sun Jan 11 2026

Cover image

For a long time, coding felt like a craft practiced by hand. Engineers pored over pull requests line by line, choosing one expression over another, shaping code the way a carpenter sands a joint or a swordsmith hammers steel until it holds together just right. Tools helped, but they never led, and the work stayed grounded in close attention to detail. Recently though, working with Claude Code feels less like bench work and more like an RPG, where I’m giving high-level commands, juggling abilities, and watching entire moves resolve without me touching every step.

One way to understand how this happened is to look at how the tools evolved. For a long time, autocomplete was essentially mechanical, early IntelliSense felt helpful but shallow, good at finishing identifiers or method names, and clearly out of its depth beyond that. Around 2021, GitHub Copilot changed that by making autocomplete feel semantic rather than mechanical, so completion stopped being about characters and started being about intent. In 2023 IDEs like Cursor expanded that idea to larger scopes, where the AI wasn’t just finishing lines but editing files and responding to higher-level instructions. By 2024, models like Claude 3.5 Sonnet made this kind of assistance feel usable much more often, not because the job became easier, but because the model stopped collapsing as soon as the context got messy. And by early 2025, terminal-native agentic tools like Claude Code made it plausible to delegate real chunks of work, editing files, running tests, even pushing changes while staying in the loop as a reviewer rather than being the typist. Somewhere along the way, the unit of effort quietly shifted from writing code to deciding what should happen and then checking whether it actually did.

Lately, a typical day involves running 3 Claude Code agents at the same time, each working on a different task. I’ll prompt one, move on to another, come back to review the first, nudge it slightly, then repeat the loop. Many of these tasks would have felt too small or not worth automating before, back when we used to seriously compare the cost of automation against the benefit of it. Now they’re cheap enough to delegate that the calculation barely comes up. My workflow isn't as extreme as some of the engineers on Twitter (Claude Code creator runs 5 parallel Claudes!), but it already feels miles away from how software was typically written in the 2010s.

There’s a running joke I now have with a coworker where we call anything written by hand “artisanal.” If one of us ends up writing something directly instead of delegating it, we’ll point at it and say, “this is artisanal.” It’s funny partly because of how strange the phrase sounds. Code has always been written by hand, until suddenly, over the last three or four years, that stopped being the default.

The mental model that fit this experience best for me wasn’t craftsmanship or automation though , it feels more and more like playing MMORPG. Growing up playing games like The Great Merchant (巨商), Mabinogi (瑪奇), SGO (星夢) and Fairyland Online (童話), progress was never just about effort or repetition. It depended on how you built your character over time, what you invested in, what you equipped, and how you adapted when the game changed. Once you look at modern software engineering this way, the mapping becomes fairly natural.

Ability scores set the limits. In games, base stats grow slowly and quietly, and they rarely draw attention unless you neglect them. They don’t make things flashy, but they quietly constrain everything else you do. In software, they show up in familiar ways: Intelligence looks like the ability to reason about systems when information is incomplete, Wisdom shows up as judgment, knowing where to look when something breaks and when to stop digging and Dexterity shows up in the precision of reading unfamiliar code, debugging without knowing where the bug is, and making changes that fit cleanly into what’s already there. Even Strength has its analogue, showing up as the raw ability to push through work, untangling a messy refactor, finishing an unpleasant migration, or carrying a task across the finish line when it’s too tedious for cleverness. These abilities aren’t something you turn on; they’re always present, shaping every decision you make, whether you notice them or not.

AI hasn’t replaced these fundamentals, and in many cases it has made them more visible. It acts less like a substitute and more like a multiplier, increasing the reach of whatever abilities are already there. As AI becomes more capable, small differences in fundamentals tend to matter more, not less. The ceiling is still set by fundamentals, even as the pace of work increases.

Items are tools: LLMs, IDEs, and CLIs that make AI usable. Better models clearly matter, whether that shows up as larger context windows, stronger reasoning, or better tool access, but they’re also external, replaceable, and change quickly. On the SWE-Bench Verified coding benchmark, GPT-4.1 scored 54.6 %, while GPT-5.2 reached 80.0 % roughly eight months later, a reminder of how quickly the effectiveness of tools can shift. Just as importantly, different tools excel at different roles. Some models are better at vision and multimodality, others at sustained coding and refactoring, others at general conversation and synthesis. A wizard, a warrior, and an archer don’t carry the same weapons; here, stronger tools simply give your decisions more reach, and part of the skill is choosing the ones that fit how you actually work.

Skills sit somewhere in between. In practice, they show up as knowing how to use the tools well: delegating a change and then using /commit and /create-pr to turn it into a clean pull request, asking an agent to act as a code-simplifier to refactor and reduce complexity before review, or giving a model a video or screenshot as context so it can reason about a UI bug before touching the code. They also show up in smaller decisions, like constraining an agent so it only edits one file, stopping it early when it starts drifting, or deciding that a particular function is faster to rewrite by hand. These skills create leverage, but they’re contextual and temporary. They sit on top of fundamentals rather than replacing them, and they only work when you have a clear sense of what “good” actually looks like.

Modern software engineering now is about this kind of holy trinity in RPG, and ignoring any part of it comes at a cost. Engineers can move very quickly with powerful models and elaborate workflows while lacking the fundamentals to explain what’s happening when things go wrong, producing confident looking output that’s difficult to reason about or repair. Others have strong fundamentals but treat AI as something to be avoided, using it reluctantly or not at all, and gradually turning themselves into the bottleneck as the surrounding environment changes. In game terms, the problem isn’t using too much gear or too little of it, but relying on equipment without understanding the mechanics, or insisting on approaching new encounters as if the old rules still apply.

This framing also makes constant change easier to accept. In live games, patches rarely invalidate skill outright, but they do favor players who understand why their build works over those who memorize a single setup. Software feels similar now. Models improve, assumptions expire, and surface details shift, but the engineers who adapt best are the ones who understand the mechanics underneath their tools and workflows, and can adjust their approach when those mechanics change.

Once I started seeing the work this way, it became obvious what actually carries over between patches. Fundamentals still set the ceiling. AI workflows feel more like temporary buffs, powerful, sometimes decisive, but never permanent. Speed matters, but without judgment it just makes mistakes happen faster. What still surprises me is how quickly this stopped being theoretical. The MMORPG framing sounds playful, but it now describes an ordinary day job in software engineering. Managing agents, shaping context, and reviewing outcomes has quietly become normal work, not a thought experiment about the future.

You’re a wizard of the Arcane Tower now, standing just off the lane. Runes flare, sigils lock, and spells queue up: Context Bind, Agent Summon, Interrupt, Rollback, cast not for damage, but for control. You don’t trade blows, you manage cooldowns, watch mana, cancel miscasts, and let carefully prepared magic ripple across the map. It’s genuinely fun, and there’s rarely been a better time to be a software engineer.