The Robot Never Tires

Leaving agents do agent things for a few weeks, sometimes with adequate oversight.

May 04, 2026

I was not a fan of LLM agents¹ for a while, mostly because I've seen a lot of crap being pushed out in the wild from a subset of their supporters.

I enjoy the process of crafting code: seeing all this agents discourse took a toll on me for a few weeks and made me re-evaluate my position in the software engineering field... Then I tried Claude Code.

It was awesome.

I tried open models, they were also pretty good, and cheaper! I could even run some of them on my old gaming desktop²!

Communities formed, people started the race to the bottom, who's gonna distill the next model to run on a Core2Duo? We'll see!

geesawra

the open-weight model distills scene exhibit the same aura as early days XDA-Developers

The largest threat to the software engineer that works with agents is not deskilling, it's burnout: the agent is always working, but you lowly human, cannot.

I spent a couple of days prompting the robots for fixes on my Android app from my phone, then waiting for the CI to push out an update.

Did it feel amazing? Yes.

Would I do that again? Hell no, I worked hard to put boundaries between me and my day job, last thing I need are my side-projects taking over the little free time that I can still get a hold of.

Conversely, I think the mobile agentic coding scenario has legitimized the existence of foldable Android phones: longest product-market fit cycle I've ever seen!

geesawra

Claude Code remote will make foldable phone sales skyrocket, mark my words

The unexpected upside of using agents to do coding work is that it helped me hone my writing quite a bit.

Typing "please fix" into a Claude Code session isn't effective - if anything, it may be the most destructive thing you could do on your codebase - so I moved to a "spec" heavy workflow, in which... I just write what would've been a detailed ticket for a coworker.

See a bug? Know how to fix it? Write it down in the most detailed way possible, but instead of putting it in the backlog, give it to the robot and see what plan it comes up with.

I found this approach to be incredibly effective, and paired with a good harness and LLM, can produce very good results that more often than not require little modification.

I worry for open-source maintainers. We've already seen projects going down the source-available model because of the unbearable weight that LLM-generated pull requests have on them.

On one hand, agentic coding opened the floodgates to anybody with a credit card to git clone a repository and do their edits: that's cool.

On the other hand, 9 times out of 10 the changes aren't worth upstreaming, not even reviewing: it's crap that may be working for a single user, but doesn't reach the quality threshold for everyone else.

I found myself in this situation in the past two weeks: I received my Xteink X3³, and roughly two hours after flashing the Crosspoint firmware I was already prompting Claude to build an Instapaper integration and fix out-of-memory errors due to large books.

It's wonderful and it works great, but since I'm not in a position to review several hundred lines of Arduino C++ with good judgement, I decided I will not be opening PRs for any of those changes and create a a personal fork instead.

This may come across as selfish, but as someone who has reviewed - and reverted! - several slop PRs at my day job, I'd rather not inflict this pain on anybody else.

Maybe I'll find some time to go through the code with a fine-tooth comb in the future, or maybe the Crosspoint folks will pick up my slack and write a better version themselves.

I'm fine with either.

Anyone can grab it, but I'm not gonna pretend it's fit for anyone else. If my slop breaks, I'm the one dealing with it.

Malleable software is powerful, but should be used responsibly.

geesawra

see, this is the kind of LLM-driven workload that i would not feel confident pushing upstream i know _a few things_ about how elf loading works, but since I can't reliably verify that what claude wrote isn't crap I won't push it last thing i want is to contribute to maintainer burnout

My position about agentic coding after a month of intensive use:

Claude Code is good but it's not the only agent harness out there, alternatives like Pi are better suited for the conscious software engineer that's looking for coding assistance, not intellectual offloading.
Starting a codebase from scratch by only using agents is a recipe for failure. Agents need context, and most importantly they don't know anything about your style preferences. Design and implement the basics, document thoroughly, then let the agent loose.
Just because someone's writing code for you doesn't mean you shouldn't have a say in what they write. Reviews are a must, especially if you're working in a team.
Have fun! Vibe-coding is a fantastic tool... That should be wielded consciously, don't push your slop on the rest of us just because.
Don't underestimate open/local models. The gap between e.g. Kimi K2.6 and Claude Opus 4.6 is much, much smaller than you think if used with the right harness.

Agents are an awesome tool to work on that side-project that you were never gonna finish, they reignited the passion for a project that I started a few months back, but never had the willpower to finish, or at least bring to a passable MVP status.

Tales of a native Bluesky Android client - Gee's Sawras

It took a while but the code looks much better now Kotlin is cool but it still smells too much like java It has Result<> though!

https://geesawra.industries/3m2k7hgb4hs25

But they are also a tool of unfathomable pain when used wrong.

Also called harnesses: tools that "wrap" an LLM into an interactive tool. Harnesses are often developed as CLI tools or IDE extensions.

Quality varies based on how much GPU memory you have, but nowadays you can run good models in 16GB of VRAM.

The coolest e-ink reader out there.

My favorite Computers are e-ink

llms

agents

feelings

Gee's Sawras

This is not my actual name btw