Software 3.0 Is Coming. Long Live the AI Manager.
Why the most valuable skill in the age of AI isn't coding, but judgment.
TL;DR
I wanted to see what the future of software development really looks like, so I built a macOS app from the ground up using Claude Code. This hands-on test gave me a clear picture of what AI for software development is like today. Here’s the gist:
The Good Part: The AI is fantastic for getting a project off the ground. It handled all the boring boilerplate, helped me learn a framework I hadn't used in years, and made trying out new ideas incredibly fast.
The Hard Part:As soon as things got complicated, the AI required a lot of supervision. It behaved like a junior developer who is extremely fast but has no memory, getting stuck on the same bugs, “helpfully” breaking things that already worked, and constantly needing reminders about the project’s goals.
The Real Skill: My main takeaway? “English isn’t the new programming language”; good judgment is. The job is shifting from just writing code to being an “AI Manager.” Your role becomes setting the direction, defining what “done” looks like with things like tests, and being the ultimate quality check.
What This Means for Us: Instead of making developers obsolete, AI actually makes your experience and ability to see the big picture more valuable. The most important skill isn't how fast you can type, but how well you can think.

I’m building a startup in the AI space, so it’s safe to say I’m optimistic about the technology. But let's be honest, the hype has gotten a little ahead of reality. A popular idea, borrowed from Andrej Karpathy's "Software 3.0" vision, has been simplified into a catchy but incomplete slogan: "English is the new programming language."
While I agree we're in a new software paradigm, my experience shows it's far more nuanced. You can certainly use English to get things done, but the real skill isn't just talking to a machine; it's a new discipline I'd call agent engineering. This got me wondering: to what extent can this new craft actually replace the need for deep engineering expertise?
To find out, I set out to build a native macOS app using only Claude Code and ChatGPT, completely avoiding a traditional IDE. It had been a decade since I last built a desktop app, back in the days of XPCOMM for Firefox extensions. My goal was an "Agent Workspace" to manage and compare agents from different model providers, orchestrate their workflows, and track their runs.
The Honeymoon Phase: Rapid Prototyping with Claude Code.
The first few days were fantastic. For the first time in a long while, I experienced that pure joy of just willing software into existence. The usual friction of development, such as setting up boilerplate, fighting with syntax, or hunting through obscure documentation, simply wasn't there. Claude Code instantly filled my knowledge gaps in SwiftUI.
This wasn't just about moving faster; it was about maintaining creative momentum. I could stay focused on the what instead of getting bogged down in the how. Claude Code handled the rote work, which brought back a sense of fun that can sometimes get lost in the daily grind of professional coding.
Most importantly, iteration became incredibly cheap. I could pivot and explore different user experiences without that sinking feeling of wasted time. This ability to rapidly prototype is a powerful, letting you explore a problem space at a pace that was previously out of reach.
The Reality Check: When Claude Code Gets Confused
Eventually, as the app's complexity grew, the honeymoon ended. Claude Code began to feel like an infinitely fast but naive junior developer who required constant supervision. The initial feeling of flow was often interrupted by the tedious reality of debugging and course-correcting.
Debugging AI-Generated Code: The Markdown Loop of Despair
My app needed to render Markdown, which should have been straightforward. The Markdown view was hidden, so I asked Claude Code to debug it and apply a fix. Instead, Claude Code got stuck in a loop, cycling through plausible but incorrect solutions such as native views, WebViews, and third-party libraries. This is a classic junior developer move: trying solutions from the internet without first understanding the root cause of the problem.
The debugging process itself was the most revealing part. Claude Code suggested diagnostic steps, added debug statements, and then asked me for the logs and screenshots. With that feedback, it correctly identified a view-sizing issue and understood why it was failing. But when I asked it to remove the debug code and apply the fix, it seemed to have forgotten the root cause, suggesting the same flawed approaches instead of making the one necessary change: setting a minimum height.
The Context Chore: Keeping the Memory Updated
In Claude Code, you can provide a Claude.md file for project context, which is a useful idea in theory. I set one up at the start, detailing the app's goals, architecture, and my coding style preferences. While this seemed to help initially, I found it hard to know when to update it as the app evolved.
As someone who isn't the best at documentation, keeping this file current felt like a chore I kept putting off. Claude Code already had the necessary context from our ongoing conversations, but none of it was saved for long-term use. This felt like a missed opportunity, as it led to repeated mistakes, such as designing an API in a way I had previously instructed against.
Taming the Eager AI: From Unplanned Changes to "Plan Mode"
Claude Code also had a habit of making unsolicited changes, much like an eager developer who "helpfully" refactors code outside the scope of their task and breaks something else. Features that were working fine would suddenly stop after a seemingly unrelated change. This brittleness put the burden of correctness and QA right back on my shoulders.
Eventually, I learned to ask it to plan its changes before implementation, which helped. A helpful tip for this is enabling plan mode with Shift + Tab, so you don't have to ask every single time.
The Best-Case Scenario Fallacy
I should be clear that this experiment was a best-case scenario. It was a self-contained, greenfield project with no legacy code or complex integrations. Most real-world products don't have that luxury.
The issues I faced with context management and brittleness would be an order of magnitude greater in sprawling, 15-year-old monoliths. If this level of supervision is needed in a clean environment, it would be a significant obstacle in the messy reality of most systems. My experience represents the floor, not the ceiling, for the level of human judgment required.
The Real "Software 3.0": Rise of the AI Manager
This experiment helped me form a new mental model. The future isn't about replacing developers; it’s about elevating them into a new role: the AI Manager. This person provides the architectural vision, contextual judgment, and quality assurance that the AI, as an execution engine, lacks.
The "naive junior developer" metaphor is a good start, but it's not entirely accurate. A human junior has a world model and learns from their mistakes; an LLM does not. It's a statistical engine that performs pattern matching, and it lacks true causal reasoning and persistent memory between sessions. This makes the manager's role even more critical.
AI's Impact on Developer Workflow: Flipping the Ratio
In the past, a senior developer might spend 60% of their time on execution (coding) and 40% on architecture and strategy. My experience suggests AI coding assistants like Claude Code are flipping that ratio. They handle the first 90% of code generation, but in return, they demand that we dedicate 90% of our cognitive effort to supervision, strategy, and problem-solving. Your value shifts from typing speed to the quality of your judgment.
A New Kind of Pair Programming
The best mental model wasn't giving orders to a subordinate, but pairing with a brilliant, lightning-fast partner who has zero common sense or memory. This partner can write the code, but you have to provide the direction, context, and judgment. My workflow became asynchronous: I'd delegate a task to Claude Code, work on something else, and then come back to review the completed code at my convenience.
The Power of Verifiable Contracts
My role shifted from writing implementation code to defining the proof of correctness. Vague English instructions led to frustrating loops, but precise, machine-testable specifications created a clear target. Test-Driven Development (TDD) proved to be an especially effective tool for this.
By writing a failing test first, I gave Claude Code a verifiable contract. Its job was no longer the ambiguous "build this feature" but the concrete "make this test pass." This gave Claude Code a mechanism for self-correction, though it's not a perfect solution; TDD didn't help with my Markdown issue. Still, it transformed the workflow into an improved loop of automated execution and verification.
Conclusion: Human Judgment Remains at the Core
It’s easy to write these issues off as temporary limitations of today's models. While the tools will certainly improve, assuming this will automate away the need for a manager misunderstands what software development is: translating ambiguous human intent into precise machine logic.
No matter how capable models become, a human with domain expertise and architectural wisdom will always be needed to resolve ambiguity and set the direction. The nature of that supervision will evolve from fixing syntax to defining strategy, but the necessity for it won't disappear.
This means our professional growth will be defined less by our coding speed and more by the quality of our judgment. As AI handles more boilerplate and scaffolding, every developer will become significantly more productive. This doesn't mean fewer jobs, but rather that more people will be empowered to build, and the total amount of software will grow.
The real shift will be in how we organize, with teams becoming smaller and more leveraged. In this future, the developer's primary role is to provide the strategic direction and creative oversight that machines can't. The most valuable developers of tomorrow will be the best AI Managers of today.
What’s your experience with AI coding assistants?


