Coding From Your Phone: Why AI Agents Went Asynchronous

When Cursor put a native iOS app into public beta this summer, available to everyone on a paid plan, my first instinct was to roll my eyes. Phone apps for developer tools are usually where good intentions go to die. You get a notification viewer, maybe a read-only dashboard, and a marketing bullet point. I have seen this movie a dozen times.

This is not that movie. The Cursor app lets you launch AI coding agents in the cloud from your phone, and, more interestingly, remotely control agents that are already running on your local machine. You can kick off a refactor from the train, check on a bug fix from the kitchen, redirect an agent that wandered off course while you are standing in line for coffee. And once I actually used it that way for a week, I stopped thinking of it as an app and started thinking of it as a confession. The industry is quietly admitting that coding has become asynchronous, and the phone is simply the most honest interface for asynchronous work.

Think about what a phone is good at. It is terrible for writing code. Nobody is typing a binary search on a glass keyboard. But it is excellent at exactly three things: sending a short, clear message, reviewing something someone else produced, and getting interrupted at the right moment. Those happen to be the three core activities of managing a coding agent. You write the brief, you review the diff, you intervene when something smells wrong. The phone is not a compromised version of the desktop workflow. For this workflow, it is arguably the native device.

The deeper shift is that the agent no longer needs you in the room. You kick it off and walk away, the way an earlier generation walked away from a long compile, except the thing you left running is not mechanically executing your instructions. It is making decisions. That makes the mental model less “build server” and more “coworker you delegated to”, and I would encourage you to take that framing literally, because everything about how you should behave follows from it. You would not hover over a coworker watching every keystroke. You also would not hand a coworker a one-line ticket that says “fix the auth stuff” and act surprised when the result is wrong.

The skill that matters now is the brief, not the keystrokes

Here is my actual opinion, and it is the whole point of this piece: typing speed is now close to worthless as a differentiator, and the people who will pull ahead over the next two years are the ones who can write a good brief. A good brief states the goal, the constraints, the things the agent must not touch, and how you will judge success. It anticipates ambiguity instead of discovering it three hours later in a 900-line diff. When my agents produce garbage, it is almost always because my instructions were garbage. The agent did not fail. My delegation did.

The second skill is review, and I mean real review, not the polite skim we all learned to give human pull requests. This is where asynchronous work gets genuinely uncomfortable, so let me be honest about it. When you watch code get written, you absorb context for free. You know why that guard clause exists because you saw the failing test that prompted it. When an agent works unattended for forty minutes and hands you the result, all of that context is gone. You are reviewing a stranger’s code with your own name on the commit. Approving it from a phone screen, with your attention split, is exactly how a subtle data-handling bug ships to production. My rule is simple: the phone is for steering and for rejecting, and anything I am going to merge gets reviewed on a real screen with real attention. If you take one practical habit from this article, take that one.

The third skill is knowing when to intervene, and this one is underrated because it cuts both ways. Interrupt too often and you have gained nothing, you are just pair programming through a smaller window. Interrupt too rarely and the agent burns an afternoon confidently building the wrong thing. Getting the checkpoint rhythm right, roughly the way a good manager checks in on a new hire, is a genuine skill, and almost nobody teaches it. I wrote in a companion piece on this site this week about Claude Sonnet 5 becoming good enough to drive agents on its own, and that development makes this calibration more important, not less. The more capable the worker, the more damage a bad brief does before anyone notices.

The management problem nobody planned for

Now zoom out from your own workflow to a team of thirty, and the shape of the problem changes. If every developer is running two or three unattended agents across Claude Code, Cursor, and Copilot, who actually knows what is being built right now? Which agent wrote which code? Where is the review bottleneck? I am not surprised that a platform like Journi just launched DevOS specifically to help organizations measure and manage AI-assisted development across these tools. Agent use is spreading faster than anyone can track it, and when a whole product category appears to solve a problem, the problem is real. My contrarian take is that most teams do not need a dashboard yet. They need three boring agreements: agents get scoped briefs, agent code gets reviewed harder than human code, and someone owns every running agent by name. Tooling helps. Discipline helps more.

The platform vendors clearly believe this shift is permanent. At Microsoft Build 2026, Windows Agent Framework 1.0 hit general availability, which means agents are becoming an operating system primitive rather than an app feature. Microsoft also confirmed that Project Polaris, its own coding model, will replace the older model inside GitHub Copilot by August 2026. Read those two announcements together and the message is unambiguous: the largest developer platform on earth is rebuilding itself around software that works while you are not looking at it. Cursor’s iOS app is the consumer-visible tip of that same iceberg.

This is part of our ongoing Using AI Like a Pro series, and the pro lesson here is blunt. The craft is migrating up a level. It used to live in your fingers, in the speed and precision with which you turned intent into syntax. It now lives in your judgment, in how clearly you specify intent, how ruthlessly you review the output, and how well you sense the moment to step in. Plenty of developers will resist this because keystrokes feel like work and delegation feels like cheating. I understand the feeling and I think it is a trap. The engineers I respect most were never valuable because they typed fast. They were valuable because they knew what to build, what to reject, and what to leave alone. The phone in your pocket just became a legitimate tool for all three. Treat it that way, brief carefully, review harder than feels necessary, and let the agent do the typing. That part was never the job anyway.