Sandbox Coding Agents Suck (right now)
Background sandbox coding agents like Cursor, Codex, and Copilot have not really taken off. As a YC founder who uses coding agents daily, here is why.
1) They cannot run the code or tests
If I see one more PR from Copilot that does not compile or fails a basic typecheck, I am going to lose it. What is missing: Coding agents should run on machines with the full developer environment. That means browser access, Sentry logs, Linear context, internal documentation, and debugging tools. Everything a software engineer at the company would have access to. Ramp figured this out, which is why they built their own internal tool. Cursor and Codex still fall short.
2) Developers cannot easily access or control the agent’s machine
At some point in the future, we will not need to babysit agents. We will just point them at a problem and they will write perfect code. That is not reality today. Limited visibility and limited control bottleneck adoption, especially for fast moving startups and enterprise teams.
Why I am ranting
Like Ramp, I found existing implementations from Cursor, Codex, and Copilot pretty underwhelming. Here's what I actually want:
- Multiplayer & handoff: I want to spectate teammates working on features and then continue their work in the same chat with all context preserved.
- Better UI/UX: TUIs are great for early adoption and hacker vibes, but they are not performant or accessible, even with modern terminals like Kitty or Ghostty. The web is good at this.
- True dev environments: Agents should be able to open a browser and inspect error logs directly. I spend an absurd amount of time taking screenshots and pasting console or API logs. Why are we still doing this in 2026? Control I want to edit code in my IDE, SSH into the machine, and test features in my own browser. The lack of this is what keeps me chained to locally running agents today.
We solved this internally by building a tool called Chopin. We're going to open source it soon, because this is infrastructure you should not have to reimplement yourself. I'll be sharing updates. Follow if you're interested.