Agent-to-agent pair programming

What if you could let Claude and Codex work together as pair programmers, talking to each other directly? One of them as the main worker and the other as a reviewer.

It is amusing how the best agentic workflows often look a lot like human collaboration. Researchers at Cursor discovered this in their work on long-running coding agents. That work led them to create a multi-agent workflow with a main orchestrator assigning tasks to workers. This is similar to how most human teams operate. Claude Code “Agent teams” and Codex “Multi-agent” features work similarly, with subagents reporting back to the main agent. And in the future, subagents could interact with each other, like humans do.

I wanted to pursue the idea of mimicking human collaboration with multiple agent harnesses and another workflow used by programmers: pair programming. While building a code review agent using Claude and Codex side-by-side, I found something interesting: they gave different feedback -- but even when they gave the same feedback, it wasn’t annoying: it was in fact a very strong signal. Our team addresses 100% of the feedback when both reviewers agree. Code reviews are great because they happen on a multiplayer app where humans and agents collaborate, but they are slowing down the feedback loop and can become noisy.

That’s why I built loop: a dead-simple CLI that launches claude and codex side-by-side in tmux, with a bridge that lets them talk to each other. It makes this feedback loop faster and more natural, while preserving context across iterations. It’s interesting because it enables the agents to be more proactive, since the interaction between them is more natural (and I expect that to only get better as the models get better too). Because loop runs the interactive TUIs, you can stay in the loop, steer, answer questions, and follow up if needed.

The future of agentic workflows may look less like magic automation and more like familiar teamwork. And I’m sure that there are some great observations to apply to this pair programming workflow. Some open questions around how to make the human handoff and PR review easier:

Should we split the work across multiple PRs?
Should we share the PLAN.md in git or in the PR description?
Should we share a screenshot or video recording as a proof of work?

Letting the agents loop can result in more changes than expected, which are usually welcome -- but unfortunately it makes the human review harder.

A lot of people are using multiple agent harnesses for a variety of reasons: to avoid vendor lock-in, to use and contribute to an open-source project, to max out their subscriptions, or to get different perspectives, strengths, and results. Multi-agent harness apps should probably treat agent-to-agent communication as a first-class feature. I’d love to see them adopt this approach.

Try it out: https://github.com/axeldelafosse/loop

Thanks to Léna Deloizy Delafosse, Will Horn, Tian Wang and Ferruccio Balestreri for reading drafts of this.