Practice: Multi-agent integration patterns
Self-check questions
Section titled “Self-check questions”Answer each in your own words first, then open the answer to check.
Q1. Describe the three integration patterns in one sentence each.
Show answer
- Shared origin: all agents push to one remote, lead integrates there.
- Per-agent fork: each agent on its own remote, lead pulls from each fork.
- Shared worktrees: all local, nothing pushed until integration final.
Q2. Which pattern is the default starting point for most multi-agent workflows, and why?
Show answer
Simplest setup. Uses standard tooling without special configuration. Branches are visible for other-human review. Matches the workflow most teams already know.
Q3. Why is per-agent fork the right choice when working with untrusted or external agents?
Show answer
Quarantines the agent’s work on a separate remote. The team’s main remote is never polluted with unvetted code. The lead can vet the work freely (including discarding it entirely) without leaving traces. Maps to the open-source forking workflow where external contributors push to their own forks.
Q4. What’s the disadvantage of shared-worktrees vs the other two patterns?
Show answer
No backup, no remote visibility during the work. If the lead’s machine dies, the work is gone. Other team members can’t see what agents are doing until the lead pushes. Doesn’t scale to teams with multiple concurrent fleets.
Q5. List the seven responsibilities of the lead orchestrator role.
Show answer
Plan, spawn, watch, integrate, catch semantic conflicts, push gate, cleanup.
Q6. What’s a semantic conflict, and why can git not detect it?
Show answer
Code merges cleanly (git produces no line-level conflict) but behaves wrong because two agents had inconsistent mental models. Examples: renamings, duplicate utilities, mismatched field names, mismatched return types. Git can’t detect them because they cross file boundaries without overlapping at the line level.
Q7. List the three layers of defense against semantic conflicts.
Show answer
- Tight agent scope, reduce overlap to reduce conflict surface area
- Integration tests at each merge step in the integration sequence
- Lead reads the integrated diff before pushing, judgment work that no test framework substitutes for
Q8. What did the Clawless 2026-06-04 sprint teach about per-agent tests vs the lead-stage build?
Show answer
Per-agent tests are NECESSARY but NOT SUFFICIENT. The lead-stage build (full-system, after all merges) caught ~300 latent breakers that per-agent tests had missed. The build gate at integration is the most important guardrail.
Q9. When should you abort and re-run an agent vs repair its branch by hand?
Show answer
Abort and re-run when:
- The agent misunderstood its scope and a refined prompt would clearly help
- The bad work is more than a few quick fixes
- The work is small enough that re-running is cheap
Repair the branch when:
- The work is mostly good with targeted issues
- The fixes are small and obvious
- Re-running would lose work that’s worth keeping
Q10. Why is the build gate at integration time “non-negotiable”?
Show answer
It catches what nothing else can. Per-agent tests catch individual-agent issues; integration tests catch boundary issues at the merge level; the build gate catches the cumulative interactions of all agents’ work as a complete system. Skipping the build gate has historically been the source of multi-agent integration bugs that shipped.
Pattern selection drill
Section titled “Pattern selection drill”For each situation, identify the right integration pattern and justify the choice:
- A solo developer running 4 agents on their laptop for a 2-hour focused sprint
- A 10-person engineering team where one engineer at a time runs a multi-agent fleet and the team wants to peer-review agent work before integration
- A team piloting agents from an external AI services provider whose code quality is not yet trusted
- A startup where the founder runs nightly multi-agent fleets and other team members occasionally inspect the agent branches in the morning before final integration
- A maintainer of a large open-source project receiving AI-assisted PRs from external contributors
Semantic-conflict spotting drill
Section titled “Semantic-conflict spotting drill”For each scenario, identify the type of semantic conflict and propose how the lead might catch it:
- Agent A renames
compute_averagetocompute_mean. Agent B writes a new function that callscompute_average. - Agent A adds a new field
created_atto the User model. Agent B adds a new endpoint that returns User with fieldcreation_time. - Agent A adds a database migration adding column X. Agent B adds a database migration dropping column Y. The migrations must run in a specific order; the agents’ code assumes their own migration runs first.
- Agent A adds a utility
format_currency(amount, code). Agent B adds a different utilityformat_money(value, currency)doing the same thing. - Agent A changes the return type of
get_user()from User toOptional<User>. Agent B writes new code callingget_user()and using the result directly without null-checking.
Mock fleet integration drill
Section titled “Mock fleet integration drill”Simulate a 3-agent fleet locally:
- Create a fresh repo. Make an initial commit on main.
- Create 3 worktrees with new branches:
git worktree add -b agent-1/x ../wt-1 maingit worktree add -b agent-2/y ../wt-2 maingit worktree add -b agent-3/z ../wt-3 main- In each worktree, deliberately introduce changes:
wt-1: Add a filelib.pydefining functionadd(a, b)returninga + b.wt-2: Add a fileapp.pythat imports from lib and usesadd. (Note: lib.py exists only on wt-1’s branch right now; wt-2’s worktree won’t see it until merged.)wt-3: Add a filetests.pythat tests the integration.
- Commit in each worktree.
- From main repo, integrate:
git checkout -b integration maingit merge agent-1/xgit merge agent-2/y # should succeedgit merge agent-3/z # should succeed- Run the tests. Did everything work?
- Now make a SEMANTIC conflict: edit one of the agent branches’ files to use a slightly-different name (e.g.
wt-1calls itaddbutwt-2calls itsum_two). Re-do the integration. Tests fail. Catch it.
Scenario reflections
Section titled “Scenario reflections”Scenario A: Your team is starting multi-agent workflows for the first time. What’s your recommended starting pattern, and what process discipline do you put in place from day one?
Show answer
Start with shared-origin. Process discipline:
- Tight agent scope, non-overlapping when possible
- Integration tests covering boundaries between agent scopes
- Lead-stage build at the end of each integration sequence
- Lead reads the integrated diff before pushing
- Cleanup after each sprint (remove worktrees, delete branches)
Document the workflow. Train the lead role explicitly. Track metrics: integration time, semantic-conflict frequency, agent rework rate.
Scenario B: You’re integrating 6 agent branches. Three integrate cleanly with passing tests. The fourth merges cleanly but fails 12 tests. Walk through your decision process.
Show answer
Decision tree:
- Read the test failures. Are they git-level conflicts (the merge had problems)? Or semantic (merge clean, behavior wrong)?
- If semantic, are the failures concentrated in one boundary (agent-3 + agent-4)? If so, that boundary is the suspect.
- Investigate the diff for agent-4 vs the integration of agents 1-3. Look for renamings, mismatched assumptions.
- If the fix is small and obvious: repair agent-4’s branch and continue.
- If the fix would be substantial: ABORT agent-4, refine its prompt to address the misunderstanding, re-run.
Don’t sink hours into rescuing a confused agent. 20 minutes of re-running often beats 2 hours of hand-repair.
Scenario C: Your team has been doing multi-agent fleets for 3 months. You’re seeing a recurring pattern of duplicate utility functions across agent branches. How do you address this at the process level?
Show answer
Process-level fixes:
- Prompt agents with an explicit “before adding a utility function, check if one already exists” instruction
- Have one agent (or the lead) own shared utilities; other agents must use what already exists or coordinate
- Add lint rules that detect duplicate-logic patterns at integration time
- Lead’s diff review includes an explicit “check for duplicate functions” step
The pattern fix is upstream of the symptom. Don’t keep catching duplicates in integration; prevent them at the prompt or scope-design step.
Scenario D: A non-technical stakeholder asks why the integration step takes so long when “the agents do the work in minutes.” How do you explain the lead’s role?
Show answer
The agents produce drafts in parallel. Each draft is independently correct, but combining them into a working whole requires:
- Resolving conflicts where drafts disagree
- Catching subtle mismatches where drafts make different assumptions
- Running the full system to verify the combination actually works
- Reading the combined result for judgment-level issues no test can catch
A useful analogy: imagine six authors writing different chapters of a book in parallel. Each chapter is internally well-written. But the editor still needs to make sure the character names match, the plot timeline is consistent, the tone is unified, and the book reads as one coherent work. The lead’s integration role is similar.
The work the lead does is genuinely high-leverage. It’s not pure overhead; it’s the step that converts parallel drafts into shippable software.
Flashcards
Section titled “Flashcards”Q. Pattern 1: shared origin
All agents push to one remote. Lead pulls from there to integrate. The default for most teams.
Q. Pattern 2: per-agent fork
Each agent pushes to its own remote (fork). Lead adds each fork as a named remote. Strongest isolation, used for untrusted agents.
Q. Pattern 3: shared worktrees
All worktrees on lead’s machine. No remote push until integration final. Fastest iteration, smallest blast radius.
Q. When to use shared origin
Default. Other-human review needed in-flight. Standard team workflow.
Q. When to use per-agent fork
Untrusted agents. External contributors. Hard isolation requirements.
Q. When to use shared worktrees
Fast solo iteration. Lead controls whole sprint. Maximum control with minimum cleanup.
Q. Lead orchestrator role (7 parts)
Plan, spawn, watch, integrate, catch semantic conflicts, push gate, cleanup.
Q. Semantic conflict
Code merges cleanly (git happy) but behaves wrong because two agents had inconsistent models. NOT detected by git.
Q. Three defenses against semantic conflicts
- Tight agent scope (reduce overlap). 2. Integration tests at each merge step. 3. Lead reads integrated diff before pushing.
Q. Lead-stage build
Full-system build at integration step. Most important guardrail in multi-agent work. Catches what per-agent tests miss.
Q. Clawless 2026-06-04 lesson
Per-agent tests passed; lead-stage build caught ~300 latent breakers. Per-agent tests necessary but not sufficient.
Q. When to abort agent vs repair
Abort if a refined prompt would be clearer and re-run is cheap. Repair if the agent’s work is mostly good and needs targeted fixes.
Q. Push gate
The lead is the only one who pushes integrated work to main. Last line of human review.
Q. Semantic conflict example: renames
Agent A renames function X to Y. Agent B’s code calls X. Merges cleanly, behavior breaks.
Q. Semantic conflict example: duplicate utilities
Two agents add similar helper functions in different files with different names. Merges cleanly, code is wasteful.