Practice: Multi-agent integration patterns

Self-check questions

Answer each in your own words first, then open the answer to check.

Q1. Describe the three integration patterns in one sentence each.

Show answer

Shared origin: all agents push to one remote, lead integrates there.
Per-agent fork: each agent on its own remote, lead pulls from each fork.
Shared worktrees: all local, nothing pushed until integration final.

Q2. Which pattern is the default starting point for most multi-agent workflows, and why?

Show answer

Simplest setup. Uses standard tooling without special configuration. Branches are visible for other-human review. Matches the workflow most teams already know.

Q3. Why is per-agent fork the right choice when working with untrusted or external agents?

Show answer

Quarantines the agent’s work on a separate remote. The team’s main remote is never polluted with unvetted code. The lead can vet the work freely (including discarding it entirely) without leaving traces. Maps to the open-source forking workflow where external contributors push to their own forks.

Q4. What’s the disadvantage of shared-worktrees vs the other two patterns?

Show answer

No backup, no remote visibility during the work. If the lead’s machine dies, the work is gone. Other team members can’t see what agents are doing until the lead pushes. Doesn’t scale to teams with multiple concurrent fleets.

Q5. List the seven responsibilities of the lead orchestrator role.

Show answer

Plan, spawn, watch, integrate, catch semantic conflicts, push gate, cleanup.

Q6. What’s a semantic conflict, and why can git not detect it?

Show answer

Code merges cleanly (git produces no line-level conflict) but behaves wrong because two agents had inconsistent mental models. Examples: renamings, duplicate utilities, mismatched field names, mismatched return types. Git can’t detect them because they cross file boundaries without overlapping at the line level.

Q7. List the three layers of defense against semantic conflicts.

Show answer

Tight agent scope, reduce overlap to reduce conflict surface area
Integration tests at each merge step in the integration sequence
Lead reads the integrated diff before pushing, judgment work that no test framework substitutes for

Q8. What did the Clawless 2026-06-04 sprint teach about per-agent tests vs the lead-stage build?

Show answer

Per-agent tests are NECESSARY but NOT SUFFICIENT. The lead-stage build (full-system, after all merges) caught ~300 latent breakers that per-agent tests had missed. The build gate at integration is the most important guardrail.

Q9. When should you abort and re-run an agent vs repair its branch by hand?

Show answer

Abort and re-run when:

The agent misunderstood its scope and a refined prompt would clearly help
The bad work is more than a few quick fixes
The work is small enough that re-running is cheap

Repair the branch when:

The work is mostly good with targeted issues
The fixes are small and obvious
Re-running would lose work that’s worth keeping

Q10. Why is the build gate at integration time “non-negotiable”?

Show answer

It catches what nothing else can. Per-agent tests catch individual-agent issues; integration tests catch boundary issues at the merge level; the build gate catches the cumulative interactions of all agents’ work as a complete system. Skipping the build gate has historically been the source of multi-agent integration bugs that shipped.

Pattern selection drill

For each situation, identify the right integration pattern and justify the choice:

A solo developer running 4 agents on their laptop for a 2-hour focused sprint
A 10-person engineering team where one engineer at a time runs a multi-agent fleet and the team wants to peer-review agent work before integration
A team piloting agents from an external AI services provider whose code quality is not yet trusted
A startup where the founder runs nightly multi-agent fleets and other team members occasionally inspect the agent branches in the morning before final integration
A maintainer of a large open-source project receiving AI-assisted PRs from external contributors

Semantic-conflict spotting drill

For each scenario, identify the type of semantic conflict and propose how the lead might catch it:

Agent A renames compute_average to compute_mean. Agent B writes a new function that calls compute_average.
Agent A adds a new field created_at to the User model. Agent B adds a new endpoint that returns User with field creation_time.
Agent A adds a database migration adding column X. Agent B adds a database migration dropping column Y. The migrations must run in a specific order; the agents’ code assumes their own migration runs first.
Agent A adds a utility format_currency(amount, code). Agent B adds a different utility format_money(value, currency) doing the same thing.
Agent A changes the return type of get_user() from User to Optional<User>. Agent B writes new code calling get_user() and using the result directly without null-checking.

Mock fleet integration drill

Simulate a 3-agent fleet locally:

Create a fresh repo. Make an initial commit on main.
Create 3 worktrees with new branches:

git worktree add -b agent-1/x ../wt-1 main
git worktree add -b agent-2/y ../wt-2 main
git worktree add -b agent-3/z ../wt-3 main

In each worktree, deliberately introduce changes:
- wt-1: Add a file lib.py defining function add(a, b) returning a + b.
- wt-2: Add a file app.py that imports from lib and uses add. (Note: lib.py exists only on wt-1’s branch right now; wt-2’s worktree won’t see it until merged.)
- wt-3: Add a file tests.py that tests the integration.
Commit in each worktree.
From main repo, integrate:

git checkout -b integration main
git merge agent-1/x
git merge agent-2/y       # should succeed
git merge agent-3/z       # should succeed

Run the tests. Did everything work?
Now make a SEMANTIC conflict: edit one of the agent branches’ files to use a slightly-different name (e.g. wt-1 calls it add but wt-2 calls it sum_two). Re-do the integration. Tests fail. Catch it.

Scenario reflections

Scenario A: Your team is starting multi-agent workflows for the first time. What’s your recommended starting pattern, and what process discipline do you put in place from day one?

Show answer

Start with shared-origin. Process discipline:

Tight agent scope, non-overlapping when possible
Integration tests covering boundaries between agent scopes
Lead-stage build at the end of each integration sequence
Lead reads the integrated diff before pushing
Cleanup after each sprint (remove worktrees, delete branches)

Document the workflow. Train the lead role explicitly. Track metrics: integration time, semantic-conflict frequency, agent rework rate.

Scenario B: You’re integrating 6 agent branches. Three integrate cleanly with passing tests. The fourth merges cleanly but fails 12 tests. Walk through your decision process.

Show answer

Decision tree:

Read the test failures. Are they git-level conflicts (the merge had problems)? Or semantic (merge clean, behavior wrong)?
If semantic, are the failures concentrated in one boundary (agent-3 + agent-4)? If so, that boundary is the suspect.
Investigate the diff for agent-4 vs the integration of agents 1-3. Look for renamings, mismatched assumptions.
If the fix is small and obvious: repair agent-4’s branch and continue.
If the fix would be substantial: ABORT agent-4, refine its prompt to address the misunderstanding, re-run.

Don’t sink hours into rescuing a confused agent. 20 minutes of re-running often beats 2 hours of hand-repair.

Scenario C: Your team has been doing multi-agent fleets for 3 months. You’re seeing a recurring pattern of duplicate utility functions across agent branches. How do you address this at the process level?

Show answer

Process-level fixes:

Prompt agents with an explicit “before adding a utility function, check if one already exists” instruction
Have one agent (or the lead) own shared utilities; other agents must use what already exists or coordinate
Add lint rules that detect duplicate-logic patterns at integration time
Lead’s diff review includes an explicit “check for duplicate functions” step

The pattern fix is upstream of the symptom. Don’t keep catching duplicates in integration; prevent them at the prompt or scope-design step.

Scenario D: A non-technical stakeholder asks why the integration step takes so long when “the agents do the work in minutes.” How do you explain the lead’s role?

Show answer

The agents produce drafts in parallel. Each draft is independently correct, but combining them into a working whole requires:

Resolving conflicts where drafts disagree
Catching subtle mismatches where drafts make different assumptions
Running the full system to verify the combination actually works
Reading the combined result for judgment-level issues no test can catch

A useful analogy: imagine six authors writing different chapters of a book in parallel. Each chapter is internally well-written. But the editor still needs to make sure the character names match, the plot timeline is consistent, the tone is unified, and the book reads as one coherent work. The lead’s integration role is similar.

The work the lead does is genuinely high-leverage. It’s not pure overhead; it’s the step that converts parallel drafts into shippable software.

Flashcards

Q. Pattern 1: shared origin

All agents push to one remote. Lead pulls from there to integrate. The default for most teams.

Q. Pattern 2: per-agent fork

Each agent pushes to its own remote (fork). Lead adds each fork as a named remote. Strongest isolation, used for untrusted agents.

Q. Pattern 3: shared worktrees

All worktrees on lead’s machine. No remote push until integration final. Fastest iteration, smallest blast radius.

Q. When to use shared origin

Default. Other-human review needed in-flight. Standard team workflow.

Q. When to use per-agent fork

Untrusted agents. External contributors. Hard isolation requirements.

Q. When to use shared worktrees

Fast solo iteration. Lead controls whole sprint. Maximum control with minimum cleanup.

Q. Lead orchestrator role (7 parts)

Plan, spawn, watch, integrate, catch semantic conflicts, push gate, cleanup.

Q. Semantic conflict

Code merges cleanly (git happy) but behaves wrong because two agents had inconsistent models. NOT detected by git.

Q. Three defenses against semantic conflicts

Tight agent scope (reduce overlap). 2. Integration tests at each merge step. 3. Lead reads integrated diff before pushing.

Q. Lead-stage build

Full-system build at integration step. Most important guardrail in multi-agent work. Catches what per-agent tests miss.

Q. Clawless 2026-06-04 lesson

Per-agent tests passed; lead-stage build caught ~300 latent breakers. Per-agent tests necessary but not sufficient.

Q. When to abort agent vs repair

Abort if a refined prompt would be clearer and re-run is cheap. Repair if the agent’s work is mostly good and needs targeted fixes.

Q. Push gate

The lead is the only one who pushes integrated work to main. Last line of human review.

Q. Semantic conflict example: renames

Agent A renames function X to Y. Agent B’s code calls X. Merges cleanly, behavior breaks.

Q. Semantic conflict example: duplicate utilities

Two agents add similar helper functions in different files with different names. Merges cleanly, code is wasteful.