LangGraph Multi-Agent — Supervisor Pattern Guide (2026)
This LangGraph multi-agent guide walks through the supervisor pattern — where a central agent routes tasks to specialized sub-agents for research, code generation, and review. You will build a complete working system with state management, message routing, error handling, and production deployment patterns.
1. Why LangGraph Multi-Agent Matters
Section titled “1. Why LangGraph Multi-Agent Matters”Multi-agent systems decompose complex tasks across specialized sub-agents so each one focuses on what it does best — producing higher quality results than a single LLM call attempting everything at once.
Why Multi-Agent Systems Exist
Section titled “Why Multi-Agent Systems Exist”A single LLM call handles simple tasks well. Ask it to summarize a document, generate a function, or answer a factual question — one prompt, one response, done.
Real-world problems are rarely that simple. Consider a user request like “Research the latest pricing for AWS Bedrock, write a cost comparison script, and review it for accuracy.” This requires three distinct capabilities: web research, code generation, and code review. A single prompt attempting all three produces mediocre results across the board.
Multi-agent systems solve this by decomposing complex tasks into specialized sub-tasks. Each agent focuses on what it does best, with a coordination layer managing the workflow. The result: higher quality outputs, clearer reasoning traces, and the ability to assign different tools and models to different tasks.
For background on agent architectures, see AI Agents and Agentic Design Patterns.
Why LangGraph for Multi-Agent
Section titled “Why LangGraph for Multi-Agent”LangGraph provides the graph-based execution model that multi-agent systems need. Unlike simple chain-based frameworks, LangGraph gives you:
- Explicit state management — shared state flows between nodes with type safety
- Conditional routing — the supervisor decides which agent runs next based on the current state
- Cycles and loops — agents can re-route tasks, retry on failure, or request human approval
- Built-in persistence — checkpoint state for long-running workflows and recovery
If you are new to LangGraph, start with LangGraph Tutorial before continuing. For a comparison with LangChain’s simpler chain model, see LangChain vs LangGraph.
2. What’s New in 2026
Section titled “2. What’s New in 2026”| Development | Impact |
|---|---|
| LangGraph 0.3+ Command API | The Command object replaces manual edge routing. Agents return Command(goto="next_node") instead of relying on conditional edge functions |
Prebuilt create_react_agent | Sub-agents can be created with a single function call — model, tools, and system prompt in one line |
| Functional API | @task and @entrypoint decorators enable Python-native graph definitions without subclassing |
| LangGraph Platform | Managed deployment with built-in persistence, streaming, and cron jobs |
| Supervisor library | langgraph-supervisor package provides a prebuilt supervisor pattern out of the box |
| Swarm library | langgraph-swarm enables peer-to-peer agent handoffs without a central coordinator |
3. How LangGraph Multi-Agent Works
Section titled “3. How LangGraph Multi-Agent Works”The supervisor pattern is the most common multi-agent topology: one coordinator routes tasks to specialized sub-agents, then collects and merges their results.
The Supervisor Pattern Architecture
Section titled “The Supervisor Pattern Architecture”📊 Visual Explanation
Section titled “📊 Visual Explanation”LangGraph Multi-Agent — Supervisor Pattern
Supervisor receives requests, routes to specialized sub-agents, and merges results.
The supervisor pattern uses a hub-and-spoke topology. The supervisor node sits at the center and decides which sub-agent handles each part of a request. Sub-agents execute independently and return results to the supervisor, which can then route to another agent or deliver the final response.
This differs from a sequential chain (A then B then C) because the supervisor can skip agents, re-invoke agents, or fan out to multiple agents based on the task requirements.
4. The Supervisor Pattern Explained
Section titled “4. The Supervisor Pattern Explained”The supervisor uses structured output to classify each incoming task and route it to exactly one sub-agent, preventing ambiguous or overlapping execution.
Routing Logic
Section titled “Routing Logic”The supervisor agent needs to make one critical decision: which sub-agent should handle the current task? This is implemented as structured output from the supervisor LLM call using a Pydantic model:
class RouteDecision(BaseModel): next_agent: Literal["research", "code", "review", "FINISH"] reason: str # Why this agent was selected task_description: str # What the agent should doThe supervisor inspects the conversation history, identifies what needs to happen next, and returns a RouteDecision. LangGraph uses llm.with_structured_output(RouteDecision) to guarantee valid JSON from the LLM, and the graph routes execution to the correct sub-agent node based on the next_agent field.
When to Use the Supervisor Pattern
Section titled “When to Use the Supervisor Pattern”| Scenario | Supervisor Pattern? | Why |
|---|---|---|
| Tasks requiring 2-4 distinct capabilities | Yes | Each capability maps to a sub-agent |
| Sequential pipeline (always A then B then C) | No | Use a simple chain instead |
| Peer agents with equal authority | No | Use the swarm pattern instead |
| Dynamic routing based on user intent | Yes | Supervisor excels at intent classification |
| Human-in-the-loop approval workflows | Yes | Supervisor can pause for human review |
5. Building a Multi-Agent System
Section titled “5. Building a Multi-Agent System”The implementation below uses LangGraph 0.3+ with create_react_agent for each sub-agent and the Command API for type-safe routing between nodes.
Complete Python Code — Supervisor + 3 Sub-Agents
Section titled “Complete Python Code — Supervisor + 3 Sub-Agents”This implementation uses LangGraph 0.3+ with the Command API and create_react_agent for sub-agents.
import operatorfrom typing import Annotated, Literalfrom typing_extensions import TypedDictfrom pydantic import BaseModel
from langchain_core.messages import HumanMessage, BaseMessagefrom langchain_openai import ChatOpenAIfrom langgraph.graph import StateGraph, START, ENDfrom langgraph.prebuilt import create_react_agentfrom langgraph.types import Command
# ── Tools for each agent ──────────────────────────from langchain_community.tools.tavily_search import TavilySearchResultsfrom langchain_core.tools import tool
search_tool = TavilySearchResults(max_results=3)
@tooldef execute_python(code: str) -> str: """Execute Python code and return the output.""" import subprocess result = subprocess.run( ["python", "-c", code], capture_output=True, text=True, timeout=30 ) if result.returncode != 0: return f"Error:\n{result.stderr}" return result.stdout or "Code executed successfully (no output)."
@tooldef review_code(code: str) -> str: """Review code for bugs, security issues, and best practices.""" # In production, this would run static analysis tools return f"Reviewed {len(code.splitlines())} lines of code."
# ── Routing schema ────────────────────────────────class RouteDecision(BaseModel): """Supervisor's routing decision.""" next_agent: Literal["research", "code", "review", "FINISH"] reason: str task_description: str
# ── Shared state ──────────────────────────────────class AgentState(TypedDict): messages: Annotated[list[BaseMessage], operator.add] next_agent: str
# ── Sub-agents ────────────────────────────────────llm = ChatOpenAI(model="gpt-4o", temperature=0)
research_agent = create_react_agent( llm, tools=[search_tool], prompt="You are a research specialist. Search for accurate, " "up-to-date information. Always cite your sources. " "Return findings in a structured format.",)
code_agent = create_react_agent( llm, tools=[execute_python], prompt="You are a senior Python developer. Write clean, " "well-documented code. Include error handling and " "type hints. Test your code before returning it.",)
review_agent = create_react_agent( llm, tools=[review_code], prompt="You are a code reviewer and fact-checker. Verify " "accuracy of research findings and review code for " "bugs, security issues, and best practices. Be specific " "about any issues found.",)
# ── Sub-agent node wrappers ───────────────────────def research_node(state: AgentState) -> Command[Literal["supervisor"]]: result = research_agent.invoke({"messages": state["messages"]}) return Command( update={"messages": result["messages"]}, goto="supervisor", )
def code_node(state: AgentState) -> Command[Literal["supervisor"]]: result = code_agent.invoke({"messages": state["messages"]}) return Command( update={"messages": result["messages"]}, goto="supervisor", )
def review_node(state: AgentState) -> Command[Literal["supervisor"]]: result = review_agent.invoke({"messages": state["messages"]}) return Command( update={"messages": result["messages"]}, goto="supervisor", )
# ── Supervisor ────────────────────────────────────SUPERVISOR_PROMPT = """You are a supervisor managing three agents:- research: searches the web for information- code: writes and executes Python code- review: reviews code and verifies research accuracy
Given the conversation so far, decide which agent should actnext, or respond FINISH if the task is complete.
Respond with the agent name: research, code, review, or FINISH."""
def supervisor_node(state: AgentState) -> Command[ Literal["research", "code", "review", "__end__"]]: messages = [{"role": "system", "content": SUPERVISOR_PROMPT}] + state["messages"] response = llm.with_structured_output(RouteDecision).invoke(messages)
if response.next_agent == "FINISH": return Command(goto=END, update={"next_agent": "FINISH"})
return Command( goto=response.next_agent, update={ "messages": [HumanMessage( content=f"[Supervisor → {response.next_agent}]: {response.task_description}" )], "next_agent": response.next_agent, }, )
# ── Wire the graph ────────────────────────────────graph = StateGraph(AgentState)graph.add_node("supervisor", supervisor_node)graph.add_node("research", research_node)graph.add_node("code", code_node)graph.add_node("review", review_node)
graph.add_edge(START, "supervisor")app = graph.compile()
# ── Run ───────────────────────────────────────────result = app.invoke({ "messages": [HumanMessage( content="Research the current pricing for OpenAI's GPT-4o API, " "write a Python cost calculator function, and review " "the code for correctness." )]})
for msg in result["messages"]: print(f"{msg.type}: {msg.content[:200]}")What This Code Does
Section titled “What This Code Does”- Three sub-agents are created with
create_react_agent, each with its own tools and system prompt - Node wrappers invoke the sub-agent and return a
Commandthat routes back to the supervisor - The supervisor uses structured output to decide which agent acts next or whether to finish
- The graph starts at the supervisor, which fans out to sub-agents and collects results
6. State Management and Message Routing
Section titled “6. State Management and Message Routing”LangGraph’s typed state schema is the shared memory that flows between all nodes, with reducer functions controlling how each field accumulates across routing steps.
The State Schema
Section titled “The State Schema”LangGraph state is the backbone of multi-agent communication. Every node reads from and writes to the shared state.
from typing import Annotatedfrom typing_extensions import TypedDictfrom langchain_core.messages import BaseMessageimport operator
class AgentState(TypedDict): # Messages accumulate using operator.add (append-only) messages: Annotated[list[BaseMessage], operator.add] # Track which agent was last active next_agent: str # Optional: track iteration count for loop limits iteration_count: intThe Annotated[list[BaseMessage], operator.add] pattern is critical. It tells LangGraph to append new messages rather than replace the entire list. Without this, each node would overwrite the conversation history.
Message Routing Patterns
Section titled “Message Routing Patterns”Pattern 1: Fan-out / Fan-in — Supervisor sends tasks to multiple agents, then merges results.
def supervisor_fanout(state: AgentState) -> list[Command]: """Route to multiple agents in parallel.""" return [ Command(goto="research", update={"messages": [ HumanMessage(content="Research the pricing data.") ]}), Command(goto="code", update={"messages": [ HumanMessage(content="Write the calculator function.") ]}), ]Pattern 2: Sequential with gates — Supervisor enforces ordering constraints.
def supervisor_sequential(state: AgentState) -> Command: """Enforce: research first, then code, then review.""" last = state.get("next_agent", "") if last == "": return Command(goto="research") elif last == "research": return Command(goto="code") elif last == "code": return Command(goto="review") return Command(goto=END)Pattern 3: Loop with exit condition — Supervisor re-routes until quality threshold is met.
def supervisor_with_retry(state: AgentState) -> Command: """Re-route to code agent if review finds issues.""" if state.get("iteration_count", 0) > 3: return Command(goto=END) # Safety limit
last_message = state["messages"][-1].content if "APPROVED" in last_message: return Command(goto=END) return Command(goto="code")Preventing Infinite Loops
Section titled “Preventing Infinite Loops”Multi-agent systems can loop forever if the supervisor keeps routing between agents without converging. Always implement at least one of these safeguards:
- Iteration counter — hard limit on total routing decisions (e.g., max 10)
- Recursion limit — LangGraph’s built-in
recursion_limitparameter ingraph.compile() - Timeout — wall-clock time limit on the entire graph execution
- Convergence check — detect repeated routing patterns and force termination
app = graph.compile()# Built-in recursion limit prevents infinite loopsresult = app.invoke( {"messages": [HumanMessage(content="...")]}, config={"recursion_limit": 25},)7. Alternative Patterns
Section titled “7. Alternative Patterns”The supervisor pattern is one of several multi-agent architectures. Choose based on your coordination needs.
Hierarchical Multi-Agent
Section titled “Hierarchical Multi-Agent”A supervisor delegates to mid-level supervisors, which manage their own sub-agents. Useful when the problem has natural sub-domains (e.g., a “backend team” supervisor and a “frontend team” supervisor, each with specialized agents).
Top Supervisor → Backend Supervisor → [DB Agent, API Agent] → Frontend Supervisor → [UI Agent, Test Agent]Peer-to-Peer (Swarm Handoffs)
Section titled “Peer-to-Peer (Swarm Handoffs)”Agents hand off directly to each other without a central coordinator. Each agent decides who should act next. LangGraph’s langgraph-swarm package implements this pattern.
from langgraph_swarm import create_handoff_tool, create_swarm
research = create_react_agent(llm, [search_tool, create_handoff_tool("code")])code = create_react_agent(llm, [execute_python, create_handoff_tool("research")])
swarm = create_swarm([research, code]).compile()Best for: systems where agents are peers with equal authority and routing decisions depend on local context rather than global coordination.
Network Topology
Section titled “Network Topology”Every agent can communicate with every other agent. No fixed routing structure — the graph dynamically determines paths. This is the most flexible but hardest to debug.
Pattern Comparison
Section titled “Pattern Comparison”| Pattern | Coordination | Complexity | Best For |
|---|---|---|---|
| Supervisor | Centralized | Medium | 2-5 agents with clear role separation |
| Hierarchical | Multi-level | High | Large teams with sub-domains |
| Swarm | Decentralized | Low-Medium | Peer agents with handoff logic |
| Network | Dynamic | Very High | Research, highly adaptive systems |
For a broader comparison of agent framework options, see Agentic Frameworks Comparison.
8. Multi-Agent Trade-offs and Pitfalls
Section titled “8. Multi-Agent Trade-offs and Pitfalls”Multi-agent systems introduce compounding failure surfaces — routing confusion, token budget explosion, and infinite loops — each requiring explicit mitigation at design time.
Common Failure Modes
Section titled “Common Failure Modes”Routing confusion: The supervisor sends tasks to the wrong agent. This happens when agent role descriptions overlap or the supervisor prompt is ambiguous. Fix: make role boundaries explicit and non-overlapping in the supervisor prompt.
Message pollution: Sub-agents receive irrelevant messages from other agents’ conversations. The shared message list grows with every agent interaction, and later agents process all previous messages — including tool calls they do not need. Fix: filter messages per agent or use separate message channels.
Token budget explosion: Each sub-agent call includes the full conversation history. With 3 agents and 10 routing steps, the supervisor processes 10x the original message volume. Fix: summarize intermediate results, trim tool call messages, or use shorter context windows for sub-agents.
Cascading failures: If the research agent returns bad data, the code agent writes incorrect code, and the review agent may not catch it because it trusts the research. Fix: implement independent verification — the review agent should cross-check facts, not just review code syntax.
Infinite retry loops: The supervisor routes to the code agent, the review agent rejects the code, the supervisor routes back to the code agent, and the cycle repeats. The code agent may produce the same output each time. Fix: include the rejection reason in the re-routing message and enforce a retry limit.
Cost and Latency Implications
Section titled “Cost and Latency Implications”Each routing step involves at least one LLM call (the supervisor decision) plus the sub-agent’s LLM calls. A 3-agent system with one routing cycle per agent requires a minimum of 7 LLM calls (1 initial supervisor + 3 sub-agents + 3 supervisor re-evaluations). At GPT-4o pricing, a complex multi-agent interaction can cost $0.10-$0.50 per request.
Latency compounds similarly. If each LLM call takes 2 seconds, a 7-call workflow takes 14+ seconds sequentially. Parallel fan-out helps but adds complexity.
9. LangGraph Multi-Agent Interview Questions
Section titled “9. LangGraph Multi-Agent Interview Questions”Multi-agent questions test architectural thinking — interviewers want to see decomposition into agents, clear role boundaries, and anticipation of failure modes.
What Interviewers Expect
Section titled “What Interviewers Expect”Multi-agent questions test architectural thinking. Interviewers want to see that you can decompose a problem into agents, define clear interfaces, and anticipate failure modes.
Strong vs Weak Answer Patterns
Section titled “Strong vs Weak Answer Patterns”Q: “Design a multi-agent system for automated code review.”
Weak: “I would use multiple LLM calls — one for checking bugs, one for style, one for security.”
Strong: “I would use the supervisor pattern with three specialized agents. The supervisor receives the PR diff and routes to a security agent (checks for hardcoded secrets, SQL injection, dependency vulnerabilities), a logic agent (checks for bugs, edge cases, race conditions), and a style agent (checks naming conventions, documentation, complexity metrics). Each agent has specific tools — the security agent runs Bandit and Snyk, the logic agent can execute test cases, the style agent runs Ruff. The supervisor merges the three reviews into a single report with severity rankings. I would set a recursion limit of 5 to prevent the supervisor from endlessly re-routing, and I would filter messages so each agent only sees the PR diff and its own previous outputs, not the other agents’ reviews.”
Common Interview Questions
Section titled “Common Interview Questions”- Compare the supervisor pattern vs swarm pattern — when would you use each?
- How do you prevent infinite loops in a multi-agent graph?
- Design a multi-agent system for customer support with escalation
- What is the cost and latency overhead of multi-agent vs single-agent?
- How does LangGraph’s state management differ from passing data between functions?
- How would you test a multi-agent system? What failure modes would you check for?
10. Multi-Agent Systems in Production
Section titled “10. Multi-Agent Systems in Production”Production multi-agent deployments require choosing between managed infrastructure (LangGraph Platform) and self-hosted persistence, plus streaming and observability from day one.
Production Deployment Patterns
Section titled “Production Deployment Patterns”Pattern 1: LangGraph Platform (Managed)
Client → LangGraph Cloud API → Supervisor Graph → Sub-agents → ResponseBest for: teams that want managed infrastructure with built-in persistence, streaming, and monitoring. LangGraph Platform handles checkpointing, retries, and horizontal scaling.
Pattern 2: Self-Hosted with Persistence
Client → FastAPI → LangGraph (SQLite/Postgres checkpointer) → Sub-agents → ResponseBest for: teams with specific infrastructure requirements or data residency constraints.
from langgraph.checkpoint.postgres import PostgresSaver
checkpointer = PostgresSaver.from_conn_string("postgresql://...")app = graph.compile(checkpointer=checkpointer)
# Resume from a previous checkpointconfig = {"configurable": {"thread_id": "user-123"}}result = app.invoke({"messages": [HumanMessage(content="Continue...")]}, config=config)Pattern 3: Async with Streaming
async for event in app.astream_events( {"messages": [HumanMessage(content="...")]}, version="v2",): if event["event"] == "on_chat_model_stream": print(event["data"]["chunk"].content, end="", flush=True)Observability
Section titled “Observability”Multi-agent systems are hard to debug without proper tracing. Every routing decision, sub-agent invocation, and tool call should be logged.
- LangSmith — native integration with LangGraph for trace visualization
- Langfuse — open-source alternative with session-level tracing
- Custom logging — add metadata to each node for structured logging
For a comparison of observability tools, see LangSmith vs Langfuse.
Monitoring Checklist
Section titled “Monitoring Checklist”- Routing accuracy: track how often the supervisor picks the correct agent (use human evaluation samples)
- Sub-agent success rate: percentage of sub-agent calls that produce usable outputs
- End-to-end latency: p50 and p95 across the full graph execution
- Token consumption: total tokens per request — compare against single-agent baseline
- Error rate: graph failures (timeouts, infinite loops, tool errors) per 1000 requests
11. Summary and Key Takeaways
Section titled “11. Summary and Key Takeaways”Use these reference tables to quickly recall the key decisions: when to use multi-agent, which topology, how many agents, and what framework.
The Decision in 30 Seconds
Section titled “The Decision in 30 Seconds”| Question | Answer |
|---|---|
| When to use multi-agent? | When a task requires 2+ distinct capabilities that a single agent handles poorly |
| Which pattern? | Supervisor for most cases. Swarm for peer handoffs. Hierarchical for large agent teams |
| How many agents? | Start with 2-3. Each additional agent adds latency and cost |
| Biggest risk? | Token budget explosion and infinite routing loops |
| Framework? | LangGraph 0.3+ with Command API and create_react_agent |
Official Documentation
Section titled “Official Documentation”- LangGraph Docs — core framework documentation
- LangGraph Multi-Agent Tutorial — official supervisor example
- langgraph-supervisor Package — prebuilt supervisor pattern
- langgraph-swarm Package — prebuilt swarm pattern
- LangGraph Platform — managed deployment
Related
Section titled “Related”- LangGraph Tutorial — foundational LangGraph concepts and single-agent graphs
- LangChain vs LangGraph — when to use chains vs graphs
- Agentic Design Patterns — reflection, planning, tool use, and multi-agent patterns
- AI Agents — what agents are and how they work
- Agentic Frameworks Comparison — LangGraph vs CrewAI vs AutoGen vs Pydantic AI
- LLM Evaluation — measuring agent quality systematically
Last updated: March 2026. LangGraph’s API evolves quickly; verify current patterns against the official documentation.
Frequently Asked Questions
What is the supervisor pattern in multi-agent systems?
The supervisor pattern uses a central agent that routes tasks to specialized sub-agents for research, code generation, review, and other capabilities. The supervisor decides which agent runs next based on the current state, manages message routing between agents, and handles error recovery. This creates higher quality outputs because each agent focuses on what it does best.
Why use multi-agent systems instead of a single agent?
A single LLM call attempting multiple capabilities (research, coding, review) produces mediocre results across the board. Multi-agent systems decompose complex tasks into specialized sub-tasks where each agent focuses on one capability. This yields higher quality outputs, clearer reasoning traces, and the ability to assign different tools and models to different tasks.
How does LangGraph handle multi-agent state management?
LangGraph provides explicit state management where shared state flows between nodes with type safety. The supervisor node reads the current state to decide which agent runs next. Each sub-agent receives relevant state, performs its work, and returns state updates. Conditional routing enables re-routing on failure, and built-in persistence via checkpointers enables recovery for long-running workflows.
When should I use a multi-agent system vs a single agent?
Use multi-agent systems when the task requires multiple distinct capabilities (research plus coding plus review), when different sub-tasks benefit from different tools or models, or when you need clear separation of concerns for debugging and testing. A single agent is sufficient for focused tasks that require only one capability.
What is the Command API in LangGraph 0.3+?
The Command API replaces manual edge routing in LangGraph 0.3+. Agents return Command(goto='next_node') instead of relying on conditional edge functions, providing type-safe routing between nodes. Sub-agent node wrappers use Command to route back to the supervisor after execution.
How do you prevent infinite loops in a multi-agent graph?
Implement at least one safeguard: an iteration counter with a hard limit on total routing decisions, LangGraph's built-in recursion_limit parameter in graph.compile(), a wall-clock timeout on the entire graph execution, or a convergence check that detects repeated routing patterns and forces termination.
What tools does each sub-agent need in a supervisor system?
Each sub-agent is assigned tools matching its specialized role. A research agent uses search tools like TavilySearchResults for web research. A code agent uses code execution tools. A review agent uses code review and static analysis tools. Tools are bound to agents via create_react_agent, ensuring each agent only accesses capabilities relevant to its role.
What is the difference between the supervisor pattern and swarm pattern?
The supervisor pattern uses a central coordinator that routes tasks to specialized sub-agents based on intent classification. The swarm pattern uses peer-to-peer agent handoffs without a central coordinator — each agent decides who should act next. Use supervisor for 2-5 agents with clear role separation; use swarm for peer agents with equal authority.
How does fan-out and fan-in routing work in LangGraph?
Fan-out routing sends tasks to multiple agents in parallel by returning multiple Command objects from the supervisor node. Fan-in collects results from all parallel agents back at the supervisor, which merges them and decides whether additional routing is needed or the task is complete. This pattern reduces latency for tasks with independent sub-components.
What are the cost and latency implications of multi-agent systems?
Each routing step involves at least one LLM call for the supervisor decision plus the sub-agent's LLM calls. A 3-agent system with one routing cycle per agent requires a minimum of 7 LLM calls. At GPT-4o pricing, a complex multi-agent interaction can cost $0.10-$0.50 per request. Latency compounds similarly — a 7-call workflow takes 14+ seconds sequentially.