CrewAI Tutorial — Build Multi-Agent Systems in Python (2026)
CrewAI lets you build multi-agent AI systems where specialized agents collaborate on complex tasks — using plain Python. Instead of one monolithic prompt doing everything, you define agents with distinct roles, assign them tasks, and let CrewAI orchestrate the execution. This tutorial takes you from zero to a working multi-agent crew in under 20 minutes.
Who this is for:
- Junior engineers: You want to build your first multi-agent system beyond a single LLM call
- Senior engineers: You need a fast way to prototype role-based agent workflows before deciding on a production framework
Why CrewAI Matters
Section titled “Why CrewAI Matters”Single-agent systems break down when the task requires multiple areas of expertise. Cramming research, analysis, and writing into one prompt produces mediocre results because the LLM cannot specialize.
CrewAI solves this with role-based multi-agent coordination:
| Challenge | Single Agent | With CrewAI |
|---|---|---|
| Research + analysis + writing in one prompt | Context overload, unfocused output | Three specialized agents, each with a clear role |
| Agent needs 10+ tools | Reasoning degrades with tool sprawl | Each agent gets only the tools it needs |
| Output quality varies unpredictably | No review step — one-shot generation | Reviewer agent checks quality before final output |
| Changing one part of the workflow | Rewrite the entire prompt | Swap one agent or task definition |
CrewAI’s core insight: agent specialization through role assignment produces better results than general-purpose prompts. The framework handles orchestration, context passing, memory, and tool dispatch — you focus on defining who does what.
When to Use CrewAI
Section titled “When to Use CrewAI”CrewAI fits workflows that map naturally to a team of specialists working together. If you can describe your workflow as “Agent A does X, passes the result to Agent B who does Y,” CrewAI is the right tool.
Strong use cases:
- Research crews — A researcher gathers sources, an analyst extracts key findings, a writer produces the final report
- Content pipelines — An SEO analyst identifies keywords, a writer drafts the article, an editor reviews tone and accuracy
- Data analysis teams — A data collector pulls from APIs, a statistician runs analysis, a reporter summarizes findings
- Customer support triage — A classifier categorizes tickets, a domain expert drafts responses, a QA agent reviews accuracy
When to reach for something else:
- Workflows with loops and retries — LangGraph handles cycles natively. CrewAI’s sequential model does not loop.
- Human-in-the-loop approval gates — CrewAI has basic
human_inputon tasks, but LangGraph’s interrupt() is more robust for production approval workflows. - Simple single-agent tasks — If one LLM call with a good prompt solves your problem, CrewAI overhead is unnecessary.
How CrewAI Works — Architecture
Section titled “How CrewAI Works — Architecture”CrewAI follows a straightforward execution model: you define agents, assign them tasks, organize tasks into a crew, and kick off execution.
CrewAI Execution Pipeline
The Four Building Blocks
Section titled “The Four Building Blocks”-
Agents — LLM-powered workers defined by a
role,goal, andbackstory. The backstory shapes the agent’s personality and approach. Each agent can have its own tools and LLM configuration. -
Tasks — Units of work assigned to agents. Each task has a
description,expected_output, and an assignedagent. Tasks can reference other tasks viacontextto receive their outputs. -
Crews — Collections of agents and tasks with an execution strategy.
Process.sequentialruns tasks in order.Process.hierarchicaladds a manager agent that delegates and reviews. -
Tools — Python functions that give agents external capabilities like web search, file access, API calls, or code execution. Assigned per-agent to enforce role boundaries.
CrewAI Tutorial — Build Your First Crew
Section titled “CrewAI Tutorial — Build Your First Crew”You will build a research crew with three agents: a researcher who gathers information, a writer who drafts a report, and a reviewer who checks quality.
Step 1: Install CrewAI
Section titled “Step 1: Install CrewAI”pip install crewai crewai-toolsCrewAI requires Python 3.10+ and works with OpenAI, Anthropic, and Google models. Set your API key:
export OPENAI_API_KEY="your-api-key-here"Step 2: Define Your Agents
Section titled “Step 2: Define Your Agents”Each agent needs a role (job title), goal (what success looks like), and backstory (personality and expertise).
from crewai import Agent
researcher = Agent( role="Senior Research Analyst", goal="Find comprehensive, accurate information on the given topic", backstory=( "You are an experienced research analyst who excels at " "finding reliable sources, cross-referencing data points, " "and identifying the most important trends. You always " "cite your sources and flag uncertainty." ), verbose=True, allow_delegation=False)
writer = Agent( role="Technical Content Writer", goal="Transform research findings into clear, engaging content", backstory=( "You are a skilled technical writer who turns complex " "research into accessible, well-structured articles. " "You prioritize clarity over jargon and always include " "concrete examples." ), verbose=True, allow_delegation=False)
reviewer = Agent( role="Quality Assurance Editor", goal="Ensure content is accurate, complete, and well-structured", backstory=( "You are a meticulous editor who catches factual errors, " "unclear explanations, and structural problems. You provide " "specific, actionable feedback rather than vague suggestions." ), verbose=True, allow_delegation=False)Step 3: Define Your Tasks
Section titled “Step 3: Define Your Tasks”Each task describes what needs to be done, what the output should look like, and which agent owns it.
from crewai import Task
research_task = Task( description=( "Research the current state of AI agents in enterprise " "software. Cover: key frameworks, adoption trends, " "common use cases, and challenges. Focus on data from " "2025-2026." ), expected_output=( "A structured research brief with 5 key findings, " "each supported by specific data points or examples. " "Include source references." ), agent=researcher)
writing_task = Task( description=( "Write a 500-word article based on the research findings. " "Use clear headings, concrete examples, and a professional " "but accessible tone." ), expected_output=( "A polished article in markdown format with an introduction, " "3-4 sections with headings, and a conclusion." ), agent=writer, context=[research_task] # Writer receives researcher's output)
review_task = Task( description=( "Review the article for factual accuracy, clarity, and " "completeness. Check that all research findings are " "accurately represented and the writing is engaging." ), expected_output=( "The final article with any corrections applied, plus " "a brief editorial note listing changes made." ), agent=reviewer, context=[research_task, writing_task] # Reviewer sees both)Step 4: Create and Run the Crew
Section titled “Step 4: Create and Run the Crew”from crewai import Crew, Process
crew = Crew( agents=[researcher, writer, reviewer], tasks=[research_task, writing_task, review_task], process=Process.sequential, verbose=True)
result = crew.kickoff()print(result.raw)What happens when you run this:
- The researcher executes first, producing a structured research brief
- The writer receives the research brief via
contextand drafts the article - The reviewer receives both the research and the article, then produces the final output
crew.kickoff()returns aCrewOutputwith.raw(string),.tasks_output(per-task results), and optional structured data
CrewAI Architecture Deep Dive
Section titled “CrewAI Architecture Deep Dive”Understanding the full stack helps you debug issues and optimize performance.
CrewAI Architecture Stack
Key Architecture Details
Section titled “Key Architecture Details”Memory system: Three layers — short-term (current run), long-term (across runs), and entity memory (tracks people, companies, concepts). Enable with memory=True on the Crew.
LLM flexibility: Each agent can use a different LLM. Set llm="gpt-4o" or llm="anthropic/claude-sonnet-4-20250514" on individual agents to mix models based on task requirements.
Tool isolation: Tools are assigned per-agent, not globally. This prevents a writer agent from calling a database deletion tool that only the admin agent should access.
Delegation: When allow_delegation=True, an agent can ask another agent in the crew for help. The manager agent in hierarchical mode uses this to route subtasks dynamically.
CrewAI Advanced Examples
Section titled “CrewAI Advanced Examples”Example 1: Research Crew with Web Search
Section titled “Example 1: Research Crew with Web Search”from crewai import Agent, Task, Crew, Processfrom crewai_tools import SerperDevTool, ScrapeWebsiteTool
search_tool = SerperDevTool()scrape_tool = ScrapeWebsiteTool()
researcher = Agent( role="Web Research Specialist", goal="Find the most relevant and recent information on the topic", backstory="Expert at web research who verifies facts across sources", tools=[search_tool, scrape_tool], verbose=True)
analyst = Agent( role="Data Analyst", goal="Extract actionable insights from raw research data", backstory="Analytical thinker who spots patterns and trends in data", verbose=True)
research_task = Task( description="Search for the latest developments in {topic}. Find at least 5 sources.", expected_output="A list of 5 key findings with source URLs", agent=researcher)
analysis_task = Task( description="Analyze the research findings and produce 3 actionable recommendations", expected_output="3 prioritized recommendations with supporting evidence", agent=analyst, context=[research_task])
crew = Crew( agents=[researcher, analyst], tasks=[research_task, analysis_task], process=Process.sequential, verbose=True)
result = crew.kickoff(inputs={"topic": "AI agent frameworks in 2026"})Example 2: Code Review Crew with Structured Output
Section titled “Example 2: Code Review Crew with Structured Output”from crewai import Agent, Task, Crew, Processfrom pydantic import BaseModel
class ReviewResult(BaseModel): issues_found: list[str] suggestions: list[str] approval_status: str
security_reviewer = Agent( role="Security Auditor", goal="Identify security vulnerabilities and unsafe patterns", backstory="Cybersecurity expert focused on code-level vulnerabilities", verbose=True)
style_reviewer = Agent( role="Code Style Reviewer", goal="Ensure code follows best practices and is maintainable", backstory="Senior engineer who maintains coding standards for the team", verbose=True)
security_task = Task( description="Review this code for security issues: {code_snippet}", expected_output="List of security vulnerabilities with severity ratings", agent=security_reviewer)
summary_task = Task( description="Combine security review with style analysis into a final verdict", expected_output="Structured review result with approval status", agent=style_reviewer, context=[security_task], output_pydantic=ReviewResult # Enforces structured output)
review_crew = Crew( agents=[security_reviewer, style_reviewer], tasks=[security_task, summary_task], process=Process.sequential, verbose=True)
result = review_crew.kickoff(inputs={ "code_snippet": "def process_user_input(data): return eval(data)"})print(result.pydantic) # ReviewResult instanceExample 3: Hierarchical Data Analysis Pipeline
Section titled “Example 3: Hierarchical Data Analysis Pipeline”from crewai import Agent, Task, Crew, Process
collector = Agent( role="Data Collection Specialist", goal="Gather and clean raw data from multiple sources", backstory="Data engineer skilled at ETL and data quality validation", verbose=True)
statistician = Agent( role="Statistical Analyst", goal="Apply statistical methods to extract meaningful patterns", backstory="Statistician who translates numbers into business insights", verbose=True)
collect_task = Task( description="Collect Q4 2025 sales data. Identify missing values and outliers.", expected_output="Clean dataset summary with data quality report", agent=collector)
analyze_task = Task( description="Perform trend analysis and write an executive summary", expected_output="One-page summary with 3 key trends and recommendations", agent=statistician, context=[collect_task], output_file="executive_summary.md")
pipeline = Crew( agents=[collector, statistician], tasks=[collect_task, analyze_task], process=Process.hierarchical, # Manager agent coordinates manager_llm="gpt-4o", memory=True, verbose=True)
result = pipeline.kickoff()CrewAI vs LangGraph Agents
Section titled “CrewAI vs LangGraph Agents”The two most popular multi-agent frameworks in 2026 take fundamentally different approaches to agent coordination.
CrewAI vs LangGraph for Multi-Agent Systems
- Intuitive role/goal/backstory agent design
- Working crew in under 50 lines of Python
- Built-in memory (short-term, long-term, entity)
- Structured output via Pydantic on tasks
- No native cycles or retry loops
- No checkpoint-based state persistence
- Basic human-in-the-loop (human_input flag)
- Explicit execution graphs with conditional routing
- Built-in checkpointing (SQLite, PostgreSQL, Redis)
- First-class human-in-the-loop via interrupt()
- Cycles enable retry loops and iterative refinement
- More boilerplate — state schema + edge definitions
- Steeper learning curve for simple tasks
- No role-based abstractions — agents are just nodes
Decision framework: If your workflow looks like an org chart (clear roles, sequential handoffs), start with CrewAI. If it looks like a flowchart (branches, loops, conditional routing), start with LangGraph. For a detailed breakdown, see LangGraph vs CrewAI.
Interview Questions
Section titled “Interview Questions”These four questions cover the multi-agent architecture concepts that come up in GenAI engineering interviews when discussing CrewAI and role-based agent coordination.
Q1: “What is CrewAI and how does it differ from single-agent systems?”
Section titled “Q1: “What is CrewAI and how does it differ from single-agent systems?””What they are testing: Do you understand why multi-agent architectures exist?
Strong answer: “CrewAI is a role-based multi-agent framework where you define agents with distinct roles, goals, and backstories, then assign them tasks in a crew. Unlike single-agent systems where one LLM handles everything, CrewAI splits work across specialized agents — a researcher, a writer, a reviewer — each focused on what it does best. This produces higher quality output because each agent’s context window is focused on its specific task rather than overloaded with the entire workflow.”
Weak answer: “CrewAI lets you run multiple LLMs at once.” (Misses the specialization and orchestration point)
Q2: “When would you choose CrewAI over LangGraph?”
Section titled “Q2: “When would you choose CrewAI over LangGraph?””What they are testing: Framework selection judgment — can you match the tool to the problem?
Strong answer: “I choose CrewAI when the workflow maps to a team of specialists with clear handoffs — like a content pipeline where a researcher, writer, and editor work sequentially. I choose LangGraph when the workflow has cycles, needs durable checkpointing, or requires human-in-the-loop approval gates. CrewAI gets me a working prototype faster; LangGraph gives me more control over execution flow.”
Q3: “How does context flow between tasks in CrewAI?”
Section titled “Q3: “How does context flow between tasks in CrewAI?””What they are testing: Implementation-level understanding of the framework.
Strong answer: “Each task can reference other tasks via the context parameter. When a writing task sets context=[research_task], the writer agent receives the researcher’s output as additional context in its prompt. CrewAI also supports crew-level memory — short-term for the current run, long-term across runs, and entity memory for tracking specific subjects. The combination of explicit context passing and implicit memory gives agents both structured and ambient awareness.”
Q4: “What are the risks of multi-agent systems?”
Section titled “Q4: “What are the risks of multi-agent systems?””What they are testing: Production maturity — do you think beyond the happy path?
Strong answer: “Three main risks: cost multiplication (each agent makes its own LLM calls, so a 3-agent crew costs roughly 3x a single agent), error propagation (a bad output from the first agent poisons all downstream tasks), and non-determinism (agent outputs vary between runs). I mitigate these with structured outputs via Pydantic to enforce consistency, verbose=True for debugging, cost monitoring per agent, and guardrails on critical tasks.”
CrewAI in Production
Section titled “CrewAI in Production”Moving from prototype to production requires attention to cost, reliability, and observability.
Cost management: A 3-agent crew with GPT-4o can cost $0.10-0.50 per run depending on task complexity. Use cheaper models (GPT-4o-mini, Claude Haiku) for formatting or classification tasks, and reserve expensive models for complex reasoning.
Structured outputs: Always use output_pydantic or output_json on tasks that feed into downstream code. Intermediate task outputs should be structured so the next agent can parse them reliably.
Error handling: Set max_retry_limit on the Crew for automatic retries. Wrap tool functions in try/except blocks and return descriptive error messages — the agent can adapt its approach when it gets a clear error instead of a stack trace.
Observability: Enable verbose=True during development. For production, CrewAI integrates with LangSmith. Log result.tasks_output to track per-task execution times and output quality.
Scaling: Use crew.kickoff_async() for concurrent execution or crew.kickoff_for_each(inputs=[...]) for batch processing. Pin crewai and crewai-tools versions in requirements.txt.
Summary and Key Takeaways
Section titled “Summary and Key Takeaways”- CrewAI models multi-agent workflows as teams — agents with roles, goals, and backstories collaborate on tasks
- Four building blocks: Agents (who), Tasks (what), Crews (how), and Tools (with what)
- Sequential process runs tasks in order; hierarchical process adds a manager agent for delegation and quality control
- Context passing via the
contextparameter chains task outputs — the writer receives the researcher’s findings automatically - Structured outputs with
output_pydanticenforce predictable data formats for production reliability - Start small — 2-3 agents with clear role boundaries. Expand only when you identify a genuine need for additional specialization
- Know the limits — CrewAI does not support cycles or checkpoint-based persistence. For those, use LangGraph
Related
Section titled “Related”- LangGraph vs CrewAI — Detailed comparison of graph-based vs role-based orchestration
- AI Agents — Agent architectures and when to use multi-agent systems
- Agentic Frameworks Compared — CrewAI vs LangGraph vs AutoGen
- Agentic Design Patterns — ReAct, Plan-and-Execute, and delegation patterns
- Agent Debugging — Techniques for debugging multi-agent workflows
Frequently Asked Questions
What is CrewAI and what is it used for?
CrewAI is a Python framework for building multi-agent AI systems using role-based orchestration. You define agents with specific roles, goals, and backstories, assign them tasks, and organize them into crews that execute sequentially or hierarchically. CrewAI is used for research automation, content pipelines, data analysis teams, and any workflow where multiple specialized AI agents need to collaborate.
How do I install CrewAI?
Install CrewAI with pip: pip install crewai crewai-tools. This installs the core framework and the official tool library. CrewAI requires Python 3.10 or higher and works with OpenAI, Anthropic, Google, and other LLM providers out of the box.
What is the difference between sequential and hierarchical process in CrewAI?
Sequential process executes tasks in the order you define them — Task 1 completes, its output feeds into Task 2, and so on. Hierarchical process adds a manager agent that coordinates task delegation, reviews outputs, and decides when work meets quality standards. Use sequential for simple linear workflows and hierarchical when tasks require quality gates or dynamic delegation.
How do I add custom tools to a CrewAI agent?
Define tools using the @tool decorator from crewai, then pass them to the Agent constructor via the tools parameter. Each tool is a Python function with a descriptive name and docstring. CrewAI also integrates with LangChain tools and provides built-in tools like SerperDevTool for web search and ScrapeWebsiteTool for web scraping.
How does CrewAI compare to LangGraph?
CrewAI uses role-based orchestration where agents have roles, goals, and backstories — ideal for workflows that map to team structures. LangGraph uses graph-based state machines with explicit nodes, edges, and conditional routing — ideal for complex workflows with cycles, retries, and human-in-the-loop. CrewAI is faster to prototype; LangGraph gives more fine-grained execution control.
Can CrewAI agents share context between tasks?
Yes. Use the context parameter on a Task to pass outputs from previous tasks. When you set context=[research_task] on a writing task, the writer agent receives the researcher's output as additional context. CrewAI also supports crew-level memory (short-term, long-term, and entity memory) that persists information across all tasks in the crew run.
How do I get structured output from CrewAI?
Use the output_pydantic or output_json parameter on a Task. Define a Pydantic model with the fields you need, then set output_pydantic=YourModel on the task. CrewAI instructs the agent to return data matching that schema, which is essential for production pipelines where downstream code needs predictable data structures.
What are common mistakes when building CrewAI crews?
Common mistakes include writing vague agent backstories that do not constrain behavior, omitting expected_output on tasks so agents produce inconsistent formats, not using the context parameter to chain task outputs, giving agents too many tools which degrades reasoning quality, and skipping verbose=True during development which makes debugging nearly impossible.
Is CrewAI free and open source?
Yes. CrewAI is MIT-licensed and free to use. The core framework and tools library are open source on GitHub. CrewAI also offers CrewAI+ as a paid enterprise tier with managed deployment and monitoring dashboards, but the open-source version is fully functional for production use.
How do I handle errors and retries in CrewAI?
Set max_retry_limit on the Crew to allow automatic retries when tasks fail. For tool-level errors, wrap tool functions in try/except blocks and return descriptive error messages so the agent can adapt its approach. Use verbose=True to monitor agent reasoning during execution, and implement task callbacks to log outcomes and trigger alerts on failures.
Last updated: March 2026 | CrewAI 0.100+ / Python 3.10+