LangChain LCEL Tutorial — Build Chains the Modern Way (2026)
LangChain LCEL is the modern way to build chains in LangChain v0.3+. If you have used LangChain before 2024, you probably wrote LLMChain or SequentialChain code that no longer works. LCEL (LangChain Expression Language) replaced all of that with a single concept: pipe Runnables together using the | operator. This tutorial walks you through 6 runnable Python examples — from basic chains to parallel execution and fallbacks.
For the broader LangChain ecosystem (RAG, agents, tool calling), see the full LangChain tutorial. This page focuses specifically on LCEL composition patterns.
1. Why LangChain LCEL Replaced Legacy Chains
Section titled “1. Why LangChain LCEL Replaced Legacy Chains”LangChain LCEL exists because the old chain API was a dead end. Here is what happened and why it matters.
The Problem with LLMChain and SequentialChain
Section titled “The Problem with LLMChain and SequentialChain”Before LCEL, LangChain had dedicated chain classes for every pattern:
| Legacy Class | What It Did | Why It Was Removed |
|---|---|---|
LLMChain | Prompt + LLM in one object | Could not stream, could not compose with other chains easily |
SequentialChain | Multiple chains in sequence | Required explicit input_keys / output_keys mapping — brittle |
TransformChain | Custom transformation step | Separate API from LLMChain — not interchangeable |
RouterChain | Conditional routing | Complex setup for a simple if/else |
Every chain type had its own API. Streaming required special methods. Combining chains meant learning each class’s unique interface. The cognitive overhead was enormous.
What LCEL Changed
Section titled “What LCEL Changed”LCEL unified everything under one interface: Runnable. Every component — prompts, models, parsers, retrievers, custom functions — implements the same three methods: .invoke(), .stream(), .batch(). You compose them with the | operator:
# Old way (deprecated)chain = LLMChain(llm=llm, prompt=prompt)
# New way (LCEL)chain = prompt | llm | parserThe | operator calls Python’s __or__ method, creating a RunnableSequence that passes output from one step as input to the next. Streaming works automatically through every step in the chain.
2. When LCEL Matters — Real-World Patterns
Section titled “2. When LCEL Matters — Real-World Patterns”LCEL shines when you need to compose multiple steps. Here are the patterns you will actually use in production.
| Pattern | LCEL Expression | Use Case |
|---|---|---|
| Simple generation | prompt | llm | parser | Chatbots, text generation, summarization |
| RAG retrieval chain | {"context": retriever, "question": passthrough} | prompt | llm | parser | Q&A over documents, knowledge bases |
| Multi-step reasoning | chain_1 | chain_2 | chain_3 | Extract → analyze → summarize pipelines |
| Parallel tool calls | RunnableParallel(a=chain_a, b=chain_b) | Run sentiment + keywords + summary simultaneously |
| Streaming responses | chain.stream(input) | Real-time UX in web applications |
| Fallback chains | chain.with_fallbacks([backup]) | GPT-4o primary, Claude fallback on failure |
If your use case is a single LLM API call with no composition, skip LCEL and use the raw SDK directly. LCEL adds value when you chain 2+ components together.
3. How LCEL Works — The Pipe Operator
Section titled “3. How LCEL Works — The Pipe Operator”The pipe operator is the core of LCEL. Understanding how data flows through a chain unlocks every pattern in this tutorial.
The Runnable Protocol
Section titled “The Runnable Protocol”Every LCEL component implements the Runnable interface:
class Runnable: def invoke(self, input) # Single input → single output def stream(self, input) # Single input → streamed output chunks def batch(self, inputs) # Multiple inputs → multiple outputs (concurrent) async def ainvoke(self, input) # Async version of invoke async def astream(self, input) # Async version of streamWhen you write prompt | llm | parser, Python creates a RunnableSequence. Calling .invoke() on the sequence passes data through each component in order. Calling .stream() streams tokens through the entire pipeline.
Input/Output Schema
Section titled “Input/Output Schema”Each Runnable declares its input and output schemas. The pipe operator connects them:
- PromptTemplate — Input:
dictwith template variables. Output:ChatPromptValue(formatted messages). - ChatModel — Input: messages. Output:
AIMessagewith content and metadata. - OutputParser — Input:
AIMessage. Output:str,dict, or Pydantic object.
If the output schema of step N does not match the input schema of step N+1, you get a clear error at chain construction time — not at runtime.
LCEL Pipe Flow
Section titled “LCEL Pipe Flow”:bar_chart: Visual Explanation
Section titled “:bar_chart: Visual Explanation”LCEL Pipe Operator — Data Flow
Each component transforms input and passes output to the next
4. LangChain LCEL Step by Step
Section titled “4. LangChain LCEL Step by Step”Follow these 6 steps to go from zero to a working LCEL chain with streaming.
Step 1: Install LangChain
Section titled “Step 1: Install LangChain”pip install langchain langchain-openai langchain-corePin versions in production: langchain==0.3.x and langchain-core==0.3.x. LangChain releases weekly and occasionally introduces breaking changes.
Step 2: Create a Prompt Template
Section titled “Step 2: Create a Prompt Template”from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([ ("system", "You are a senior engineer who explains concepts clearly."), ("user", "Explain {topic} in 3 bullet points. Be specific.")])The template declares {topic} as an input variable. When invoked, it produces formatted chat messages.
Step 3: Pipe to an LLM
Section titled “Step 3: Pipe to an LLM”from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)chain = prompt | llmThis two-step chain renders the prompt, then sends it to GPT-4o. The output is an AIMessage object.
Step 4: Add an Output Parser
Section titled “Step 4: Add an Output Parser”from langchain_core.output_parsers import StrOutputParser
parser = StrOutputParser()chain = prompt | llm | parserNow the chain returns a plain string instead of an AIMessage. For structured output, use JsonOutputParser or PydanticOutputParser.
Step 5: Chain with a Retriever (RAG)
Section titled “Step 5: Chain with a Retriever (RAG)”from langchain_core.runnables import RunnablePassthrough
rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | parser)RunnablePassthrough() forwards the user question unchanged. The retriever fetches relevant documents in parallel. Both feed into the prompt template. See the RAG architecture guide for the full pattern.
Step 6: Add Streaming
Section titled “Step 6: Add Streaming”for chunk in chain.stream({"topic": "LCEL in LangChain"}): print(chunk, end="", flush=True)Every component in the chain supports streaming. The prompt renders instantly, the LLM streams tokens, and the parser yields chunks as they arrive. No extra configuration needed.
5. LCEL Architecture — Runnable Composition
Section titled “5. LCEL Architecture — Runnable Composition”LCEL’s power comes from composing Runnables in two patterns: sequential chains and parallel branches.
Sequential vs Parallel Composition
Section titled “Sequential vs Parallel Composition”:bar_chart: Visual Explanation
Section titled “:bar_chart: Visual Explanation”LCEL Composition Patterns
Sequential chains vs parallel branches — both are Runnables
How RunnableParallel Works
Section titled “How RunnableParallel Works”RunnableParallel takes a dictionary of named chains and runs them all concurrently:
from langchain_core.runnables import RunnableParallel
parallel = RunnableParallel( summary=prompt_summary | llm | parser, keywords=prompt_keywords | llm | parser,)
# Both chains run at the same timeresults = parallel.invoke({"text": "Your input here..."})# results = {"summary": "...", "keywords": "..."}The total latency equals the slowest branch, not the sum of all branches. For 3 chains that each take 2 seconds, sequential execution takes 6 seconds while parallel execution takes 2 seconds.
Dictionary Shorthand
Section titled “Dictionary Shorthand”You do not always need to import RunnableParallel explicitly. A plain dictionary in an LCEL chain creates one automatically:
# This is equivalent to RunnableParallelchain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | parser)The dictionary {"context": ..., "question": ...} runs both branches concurrently and passes the combined result to the prompt.
6. LangChain LCEL Code Examples in Python
Section titled “6. LangChain LCEL Code Examples in Python”Six runnable examples covering the patterns you will use most. Each example is self-contained.
Example 1: Basic LCEL Chain
Section titled “Example 1: Basic LCEL Chain”The simplest possible chain — prompt, model, parser:
from langchain_openai import ChatOpenAIfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful coding assistant."), ("user", "Write a Python function that {task}")])
chain = prompt | ChatOpenAI(model="gpt-4o") | StrOutputParser()result = chain.invoke({"task": "reverses a linked list"})print(result)Example 2: RAG Chain with Retriever
Section titled “Example 2: RAG Chain with Retriever”A retrieval-augmented generation chain that answers questions from your documents:
from langchain_openai import ChatOpenAI, OpenAIEmbeddingsfrom langchain_chroma import Chromafrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParserfrom langchain_core.runnables import RunnablePassthrough
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")vectorstore = Chroma(embedding_function=embeddings)vectorstore.add_texts(["LCEL uses the pipe operator for composition.", "RunnableParallel runs chains concurrently.", "Every LCEL component implements invoke, stream, and batch."])retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
def format_docs(docs): return "\n".join(doc.page_content for doc in docs)
prompt = ChatPromptTemplate.from_messages([ ("system", "Answer based only on this context:\n{context}"), ("user", "{question}")])
rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | ChatOpenAI(model="gpt-4o") | StrOutputParser())
answer = rag_chain.invoke("How does LCEL composition work?")print(answer)Example 3: RunnablePassthrough for Data Forwarding
Section titled “Example 3: RunnablePassthrough for Data Forwarding”RunnablePassthrough forwards input unchanged. Use it when one branch processes data while another passes it through:
from langchain_core.runnables import RunnablePassthrough
# RunnablePassthrough.assign() adds new keys alongside existing oneschain = RunnablePassthrough.assign( word_count=lambda x: len(x["text"].split()))
result = chain.invoke({"text": "LCEL makes LangChain chains composable"})# result = {"text": "LCEL makes LangChain chains composable", "word_count": 6}Example 4: RunnableParallel for Concurrent Execution
Section titled “Example 4: RunnableParallel for Concurrent Execution”Run 3 analysis chains simultaneously on the same input:
from langchain_openai import ChatOpenAIfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParserfrom langchain_core.runnables import RunnableParallel
llm = ChatOpenAI(model="gpt-4o")parser = StrOutputParser()
summary_prompt = ChatPromptTemplate.from_template("Summarize in 1 sentence: {text}")keywords_prompt = ChatPromptTemplate.from_template("Extract 5 keywords from: {text}")sentiment_prompt = ChatPromptTemplate.from_template("Is this positive, negative, or neutral? {text}")
analysis = RunnableParallel( summary=summary_prompt | llm | parser, keywords=keywords_prompt | llm | parser, sentiment=sentiment_prompt | llm | parser,)
results = analysis.invoke({"text": "LCEL is a game-changer for LangChain development."})print(results["summary"])print(results["keywords"])print(results["sentiment"])Example 5: Streaming Chain
Section titled “Example 5: Streaming Chain”Stream tokens to the user in real time:
from langchain_openai import ChatOpenAIfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_messages([ ("system", "You explain technical concepts for senior engineers."), ("user", "Explain {topic} with a concrete code example.")])
chain = prompt | ChatOpenAI(model="gpt-4o") | StrOutputParser()
# Synchronous streamingfor chunk in chain.stream({"topic": "LCEL Runnables"}): print(chunk, end="", flush=True)
# Async streaming (for FastAPI, Django, etc.)# async for chunk in chain.astream({"topic": "LCEL Runnables"}):# yield chunkExample 6: Chain with Fallbacks
Section titled “Example 6: Chain with Fallbacks”Use .with_fallbacks() to add resilience when the primary model fails:
from langchain_openai import ChatOpenAIfrom langchain_anthropic import ChatAnthropicfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_template("Explain {topic} in 3 sentences.")parser = StrOutputParser()
primary = prompt | ChatOpenAI(model="gpt-4o") | parserfallback = prompt | ChatAnthropic(model="claude-sonnet-4-20250514") | parser
# If GPT-4o fails (rate limit, outage), automatically try Clauderesilient_chain = primary.with_fallbacks([fallback])
result = resilient_chain.invoke({"topic": "LangChain LCEL"})print(result)7. LCEL Trade-offs — When Not to Use It
Section titled “7. LCEL Trade-offs — When Not to Use It”LCEL is not always the right tool. Here are the cases where you should reach for something else.
When LCEL Is Overkill
Section titled “When LCEL Is Overkill”- Single API call — If you just need
openai.chat.completions.create(), the raw SDK is simpler. LCEL adds value only when you compose 2+ steps. - Simple function calls — A Python function that calls an API and returns results does not need to be a Runnable. Regular Python is fine.
- Prototyping — When you are exploring an idea, write plain Python first. Convert to LCEL only when you need streaming, batching, or composition.
When to Use LangGraph Instead
Section titled “When to Use LangGraph Instead”LCEL handles linear pipelines where data flows in one direction. LangGraph handles everything else:
| Pattern | LCEL | LangGraph |
|---|---|---|
| Linear chain (A → B → C) | Yes | Overkill |
| Conditional routing (if X, do A; else do B) | Possible but awkward | Built-in |
| Loops (retry until success) | No | Yes |
| State persistence across requests | No | Yes |
| Human-in-the-loop approval | No | Yes |
Most production agent systems use LangGraph for orchestration with LCEL chains inside individual graph nodes. See the LangChain vs LangGraph comparison for a detailed breakdown.
Readability Concerns
Section titled “Readability Concerns”The pipe operator syntax is elegant for short chains but degrades quickly:
# Readable (3 pipes)chain = prompt | llm | parser
# Still OK (with named dictionary)chain = {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | parser
# Getting hard to follow (nested parallels)chain = RunnableParallel(a=p1 | llm | pa, b=p2 | llm | pb) | merge_prompt | llm | parserIf you catch yourself writing deeply nested LCEL, refactor into named sub-chains:
extract_chain = prompt_extract | llm | parser_extractanalyze_chain = prompt_analyze | llm | parser_analyzefinal_chain = extract_chain | analyze_chain # Much clearer8. LangChain LCEL Interview Questions
Section titled “8. LangChain LCEL Interview Questions”These questions test whether you understand LCEL’s mechanics or just memorized the syntax.
Q1: “What is LCEL and why did LangChain adopt it?”
Section titled “Q1: “What is LCEL and why did LangChain adopt it?””What they test: Do you understand the architectural motivation, not just the syntax?
Strong answer: “LCEL is LangChain’s declarative composition syntax built on the Runnable interface. Every component implements invoke, stream, and batch. The pipe operator connects them: prompt | llm | parser. LangChain adopted it because the legacy chain classes (LLMChain, SequentialChain) each had separate APIs, did not support streaming natively, and could not be easily composed. LCEL unified everything under one interface.”
Q2: “When would you use RunnableParallel vs a sequential chain?”
Section titled “Q2: “When would you use RunnableParallel vs a sequential chain?””Strong answer: “Sequential for dependent steps where step N needs step N-1’s output. Parallel for independent steps that can run simultaneously — like extracting keywords, generating a summary, and scoring sentiment from the same input. Parallel reduces total latency from the sum of all steps to the duration of the slowest step.”
Q3: “Your LCEL chain is 8 pipes deep. How do you refactor?”
Section titled “Q3: “Your LCEL chain is 8 pipes deep. How do you refactor?””Strong answer: “Break it into named sub-chains of 2-3 pipes each, then compose those. If the chain has conditional logic or loops, migrate to LangGraph where each node is a short LCEL chain. Name each sub-chain descriptively so the pipeline reads like a recipe.”
Q4: “How does LCEL handle errors in a chain?”
Section titled “Q4: “How does LCEL handle errors in a chain?””Strong answer: “By default, an exception in any step propagates and stops the chain. You can add .with_fallbacks([backup_chain]) to try alternatives on failure, and .with_retry(stop_after_attempt=3) for transient errors like rate limits. For fine-grained error handling, wrap individual steps in try/except inside a RunnableLambda.”
9. LCEL in Production — Streaming and Tracing
Section titled “9. LCEL in Production — Streaming and Tracing”Running LCEL chains in production requires observability, async execution, and cost awareness.
LangSmith Tracing
Section titled “LangSmith Tracing”Enable LangSmith to trace every chain execution:
export LANGCHAIN_TRACING_V2=trueexport LANGCHAIN_API_KEY=your-keyexport LANGCHAIN_PROJECT=my-projectWith tracing enabled, every .invoke() and .stream() call creates a trace span. You see the exact input and output of each step, latency per component, and token usage. Without this, debugging a 4-step chain that returns wrong answers is guesswork.
Async Execution in Web Frameworks
Section titled “Async Execution in Web Frameworks”In FastAPI or Django, always use async methods to avoid blocking the event loop:
from fastapi import FastAPIfrom fastapi.responses import StreamingResponse
app = FastAPI()
@app.post("/chat")async def chat(question: str): async def generate(): async for chunk in chain.astream({"question": question}): yield chunk return StreamingResponse(generate(), media_type="text/plain")Using chain.invoke() in an async web server blocks the entire event loop for the duration of the LLM call. A single slow request blocks all other requests. Always use chain.ainvoke() or chain.astream().
Retry and Fallback Patterns
Section titled “Retry and Fallback Patterns”Production chains need resilience against transient failures:
# Retry on rate limits (up to 3 attempts with exponential backoff)resilient = chain.with_retry(stop_after_attempt=3)
# Fallback to a different model on any errorfrom langchain_anthropic import ChatAnthropicbackup = prompt | ChatAnthropic(model="claude-sonnet-4-20250514") | parserresilient = chain.with_fallbacks([backup])Cost Tracking
Section titled “Cost Tracking”Log token usage per request to catch cost spikes early. A RAG chain that retrieves too many chunks can cost 10-50x more than an optimized one:
from langchain_core.callbacks import BaseCallbackHandler
class CostTracker(BaseCallbackHandler): def on_llm_end(self, response, **kwargs): usage = response.llm_output.get("token_usage", {}) print(f"Tokens: {usage.get('total_tokens', 0)}")For prompt engineering best practices that reduce token costs, see the dedicated guide.
10. Summary and Key Takeaways
Section titled “10. Summary and Key Takeaways”- LCEL is the current LangChain API — if you see
LLMChainorSequentialChainin a tutorial, it is outdated - The pipe operator composes Runnables —
prompt | llm | parsercreates a streamable, batchable chain - RunnablePassthrough forwards input unchanged — essential for RAG chains where the question must reach the prompt
- RunnableParallel runs branches concurrently — total latency equals the slowest branch, not the sum
- Streaming works automatically — call
.stream()instead of.invoke()on any chain - Use
.with_fallbacks()for resilience — primary model fails, backup model takes over - Know when to stop — if your chain exceeds 4 pipes or needs loops, switch to LangGraph
Related
Section titled “Related”- LangChain Tutorial — Full LangChain guide covering RAG, agents, and tool calling
- LangChain vs LangGraph — When linear chains are not enough
- LangGraph Tutorial — Build stateful agents with cycles and persistence
- LangSmith vs Langfuse — Observability for your LCEL chains
- RAG Architecture — Deep dive into retrieval-augmented generation
- Pydantic AI Tutorial — Type-safe alternative to LangChain
Frequently Asked Questions
What is LCEL in LangChain?
LCEL (LangChain Expression Language) is the declarative composition syntax in LangChain v0.3+. You chain components together using the pipe operator (|), connecting prompts, models, output parsers, and retrievers into a runnable pipeline. LCEL replaced the legacy LLMChain and SequentialChain APIs and handles streaming, batching, and async execution automatically.
How does the pipe operator work in LangChain LCEL?
The pipe operator (|) connects Runnable components in sequence. Each component takes the output of the previous one as input. For example, prompt | llm | parser creates a chain where a prompt template renders variables, the LLM generates a response, and the parser extracts structured output. Under the hood, the | operator calls the __or__ method on the Runnable interface.
What is a Runnable in LangChain?
A Runnable is LangChain's universal interface. Every component — prompts, models, parsers, retrievers, and custom functions — implements the Runnable protocol with three methods: invoke() for single inputs, stream() for token-by-token output, and batch() for processing multiple inputs concurrently. This uniform interface is what makes the pipe operator composition possible.
What is RunnablePassthrough in LangChain?
RunnablePassthrough passes its input through unchanged to the next step in the chain. It is most commonly used in RAG chains to forward the user question alongside retrieved context. For example, you use RunnablePassthrough() as the question key in a dictionary so the original input flows through while the retriever processes the context key separately.
What is RunnableParallel in LangChain?
RunnableParallel runs multiple chains simultaneously and collects their results into a dictionary. You pass a dictionary of named chains, and LangChain executes them all concurrently. This is useful when you need to generate a summary, extract keywords, and analyze sentiment from the same input in parallel rather than running each chain sequentially.
How do I stream responses with LCEL?
Call .stream() instead of .invoke() on any LCEL chain. Every component in the chain supports streaming natively. The prompt renders instantly, the LLM streams tokens as they are generated, and the parser yields chunks as they arrive. Use async streaming with async for chunk in chain.astream(input) in web frameworks like FastAPI.
Should I use LCEL or LangGraph?
Use LCEL for linear pipelines where data flows in one direction — prompt to model to parser. Use LangGraph when you need cycles, conditional routing, state persistence, or human-in-the-loop workflows. Most production agents use LangGraph for orchestration while using LCEL chains inside individual graph nodes for the actual LLM calls.
How does LCEL replace LLMChain?
The legacy LLMChain combined a prompt and model into a single object. LCEL replaces this with prompt | llm | parser, which is more composable and supports streaming natively. LLMChain required special methods for streaming and could not be easily extended with new components. LCEL chains are just Runnables, so any component can be added or swapped without changing the chain structure.
Can I add error handling to LCEL chains?
Yes. LCEL provides .with_fallbacks() to specify backup chains that run when the primary chain fails. You can also use .with_retry() to automatically retry on transient errors like rate limits or timeouts. For example, chain.with_fallbacks([fallback_chain]) tries the main chain first and falls back to the alternative if it raises an exception.
What are the best practices for LCEL in production?
Pin your langchain and langchain-core versions explicitly. Use async methods (ainvoke, astream) in web frameworks to avoid blocking the event loop. Enable LangSmith tracing with LANGCHAIN_TRACING_V2=true for observability. Add .with_fallbacks() for resilience and .with_retry() for transient failures. Keep chains short and composable rather than building deeply nested pipelines.
Last updated: March 2026 | LangChain v0.3+ / Python 3.10+