LangChain LCEL Tutorial — Build Chains the Modern Way (2026)

LangChain LCEL is the modern way to build chains in LangChain v0.3+. If you have used LangChain before 2024, you probably wrote LLMChain or SequentialChain code that no longer works. LCEL (LangChain Expression Language) replaced all of that with a single concept: pipe Runnables together using the | operator. This tutorial walks you through 6 runnable Python examples — from basic chains to parallel execution and fallbacks.

For the broader LangChain ecosystem (RAG, agents, tool calling), see the full LangChain tutorial. This page focuses specifically on LCEL composition patterns.

1. Why LangChain LCEL Replaced Legacy Chains

LangChain LCEL exists because the old chain API was a dead end. Here is what happened and why it matters.

The Problem with LLMChain and SequentialChain

Before LCEL, LangChain had dedicated chain classes for every pattern:

Legacy Class	What It Did	Why It Was Removed
`LLMChain`	Prompt + LLM in one object	Could not stream, could not compose with other chains easily
`SequentialChain`	Multiple chains in sequence	Required explicit `input_keys` / `output_keys` mapping — brittle
`TransformChain`	Custom transformation step	Separate API from LLMChain — not interchangeable
`RouterChain`	Conditional routing	Complex setup for a simple if/else

Every chain type had its own API. Streaming required special methods. Combining chains meant learning each class’s unique interface. The cognitive overhead was enormous.

What LCEL Changed

LCEL unified everything under one interface: Runnable. Every component — prompts, models, parsers, retrievers, custom functions — implements the same three methods: .invoke(), .stream(), .batch(). You compose them with the | operator:

# Old way (deprecated)
chain = LLMChain(llm=llm, prompt=prompt)

# New way (LCEL)
chain = prompt | llm | parser

The | operator calls Python’s __or__ method, creating a RunnableSequence that passes output from one step as input to the next. Streaming works automatically through every step in the chain.

2. When LCEL Matters — Real-World Patterns

LCEL shines when you need to compose multiple steps. Here are the patterns you will actually use in production.

Pattern	LCEL Expression	Use Case
Simple generation	`prompt \| llm \| parser`	Chatbots, text generation, summarization
RAG retrieval chain	`{"context": retriever, "question": passthrough} \| prompt \| llm \| parser`	Q&A over documents, knowledge bases
Multi-step reasoning	`chain_1 \| chain_2 \| chain_3`	Extract → analyze → summarize pipelines
Parallel tool calls	`RunnableParallel(a=chain_a, b=chain_b)`	Run sentiment + keywords + summary simultaneously
Streaming responses	`chain.stream(input)`	Real-time UX in web applications
Fallback chains	`chain.with_fallbacks([backup])`	GPT-4o primary, Claude fallback on failure

If your use case is a single LLM API call with no composition, skip LCEL and use the raw SDK directly. LCEL adds value when you chain 2+ components together.

3. How LCEL Works — The Pipe Operator

The pipe operator is the core of LCEL. Understanding how data flows through a chain unlocks every pattern in this tutorial.

The Runnable Protocol

Every LCEL component implements the Runnable interface:

class Runnable:
    def invoke(self, input)    # Single input → single output
    def stream(self, input)    # Single input → streamed output chunks
    def batch(self, inputs)    # Multiple inputs → multiple outputs (concurrent)
    async def ainvoke(self, input)   # Async version of invoke
    async def astream(self, input)   # Async version of stream

When you write prompt | llm | parser, Python creates a RunnableSequence. Calling .invoke() on the sequence passes data through each component in order. Calling .stream() streams tokens through the entire pipeline.

Input/Output Schema

Each Runnable declares its input and output schemas. The pipe operator connects them:

PromptTemplate — Input: dict with template variables. Output: ChatPromptValue (formatted messages).
ChatModel — Input: messages. Output: AIMessage with content and metadata.
OutputParser — Input: AIMessage. Output: str, dict, or Pydantic object.

If the output schema of step N does not match the input schema of step N+1, you get a clear error at chain construction time — not at runtime.

LCEL Pipe Flow

:bar_chart: Visual Explanation

LCEL Pipe Operator — Data Flow

Each component transforms input and passes output to the next

Input

Template variables

Dict: {topic: 'RAG'}

Passed to PromptTemplate

Processing

prompt | llm | parser

PromptTemplate renders messages

ChatModel generates response

OutputParser extracts result

Output

Structured result

String, dict, or Pydantic object

Ready for your application

Idle

4. LangChain LCEL Step by Step

Follow these 6 steps to go from zero to a working LCEL chain with streaming.

Step 1: Install LangChain

pip install langchain langchain-openai langchain-core

Pin versions in production: langchain==0.3.x and langchain-core==0.3.x. LangChain releases weekly and occasionally introduces breaking changes.

Step 2: Create a Prompt Template

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a senior engineer who explains concepts clearly."),
    ("user", "Explain {topic} in 3 bullet points. Be specific.")
])

The template declares {topic} as an input variable. When invoked, it produces formatted chat messages.

Step 3: Pipe to an LLM

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o", temperature=0)
chain = prompt | llm

This two-step chain renders the prompt, then sends it to GPT-4o. The output is an AIMessage object.

Step 4: Add an Output Parser

from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()
chain = prompt | llm | parser

Now the chain returns a plain string instead of an AIMessage. For structured output, use JsonOutputParser or PydanticOutputParser.

Step 5: Chain with a Retriever (RAG)

from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | parser
)

RunnablePassthrough() forwards the user question unchanged. The retriever fetches relevant documents in parallel. Both feed into the prompt template. See the RAG architecture guide for the full pattern.

Step 6: Add Streaming

for chunk in chain.stream({"topic": "LCEL in LangChain"}):
    print(chunk, end="", flush=True)

Every component in the chain supports streaming. The prompt renders instantly, the LLM streams tokens, and the parser yields chunks as they arrive. No extra configuration needed.

5. LCEL Architecture — Runnable Composition

LCEL’s power comes from composing Runnables in two patterns: sequential chains and parallel branches.

Sequential vs Parallel Composition

:bar_chart: Visual Explanation

LCEL Composition Patterns

Sequential chains vs parallel branches — both are Runnables

Sequential Chainprompt | llm | parser

PromptTemplate

ChatModel

OutputParser

Result: single output

Parallel ChainRunnableParallel(a=..., b=...)

Input splits to branches

Branch A: summary chain

Branch B: keywords chain

Result: {summary: ..., keywords: ...}

Idle

How RunnableParallel Works

RunnableParallel takes a dictionary of named chains and runs them all concurrently:

from langchain_core.runnables import RunnableParallel

parallel = RunnableParallel(
    summary=prompt_summary | llm | parser,
    keywords=prompt_keywords | llm | parser,
)

# Both chains run at the same time
results = parallel.invoke({"text": "Your input here..."})
# results = {"summary": "...", "keywords": "..."}

The total latency equals the slowest branch, not the sum of all branches. For 3 chains that each take 2 seconds, sequential execution takes 6 seconds while parallel execution takes 2 seconds.

Dictionary Shorthand

You do not always need to import RunnableParallel explicitly. A plain dictionary in an LCEL chain creates one automatically:

# This is equivalent to RunnableParallel
chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | parser
)

The dictionary {"context": ..., "question": ...} runs both branches concurrently and passes the combined result to the prompt.

6. LangChain LCEL Code Examples in Python

Six runnable examples covering the patterns you will use most. Each example is self-contained.

Example 1: Basic LCEL Chain

The simplest possible chain — prompt, model, parser:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful coding assistant."),
    ("user", "Write a Python function that {task}")
])

chain = prompt | ChatOpenAI(model="gpt-4o") | StrOutputParser()
result = chain.invoke({"task": "reverses a linked list"})
print(result)

Example 2: RAG Chain with Retriever

A retrieval-augmented generation chain that answers questions from your documents:

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(embedding_function=embeddings)
vectorstore.add_texts(["LCEL uses the pipe operator for composition.",
                       "RunnableParallel runs chains concurrently.",
                       "Every LCEL component implements invoke, stream, and batch."])
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

def format_docs(docs):
    return "\n".join(doc.page_content for doc in docs)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer based only on this context:\n{context}"),
    ("user", "{question}")
])

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model="gpt-4o")
    | StrOutputParser()
)

answer = rag_chain.invoke("How does LCEL composition work?")
print(answer)

Example 3: RunnablePassthrough for Data Forwarding

RunnablePassthrough forwards input unchanged. Use it when one branch processes data while another passes it through:

from langchain_core.runnables import RunnablePassthrough

# RunnablePassthrough.assign() adds new keys alongside existing ones
chain = RunnablePassthrough.assign(
    word_count=lambda x: len(x["text"].split())
)

result = chain.invoke({"text": "LCEL makes LangChain chains composable"})
# result = {"text": "LCEL makes LangChain chains composable", "word_count": 6}

Example 4: RunnableParallel for Concurrent Execution

Run 3 analysis chains simultaneously on the same input:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel

llm = ChatOpenAI(model="gpt-4o")
parser = StrOutputParser()

summary_prompt = ChatPromptTemplate.from_template("Summarize in 1 sentence: {text}")
keywords_prompt = ChatPromptTemplate.from_template("Extract 5 keywords from: {text}")
sentiment_prompt = ChatPromptTemplate.from_template("Is this positive, negative, or neutral? {text}")

analysis = RunnableParallel(
    summary=summary_prompt | llm | parser,
    keywords=keywords_prompt | llm | parser,
    sentiment=sentiment_prompt | llm | parser,
)

results = analysis.invoke({"text": "LCEL is a game-changer for LangChain development."})
print(results["summary"])
print(results["keywords"])
print(results["sentiment"])

Example 5: Streaming Chain

Stream tokens to the user in real time:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
    ("system", "You explain technical concepts for senior engineers."),
    ("user", "Explain {topic} with a concrete code example.")
])

chain = prompt | ChatOpenAI(model="gpt-4o") | StrOutputParser()

# Synchronous streaming
for chunk in chain.stream({"topic": "LCEL Runnables"}):
    print(chunk, end="", flush=True)

# Async streaming (for FastAPI, Django, etc.)
# async for chunk in chain.astream({"topic": "LCEL Runnables"}):
#     yield chunk

Example 6: Chain with Fallbacks

Use .with_fallbacks() to add resilience when the primary model fails:

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_template("Explain {topic} in 3 sentences.")
parser = StrOutputParser()

primary = prompt | ChatOpenAI(model="gpt-4o") | parser
fallback = prompt | ChatAnthropic(model="claude-sonnet-4-20250514") | parser

# If GPT-4o fails (rate limit, outage), automatically try Claude
resilient_chain = primary.with_fallbacks([fallback])

result = resilient_chain.invoke({"topic": "LangChain LCEL"})
print(result)

7. LCEL Trade-offs — When Not to Use It

LCEL is not always the right tool. Here are the cases where you should reach for something else.

When LCEL Is Overkill

Single API call — If you just need openai.chat.completions.create(), the raw SDK is simpler. LCEL adds value only when you compose 2+ steps.
Simple function calls — A Python function that calls an API and returns results does not need to be a Runnable. Regular Python is fine.
Prototyping — When you are exploring an idea, write plain Python first. Convert to LCEL only when you need streaming, batching, or composition.

When to Use LangGraph Instead

LCEL handles linear pipelines where data flows in one direction. LangGraph handles everything else:

Pattern	LCEL	LangGraph
Linear chain (A → B → C)	Yes	Overkill
Conditional routing (if X, do A; else do B)	Possible but awkward	Built-in
Loops (retry until success)	No	Yes
State persistence across requests	No	Yes
Human-in-the-loop approval	No	Yes

Most production agent systems use LangGraph for orchestration with LCEL chains inside individual graph nodes. See the LangChain vs LangGraph comparison for a detailed breakdown.

Readability Concerns

The pipe operator syntax is elegant for short chains but degrades quickly:

# Readable (3 pipes)
chain = prompt | llm | parser

# Still OK (with named dictionary)
chain = {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | parser

# Getting hard to follow (nested parallels)
chain = RunnableParallel(a=p1 | llm | pa, b=p2 | llm | pb) | merge_prompt | llm | parser

If you catch yourself writing deeply nested LCEL, refactor into named sub-chains:

extract_chain = prompt_extract | llm | parser_extract
analyze_chain = prompt_analyze | llm | parser_analyze
final_chain = extract_chain | analyze_chain  # Much clearer

8. LangChain LCEL Interview Questions

These questions test whether you understand LCEL’s mechanics or just memorized the syntax.

Q1: “What is LCEL and why did LangChain adopt it?”

What they test: Do you understand the architectural motivation, not just the syntax?

Strong answer: “LCEL is LangChain’s declarative composition syntax built on the Runnable interface. Every component implements invoke, stream, and batch. The pipe operator connects them: prompt | llm | parser. LangChain adopted it because the legacy chain classes (LLMChain, SequentialChain) each had separate APIs, did not support streaming natively, and could not be easily composed. LCEL unified everything under one interface.”

Q2: “When would you use RunnableParallel vs a sequential chain?”

Strong answer: “Sequential for dependent steps where step N needs step N-1’s output. Parallel for independent steps that can run simultaneously — like extracting keywords, generating a summary, and scoring sentiment from the same input. Parallel reduces total latency from the sum of all steps to the duration of the slowest step.”

Q3: “Your LCEL chain is 8 pipes deep. How do you refactor?”

Strong answer: “Break it into named sub-chains of 2-3 pipes each, then compose those. If the chain has conditional logic or loops, migrate to LangGraph where each node is a short LCEL chain. Name each sub-chain descriptively so the pipeline reads like a recipe.”

Q4: “How does LCEL handle errors in a chain?”

Strong answer: “By default, an exception in any step propagates and stops the chain. You can add .with_fallbacks([backup_chain]) to try alternatives on failure, and .with_retry(stop_after_attempt=3) for transient errors like rate limits. For fine-grained error handling, wrap individual steps in try/except inside a RunnableLambda.”

9. LCEL in Production — Streaming and Tracing

Running LCEL chains in production requires observability, async execution, and cost awareness.

LangSmith Tracing

Enable LangSmith to trace every chain execution:

export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=your-key
export LANGCHAIN_PROJECT=my-project

With tracing enabled, every .invoke() and .stream() call creates a trace span. You see the exact input and output of each step, latency per component, and token usage. Without this, debugging a 4-step chain that returns wrong answers is guesswork.

Async Execution in Web Frameworks

In FastAPI or Django, always use async methods to avoid blocking the event loop:

from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()

@app.post("/chat")
async def chat(question: str):
    async def generate():
        async for chunk in chain.astream({"question": question}):
            yield chunk
    return StreamingResponse(generate(), media_type="text/plain")

Using chain.invoke() in an async web server blocks the entire event loop for the duration of the LLM call. A single slow request blocks all other requests. Always use chain.ainvoke() or chain.astream().

Retry and Fallback Patterns

Production chains need resilience against transient failures:

# Retry on rate limits (up to 3 attempts with exponential backoff)
resilient = chain.with_retry(stop_after_attempt=3)

# Fallback to a different model on any error
from langchain_anthropic import ChatAnthropic
backup = prompt | ChatAnthropic(model="claude-sonnet-4-20250514") | parser
resilient = chain.with_fallbacks([backup])

Cost Tracking

Log token usage per request to catch cost spikes early. A RAG chain that retrieves too many chunks can cost 10-50x more than an optimized one:

from langchain_core.callbacks import BaseCallbackHandler

class CostTracker(BaseCallbackHandler):
    def on_llm_end(self, response, **kwargs):
        usage = response.llm_output.get("token_usage", {})
        print(f"Tokens: {usage.get('total_tokens', 0)}")

For prompt engineering best practices that reduce token costs, see the dedicated guide.

10. Summary and Key Takeaways

LCEL is the current LangChain API — if you see LLMChain or SequentialChain in a tutorial, it is outdated
The pipe operator composes Runnables — prompt | llm | parser creates a streamable, batchable chain
RunnablePassthrough forwards input unchanged — essential for RAG chains where the question must reach the prompt
RunnableParallel runs branches concurrently — total latency equals the slowest branch, not the sum
Streaming works automatically — call .stream() instead of .invoke() on any chain
Use .with_fallbacks() for resilience — primary model fails, backup model takes over
Know when to stop — if your chain exceeds 4 pipes or needs loops, switch to LangGraph

LangChain Tutorial — Full LangChain guide covering RAG, agents, and tool calling
LangChain vs LangGraph — When linear chains are not enough
LangGraph Tutorial — Build stateful agents with cycles and persistence
LangSmith vs Langfuse — Observability for your LCEL chains
RAG Architecture — Deep dive into retrieval-augmented generation
Pydantic AI Tutorial — Type-safe alternative to LangChain

Frequently Asked Questions

What is LCEL in LangChain?

LCEL (LangChain Expression Language) is the declarative composition syntax in LangChain v0.3+. You chain components together using the pipe operator (|), connecting prompts, models, output parsers, and retrievers into a runnable pipeline. LCEL replaced the legacy LLMChain and SequentialChain APIs and handles streaming, batching, and async execution automatically.

How does the pipe operator work in LangChain LCEL?

The pipe operator (|) connects Runnable components in sequence. Each component takes the output of the previous one as input. For example, prompt | llm | parser creates a chain where a prompt template renders variables, the LLM generates a response, and the parser extracts structured output. Under the hood, the | operator calls the __or__ method on the Runnable interface.

What is a Runnable in LangChain?

A Runnable is LangChain's universal interface. Every component — prompts, models, parsers, retrievers, and custom functions — implements the Runnable protocol with three methods: invoke() for single inputs, stream() for token-by-token output, and batch() for processing multiple inputs concurrently. This uniform interface is what makes the pipe operator composition possible.

What is RunnablePassthrough in LangChain?

RunnablePassthrough passes its input through unchanged to the next step in the chain. It is most commonly used in RAG chains to forward the user question alongside retrieved context. For example, you use RunnablePassthrough() as the question key in a dictionary so the original input flows through while the retriever processes the context key separately.

What is RunnableParallel in LangChain?

RunnableParallel runs multiple chains simultaneously and collects their results into a dictionary. You pass a dictionary of named chains, and LangChain executes them all concurrently. This is useful when you need to generate a summary, extract keywords, and analyze sentiment from the same input in parallel rather than running each chain sequentially.

How do I stream responses with LCEL?

Call .stream() instead of .invoke() on any LCEL chain. Every component in the chain supports streaming natively. The prompt renders instantly, the LLM streams tokens as they are generated, and the parser yields chunks as they arrive. Use async streaming with async for chunk in chain.astream(input) in web frameworks like FastAPI.

Should I use LCEL or LangGraph?

Use LCEL for linear pipelines where data flows in one direction — prompt to model to parser. Use LangGraph when you need cycles, conditional routing, state persistence, or human-in-the-loop workflows. Most production agents use LangGraph for orchestration while using LCEL chains inside individual graph nodes for the actual LLM calls.

How does LCEL replace LLMChain?

The legacy LLMChain combined a prompt and model into a single object. LCEL replaces this with prompt | llm | parser, which is more composable and supports streaming natively. LLMChain required special methods for streaming and could not be easily extended with new components. LCEL chains are just Runnables, so any component can be added or swapped without changing the chain structure.

Can I add error handling to LCEL chains?

Yes. LCEL provides .with_fallbacks() to specify backup chains that run when the primary chain fails. You can also use .with_retry() to automatically retry on transient errors like rate limits or timeouts. For example, chain.with_fallbacks([fallback_chain]) tries the main chain first and falls back to the alternative if it raises an exception.

What are the best practices for LCEL in production?

Pin your langchain and langchain-core versions explicitly. Use async methods (ainvoke, astream) in web frameworks to avoid blocking the event loop. Enable LangSmith tracing with LANGCHAIN_TRACING_V2=true for observability. Add .with_fallbacks() for resilience and .with_retry() for transient failures. Keep chains short and composable rather than building deeply nested pipelines.

Last updated: March 2026 | LangChain v0.3+ / Python 3.10+

LangChain LCEL Tutorial — Build Chains the Modern Way (2026)

1. Why LangChain LCEL Replaced Legacy Chains

The Problem with LLMChain and SequentialChain

What LCEL Changed

2. When LCEL Matters — Real-World Patterns

3. How LCEL Works — The Pipe Operator

The Runnable Protocol

Input/Output Schema

LCEL Pipe Flow

:bar_chart: Visual Explanation

4. LangChain LCEL Step by Step

Step 1: Install LangChain

Step 2: Create a Prompt Template

Step 3: Pipe to an LLM

Step 4: Add an Output Parser

Step 5: Chain with a Retriever (RAG)

Step 6: Add Streaming

5. LCEL Architecture — Runnable Composition

Sequential vs Parallel Composition

:bar_chart: Visual Explanation

How RunnableParallel Works

Dictionary Shorthand

6. LangChain LCEL Code Examples in Python

Example 1: Basic LCEL Chain

Example 2: RAG Chain with Retriever

Example 3: RunnablePassthrough for Data Forwarding

Example 4: RunnableParallel for Concurrent Execution

Example 5: Streaming Chain

Example 6: Chain with Fallbacks

7. LCEL Trade-offs — When Not to Use It

When LCEL Is Overkill

When to Use LangGraph Instead

Readability Concerns

8. LangChain LCEL Interview Questions

Q1: “What is LCEL and why did LangChain adopt it?”

Q2: “When would you use RunnableParallel vs a sequential chain?”

Q3: “Your LCEL chain is 8 pipes deep. How do you refactor?”

Q4: “How does LCEL handle errors in a chain?”

9. LCEL in Production — Streaming and Tracing

LangSmith Tracing

Async Execution in Web Frameworks

Retry and Fallback Patterns

Cost Tracking

10. Summary and Key Takeaways

Related

Frequently Asked Questions