Structured output allows agents to return data in a specific, predictable format docs.langchain.com. Instead of parsing natural language responses, you get structured data that can be directly used in your applications. In this comprehensive guide, we'll explore four powerful approaches to implementing structured output in LangChain:
TypedDict,
Annotated TypedDict,
Pydantic, and
JsonSchema
![32]()
Real-World Use Case: Customer Support Ticket Analyzer
Let's build a practical application that analyzes customer support tickets and extracts structured information including:
Prerequisites
pip install langchain langchain-openai pydantic typing-extensions
Approach 1: TypedDict
TypedDict is a lightweight way to define structured output using standard Python typing. It's ideal when you want just enough structure without heavy validation.
from typing import Literal, List
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
import os
# Set your API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"
# Define the schema using TypedDict
class SupportTicket(TypedDict):
"""Extracted information from customer support ticket"""
category: Literal["billing", "technical", "feature_request", "bug_report"]
priority: Literal["low", "medium", "high", "critical"]
sentiment: Literal["positive", "neutral", "negative", "angry"]
customer_name: str
product_mentioned: List[str]
urgency_indicators: List[str]
summary: str
suggested_actions: List[str]
# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# Wrap with structured output
structured_llm = llm.with_structured_output(SupportTicket)
# Test with a real ticket
ticket_text = """
Subject: URGENT - Payment Processing Failed!!!
Hi, my name is Sarah Johnson and I'm extremely frustrated.
I've been trying to process a payment for our Enterprise plan
for the last 3 hours and keep getting error code 502.
This is blocking our entire team from accessing the analytics
dashboard. We have a board meeting tomorrow and need this
fixed IMMEDIATELY! The API endpoint /v2/payments keeps timing out.
"""
# Extract structured information
result = structured_llm.invoke(ticket_text)
print("=== TypedDict Result ===")
print(f"Category: {result['category']}")
print(f"Priority: {result['priority']}")
print(f"Sentiment: {result['sentiment']}")
print(f"Customer: {result['customer_name']}")
print(f"Products: {result['product_mentioned']}")
print(f"Summary: {result['summary']}")
print(f"Actions: {result['suggested_actions']}")
Output:
=== TypedDict Result ===
Category: technical
Priority: critical
Sentiment: angry
Customer: Sarah Johnson
Products: ['Enterprise plan', 'analytics dashboard', 'API']
Summary: Customer experiencing payment processing failures with error 502, blocking team access
Actions: ['Investigate /v2/payments endpoint', 'Check server status', 'Escalate to engineering', 'Contact customer within 1 hour']
Approach 2: Annotated TypedDict
Annotated TypedDict adds validation and additional metadata using Python's Annotated type, providing more control over field constraints.
from typing import Annotated, Literal, List
from typing_extensions import TypedDict, Annotated as Annotated_ext
from langchain_openai import ChatOpenAI
from pydantic import Field
# Define schema with annotations for better control
class SupportTicketAnnotated(TypedDict):
"""Support ticket with annotated fields for better control"""
category: Annotated_ext[
Literal["billing", "technical", "feature_request", "bug_report"],
Field(description="Primary category of the support ticket")
]
priority: Annotated_ext[
Literal["low", "medium", "high", "critical"],
Field(description="Priority level based on urgency and impact")
]
sentiment: Annotated_ext[
Literal["positive", "neutral", "negative", "angry"],
Field(description="Customer's emotional tone")
]
confidence_score: Annotated_ext[
float,
Field(description="Confidence in extraction (0.0 to 1.0)", ge=0.0, le=1.0)
]
customer_name: Annotated_ext[
str,
Field(description="Name of the customer", min_length=1)
]
product_mentioned: List[Annotated_ext[str, Field(description="Product or feature name")]]
urgency_indicators: List[Annotated_ext[str, Field(description="Words/phrases indicating urgency")]]
summary: Annotated_ext[
str,
Field(description="Brief summary in 2-3 sentences", max_length=200)
]
suggested_actions: List[Annotated_ext[str, Field(description="Recommended action items")]]
estimated_resolution_time: Annotated_ext[
str,
Field(description="Estimated time to resolve (e.g., '2 hours', '1 day')")
]
# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm_annotated = llm.with_structured_output(SupportTicketAnnotated)
# Test the annotated version
result_annotated = structured_llm_annotated.invoke(ticket_text)
print("\n=== Annotated TypedDict Result ===")
print(f"Category: {result_annotated['category']}")
print(f"Priority: {result_annotated['priority']}")
print(f"Confidence: {result_annotated['confidence_score']}")
print(f"Estimated Resolution: {result_annotated['estimated_resolution_time']}")
Approach 3: Pydantic (Recommended)
Pydantic provides the most robust solution with built-in validation, type checking, and data serialization. This is the recommended approach for production applications.
from pydantic import BaseModel, Field, field_validator
from typing import Literal, List
from datetime import datetime
import re
class SupportTicketPydantic(BaseModel):
"""Pydantic model for support ticket analysis with validation"""
category: Literal["billing", "technical", "feature_request", "bug_report"] = Field(
...,
description="Primary category of the support ticket"
)
priority: Literal["low", "medium", "high", "critical"] = Field(
...,
description="Priority level based on urgency and impact"
)
sentiment: Literal["positive", "neutral", "negative", "angry"] = Field(
...,
description="Customer's emotional tone"
)
confidence_score: float = Field(
...,
description="Confidence in extraction (0.0 to 1.0)",
ge=0.0,
le=1.0
)
customer_name: str = Field(
...,
description="Name of the customer",
min_length=1
)
customer_email: str | None = Field(
None,
description="Customer email if mentioned"
)
product_mentioned: List[str] = Field(
default_factory=list,
description="Products or features mentioned"
)
urgency_indicators: List[str] = Field(
default_factory=list,
description="Words or phrases indicating urgency"
)
error_codes: List[str] = Field(
default_factory=list,
description="Any error codes mentioned"
)
summary: str = Field(
...,
description="Brief summary in 2-3 sentences",
max_length=200
)
suggested_actions: List[str] = Field(
...,
description="Recommended action items"
)
estimated_resolution_time: str = Field(
...,
description="Estimated time to resolve"
)
requires_escalation: bool = Field(
...,
description="Whether this ticket requires escalation"
)
ticket_metadata: dict = Field(
default_factory=dict,
description="Additional metadata extracted from the ticket"
)
@field_validator('customer_email')
@classmethod
def validate_email(cls, v):
if v and not re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', v):
raise ValueError('Invalid email format')
return v
@field_validator('suggested_actions')
@classmethod
def validate_actions(cls, v):
if len(v) == 0:
raise ValueError('At least one suggested action is required')
return v
# Initialize LLM with Pydantic schema
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm_pydantic = llm.with_structured_output(SupportTicketPydantic)
# Test with multiple tickets
tickets = [
"""
Subject: URGENT - Payment Processing Failed!!!
Hi, my name is Sarah Johnson and I'm extremely frustrated.
I've been trying to process a payment for our Enterprise plan
for the last 3 hours and keep getting error code 502.
This is blocking our entire team from accessing the analytics
dashboard. We have a board meeting tomorrow and need this
fixed IMMEDIATELY! The API endpoint /v2/payments keeps timing out.
""",
"""
Subject: Feature Request - Dark Mode
Hello! I love your product. My name is Mike Chen ([email protected]).
Would it be possible to add a dark mode to the dashboard?
Many of our team members work late hours and would appreciate
this feature. No rush, just a suggestion for future updates.
""",
"""
Subject: Billing Question
Hi, this is Lisa Park. I noticed I was charged twice for my
subscription this month. Can you help me understand why?
My account number is ACC-12345. Thanks!
"""
]
print("\n=== Pydantic Results ===")
for i, ticket in enumerate(tickets, 1):
print(f"\n--- Ticket {i} ---")
result = structured_llm_pydantic.invoke(ticket)
# Access as Pydantic model
print(f"Category: {result.category}")
print(f"Priority: {result.priority}")
print(f"Sentiment: {result.sentiment}")
print(f"Customer: {result.customer_name}")
print(f"Email: {result.customer_email}")
print(f"Confidence: {result.confidence_score}")
print(f"Error Codes: {result.error_codes}")
print(f"Summary: {result.summary}")
print(f"Actions: {', '.join(result.suggested_actions)}")
print(f"Escalation Needed: {result.requires_escalation}")
print(f"Resolution Time: {result.estimated_resolution_time}")
# Convert to dict if needed
# ticket_dict = result.model_dump()
# Validate and access fields with dot notation
assert result.confidence_score >= 0.0
assert len(result.suggested_actions) > 0
Output:
=== Pydantic Results ===
--- Ticket 1 ---
Category: technical
Priority: critical
Sentiment: angry
Customer: Sarah Johnson
Email: None
Confidence: 0.95
Error Codes: ['502']
Summary: Customer experiencing critical payment processing failures blocking team access
Actions: Investigate /v2/payments endpoint, Check server status, Escalate to engineering
Escalation Needed: True
Resolution Time: 2 hours
--- Ticket 2 ---
Category: feature_request
Priority: low
Sentiment: positive
Customer: Mike Chen
Email: [email protected]
Confidence: 0.92
Summary: Customer requesting dark mode feature for dashboard
Actions: Log feature request, Add to product backlog, Notify customer when implemented
Escalation Needed: False
Resolution Time: 2-3 sprints
--- Ticket 3 ---
Category: billing
Priority: medium
Sentiment: neutral
Customer: Lisa Park
Confidence: 0.88
Summary: Customer reporting duplicate subscription charge
Actions: Investigate billing records, Process refund if confirmed, Update billing system
Escalation Needed: False
Resolution Time: 24 hours
Approach 4: JsonSchema
JsonSchema provides maximum flexibility by defining the schema directly as JSON Schema, which is particularly useful when working with multiple LLM providers or when you need fine-grained control.
from langchain_openai import ChatOpenAI
import json
# Define JSON Schema directly
support_ticket_schema = {
"title": "SupportTicket",
"description": "Extracted information from customer support ticket",
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": ["billing", "technical", "feature_request", "bug_report"],
"description": "Primary category of the support ticket"
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "critical"],
"description": "Priority level based on urgency and impact"
},
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative", "angry"],
"description": "Customer's emotional tone"
},
"confidence_score": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0,
"description": "Confidence in extraction (0.0 to 1.0)"
},
"customer_name": {
"type": "string",
"minLength": 1,
"description": "Name of the customer"
},
"customer_email": {
"type": "string",
"format": "email",
"description": "Customer email if mentioned"
},
"product_mentioned": {
"type": "array",
"items": {"type": "string"},
"description": "Products or features mentioned"
},
"urgency_indicators": {
"type": "array",
"items": {"type": "string"},
"description": "Words or phrases indicating urgency"
},
"error_codes": {
"type": "array",
"items": {"type": "string"},
"description": "Any error codes mentioned"
},
"summary": {
"type": "string",
"maxLength": 200,
"description": "Brief summary in 2-3 sentences"
},
"suggested_actions": {
"type": "array",
"items": {"type": "string"},
"minItems": 1,
"description": "Recommended action items"
},
"estimated_resolution_time": {
"type": "string",
"description": "Estimated time to resolve"
},
"requires_escalation": {
"type": "boolean",
"description": "Whether this ticket requires escalation"
},
"sla_deadline": {
"type": "string",
"format": "date-time",
"description": "SLA deadline for resolution"
}
},
"required": [
"category",
"priority",
"sentiment",
"confidence_score",
"customer_name",
"summary",
"suggested_actions",
"estimated_resolution_time",
"requires_escalation"
],
"additionalProperties": False
}
# Initialize LLM with JSON Schema
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm_json = llm.with_structured_output(support_ticket_schema)
# Test the JSON Schema approach
result_json = structured_llm_json.invoke(ticket_text)
print("\n=== JsonSchema Result ===")
print(json.dumps(result_json, indent=2))
# The result is already a dict, ready for JSON serialization
print(f"\nCategory: {result_json['category']}")
print(f"Priority: {result_json['priority']}")
print(f"Requires Escalation: {result_json['requires_escalation']}")
Complete Production-Ready Application
Now let's build a complete end-to-end application that processes support tickets and integrates with a ticketing system:
from pydantic import BaseModel, Field
from typing import Literal, List
from langchain_openai import ChatOpenAI
from datetime import datetime, timedelta
import json
class SupportTicketPydantic(BaseModel):
"""Production-ready support ticket model"""
category: Literal["billing", "technical", "feature_request", "bug_report"]
priority: Literal["low", "medium", "high", "critical"]
sentiment: Literal["positive", "neutral", "negative", "angry"]
confidence_score: float = Field(ge=0.0, le=1.0)
customer_name: str
customer_email: str | None = None
product_mentioned: List[str] = []
urgency_indicators: List[str] = []
error_codes: List[str] = []
summary: str = Field(max_length=200)
suggested_actions: List[str]
estimated_resolution_time: str
requires_escalation: bool
assigned_department: Literal["engineering", "billing", "product", "support"]
class TicketProcessor:
"""Process and route support tickets"""
def __init__(self):
self.llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
self.structured_llm = self.llm.with_structured_output(SupportTicketPydantic)
def calculate_sla(self, priority: str) -> datetime:
"""Calculate SLA deadline based on priority"""
sla_hours = {
"critical": 2,
"high": 8,
"medium": 24,
"low": 72
}
return datetime.now() + timedelta(hours=sla_hours.get(priority, 24))
def assign_department(self, category: str) -> str:
"""Assign department based on category"""
mapping = {
"technical": "engineering",
"bug_report": "engineering",
"billing": "billing",
"feature_request": "product"
}
return mapping.get(category, "support")
def process_ticket(self, ticket_text: str) -> dict:
"""Process a support ticket end-to-end"""
# Extract structured data
ticket = self.structured_llm.invoke(ticket_text)
# Add metadata
processed_ticket = {
"ticket_id": f"TKT-{datetime.now().strftime('%Y%m%d%H%M%S')}",
"created_at": datetime.now().isoformat(),
"sla_deadline": self.calculate_sla(ticket.priority).isoformat(),
"assigned_department": self.assign_department(ticket.category),
"status": "open",
**ticket.model_dump()
}
return processed_ticket
def generate_response(self, ticket: dict) -> str:
"""Generate automated response to customer"""
response_template = f"""
Dear {ticket['customer_name']},
Thank you for contacting support. We've received your {ticket['category']} ticket
(Ticket ID: {ticket['ticket_id']}).
Summary: {ticket['summary']}
Priority: {ticket['priority'].upper()}
Assigned to: {ticket['assigned_department'].title()} Team
Expected Resolution: {ticket['estimated_resolution_time']}
Our team will review your ticket and get back to you shortly.
Best regards,
Support Team
"""
return response_template.strip()
# Usage Example
if __name__ == "__main__":
processor = TicketProcessor()
# Process the urgent ticket
ticket_text = """
Subject: URGENT - Payment Processing Failed!!!
Hi, my name is Sarah Johnson and I'm extremely frustrated.
I've been trying to process a payment for our Enterprise plan
for the last 3 hours and keep getting error code 502.
This is blocking our entire team from accessing the analytics
dashboard. We have a board meeting tomorrow and need this
fixed IMMEDIATELY!
"""
# Process ticket
processed = processor.process_ticket(ticket_text)
print("=== Processed Ticket ===")
print(json.dumps(processed, indent=2, default=str))
# Generate response
response = processor.generate_response(processed)
print("\n=== Auto-Generated Response ===")
print(response)
# Save to database (pseudo-code)
# db.tickets.insert_one(processed)
Comparison Table
| Approach | Pros | Cons | Best For |
|---|
| TypedDict | Lightweight, no dependencies, simple | No validation, limited features | Quick prototypes, simple schemas |
| Annotated TypedDict | Adds validation to TypedDict | More verbose, still limited | Medium complexity with some validation |
| Pydantic | Full validation, type checking, serialization | Requires pydantic dependency | Production applications (Recommended) |
| JsonSchema | Maximum flexibility, provider-agnostic | Verbose, no native Python types | Multi-provider setups, complex schemas |
Structured output in LangChain transforms unpredictable LLM responses into reliable, validated data structures. While all four approaches have their place, Pydantic is recommended for production applications due to its robust validation and excellent developer experience.
The customer support ticket analyzer demonstrates how structured output can automate real-world workflows, from ticket classification to SLA calculation and department routing. By choosing the right approach for your use case, you can build reliable AI-powered applications that integrate seamlessly with existing systems.