If the first OpenAI Agents SDK article explained what the SDK is, this continuation explains how the runtime behaves after the first successful demo. The technical center of a real agent system is not the prompt. It is the run contract: what comes back, what continues to the next turn, what gets stored, and when a streamed run is truly finished.
![openai-agents-sdk-results-state]()
This article avoids repeating the basics. It focuses only on result surfaces, continuation state, typed outputs, and streaming behavior in the OpenAI Agents SDK.
Abstract / Overview
The most important runtime fact in the official docs is simple: one SDK run is one application-level turn. That means your application should treat the run result as a workflow object, not just as display text. In practice, that means reading finalOutput, continuation surfaces like history, lastAgent, and lastResponseId, and resumable state when a run is incomplete.
OpenAI’s runtime guide also makes a second point that matters in production: pick one conversation-state strategy per conversation. Mixing local replay with server-managed state can duplicate context unless you are doing it very deliberately.
Conceptual Background
The run is the real boundary
The OpenAI Agents SDK overview says the SDK is for applications where your system owns orchestration, tool execution, approvals, and state. That means the unit you ship is not a single prompt. It is a full turn of workflow execution.
The results guide makes this more concrete. A run can return a final answer, but it can also return history for replay, the last active specialist, a server-managed response ID, or resumable state when the workflow paused. That is why production apps should model the result object as part of their architecture.
Typed output changes the reliability of the whole workflow
The agent definitions guide says outputType should be used when downstream code needs typed data rather than free-form prose. This is a small setting with a big effect. If the next step is routing, storage, validation, or tool execution, typed output is not a convenience. It is the contract
Runtime map
![openai-agents-sdk-results-state-streaming]()
Conceptual Background
Keep runtime-only data out of model-visible history
The official guide says conversation history is what the model sees, while run context is what your code sees. Authenticated user IDs, database handles, loggers, and helper services belong in the runtime context, not in the prompt transcript. That keeps prompts smaller and keeps sensitive runtime data out of model-visible text.
Step-by-Step Walkthrough
Build the agent around a typed result
The official docs recommend using structured output when downstream code needs a known shape. This is the cleanest starting point for extraction, triage, and classification workflows.
import { Agent, run } from "@openai/agents";
import { z } from "zod";
const Ticket = z.object({
category: z.enum(["billing", "refund", "technical"]),
severity: z.enum(["low", "medium", "high"]),
summary: z.string(),
});
const triageAgent = new Agent({
name: "Triage agent",
instructions: "Classify the request and return structured output.",
outputType: Ticket,
});
const result = await run(triageAgent, "Refund my annual invoice.");
console.log(result.finalOutput);
This pattern is more reliable than parsing free text later. It keeps the model contract close to the agent definition and reduces fragile post-processing code.
Read the whole result object
The results guide says most apps need more than the final answer. The key result surfaces are:
finalOutput for the user-facing answer
history or to_input_list() when your app owns local replay
lastAgent when a specialist should keep control on the next turn
lastResponseId for server-managed continuation
interruptions and state when a run is paused and must later resume
This changes how you persist agent runs. If you store only the rendered answer, you lose the continuation model.
Pick one continuation strategy
The running-agents guide lists four common continuation paths: local replay, sessions, conversationId, and previousResponseId. OpenAI also says sessions are the best default when you want durable memory, resumable approval flows, or storage for your application controls.
A clean Python session example looks like this:
import asyncio
from agents import Agent, Runner, SQLiteSession
agent = Agent(
name="Tour guide",
instructions="Answer with compact travel facts.",
)
session = SQLiteSession("conversation_123")
async def main():
first = await Runner.run(agent, "What city is the Golden Gate Bridge in?", session=session)
print(first.final_output)
second = await Runner.run(agent, "What state is it in?", session=session)
print(second.final_output)
asyncio.run(main())
Use sessions when your app needs a durable state under its own control. Use conversationId when several systems need to share one named conversation. Use previousResponseId when you want a lightweight server-managed continuation between turns.
Do not mix state models by accident
The runtime guide warns that mixing local replay with server-managed state can duplicate context. That bug is easy to create during migrations because the app keeps replaying history while also reusing response IDs or conversation IDs.
A good rule is simple. For one conversation, choose one primary continuation model and make it explicit in code review.
Stream output, but settle the run before you commit the workflow state
The running-agents guide says streaming uses the same agent loop and the same state strategies. The only difference is that your app consumes events while the run is happening. That means a streamed run can still reach tool calls, handoffs, or interruptions before it is actually done.
import asyncio
from openai.types.responses import ResponseTextDeltaEvent
from agents import Agent, Runner
agent = Agent(
name="Planet guide",
instructions="Answer with short facts.",
)
async def main():
stream = Runner.run_streamed(agent, "Give me three short facts about Saturn.")
async for event in stream.stream_events():
if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
print(event.data.delta, end="", flush=True)
print("\nFinal:", stream.final_output)
asyncio.run(main())
The safe engineering rule is not to write business state, audit state, or downstream actions until the stream has settled and the workflow is complete.
Use Cases / Scenarios
Typed ticket triage
A support intake workflow is a strong fit for structured output. The agent can classify the request, assign severity, and return a schema-safe object that the next service can trust without brittle text parsing.
Durable multi-turn assistants
If the product needs memory across turns, the official guidance points to sessions as the best default. This is useful for support, tutoring, onboarding, and internal copilots where the app must own the state store.
Server-managed continuation across services
If multiple workers or services need to continue one conversation, conversationId is the cleaner boundary? If the app only needs cheap response-to-response continuation, previousResponseId is lighter.
Live streaming interfaces
Streaming is useful for chat UIs and dashboards, but the app should still treat completion as a workflow event, not just a UI event. The final state matters more than the first tokens on screen.
Fixes
Fix: stop storing only the final answer
Store the result surface your continuation model needs, not just the visible text. That may mean history, lastAgent, lastResponseId, or resumable state.
Fix: stop mixing replay and server-managed state casually
Choose one main state strategy per conversation. OpenAI explicitly warns that mixing them can duplicate context.
Fix: stop parsing plain text when the next step is code
Use outputType when downstream systems need typed data. This is one of the clearest recommendations in the agent definitions guide.
Fix: stop treating streamed tokens as completion
A stream is only the delivery path. The workflow is complete only after the run settles.
Fix: stop putting runtime-only values into the prompt
Keep runtime data in the local context unless the model truly needs to see it.
FAQs
1. What is the main point of this part?
This part explains the runtime contract of the SDK: result handling, typed outputs, state carry-forward, and streaming settlement
2. Which continuation strategy should most teams start with?
The official runtime guide says sessions are the best default when you want durable memory, resumable approval flows, or storage for your application controls.
3. When should I use lastAgent?
Use it after handoffs when that specialist should normally stay in control for the next turn.
4. What should I persist in a session-based app?
Persist the session store, plus any audit or business metadata your system needs. Do not rely on the final answer alone.
5. Is streaming a different state model?
No. The docs say streaming uses the same agent loop and the same continuation strategies.
References
Conclusion
The first production lesson of the OpenAI Agents SDK is that the answer is not the whole result. Real systems need typed outputs, one clear continuation strategy, correct result persistence, and a strict rule for when a streamed run is actually done.
A good next step is to harden your runtime contract before you add more agents. Define one schema, pick one state model, and make stream settlement explicit in your code path. That will remove more production bugs than adding another specialist ever will.