Streaming Responses from OpenAI
In this lesson, you'll learn how to receive responses in real-time using OpenAI's streaming API, allowing your app to react immediately as the model generates output.
Prerequisites
Make sure to:
- Use a model that supports streaming (like
gpt-4.1) - Install OpenAI SDK:
pip install openai
The Code
from openai import OpenAI
client = OpenAI()
stream = client.responses.create(
model="gpt-4.1",
input=[{"role": "user", "content": "Say 'double bubble bath' ten times fast."}],
stream=True,
)
# Only print the text deltas as they arrive
for event in stream:
if hasattr(event, "delta"):
print(event.delta, end="", flush=True)
Explanation
stream=True: Enables token-by-token response.event.delta: Contains the partial text output received from the model.flush=True: Ensures immediate printing without buffering.
Use Case
Perfect for:
- Live chat experiences
- Streaming assistants or CLI bots
- Apps needing instant model feedback