OpenAI (Realtime)
OpenAI (Realtime)
Use OpenAI's Realtime API for native audio-to-audio conversations with ultra-low latency.
Setup
- Set
OPENAI_API_KEY
# .env
OPENAI_API_KEY=...
Example
from siphon.agent import Agent
from siphon.plugins import openai
agent = Agent(
agent_name="RealtimeAssistant",
llm=openai.Realtime(
model="gpt-realtime",
voice="alloy",
temperature=0.3
),
system_instructions="You are a helpful voice assistant.",
)
if __name__ == "__main__":
agent.dev()
Common options
model(default:gpt-realtime)voice(default:alloy)- Available voices:
alloy,echo,shimmer
- Available voices:
temperature(default:0.3)api_key(default:None- reads fromOPENAI_API_KEYenv var)
Notes
- OpenAI Realtime API provides native audio input/output
- Handles the complete voice pipeline in a single model
- Optimized for conversational latency
- Supports natural interruptions and turn-taking
When to Use
Use OpenAI Realtime when:
- You need ultra-low latency voice conversations
- You want simplified architecture (one component instead of LLM + STT + TTS)
- Your use case benefits from OpenAI's conversational AI optimizations
Alternative: Traditional Pipeline
If you need more flexibility or want to mix providers, use the traditional approach:
from siphon.plugins import openai, deepgram, cartesia
agent = Agent(
agent_name="Assistant",
llm=openai.LLM(),
stt=deepgram.STT(),
tts=cartesia.TTS(),
)