Is SIPHON really free?

Yes. SIPHON is 100% open-source under the Apache 2.0 license. You never pay us a platform fee. You only pay for your own infrastructure and the AI/Telephony providers you choose (e.g., OpenAI, Twilio). Zero markups, zero per-minute fees.

How much can I save compared to Vapi, Retell, or Bland?

Managed platforms charge $0.05-$0.30/min on top of your AI provider costs. With SIPHON, you pay only direct provider costs (typically $0.01-$0.03/min). For 10,000 minutes/month, that's $500-$3,000 in savings.

Who owns my call data with SIPHON?

You do. SIPHON runs on your infrastructure. All recordings, transcripts, and metadata stay in your storage (S3, MongoDB, SQL, etc.). No third-party platform has access to your customer conversations.

Do I need to be a VoIP expert?

Not at all. SIPHON abstracts away the complex SIP signaling and media handling. If you're comfortable with Python, you can build production-grade voice agents.

Which AI models can I use with SIPHON?

You have total freedom. SIPHON supports OpenAI, Anthropic, Gemini, Groq, DeepSeek, Cerebras, and open-source models. You can swap providers with a single config change - no vendor lock-in.

How does SIPHON handle latency?

SIPHON is optimized for sub-500ms latency using WebRTC (LiveKit). By running on your infrastructure and choosing fast models (Groq, Cerebras, GPT-4o), you achieve natural conversation speeds without platform routing overhead.

Can I scale SIPHON to thousands of calls?

Yes. SIPHON is built on LiveKit, which powers massive-scale streaming. The worker architecture allows horizontal scaling to handle any call volume on your infrastructure.

Is SIPHON HIPAA/SOC2 compliant?

SIPHON is a framework that runs on your infrastructure. Compliance depends on your deployment. Since you control all data and infrastructure, you can deploy SIPHON in HIPAA-compliant or air-gapped environments - something not possible with managed platforms.

Realtime Models Overview

Realtime models provide native audio-to-audio conversation capabilities with end-to-end latency optimizations. These models handle the complete voice pipeline (speech-to-text, reasoning, and text-to-speech) in a single integrated system.

What are Realtime Models?

Realtime models are designed specifically for voice conversations and offer:

Native Audio Processing: Direct audio input and output without separate STT/TTS components
Ultra-Low Latency: Optimized for real-time conversations with minimal delay
Integrated Pipeline: Single model handles listening, reasoning, and speaking
Natural Interruptions: Better handling of conversational dynamics

When to Use Realtime Models

Use realtime models when:

You need the absolute lowest latency for voice conversations
You want simplified architecture (one model instead of LLM + STT + TTS)
Your use case benefits from native audio understanding
You're building interactive voice experiences

Available Providers

SIPHON supports the following realtime model providers:

Usage Pattern

from siphon.agent import Agent
from siphon.plugins import openai  # or gemini

# Use realtime model instead of separate LLM/STT/TTS
agent = Agent(
    agent_name="RealtimeAssistant",
    llm=openai.Realtime(
        model="gpt-realtime",
        voice="alloy",
        temperature=0.3
    ),
    system_instructions="You are a helpful voice assistant.",
)

if __name__ == "__main__":
    agent.dev()

Key Differences from Traditional Pipeline

Traditional (LLM + STT + TTS)	Realtime Model
Three separate components	Single integrated component
Higher overall latency	Ultra-low latency
More configuration options	Simplified configuration
Flexible provider mixing	Provider-specific

Notes

Realtime models typically require specific API access or preview features
Configuration options may vary by provider
Some providers may have usage limits or pricing differences for realtime APIs