Is SIPHON really free?

Yes. SIPHON is 100% open-source under the Apache 2.0 license. You never pay us a platform fee. You only pay for your own infrastructure and the AI/Telephony providers you choose (e.g., OpenAI, Twilio). Zero markups, zero per-minute fees.

How much can I save compared to Vapi, Retell, or Bland?

Managed platforms charge $0.05-$0.30/min on top of your AI provider costs. With SIPHON, you pay only direct provider costs (typically $0.01-$0.03/min). For 10,000 minutes/month, that's $500-$3,000 in savings.

Who owns my call data with SIPHON?

You do. SIPHON runs on your infrastructure. All recordings, transcripts, and metadata stay in your storage (S3, MongoDB, SQL, etc.). No third-party platform has access to your customer conversations.

Do I need to be a VoIP expert?

Not at all. SIPHON abstracts away the complex SIP signaling and media handling. If you're comfortable with Python, you can build production-grade voice agents.

Which AI models can I use with SIPHON?

You have total freedom. SIPHON supports OpenAI, Anthropic, Gemini, Groq, DeepSeek, Cerebras, and open-source models. You can swap providers with a single config change - no vendor lock-in.

How does SIPHON handle latency?

SIPHON is optimized for sub-500ms latency using WebRTC (LiveKit). By running on your infrastructure and choosing fast models (Groq, Cerebras, GPT-4o), you achieve natural conversation speeds without platform routing overhead.

Can I scale SIPHON to thousands of calls?

Yes. SIPHON is built on LiveKit, which powers massive-scale streaming. The worker architecture allows horizontal scaling to handle any call volume on your infrastructure.

Is SIPHON HIPAA/SOC2 compliant?

SIPHON is a framework that runs on your infrastructure. Compliance depends on your deployment. Since you control all data and infrastructure, you can deploy SIPHON in HIPAA-compliant or air-gapped environments - something not possible with managed platforms.

Horizontal Scaling

One of the most powerful features is its ability to scale horizontally with zero configuration. Because Siphon uses a worker-based architecture, you can handle higher call volumes simply by running more copies of your agent code.

How it works

When you run agent.start(), your script acts as a Worker. It connects to the Siphon/LiveKit infrastructure and waits for jobs (calls).

One Worker: Can handle N concurrent calls (depending on CPU/Memory).
Multiple Workers: Multiply your capacity by simply starting the same script on another machine.

All workers join a shared pool. When a new call comes in, it is automatically routed to an available worker with capacity.

Scaling Out

To scale out, you do not need to change any code.

Server A: Run python agent.py
Server B: Run python agent.py
Server C: Run python agent.py

That's it. You now have 3x the capacity. It handles the load balancing and distribution of calls across these servers automatically.

Deployment Best Practices

For production environments, we recommend:

Containerize your Agent: Wrap your agent.py in a Docker container.
Orchestration: Use Kubernetes or Docker Swarm to manage the number of replicas.
Auto-scaling: Configure your orchestrator to scale the number of pods based on CPU usage or custom metrics (e.g., active calls).

# Example K8s deployment snippet
apiVersion: apps/v1
kind: Deployment
metadata:
  name: siphon-worker
spec:
  replicas: 3 # Start with 3 workers
  template:
    spec:
      containers:
      - name: agent
        image: my-agent-image:latest
        envFrom:
        - secretRef:
            name: siphon-secrets