AgentPhone

Roadmap & Position in Agent Telephony

Gives AI agents real phone numbers for calls and texts.

Company Overview

AgentPhone is a developer tool that gives AI agents phone numbers for SMS, calls, transcripts, and webhooks. Buyers span agent builders, AI receptionist teams, and sales ops groups running lead follow-up.

What They're Building

The company's public product roadmap & what they're committed to building.

Agent Phone Numbers

US and Canadian numbers for agents that need a durable identity for calls, SMS, and inbound handling.

Unified Webhook

A single event layer that routes messages, calls, transcripts, and status changes into a customer backend.

Hosted Voice Agents

Built-in STT, LLM prompting, and TTS let teams run voice agents without stitching the stack together themselves.

MCP Server

MCP support lets Claude Code, Cursor, Windsurf, and other agent clients place calls and send texts directly from tool calls.

SDKs and API

Python, TypeScript, and REST surfaces, designed to make the product feel like agent infrastructure rather than a telecom console.

Latest Intelligence

Zeitgeist tracks private signals to determine where the company is heading strategically.

Competitors

Twilio:

Telecom API incumbent with massive carrier reach, but AgentPhone competes by packaging phone workflows for AI agents.

Vapi:

Voice agent infrastructure player focused on real-time calls, while AgentPhone widens the surface to numbers, SMS, webhooks, and MCP.

Retell AI:

Voice AI platform for phone agents, stronger on call automation while AgentPhone pitches a lower-level agent phone layer.

AgentPhone

's Moat:

A defensible position is still ahead of the company. The likely path is workflow switching costs as teams wire numbers, webhooks, transcripts, and MCP tools deep into their agent stacks and accumulate operational state inside AgentPhone.

How They're Leveraging AI

AI Use Overview:

AgentPhone appears to orchestrate streaming STT, hosted LLM prompts, TTS, webhooks, and MCP tool calls rather than training proprietary models, which makes the product an integration and developer-experience play, not a model play.

More Similar Companies

ElevenLabs

Building human-like AI voices that speak, clone, dub, and converse in 70+ languages

Having established defensible voice quality and market share through its API, ElevenLabs is now becoming a multimodal generation platform with an enterprise go-to-market engine.

Deepgram

Voice AI infrastructure for real-time speech-to-text, text-to-speech, and voice agents.

Deepgram controls the full vertical stack from bare-metal training hardware to a Rust inference runtime, a cost and latency moat that API competitors riding hyperscaler infrastructure cannot replicate without years of capex.

Cartesia

Real-time multimodal voice AI built on State Space Model foundation architecture.

Cartesia owns the SSM architecture its founders invented, a primitive with linear scaling and constant-time inference that compounds in advantage as latency budgets tighten.

AssemblyAI

Speech-to-text and audio intelligence APIs for developers building voice-powered applications.

Voice is the next API primitive after text, and AssemblyAI has an accuracy and developer-experience lead over cloud incumbents with better margins than full-stack voice agent startups carry.