PolyAI

Roadmap & Position in Voice AI

Voice AI platform building conversational agents for customer service call centers

Company Overview

PolyAI is a voice AI company that builds customer-led conversational agents to handle inbound call center traffic. Customers include Marriott, Hilton, Landry's (hospitality), Metro Bank, PKO Bank Polski (banking), Golden Nugget (retail), and FedEx (telecom).

What They're Building

The company's public product roadmap & what they're committed to building.

PolyAI Agent Platform

Production voice agents for inbound customer service with an analytics and integration layer.

Agent Studio

No-code tooling for enterprises to design, test, and deploy voice agents.

Multilingual Voice Agents

Deployments across English, Spanish, French, German, and other languages for global enterprise rollouts.

Real-Time Analytics Suite

A conversation intelligence layer surfacing call drivers, containment, and CX metrics.

Latest Intelligence

Zeitgeist tracks private signals to determine where the company is heading strategically.

Real-Time Voice Is The Actual Moat

May 11, 2026

Confidence:

High

New Intel: PolyAI has a dedicated Runtime Engineering team owning the end-to-end media stack with hard latency SLAs. In voice AI, that infrastructure layer is the moat, not the model, and it also governs inference cost per minute.

Distribution Is Being Rebuilt Around Partners And Investors

May 11, 2026

Confidence:

High

New Intel: PolyAI is building a PE and VC partner program with a quota tied to portfolio company adoption, alongside a TSD channel motion. This is a deliberate bet that indirect distribution scales faster than enterprise AE hiring.

Healthcare Is The Next Regulated Wedge

May 11, 2026

Confidence:

High

New Intel: PolyAI is building a healthcare vertical with a dedicated AE, Epic and ModMed integration requirements, and HIPAA plus PCI readiness on the security roadmap. The real competitors here are Hippocratic AI and Infinitus, not contact center incumbents.

Founder and Key Execs

Nikola Mrkšić

CEO and Co-founder (PhD, Cambridge dialogue systems; ex-Apple Siri, ex-Facebook M)

Tsung-Hsien Wen

CTO and Co-founder (PhD, Cambridge; neural dialogue generation pioneer)

Pei-Hao Su

CSO and Co-founder (PhD, Cambridge; reinforcement learning for dialogue)

Founder Force Multiplier

All three co-founders did their PhDs under Steve Young at Cambridge, one of the most influential spoken dialogue systems labs in the world. Nikola Mrksic shipped conversational tech at Apple Siri and Facebook M, Tsung-Hsien Wen pioneered neural dialogue generation, and Pei-Hao Su worked on reinforcement learning for dialogue. That founding research depth is the durable asset.

Funding History

Notable Open Roles

Engineering Manager - Runtime Team

Engineering role tied to Agentic in strategic role description.

Developer Relations Manager

Engineering role tied to Platformization: developer relations + api.

Senior Platform Engineer

Engineering role tied to Platformization: platform engineer + platform.

Competitors

Cresta:

Focused on agent assist and coaching for human reps rather than full voice automation.

Parloa:

German voice AI competitor with strong DACH enterprise footprint and a 2024 Series B.

Sierra:

Bret Taylor's conversational AI startup targeting a broader omnichannel agent use case with higher valuation and hype.

PolyAI

's Moat:

Proprietary dialogue models trained on years of enterprise call data, deep integrations into contact center stacks (Genesys, NICE, Amazon Connect), and a bench of Cambridge dialogue systems researchers competitors would need years to recruit equivalently.

How They're Leveraging AI

Enterprise Conversational AI

Enterprise voice agents handle customer service calls with intent recognition, dialogue management, integrations, and real-time analytics.

AI Use Overview:

PolyAI runs dialogue models trained on years of enterprise call data, multilingual voice agents, intent detection, deep contact-center integrations with Genesys, NICE, and Amazon Connect, and conversation analytics tuned for inbound customer service rather than outbound demos.

More Similar Companies

ElevenLabs

Building human-like AI voices that speak, clone, dub, and converse in 70+ languages

Having established defensible voice quality and market share through its API, ElevenLabs is now becoming a multimodal generation platform with an enterprise go-to-market engine.

Deepgram

Voice AI infrastructure for real-time speech-to-text, text-to-speech, and voice agents.

Deepgram controls the full vertical stack from bare-metal training hardware to a Rust inference runtime, a cost and latency moat that API competitors riding hyperscaler infrastructure cannot replicate without years of capex.

Cartesia

Real-time multimodal voice AI built on State Space Model foundation architecture.

Cartesia owns the SSM architecture its founders invented, a primitive with linear scaling and constant-time inference that compounds in advantage as latency budgets tighten.

AssemblyAI

Speech-to-text and audio intelligence APIs for developers building voice-powered applications.

Voice is the next API primitive after text, and AssemblyAI has an accuracy and developer-experience lead over cloud incumbents with better margins than full-stack voice agent startups carry.

Back To All Companies >