turbopuffer

Roadmap & Position in Search Infrastructure

Search infrastructure for vector and full-text retrieval.

Company Overview

turbopuffer is a managed search database that runs vector and full-text search on object storage. It serves AI-native software teams including Cursor, Notion, Linear, Superhuman, and Anthropic.

What They're Building

The company's public product roadmap & what they're committed to building.

Hybrid Search Expansion

The public roadmap points to richer full-text ranking, fuzzy search, highlighting, native search-as-you-type, and rank by attribute or distance.

Indexing and Query Performance

Recent roadmap and blog work centers on ANN v3, faster AND queries, BM25 optimization, FTS v2, and inverted index design on object storage.

Namespace Operations

turbopuffer has shipped namespace branching, pinning, cache warming, cross-cloud namespace copies, and cross-region backup capabilities.

Enterprise Controls

Recent releases include audit logs with SIEM integration, permissions guidance, private networking, customer managed encryption keys, and BYOC options.

Developer Surface

The company maintains API clients, benchmarking tools, an MCP server beta, and agent skills for Claude Code and Cursor.

Latest Intelligence

Zeitgeist tracks private signals to determine where the company is heading strategically.

turbopuffer is doing $100M ARR on $1M Raised

May 27, 2026

Confidence:

High

$100M revenue on $1M raised, ~20 employees. Winning Anthropic, Notion, and Cursor with 1/100th the capital and 100 fewer employees than competitors.

Object Storage Search Reframes Vector DB Economics

May 25, 2026

Confidence:

Medium

Architecture and resource allocation are centering on object-storage-native search, hybrid retrieval, namespace branching, cache controls, and high-touch production deployment for large multi-tenant AI workloads.

Founder and Key Execs

Simon Hørup Eskildsen

CEO and Co-Founder, formerly Shopify infrastructure leader scaling database and compute systems.

Justine Li

CTO and Co-Founder, formerly Shopify infrastructure engineer working with Eskildsen on large-scale systems.

Nikhil Benesch

CTO, formerly CTO at Materialize.

Nathan VanBenschoten

Chief Architect, formerly principal engineer at CockroachDB.

Founder Force Multiplier

The founders spent years scaling Shopify infrastructure together, which gives turbopuffer practical judgment on database cost, reliability, and operational pain at high request volume. That background fits the product wedge unusually well.

Funding History

2024 | turbopuffer publicly launched and drew early technical attention for object storage native vector search.
2025 | The company reported 10x sales growth and 5x headcount growth, with fresh undisclosed financing from Lachy Groom and Thrive Capital.
2026 | Public roadmap and blog activity expanded around hybrid search, ANN v3, BM25, FTS v2, namespace operations, and enterprise controls.

Notable Open Roles

Database Engineer

This is the highest-signal role because query latency, indexing, and object storage design define the companys cost advantage.

Lead Security Engineer

Security hiring points to enterprise expansion where compliance, private networking, and auditability become purchase blockers.

Customer Engineer

Customer engineering suggests demand is shifting into complex deployments where retrieval design and migration support drive adoption.

Competitors

Pinecone

Managed vector database incumbent, while turbopuffer competes on object storage economics and hybrid retrieval scale.

Weaviate

Open source and managed vector search platform, broader ecosystem but a different storage and operations model.

Qdrant

Vector search database with strong developer adoption, competing for AI retrieval workloads.

Milvus and Zilliz

Open source and managed vector database stack for large-scale similarity search.

Elasticsearch and OpenSearch

Traditional full-text search systems that turbopuffer is replacing in some hybrid search workloads.

turbopuffer

's Moat:

Technical infrastructure is the moat candidate: object storage economics, cache locality, and namespace scale create switching costs once embedded in customer retrieval paths.

How They're Leveraging AI

AI Retrieval Infrastructure

Customers use turbopuffer to index vectors, full-text fields, and metadata for RAG and agent workflows across code, documents, email, and workspace data.

Hybrid Search for Productivity Software

Linear, Superhuman, Notion, and Cursor use vector search plus full-text search to improve relevance at larger scale and lower cost than prior search stacks.

Object Storage Native Search

The product moves cold data to object storage and caches active namespaces on NVMe and memory, which matters for products with huge long-tail retrieval workloads.

AI Use Overview:

turbopuffer provides the retrieval layer for AI products, storing vectors, text, and metadata so agents and assistants can search large customer corpora cheaply and quickly.

More Similar Companies

Byteport

Makes massive file transfers 10x faster so teams stop deleting data they can't afford to move.

Robotics teams delete 96% of their sensor data because they cannot move it fast enough. Byteport's DART protocol achieves 1500x faster transfer than TCP for large files, which turns a data bottleneck into a data asset for any team that generates more than it can ship.

Captain

Delivers 95%+ accurate knowledge search across unstructured enterprise data, beating standard RAG.

RAG accuracy plateaus around 80% for most implementations. Captain claims 95%+ by running parallel LLM queries across document chunks and aggregating results, which is a brute-force approach that works if the orchestration is fast enough. SOC 2 certified.

EigenPal

Automates enterprise document workflows with 93% straight-through processing from just 3-5 samples.

Most document AI requires hundreds of labeled examples. EigenPal reaches 93% straight-through automation from 3-5 samples, which means regulated enterprises (banks, insurers) can deploy on new document types in hours instead of months.

Human Archive

Captures 8,000 hours/day of multimodal human activity data to train the next generation of robots.

Robotics foundation models are data-starved. Human Archive has 50,000+ contributors wearing custom sensor rigs across homes, restaurants, hotels, and construction sites, capturing 8,000 hours/day of synchronized video, depth, and tactile data. Scale AI for embodied AI.

Back To All Companies >