A privacy-first wearable AI that only listens when pressed, no always-on mic or cloud needed.
Using on-device speech-to-text in under 500ms, intent classification across a voice app ecosystem, and federated speaker adaptation that learns without uploading audio.

Technology
|
Consumer Hardware
|
YC W26

Last Updated:
March 19, 2026

Builds a privacy-first wearable AI button that clips to clothing and delivers instant, voice-activated access to AI, only listening when physically pressed. Focused on making conversational computing accessible to everyone.
Button Computer has announced pre-orders and U.S. shipping for end of 2026. The product emphasizes privacy (only listening when pressed) as a key differentiator from always-on devices. Targeting mainstream consumers who want AI access without privacy concerns.
Both founders spent years at Apple working on Vision Pro, providing deep hardware and software integration expertise. The privacy-first approach differentiates from competitors like Humane AI Pin and Rabbit R1. Expected seed round post-Demo Day.
<p>On-device, press-to-talk speech recognition that converts spoken queries into text with sub-second latency on a power-constrained wearable.</p>
The button instantly understands what you say without ever sending your voice to the cloud.
Button Computer's core ML challenge is running high-accuracy automatic speech recognition (ASR) entirely on a tiny, battery-powered wearable. The device likely employs a heavily quantized transformer or Conformer-based ASR model, compressed via pruning and INT8/INT4 quantization to fit within the memory and compute constraints of an embedded neural processing unit (NPU). By performing inference on-device, Button eliminates round-trip cloud latency and ensures that raw audio never leaves the hardware—a critical privacy guarantee. The dual-microphone array feeds a beamforming pipeline that isolates the user's voice from ambient noise before passing clean audio frames to the ASR model. This architecture enables the sub-second response times the company advertises while preserving battery life through duty-cycling the NPU only when the physical button is pressed. The press-to-activate design also eliminates the need for a always-on wake-word detection model, further reducing power draw and false activations.
It's like having a translator in your pocket who only wakes up when you tap their shoulder—lightning fast, never eavesdropping.
<p>Voice App intent classification and routing engine that understands user requests and dispatches them to the correct first- or third-party voice application in real time.</p>
The button figures out what you're asking for and instantly sends you to the right mini-app—like a smart receptionist for your voice.
Button Computer's "Voice Apps" platform requires a lightweight but highly accurate natural language understanding (NLU) pipeline that classifies user intent and extracts entities from transcribed speech in real time. The system likely uses a compact transformer-based classifier (e.g., DistilBERT or a custom small language model) fine-tuned on a growing taxonomy of Voice App categories—timers, translations, search, note-taking, smart home control, and third-party skills. When a user presses the button and speaks, the ASR output is passed to the NLU model, which predicts the top-k intents with confidence scores. High-confidence matches are routed directly to the appropriate Voice App; ambiguous queries trigger a brief clarification prompt. As the Voice App ecosystem grows, the intent taxonomy will expand, requiring continual few-shot or zero-shot learning techniques so new apps can be integrated without retraining the full model. Entity extraction (dates, names, locations, amounts) is handled by a lightweight NER module running in tandem. This architecture is what transforms Button from a simple voice assistant into a platform—mirroring how smartphone OS intent systems route actions to installed apps.
It's like a hotel concierge who instantly knows whether you need the restaurant, the spa, or a taxi—just from the first few words out of your mouth.
<p>Personalized on-device speaker adaptation that continuously improves speech recognition accuracy for each individual user's voice, accent, and vocabulary without uploading audio data.</p>
The more you use your button, the better it understands you—and it learns everything about your voice without ever sharing it.
One of Button Computer's most novel ML applications is likely an on-device speaker adaptation system that personalizes the ASR model to each user's unique voice characteristics, accent, speaking pace, and domain-specific vocabulary. Using federated or fully local learning techniques, the device fine-tunes a small adapter layer (e.g., LoRA-style low-rank adaptation) on top of the base ASR model using only the user's own speech data stored ephemerally on-device. Over time, the model learns to better recognize the user's pronunciation patterns, frequently used proper nouns (contact names, brand names, jargon), and preferred phrasing. Because all adaptation happens on-device, no raw audio or personal speech data is ever transmitted—aligning with Button's core privacy promise. Aggregated, differentially private model updates could optionally be sent back to improve the global base model without compromising individual privacy (classic federated learning). This creates a powerful flywheel: the more a user talks to their Button, the more accurate and personalized it becomes, increasing engagement and reducing churn—a critical retention lever for the $7.99/month subscription model.
It's like a barista who memorizes your complicated coffee order after just a few visits—except this one never gossips about it.
Both founders worked on Apple Vision Pro, combining deep experience in spatial computing, hardware-software integration, and consumer product design. Chris's venture partner background adds investor network. The physical press-to-talk mechanism provides a genuine privacy guarantee that software-only solutions cannot match.