Human Archive

Product & Competitive Intelligence

Captures 8,000 hours/day of multimodal human activity data to train the next generation of robots.

Company Overview

Builds and operates large-scale, multimodal data collection infrastructure capturing synchronized video, depth, tactile, and motion data from real-world human activities to provide high-fidelity training datasets for robotics and embodied AI companies.

Competitive Advantage & Moat

Product Roadmap & Public Announcements

Human Archive has publicly announced scaling multimodal data capture operations across diverse real-world environments (homes, restaurants, hotels, retail, construction, horticulture, industrial settings) in India, with capacity of up to 8,000 hours of annotated data per day. Sensor suite includes egocentric RGB, stereo depth, tactile, IMU, and wrist cam. Building the largest multimodal manual labor dataset for robotics foundation models.

Signals & Private Analysis

Job postings for Hardware Engineer Associates in India signal continued investment in custom capture hardware. 50,000+ contributor network and 30-person ops team in India suggest rapid operational scaling. Absence of senior ML hires suggests positioning as a data supplier, not a model builder, potentially making them an acquisition target for larger robotics or AI labs.

Product Roadmap Priorities

Multimodal Robotics Data Curation
Improving
Product Differentiation
Engineering

Providing large-scale, multimodal training datasets to robotics companies building foundation models for dexterous manipulation and physical interaction.

In Plain English

They film thousands of people doing everyday physical tasks with special sensor rigs so robot brains can learn how humans actually move and touch things.

Analogy

It's like hiring 50,000 people to wear GoPros and smart gloves while doing chores so robots can binge-watch humanity's how-to playlist.

World Model Training Data
Improving
Decision Quality
Data

Supplying annotated, environment-diverse datasets to AI labs developing world models and scene understanding for embodied agents.

In Plain English

They capture what dozens of different real-world spaces actually look and feel like so AI agents can understand kitchens, warehouses, and gardens without ever visiting one.

Analogy

It's like giving an AI a passport full of stamps instead of just a postcard from one lab—suddenly it knows what a construction site smells like, not just a kitchen.

Robotics Model Benchmarking
Improving
Operational Efficiency
Product

Enabling robotics startups to benchmark and evaluate model performance across diverse manipulation tasks and environments using standardized, real-world datasets.

In Plain English

They give robot makers a standardized test using real-world footage so everyone can fairly compare whose robot is actually smarter.

Analogy

It's the SAT for robots—finally, everyone takes the same test instead of grading themselves on their own homework.

Tactile Sensing Data
Improving
Product Differentiation
Engineering

Providing tactile and force-feedback datasets to companies developing dexterous robotic hands and haptic systems.

In Plain English

They record what it actually feels like to pick up, squeeze, and handle hundreds of different objects so robotic hands can finally learn to not crush the tomato.

Analogy

It's like teaching a robot the difference between shaking hands and crushing a grape—by letting it study thousands of hours of humans doing both.

Company Overview

Key Team Members

  • Rushil Agarwal, Co-Founder
  • Samay Maini, Co-Founder
  • Shloke Patel, Co-Founder
  • Raj Patel, Co-Founder

Human Archive's massive, operationalized contributor network of 50,000+ real-world data collectors paired with custom multi-sensor hardware rigs deployed across dozens of environment types creates a proprietary, hard-to-replicate dataset moat for embodied AI that no competitor currently matches in scale, diversity, or annotation depth.

Funding History

  • 2025 | Founded by Rushil Agarwal, Samay Maini, Shloke Patel, and Raj Patel.
  • 2025 | $500K Pre-Seed from Y Combinator (W25 batch).
  • 2025-2026 | Scaled to 50,000+ contributor network and 8,000 hours/day capture capacity.

Competitors

  • Data Labeling: Scale AI, Labelbox (general data labeling/annotation at scale).
  • Robotics Data: Open X-Embodiment (open-source consortium), Covariant (proprietary pipelines), Bridge Data / RT-2 datasets (academic).
  • Foundation Models: Skild AI, Physical Intelligence (π) (robotics foundation models with proprietary data).
  • Internal: Everyday Robots / Google DeepMind (internal robotics data collection).