
Technology
|
AI Research
|
YC W26
|
Valuation:
Undisclosed (non-profit)

Last Updated:
March 24, 2026

A non-profit foundation that maintains the ARC-AGI benchmark, a test designed to measure genuine machine intelligence by evaluating fluid reasoning and abstraction. Runs annual competitions with significant cash prizes and has become the industry-standard benchmark for frontier AI labs evaluating progress toward AGI.
ARC-AGI-2 released 2025 (harder, more resistant to brute force). ARC Prize 2025: 1,455 teams, 15,154 entries, top Kaggle score 24% on ARC-AGI-2 private eval at $0.20/task. 90 paper submissions (up from 47 in 2024). Over $125,000 in prizes awarded. ARC-AGI-3 publicly released with interactive reasoning challenges requiring exploration, planning, memory, goal acquisition, and alignment. Building academic network and frontier AI lab coalition.
Chollet previewed ARC-AGI-3 at YC Startup School. Published ARC Prize 2025 Technical Report on arXiv (Jan 2026). Central theme of 2025 progress: refinement loops. All 4 frontier labs now report ARC-AGI on public model cards. OpenAI CEO Sam Altman signaled intent to partner on future benchmarks (Dec 2024). Grand Prize remains unclaimed.
Maintains and evolves the ARC-AGI benchmark to evaluate whether AI systems can perform genuine abstract reasoning on novel tasks they've never seen before.
It's an IQ test for AI that checks if machines can actually think creatively instead of just memorizing answers.
It's like giving a genius parrot a Rubik's Cube — sure, it can repeat everything you've ever said, but can it actually solve a new puzzle it's never seen before?
Runs large-scale open competitions with $1M+ prize pools to crowdsource novel algorithmic approaches to general intelligence that go beyond current deep learning paradigms.
They're offering a million-dollar bounty to anyone who can build an AI that's actually smart, not just well-read.
It's like DARPA's Grand Challenge but for building a brain — throw enough prize money at the world's smartest nerds and eventually someone figures out how to make a car drive itself, or in this case, how to make AI actually reason.
Develops progressively harder benchmark versions (ARC-AGI-2 and beyond) that adapt as AI capabilities improve, ensuring the benchmark remains a meaningful measure of intelligence rather than becoming saturated like previous AI tests.
They keep making the test harder so AI companies can't just claim their chatbot is a genius because it aced last year's easy exam.
It's like how the SAT keeps getting redesigned because prep companies crack the old version — except here the stakes are whether we actually achieve artificial general intelligence or just build really convincing fakers.
François Chollet created Keras (adopted by 2.5M+ developers), one of the most cited AI researchers (Xception paper 18,000+ citations). Left Google after 9+ years in Nov 2024. Published "On the Measure of Intelligence" (2019) which introduced the ARC benchmark. Also co-founded Ndea (YC W26). Mike Knoop is co-founder of Zapier (the largest AI automation company). Greg Kamradt is President of the foundation. The benchmark is already adopted by OpenAI, Anthropic, Google DeepMind, and xAI.