Replaces manual QA with AI that reads your code, understands changes, and writes tests per PR.
Using intent-aware test synthesis from code diffs, semantic code graph modeling across repos, and natural language test generation so any developer writes suites fast.

Technology
|
Developer Tools
|
YC W26

Last Updated:
March 20, 2026

Builds AI-powered QA that deeply understands source code. Connects to the codebase, reads diffs, understands intent behind changes, then generates and executes end-to-end tests on every pull request. Aimed at replacing manual QA engineering with AI that knows the code.
Per a Hacker News Launch HN post (March 2026), Canary connects to codebases and understands app architecture (routes, controllers, validation logic). On PR push, it reads diffs, understands intent, and generates/executes targeted tests. Currently in closed beta/design partner phase.
The ex-Windsurf and ex-Google team brings deep developer tooling expertise. Code-aware QA is a growing category as AI agents make code changes faster than humans can test. Likely expanding from web app testing to broader coverage.
<p>On every pull request, Canary reads the code diff, infers the developer's intent, and autonomously generates targeted end-to-end tests that cover affected user flows.</p>
Instead of a developer manually writing tests for every code change, Canary reads what you changed and writes the tests for you—like having a QA teammate who actually reads the code before testing.
Canary's most novel ML capability is its diff-aware autonomous test generation pipeline. When a developer opens a pull request, Canary's system ingests the code diff and uses large language models fine-tuned for code understanding to parse the semantic meaning of the changes—identifying affected routes, modified validation logic, altered API contracts, and downstream user flows. Rather than generating generic smoke tests, the system infers the developer's intent (e.g., "this change adds email validation to the signup form") and synthesizes targeted end-to-end test scenarios that exercise the specific functionality impacted. These tests are executed in real browsers against preview environments, and results—including pass/fail status and video session replays—are posted directly to the PR as comments. This closes the feedback loop within the developer's existing workflow, eliminating the traditional bottleneck of waiting for a manual QA cycle. The approach fundamentally differs from record-and-playback tools because it operates at the code semantics layer, making tests inherently less brittle and more aligned with actual application behavior.
It's like having a code reviewer who not only reads your pull request but also immediately runs every scenario a user might hit because of your change—before you even ask.
<p>Canary continuously parses and semantically maps an entire codebase—routes, controllers, validation logic, and API schemas—to maintain a living model of application behavior that drives intelligent test coverage decisions.</p>
Canary builds a mental map of your entire app's code so it knows which parts matter most and tests them first—like a new engineer who instantly memorizes your whole codebase on day one.
Unlike traditional QA tools that operate at the UI or DOM layer with no awareness of underlying application architecture, Canary constructs and continuously maintains a semantic graph of the entire connected codebase. Using a combination of static analysis (AST parsing, control flow analysis, dependency resolution) and LLM-powered semantic understanding, the system maps routes to controllers, traces data validation logic, catalogs API schemas and their consumers, and identifies critical user-facing flows. This living code model serves as the foundation for all downstream testing intelligence: it determines which tests to generate, which flows are highest-risk, where coverage gaps exist, and how a single code change propagates through the application. When new code is committed, the semantic graph is incrementally updated, ensuring test generation always reflects the current state of the application. This deep structural understanding is what enables Canary to generate tests that are contextually meaningful rather than superficially broad, and it represents a significant technical moat—competitors would need to replicate both the code parsing infrastructure and the LLM fine-tuning to achieve comparable depth.
It's like the difference between a GPS that only knows street names and one that understands traffic patterns, construction zones, and your daily commute—Canary doesn't just see your code, it understands how it all connects.
<p>Developers describe desired test scenarios in plain English, and Canary's generative AI translates those descriptions into comprehensive, executable end-to-end test suites grounded in the actual codebase.</p>
You tell Canary what to test in plain English—like "make sure users can't check out with an expired credit card"—and it writes and runs the full test for you automatically.
Canary's natural language test authoring capability represents a paradigm shift in how engineering teams interact with QA tooling. Instead of requiring developers to learn testing frameworks, write boilerplate setup/teardown code, and manually script browser interactions, Canary accepts plain English descriptions of desired test scenarios (e.g., "Verify that a user with an expired subscription cannot access premium content and is redirected to the upgrade page"). The system uses large language models to parse the natural language intent, then grounds the generated test against its semantic understanding of the actual codebase—mapping the described user flow to real routes, components, and validation logic. This grounding step is critical: it prevents the common failure mode of LLM-generated tests that look syntactically correct but don't reflect actual application behavior. The generated tests are fully executable in real browsers and can be triggered via PR comments, making the workflow conversational and iterative. Developers can refine tests by providing follow-up natural language instructions. This capability dramatically lowers the barrier to comprehensive test coverage, enabling frontend engineers, backend engineers, and even product managers to contribute to QA without specialized testing expertise—effectively democratizing quality assurance across the engineering organization.
It's like dictating a recipe to a chef who already knows your kitchen, your pantry, and your dietary restrictions—you say what you want, and they handle every detail.
Viswesh brings experience from Windsurf (AI coding tool) and Google, providing deep understanding of both developer workflows and AI code comprehension. Building QA that understands code intent (not just syntax) is a meaningful technical differentiation.
2025: Viswesh founds Canary
2026: Y Combinator W26 batch