Automates COBOL modernization with formally verified AI that guarantees translated code is correct.
Using LLM-driven code translation with TLA+ formal verification, reinforcement learning for code model optimization, and automated documentation generation.

|
Code Intelligence
|
YC W26

Last Updated:
March 19, 2026

Builds an AI-powered mainframe modernization platform using RL, LLMs, and formal verification (TLA+) to automate documentation, translation, and verification of legacy codebases like COBOL for regulated enterprises.
Rosetta AI suite for automated legacy code documentation and COBOL translation. Open-sourced Specula (TLA+ spec synthesis), Mobol (mainframe transactions), OR-bench-sample. 'Verifiable operational superintelligence.'
Formal verification tooling and operations research benchmarks signal investment in correctness infrastructure. RL environments for proprietary training. Enterprise financial services pilots imminent.
<p>AI-powered automated translation of legacy COBOL codebases into modern programming languages with formal correctness guarantees.</p>
It automatically rewrites ancient bank software into modern code and mathematically proves the new version does exactly the same thing.
Haladir Rosetta uses large language models fine-tuned on legacy and modern code pairs to automatically translate COBOL programs into modern languages such as Java, Python, or C#. Unlike generic AI code assistants, Rosetta integrates formal verification via TLA+ specifications synthesized by Specula to mathematically prove that the translated code preserves the original business logic, control flow, and data integrity. This is critical for regulated industries—banks, insurers, and government agencies—where a single mistranslation in a batch processing routine could cause millions in financial discrepancies or compliance violations. The system ingests entire COBOL codebases, builds dependency graphs, generates human-readable documentation, and produces verified modern equivalents, dramatically reducing the multi-year timelines and billion-dollar budgets typically associated with mainframe migrations. Reinforcement learning is used to iteratively improve translation quality by rewarding outputs that pass formal verification checks and penalizing those that introduce semantic drift.
It's like hiring a translator who not only converts an ancient legal contract from Latin to English but also has a notary stamp proving every clause means exactly the same thing.
<p>Building reinforcement learning environments with formally verified reward signals to train AI coding models that produce provably correct outputs.</p>
They built a training gym for AI where the AI only gets a gold star if a math proof confirms its code is actually correct.
Haladir constructs custom reinforcement learning environments where AI coding agents are trained to generate, transform, and optimize code under formal verification constraints. Traditional RL for code relies on unit tests or heuristic reward signals, which can miss subtle semantic errors. Haladir's approach uses TLA+ specifications as ground-truth oracles: the RL agent proposes a code transformation, Specula synthesizes a formal spec from both the original and transformed code, and a verification engine checks equivalence. The agent receives reward only when formal equivalence is confirmed, and is penalized for semantic drift, dead code introduction, or specification violations. This creates a training loop where models learn not just to produce syntactically correct code, but to preserve deep semantic properties—a requirement for safety-critical and financially sensitive systems. The verified datasets and RL environments are also offered as infrastructure for external AI research teams building their own code-generation models, positioning Haladir as a picks-and-shovels provider for the broader AI coding ecosystem.
It's like training a self-driving car in a simulator where the laws of physics are mathematically perfect, so when it hits the real road, it already knows exactly how to behave.
<p>AI-driven automated documentation and business logic extraction from undocumented legacy mainframe codebases to preserve institutional knowledge.</p>
It reads millions of lines of ancient, uncommented bank code and writes a plain-English manual explaining what every part does and why.
Many enterprises running mainframe systems face a critical knowledge crisis: the original COBOL developers are retiring or have already left, and decades of business logic are embedded in millions of lines of undocumented, poorly structured code. Haladir Rosetta addresses this by using large language models to ingest entire legacy codebases, parse control flow and data dependencies, identify business rules, and generate structured, human-readable documentation at multiple levels of abstraction—from high-level system architecture summaries to line-by-line annotations. The system cross-references code patterns against known mainframe idioms (batch processing, CICS transactions, JCL job flows) to produce contextually accurate explanations. Formal methods ensure that the documented business logic is consistent with actual code behavior, not just an LLM's best guess. This documentation becomes the foundation for modernization planning, compliance audits, and onboarding new engineering talent, transforming opaque legacy systems into transparent, manageable assets. The output also feeds back into Haladir's training pipeline, creating a virtuous cycle where better documentation improves model understanding for future translation and verification tasks.
It's like an archaeologist who can read every hieroglyph in a pyramid, then write a guidebook so clear that a tourist could rebuild it from scratch.
Uniquely combines formal verification (TLA+), RL, and operations research to guarantee correctness in AI code transformation. Regulated industries need mathematical proof that translated code preserves business logic.