Compresr

Product & Competitive Intelligence

Cuts LLM input costs by up to 76% while actually improving accuracy through context compression.

Company Overview

Builds an LLM-native context compression API gateway that uses ML-driven relevance filtering and token-level distillation to reduce LLM input costs by up to 76% while improving accuracy.

Competitive Advantage & Moat

Product Roadmap & Public Announcements

Context-Gateway API (latte_v1) with coarse and fine-grained compression. 10x compression on SEC filings via FinanceBench. Open-source Context Gateway proxy for Claude Code, Cursor, and OpenClaw. Python SDK (pip install compresr). Web dashboard for session monitoring, spend caps, and Slack notifications.

Signals & Private Analysis

Zero-latency background compaction, pre-computed summaries, and logging/history compaction. Show HN positioning and agentic proxy architecture suggest expansion into persistent memory management for autonomous agents. Competitive risk from LLMLingua (Microsoft Research) and native model provider compression.

Product Roadmap Priorities

Token-level context distillation
Improving
Cost Reduction
Data

Compresses massive SEC filings (230K+ tokens) down to ~10.5K tokens for LLM analysis, improving accuracy while cutting costs 76%.

In Plain English

It's like hiring a brilliant paralegal who reads a 500-page SEC filing and hands the analyst only the 25 pages that actually matter—faster, cheaper, and somehow more accurate.

Analogy

It's like Marie Kondo organizing your filing cabinet before your accountant arrives—less clutter, better answers, and your accountant charges by the hour.

Agentic memory management
Improving
Operational Efficiency
Engineering

Provides real-time, zero-latency background compaction of agent conversation histories and tool outputs to keep autonomous agents within context limits during long-running tasks.

In Plain English

It's like giving an AI agent a photographic memory that automatically forgets the boring parts so it can keep working on hard problems without losing its train of thought.

Analogy

It's like a road trip GPS that remembers every important turn you need but forgets all the straight highway stretches—so it never runs out of memory mid-journey.

Codebase context compression
Improving
Product Differentiation
Product

Compresses large codebases and repository context fed to LLM-powered coding assistants in IDEs, enabling more accurate code generation and review within token budgets.

In Plain English

It's like giving your AI coding assistant the ability to skim an entire codebase and pull up only the exact files and functions it needs to write your feature—instead of trying to read the whole repo at once and getting confused.

Analogy

It's like a librarian who knows exactly which three books on the shelf answer your question, instead of dumping the entire library on your desk and saying "good luck."

Company Overview

Key Team Members

  • Ivan Zakazov, Co-Founder & CEO
  • Berke Argin, Co-Founder & CAIO
  • Kamel Charaf, Co-Founder & COO
  • Oussama Gabouj, Co-Founder & CTO

Ivan Zakazov (CEO) researched LLM context compression as an EPFL PhD, previously at Microsoft and Philips Research, with published papers at EMNLP-25 and NeurIPS-24 directly on prompt compression. Oussama Gabouj (CTO) researched efficient ML systems and prompt compression at EPFL's DLab and AXA, with EMNLP 2025 paper accepted. Berke Argin (CAIO) is EPFL CS, ex-UBS. Kamel Charaf (COO) holds EPFL Data Science Masters, ex-Bell Labs. All four founders from EPFL provide deep technical credibility.

Funding History

  • 2024-2025 | Founded by four EPFL graduates/researchers.
  • 2025 | Context-Gateway repo and latte_v1 API launched; FinanceBench results published.
  • 2026 | Accepted into Y Combinator W26 batch.

Competitors

  • Prompt Compression: LLMLingua (Microsoft Research), Selective Context.
  • Context Management: LangChain, LlamaIndex.
  • LLM Cost Optimization: Martian, Portkey, Helicone.
  • RAG: Cohere Rerank, Jina AI.