Context Compression
Through MCP

Compressify strips your AI context down to what actually matters - cutting token waste without losing signal.

What is Compressify?

Compressify is an MCP server that sits between your AI assistant and its context window. It removes redundancy, collapses repetition, and surfaces only the information that actually drives better answers.

The problem

The cost driver in modern AI is the growing context exchanged across agent workflows, tool calls, and retrieval. Worse: bloated context doesn't just cost more - it degrades agent performance. Models with irrelevant context hallucinate more and lose focus on the task.

As teams move from simple prompts to RAG pipelines and fully agentic systems, token consumption multiplies - not by 2x, but by orders of magnitude.

Why this persists

LLM providers earn per token - they have zero incentive to reduce consumption. Larger context windows increase their revenue, not reduce it.

Token consumption per task

Traditional prompt

RAG + retrieval

Agent workflow

10-50x

The solution

Compressify sits between your application and the LLM provider. It analyzes, prioritizes, and compresses context - removing noise that inflates cost and weakens model focus.

Pure algorithm, no ML model. 55% token reduction, 11x faster than comparable compression models. Runs on 1 CPU / 2 GB RAM. Zero architecture changes needed on your end.

Preserve semantic fidelity

Drop-in MCP middleware - no code changes

Measurable, transparent savings per request

Token flow

Application100k tokens / day

100K

Compressify55% compression

45K

AI agent–55% usage

Vision

AI agents are becoming core enterprise infrastructure. Every reasoning step, tool call, and memory update scales context - and cost - by orders of magnitude. SOMA is building the compression layer between every application and every LLM.

Our roadmap extends from today's CPU-native algorithms to GPU-based compression models trained on domain-specific data, with dedicated support for Chain-of-Thought preservation where logical structure matters most.

For regulated industries, SOMA will run on-premise inside Trusted Execution Environments ensuring uncompressed data never leaves the customer's security boundary.

The team

We build tools that make AI more efficient. We have shipped context-handling infrastructure at scale and believe token cost is an engineering problem.

Piotr Barbachowski

Founder & Managing Partner

Founded Dendrite in 2022. Built it to 50 people and $40.5M in revenue in under 3 years. Now applying the same playbook to SOMA.

Aleksander Muszynski

CEO

Previously led product development of an AI coding agent at Dendrite, with hands-on experience managing AI agent projects end-to-end. Combines deep technical understanding with the product intuition needed to build and ship fast - experience forged across Intel and Dendrite.

Mateusz Zadrozny

CTO

Deep expertise in NLP, compression algorithms, and production ML systems. Extensive experience designing and deploying AI agent architectures at scale. Core architect of SOMA's architecture.

Artur Walczak

Lead Algorithm Engineer & Math Expert

A Laureate of the Polish Mathematical Olympiad and algorithm specialist. Designs the core of validation powering SOMA.

Get in touch

Interested in early access or a technical walkthrough? Drop us a message and we will get back to you within one business day.