Which AI coding agents does Headroom support out of the box?

Headroom's compatibility matrix lists Claude Code, Codex, Cursor, Aider, GitHub Copilot CLI and OpenClaw as agents you can wrap with one command. Any OpenAI-compatible client also works through headroom proxy.

What token savings can I expect from Headroom?

Headroom's published benchmarks show 92% savings on code search and SRE incident debugging, 73% on GitHub issue triage and 47% on codebase exploration — measured on real agent traces. Accuracy is preserved on GSM8K (0.870 vs 0.870), TruthfulQA (+0.030) and standard QA / tool benchmarks. Numbers will vary with your content; reproduce with: python -m headroom.evals suite --tier 1.

Is Headroom local-only? Does my data leave my machine?

Headroom itself runs entirely locally — the compressor, the proxy, the MCP server and the CCR store all live on your hardware, and originals never leave your machine until your agent asks the LLM. Note that the compressed prompt still goes to whichever upstream provider you configure (Anthropic, OpenAI, Bedrock, Copilot, etc.), so the LLM provider sees your (compressed) request just as before.

Headroom Setup Guide: pip / npm / Docker Install, Wrap Claude Code / Codex / Cursor / Copilot (2026)

chopratejas/headroom is a local context compression layer for AI agents: it sits between your agent and the LLM, compresses everything the agent reads — tool outputs, logs, RAG chunks, files, conversation history — and forwards a much smaller prompt to the provider. The project's published benchmarks report 47–92% token savings on real agent traces, with reversible compression so the LLM can retrieve originals on demand. This guide walks you through installation and the four supported integration modes — library, proxy, agent wrap, and MCP — all based on Headroom's official README and docs.

Prerequisites
Install Headroom (pip / npm / Docker)
Pick a mode: wrap / proxy / library / MCP
GitHub Copilot CLI subscription mode
Verify token savings with headroom perf
Handy CLI commands & extras
FAQ

Step 1Prerequisites

Python 3.10+ with pip — if you install via pip; or
Node.js 18+ with npm — if you want the TypeScript/Node package; or
Docker — if you prefer a container.
An AI coding agent (optional but recommended for the wrap mode): Claude Code, Codex, Cursor, Aider, or GitHub Copilot CLI.

Step 2Install Headroom

Option A — pip (Python, recommended)

pip install "headroom-ai[all]"

The [all] extra pulls in every component. If you'd rather opt in à la carte, granular extras are available: [proxy], [mcp], [ml] (the Kompress-base model), [code], [memory], [relevance], [image], [agno], [langchain], [evals].

If you use pipx, pin an interpreter explicitly:

pipx install --python python3.13 "headroom-ai[all]"

Option B — npm (TypeScript / Node)

npm install headroom-ai

Option C — Docker

Pull the image and start the proxy container (port 8787 mapped to host):

docker pull ghcr.io/chopratejas/headroom:latest
docker run -p 8787:8787 ghcr.io/chopratejas/headroom:latest

Python version: The PyPI package requires Python 3.10+. If pip install fails with a version error, check python --version first.

Step 3Pick a mode: wrap / proxy / library / MCP

Headroom supports four integration shapes. Pick the one that matches how you already use your tools — you don't have to commit to just one.

Mode A — Agent wrap (one command, easiest)

Wrap a supported coding agent in a single command; Headroom handles the proxy lifecycle and the agent's launch arguments for you:

headroom wrap claude     # Claude Code (supports --memory, --code-graph)
headroom wrap codex      # Codex (shares memory with Claude)
headroom wrap cursor     # Cursor (prints config — paste it once)
headroom wrap aider      # Aider (starts proxy + launches)
headroom wrap copilot    # GitHub Copilot CLI (starts proxy + launches)

The compatibility matrix in Headroom's README also lists OpenClaw (installed as a ContextEngine plugin). Any OpenAI-compatible client can be used through the proxy mode below.

Mode B — Proxy (zero code changes, any language)

If your tool isn't on the wrap list, run Headroom as a local proxy and point your tool at it:

headroom proxy --port 8787

Anything that speaks the OpenAI-compatible API can be redirected to http://localhost:8787. The proxy applies the same compression pipeline used by headroom wrap.

Mode C — Library (inline in your code)

For programmatic use inside your own app, call compress(messages) directly. Python:

from headroom import compress
compressed = compress(messages, model="gpt-4o")

TypeScript / Node — the JS SDK delegates compression to a local Headroom proxy, so start the proxy first, then point the client at it:

headroom proxy --port 8787

import { compress } from "headroom-ai";
const compressed = await compress(messages, {
  model: "gpt-4o",
  baseUrl: "http://localhost:8787",
});

The README's integration table also lists drop-in shims for the Anthropic / OpenAI SDKs, the Vercel AI SDK, LiteLLM callbacks, LangChain, Agno, Strands and ASGI apps.

Mode D — MCP server (for Claude Desktop and other MCP clients)

headroom mcp install

This exposes Headroom's tools — headroom_compress, headroom_retrieve, headroom_stats — to any MCP-native client.

Step 4GitHub Copilot CLI subscription mode

Headroom can route GitHub Copilot CLI subscription traffic (not just bring-your-own-key) through the local proxy:

headroom wrap copilot --subscription -- --model gpt-4o

The wrapper resolves the account-specific Copilot API endpoint and prints it as COPILOT_PROVIDER_API_URL=... during launch, then routes the OpenAI-compatible Copilot CLI requests through Headroom before forwarding to GitHub Copilot's hosted API.

Auth note: macOS Keychain auth reuse has been smoke-tested. Windows Credential Manager, Linux Secret Service / secret-tool, and Docker/CI token-injection paths are implemented or planned as auth-discovery paths, but still need real OS validation before they should be considered fully vetted. On Docker or CI, prefer passing GITHUB_COPILOT_TOKEN or GITHUB_COPILOT_GITHUB_TOKEN explicitly rather than relying on host keychain access.

Step 5Verify token savings

Check Headroom is actually doing something useful for your workload:

headroom perf

This prints before/after token counts on representative agent traces. Headroom's published benchmarks (run with python -m headroom.evals suite --tier 1) show:

92% savings on code search and SRE incident debugging
73% savings on GitHub issue triage
47% savings on codebase exploration
Accuracy preserved on GSM8K (0.870 vs 0.870), TruthfulQA (+0.030), SQuAD v2 and BFCL

Your numbers will vary with content type and the upstream LLM — these are the project's published averages, not a guarantee for every workload.

Handy CLI commands & extras

headroom wrap <agent> — one-shot wrap for Claude Code / Codex / Cursor / Aider / Copilot / OpenClaw
headroom proxy --port 8787 — start the local OpenAI-compatible proxy
headroom perf — measure compression ratio on a representative workload
headroom mcp install — register the MCP server with MCP-native clients
headroom learn — dry-run: mine failed sessions and print proposed corrections. Add --apply to actually write them to CLAUDE.md / AGENTS.md / GEMINI.md
python -m headroom.evals suite --tier 1 — reproduce the published benchmarks locally

Cross-agent memory: Wrapping multiple agents (e.g. claude and codex) lets them share a deduplicated memory store, so context one agent learned is available to the others. Originals are kept locally via Headroom's CCR (reversible compression) — the LLM calls headroom_retrieve when it needs the full value.

FAQ

How do I install Headroom?

Pick one: pip install "headroom-ai[all]" for Python (3.10+ required), npm install headroom-ai for TypeScript / Node (Node 18+), or pull and run the container with docker pull ghcr.io/chopratejas/headroom:latest && docker run -p 8787:8787 ghcr.io/chopratejas/headroom:latest.

Which agents does `headroom wrap` support?

Claude Code, Codex, Cursor, Aider, GitHub Copilot CLI and OpenClaw, per the README's compatibility matrix. Anything OpenAI-compatible also works via headroom proxy.

How does it work with GitHub Copilot?

headroom wrap copilot --subscription -- --model gpt-4o intercepts Copilot CLI's OpenAI-compatible requests and routes them through the local proxy before forwarding to GitHub's hosted Copilot API.

How much will I save?

Headroom's published benchmarks show 47–92% token reduction depending on workload, with accuracy preserved on GSM8K, TruthfulQA, SQuAD v2 and BFCL. Run headroom perf to measure on your own traces.

Is it local-only?

The compressor, proxy, MCP server and the CCR original-store all run on your machine. The compressed prompt still goes to whichever upstream LLM provider you configure, so the provider sees the compressed request just as it would any other.

Where are the full docs?

This guide covers install and the four integration modes. For architecture, CCR internals, the Kompress-base model card and provider-specific notes, see the official sources: github.com/chopratejas/headroom and headroom-docs.vercel.app/docs.

This guide is based on Headroom's public materials (GitHub: chopratejas/headroom and its README, plus headroom-docs.vercel.app). Commands, package names, ports, savings figures and the agent compatibility matrix follow the project's official documentation; Headroom is an active project, so defer to the latest official docs if anything differs. Apache-2.0 licensed. Written by NGJOO AI Lab, updated 2026-06-08.

Headroom Setup Guide: pip / npm / Docker Install & Wrap Your Coding Agents

Contents

Step 1Prerequisites

Step 2Install Headroom

Option A — pip (Python, recommended)

Option B — npm (TypeScript / Node)

Option C — Docker

Step 3Pick a mode: wrap / proxy / library / MCP

Mode A — Agent wrap (one command, easiest)

Mode B — Proxy (zero code changes, any language)

Mode C — Library (inline in your code)

Mode D — MCP server (for Claude Desktop and other MCP clients)

Step 4GitHub Copilot CLI subscription mode

Step 5Verify token savings

Handy CLI commands & extras

FAQ

How do I install Headroom?

Which agents does `headroom wrap` support?

How does it work with GitHub Copilot?

How much will I save?

Is it local-only?

Where are the full docs?

Headroom Setup Guide: pip / npm / Docker Install & Wrap Your Coding Agents

Contents

Step 1Prerequisites

Step 2Install Headroom

Option A — pip (Python, recommended)

Option B — npm (TypeScript / Node)

Option C — Docker

Step 3Pick a mode: wrap / proxy / library / MCP

Mode A — Agent wrap (one command, easiest)

Mode B — Proxy (zero code changes, any language)

Mode C — Library (inline in your code)

Mode D — MCP server (for Claude Desktop and other MCP clients)

Step 4GitHub Copilot CLI subscription mode

Step 5Verify token savings

Handy CLI commands & extras

FAQ

How do I install Headroom?

Which agents does headroom wrap support?

How does it work with GitHub Copilot?

How much will I save?

Is it local-only?

Where are the full docs?

Related

Headroom Analysis: Architecture, Use Cases & Setup

OmniRoute Setup Guide

AI Trending: Curated GitHub Repos with Engineering Breakdowns

Which agents does `headroom wrap` support?