Context Engine & Compaction in OpenClaw: How Long-Running Agents Stay Useful Without Drowning in Their Own History

Most people treat context like a vague AI fog. That is how you end up debugging by superstition.

A better mental model is a workbench. Every turn, OpenClaw chooses what gets laid on that bench for the model to see. If the bench is clean, the model looks focused. If it is buried under old tool logs, giant files, and stale back-and-forth, the model starts missing the point.

The official docs for concepts/context, concepts/context-engine, and concepts/compaction all revolve around the same practical truth: context is the live working set, and OpenClaw has explicit machinery for keeping that working set usable.

What context actually is

In OpenClaw, context is everything sent to the model for a run, bounded by the model's context window.

the system prompt OpenClaw builds for that run
conversation history for the session
tool calls and tool results
attachments, transcripts, and other injected content
compaction summaries and related prompt artifacts

That list matters because it kills a common beginner mistake. People assume only the visible chat text counts. It does not. Tool schemas, long file injections, command output, and old session baggage all compete for the same window.

How the context engine fits in

The context engine is the part that controls how that working set is assembled.

By default, OpenClaw uses the built-in legacy engine. The docs are clear that most users should leave it alone. But under the hood, the engine still owns a real lifecycle.

ingest: handle new messages entering the session
assemble: build the ordered messages that fit the token budget
compact: reduce older history when the window gets too full
after turn: persist or clean up state after the run finishes

That is the useful distinction. The context engine is not just a formatter. It is the policy layer for what the model sees and how session history gets reshaped over time.

Why OpenClaw exposes this at all

If you only ever run short chats, the default path feels invisible. That is fine.

But OpenClaw is built for long-lived sessions, tool-heavy work, background tasks, and subagents. In that world, context assembly is not a minor implementation detail. It is operational hygiene. The docs even allow plugin engines with custom assembly, custom compaction, and optional subagent lifecycle hooks because some setups need more than a plain rolling transcript.

Think of it like a newsroom editor. Reporters produce a lot of raw material. The editor decides what goes on page one, what gets condensed, and what gets archived without being forgotten. The model does better when someone is doing that editing on purpose.

What compaction changes

Compaction is OpenClaw's way of shrinking old conversation without pretending it never happened.

When a session nears the context limit, OpenClaw summarizes older turns into a compact entry, keeps recent messages intact, and stores the summary in the transcript. The full history still lives on disk. What changes is the version of that history the model sees on the next run.

That distinction is worth underlining. Compaction is not the same as deleting. It is closer to replacing a stack of meeting notes with a decent executive summary so the next meeting can still move.

Auto-compaction vs manual compaction

OpenClaw enables auto-compaction by default. It can trigger when the session approaches the context limit or when a provider returns a context-overflow error. In other words, it is both preventative and reactive.

You can also force it yourself when you know the conversation is getting bloated.

/status
/context list
/context detail
/compact
/compact Focus on the API design decisions

Those commands are underrated because they replace guesswork with visibility. If a session feels off, check the bench before blaming the carpenter.

Memory is not context

This is where people mix up three different things: live context, stored memory, and session transcript history.

Context is what the model sees right now. Memory is durable information stored somewhere else, usually so it can be retrieved later. The transcript is the recorded history of the session on disk.

context = current working set
memory = saved facts or notes that may be recalled later
transcript = full recorded history of what happened

You can have important memory that is not in the live prompt. You can have transcript history that is preserved on disk but summarized away from the current turn. And you can have a crowded context window even when memory is configured well.

That is why the docs remind the agent to save important notes before compaction. Memory carries forward what should stay durable. Context carries what is useful now. Same ecosystem, different jobs.

Compaction is not pruning either

Another easy mix-up: compaction and session pruning solve different problems.

Compaction summarizes older conversation and writes that summary into the transcript. Session pruning trims old tool results in memory before a model call, without rewriting the transcript. One is summary-based history reduction. The other is lightweight tool-output cleanup.

If a session is bloated mainly because of giant exec output, file reads, or search results, pruning may buy you more headroom before full compaction even becomes necessary.

How to reduce context bloat in practice

You do not need a custom plugin engine to get better behavior. Most of the time, you need better housekeeping.

1. Inspect before you guess

Use /status, /context list, and /context detail. The official docs show that large injected workspace files, tool schemas, and old tool output can eat far more room than people expect.

2. Keep workspace files sane

OpenClaw injects bootstrap files such as AGENTS.md, SOUL.md, TOOLS.md, and others into Project Context. Large files can be truncated, but they still carry cost. If a file is turning into a junk drawer, your agent pays for it every turn.

3. Compact on purpose

Do not wait until the model is already stumbling. If a session has reached the point where old debate matters less than current execution, run /compact and give it a focus if needed.

4. Move durable notes into memory

If something should survive beyond the current working window, store it properly instead of hoping the live transcript keeps carrying it forever. That is especially true for preferences, decisions, and stable project facts.

5. Use pruning for tool-heavy sessions

Tool output is often the quiet killer. Session pruning can trim old results without flattening the actual conversation.

6. Change the engine only when the default stops fitting

The plugin slot exists for teams who need custom assembly, special recall behavior, or engine-owned compaction logic. That is a power feature, not a beginner checkbox. If you cannot explain what problem your new engine solves, keep the default one.

When a custom context engine makes sense

Rare does not mean never. A custom engine can be justified when you need cross-session recall behavior, a different compaction strategy, or stricter control over what enters the prompt for certain runtimes.

The official concepts/context-engine docs also call out host requirements, plugin failure isolation, and the ownsCompaction flag. That is OpenClaw quietly telling you this feature is serious. If your engine fails, prompt assembly can degrade or fail closed depending on the situation. So treat custom engines like infrastructure, not decoration.

A practical mental model

If you want the short version, use this:

context is the workbench
the context engine decides what lands on it
compaction clears old clutter by replacing it with a summary
memory keeps durable notes somewhere safer than the bench
pruning sweeps away old tool debris before it takes over the room

That model is simple, but it is close enough to the real system to make better decisions.

The operator payoff

Once you understand context engine and compaction, long-session weirdness becomes easier to diagnose.

If replies drift, ask whether the current prompt is bloated or stale. If costs spike, check whether tool output and injected files are inflating every turn. If an agent forgets something important, ask whether it should have been stored as memory instead of left to survive inside raw conversation history.

This is the real gain. You stop treating long-running AI sessions like magic and start treating them like systems with budgets, bottlenecks, and cleanup rules.

Need help from people who already use this stuff?

Want long-running OpenClaw sessions that stay sharp instead of turning into expensive soup?

Join My AI Agent Profit Lab if you want help cleaning up context, memory, and session design before your agents start drifting.

Join My AI Agent Profit Lab See the community page

FAQ

What does the context engine do in OpenClaw?

It decides how model context is assembled for each run. That includes which messages are included, how they fit inside the token budget, and what happens when the system needs to summarize or hand off context across session boundaries.

What does compaction actually change?

Compaction summarizes older conversation turns into a shorter entry, keeps recent messages intact, and stores that summary in the session transcript. The full history stays on disk, but the next model turn sees the compacted version instead of the original long tail.

Is memory the same thing as context?

No. Context is what the model sees right now inside its current window. Memory is information stored elsewhere that can be reloaded or retrieved later. You can have memory on disk that is not currently in context at all.

When should you change the context engine?

Rarely. The built-in legacy engine is the default for a reason. Switch only when you need custom assembly, compaction behavior, or cross-session recall patterns that the default path does not give you.

What is the fastest way to reduce context bloat?

Start by inspecting context with /status and /context, compact when needed, trim old tool output with session pruning, keep giant workspace files under control, and move durable notes into memory files instead of making the live prompt carry everything forever.

Context Engine & Compaction