Skip to content

Delegate context-window management to deepagents SummarizationMiddleware#264

Merged
alexkroman merged 1 commit into
mainfrom
claude/eager-euler-n7ndze
Jun 23, 2026
Merged

Delegate context-window management to deepagents SummarizationMiddleware#264
alexkroman merged 1 commit into
mainfrom
claude/eager-euler-n7ndze

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Summary

Removes client-side conversation history trimming from the cascade engine and delegates context-window management to the deepagents brain's built-in SummarizationMiddleware. The engine now keeps the full running conversation history and lets the graph handle compaction (summarizing old turns, offloading evicted history to a file).

Key Changes

  • Removed trim_history() function from aai_cli/agent_cascade/text.py — this utility is no longer needed since the brain handles windowing server-side
  • Removed history trimming calls from aai_cli/agent_cascade/engine.py:
    • Deleted trim_history() call in on_turn() after appending user messages
    • Deleted trim_history() call in _record_spoken() after appending assistant messages
  • Updated module docstrings to document the architectural shift:
    • text.py: Clarified that conversation-history trimming moved to the brain's SummarizationMiddleware
    • engine.py: Added docstring to _record_spoken() explaining the engine now keeps full history
    • brain.py: Added note that context-window management is delegated to deepagents
    • config.py: Clarified that DEFAULT_MAX_HISTORY only applies to standalone (--show-code / init template) paths, not the live brain
  • Updated tests to reflect the new behavior:
    • Removed all trim_history() unit tests from tests/test_agent_cascade_text.py
    • Updated test_generate_reply_trims_history_window()test_generate_reply_keeps_full_untrimmed_history() to verify the engine no longer trims
    • Updated test_on_turn_trims_history_window()test_on_turn_keeps_full_untrimmed_history() to verify full history is retained
  • Refactored split_sentences() to use a new _ends_sentence() helper function for clarity (distinguishes end-of-text boundaries from mid-stream boundaries used in streaming)

Implementation Details

The _ends_sentence() helper clarifies the distinction between sentence boundaries in complete text (where end-of-text is a real boundary) versus partial streamed chunks (where it might be mid-token). This improves code readability without changing behavior.

The max_history config parameter remains for backward compatibility with standalone code paths (templates, --show-code generator) but is now inert in the live brain, which relies on deepagents' SummarizationMiddleware for context management.

https://claude.ai/code/session_01RgG91Q7U3j2pbJvyfTJa3X

…zation

The live cascade brain is a deepagents graph, and create_deep_agent already
wires its own SummarizationMiddleware into the stack (summarize old turns,
offload the evicted history to a file) — real context-window management. The
engine's client-side sliding window (text.trim_history + config.max_history) was
redundant in front of it, so this removes it: the engine now feeds the full
untrimmed running history each turn and lets the graph compact it.

max_history stays on CascadeConfig but only drives the hand-rolled
--show-code / `assembly init` cascade, which talks to the gateway directly and
has no middleware. text.trim_history is gone; split_sentences' inline boundary
predicate is extracted into _ends_sentence (mirroring _is_boundary) to keep the
module's average complexity at rank A after the helper's removal.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RgG91Q7U3j2pbJvyfTJa3X
@alexkroman alexkroman enabled auto-merge June 23, 2026 15:44
@alexkroman alexkroman added this pull request to the merge queue Jun 23, 2026
Merged via the queue into main with commit 371055d Jun 23, 2026
20 checks passed
@alexkroman alexkroman deleted the claude/eager-euler-n7ndze branch June 23, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants