Delegate context-window management to deepagents SummarizationMiddleware by alexkroman · Pull Request #264 · AssemblyAI/cli

alexkroman · 2026-06-23T15:43:03Z

Summary

Removes client-side conversation history trimming from the cascade engine and delegates context-window management to the deepagents brain's built-in SummarizationMiddleware. The engine now keeps the full running conversation history and lets the graph handle compaction (summarizing old turns, offloading evicted history to a file).

Key Changes

Removed trim_history() function from aai_cli/agent_cascade/text.py — this utility is no longer needed since the brain handles windowing server-side
Removed history trimming calls from aai_cli/agent_cascade/engine.py:
- Deleted trim_history() call in on_turn() after appending user messages
- Deleted trim_history() call in _record_spoken() after appending assistant messages
Updated module docstrings to document the architectural shift:
- text.py: Clarified that conversation-history trimming moved to the brain's SummarizationMiddleware
- engine.py: Added docstring to _record_spoken() explaining the engine now keeps full history
- brain.py: Added note that context-window management is delegated to deepagents
- config.py: Clarified that DEFAULT_MAX_HISTORY only applies to standalone (--show-code / init template) paths, not the live brain
Updated tests to reflect the new behavior:
- Removed all trim_history() unit tests from tests/test_agent_cascade_text.py
- Updated test_generate_reply_trims_history_window() → test_generate_reply_keeps_full_untrimmed_history() to verify the engine no longer trims
- Updated test_on_turn_trims_history_window() → test_on_turn_keeps_full_untrimmed_history() to verify full history is retained
Refactored split_sentences() to use a new _ends_sentence() helper function for clarity (distinguishes end-of-text boundaries from mid-stream boundaries used in streaming)

Implementation Details

The _ends_sentence() helper clarifies the distinction between sentence boundaries in complete text (where end-of-text is a real boundary) versus partial streamed chunks (where it might be mid-token). This improves code readability without changing behavior.

The max_history config parameter remains for backward compatibility with standalone code paths (templates, --show-code generator) but is now inert in the live brain, which relies on deepagents' SummarizationMiddleware for context management.

https://claude.ai/code/session_01RgG91Q7U3j2pbJvyfTJa3X

…zation The live cascade brain is a deepagents graph, and create_deep_agent already wires its own SummarizationMiddleware into the stack (summarize old turns, offload the evicted history to a file) — real context-window management. The engine's client-side sliding window (text.trim_history + config.max_history) was redundant in front of it, so this removes it: the engine now feeds the full untrimmed running history each turn and lets the graph compact it. max_history stays on CascadeConfig but only drives the hand-rolled --show-code / `assembly init` cascade, which talks to the gateway directly and has no middleware. text.trim_history is gone; split_sentences' inline boundary predicate is extracted into _ends_sentence (mirroring _is_boundary) to keep the module's average complexity at rank A after the helper's removal. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RgG91Q7U3j2pbJvyfTJa3X

alexkroman enabled auto-merge June 23, 2026 15:44

alexkroman added this pull request to the merge queue Jun 23, 2026

Merged via the queue into main with commit 371055d Jun 23, 2026
20 checks passed

alexkroman deleted the claude/eager-euler-n7ndze branch June 23, 2026 15:52

alexkroman mentioned this pull request Jun 23, 2026

Client-orchestrated voice agent: streaming TTS, --files sandbox, spoken approval #268

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Delegate context-window management to deepagents SummarizationMiddleware#264

Delegate context-window management to deepagents SummarizationMiddleware#264
alexkroman merged 1 commit into
mainfrom
claude/eager-euler-n7ndze

alexkroman commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

alexkroman commented Jun 23, 2026

Summary

Key Changes

Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants