Skip to content

Add assembly agent-framework: a terminal client-orchestrated voice cascade#171

Merged
alexkroman merged 2 commits into
mainfrom
claude/amazing-pascal-i5wo34
Jun 16, 2026
Merged

Add assembly agent-framework: a terminal client-orchestrated voice cascade#171
alexkroman merged 2 commits into
mainfrom
claude/amazing-pascal-i5wo34

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

assembly agent-framework holds the same kind of live voice conversation as
assembly agent, but instead of talking to AssemblyAI's Voice Agent endpoint it
wires the three primitives together itself — Streaming STT -> the LLM Gateway ->
streaming TTS — exactly like the agent-framework init template does
server-side. Sandbox-only (streaming TTS has no production host).

New feature slice aai_cli/agent_framework/:

  • engine.py: the cascade orchestrator (greeting, per-sentence TTS, barge-in,
    sliding-window history). The three network legs are injected through
    CascadeDeps (the tts/session.py seam), so it's unit-tested against fakes
    with no sockets, mic, or speaker.
  • config.py / voices.py / text.py: per-run config + the streaming-TTS voice
    catalog presentation + sentence-splitting/history-trimming helpers.

New command package aai_cli/commands/agent_framework/ follows the
options/run-split convention and reuses DuplexAudio/AgentRenderer,
client.stream_audio, llm.complete, and tts.session.synthesize.

Also clears pre-existing CodeQL py/ineffectual-statement findings in the
agent-framework / audio-transcription init templates and their tests (await-of
-name statements and a ... Protocol stub) so the full local gate is green.

…cascade

`assembly agent-framework` holds the same kind of live voice conversation as
`assembly agent`, but instead of talking to AssemblyAI's Voice Agent endpoint it
wires the three primitives together itself — Streaming STT -> the LLM Gateway ->
streaming TTS — exactly like the `agent-framework` init template does
server-side. Sandbox-only (streaming TTS has no production host).

New feature slice `aai_cli/agent_framework/`:
- engine.py: the cascade orchestrator (greeting, per-sentence TTS, barge-in,
  sliding-window history). The three network legs are injected through
  `CascadeDeps` (the `tts/session.py` seam), so it's unit-tested against fakes
  with no sockets, mic, or speaker.
- config.py / voices.py / text.py: per-run config + the streaming-TTS voice
  catalog presentation + sentence-splitting/history-trimming helpers.

New command package `aai_cli/commands/agent_framework/` follows the
options/run-split convention and reuses DuplexAudio/AgentRenderer,
client.stream_audio, llm.complete, and tts.session.synthesize.

Also clears pre-existing CodeQL `py/ineffectual-statement` findings in the
agent-framework / audio-transcription init templates and their tests (await-of
-name statements and a `...` Protocol stub) so the full local gate is green.
@alexkroman alexkroman enabled auto-merge June 16, 2026 05:21
CI runs pytest with FORCE_COLOR, so the Rich usage-error panel carries ANSI
codes and width-based ellipsis — the long "Invalid value for
'--system-prompt-file'" substring no longer matched on the Windows/Linux
runners. Assert the behavioral distinction instead: a nonexistent
--system-prompt-file makes Typer reject before the body, so the sandbox guard
(the other exit-2 path) never runs and "sandbox" stays out of the output. This
still kills the exists=True mutant and is immune to ANSI/width.
@alexkroman alexkroman added this pull request to the merge queue Jun 16, 2026
Merged via the queue into main with commit 805324d Jun 16, 2026
19 checks passed
@alexkroman alexkroman deleted the claude/amazing-pascal-i5wo34 branch June 16, 2026 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants