Skip to content

feat: aai init — interactive project scaffolder (transcribe, stream, agent)#7

Merged
alexkroman merged 32 commits into
mainfrom
aai-init
Jun 5, 2026
Merged

feat: aai init — interactive project scaffolder (transcribe, stream, agent)#7
alexkroman merged 32 commits into
mainfrom
aai-init

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Summary

Adds aai init — a Vercel-style scaffolder that picks a template, copies a real, runnable, Vercel-deployable Python web project, writes a git-ignored .env, installs deps, launches a local server, and opens the browser.

$ aai init
AssemblyAI CLI 0.1.0
? Pick a template ›
  ❯ Transcribe a pre-recorded file (+ chat with it via LLM Gateway)
    Live captions (mic → browser)
    Talk to a voice agent

Templates (shipped)

  • transcribe — drop a file or URL (defaults to the wildfires sample) → tabbed explorer (transcript with per-speaker colors, chapters, sentiment, entities, highlights) plus a built-in "ask the transcript" chat via the LLM Gateway.
  • stream — live captions: browser mic → AssemblyAI Streaming v3 (u3-rt-pro) directly, authorized by a one-time token minted by the backend.
  • agent — voice agent: browser ↔ Voice Agent API directly (Bearer one-time token); mic capture + base64 PCM playback with barge-in.

Each template is a single, dependency-free api/index.py + index.html, structured in Vercel's conventional layout so the same files run locally (uvicorn) and deploy to Vercel untouched. We build no deploy tooling of our own.

Key design points

  • Browser never sees the real key. Uploads/LLM go through the thin backend; streaming/voice use one-time temp tokens (raw-key auth for streaming, Bearer for the agent).
  • Environment-aware. aai init writes the active environment's hosts (API base, LLM Gateway, streaming, agents) into .env, so a sandbox key targets sandbox endpoints; templates default to production when unset.
  • Approach A: templates are committed, runnable projects copied verbatim — what we test is exactly what users get.
  • Reuses aai claude's step-status rendering (extracted to a shared aai_cli/steps.py).

Testing

  • 625 passed / 6 skipped; ruff + format + mypy clean.
  • A parametrized contract test runs over every template (required files, no committed key, front/back route agreement, requirements cover imports, no blocking SDK call) and auto-covers future templates.
  • Per-template tests (token endpoints incl. raw-key vs Bearer auth, transcribe submit/poll + 502 error paths, LLM chat, reply.audio data-field guard).
  • Browser-verified the transcribe flow end-to-end; wheel packaging verified (templates ship, no __pycache__).

Notes

  • Spec + plan under specs/.
  • Streaming/voice-agent require that capability on the account/plan; without it /api/token returns a clean 502 the UI surfaces.

🤖 Generated with Claude Code

alexkroman-assembly and others added 30 commits June 4, 2026 14:58
Vercel-style interactive scaffolder: pick from 4 templates (transcribe,
stream, agent, llm), copy a real Vercel-deployable Python web project,
install deps, launch a local server, and open the browser. stream and
agent both connect the browser directly to AssemblyAI via temp tokens;
transcribe and llm use a thin upload/poll backend. Generated files kept
deliberately flat/simple for human + Claude Code iteration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Plan 1 of a series: full aai init machinery (picker, scaffold, runner,
step output, packaging) plus the transcribe template as a complete
vertical slice. stream/agent/llm templates are sibling follow-on plans.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ct tests

Review found aai.Transcript.get_by_id() blocks (wait_for_completion), which
breaks the submit+poll design on serverless. Switch the status endpoint to
the SDK's non-blocking api.get_transcript, strengthen the transcribe tests,
and add a parametrized cross-template contract test (Task 10).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…PE_CHECKING Traversable

templates/ has no __init__.py so resources.files() must navigate from the
aai_cli.init package. Keep requires-python>=3.10 (TYPE_CHECKING import keeps
it clean). Verified runtime 3.11 + mypy --python-version 3.10.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ranscribe)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…egister

Adds `aai init [TEMPLATE] [DIRECTORY]` Typer command that scaffolds a project
from a template, optionally installs deps, launches the dev server, and opens
the browser; bare `aai init` with no TTY errors helpfully listing templates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… re-exports

The __init__ re-exports were unused and shadowed the scaffold submodule
(forcing a scaffold_fn alias). Import the scaffold module like the other
init submodules and keep __init__ minimal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…init__, smoke)

Fix stale docstring; sync plan to the accepted deviations (CLIError exit_code=1
matching the samples convention, minimal init/__init__.py, test_smoke update).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Daisy's request: discover every template dir and assert required files,
no committed key, front/back route agreement, requirements cover imports,
and no blocking SDK call. Auto-covers future stream/agent/llm templates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final-review fixes:
- Registry lists only the transcribe template that actually ships; picking an
  unshipped id now gives a clean error instead of FileNotFoundError. A test
  enforces registry == on-disk template dirs (auto-extends as templates land).
- scaffold._template_root guards a missing dir defensively.
- _pick_template gives a clean message if questionary is missing (stale install)
  instead of a raw ModuleNotFoundError traceback.
- find_free_port raises instead of returning an occupied port when exhausted.
- init propagates the dev-server exit code; rejects DIRECTORY + --here together.
- theme: 'created' renders as success (green).
- Tests for --here, --json, unshipped-template error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…name

- Rename to 'Transcribe a pre-recorded file' (drop '& explore') in the picker
  title, page, backend docstring, and README.
- Add POST /api/transcribe-url: AssemblyAI transcribes a public URL directly
  (no upload), defaulting to the wildfires sample so the demo works in one click.
- Redesign index.html: URL field pre-filled with the sample + primary Transcribe
  button, file upload as the secondary path; tidier card layout and transcript
  rendering. Still a single, dependency-free HTML file.
- Tests for the URL endpoint (given URL + default-to-sample); contract test
  enforces the new route stays wired to the frontend.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A sandbox key (minted by 'aai login' against a non-prod env) was rejected by the
template's production default. Now 'aai init' writes ASSEMBLYAI_BASE_URL (the active
environment's api_base) into .env, and the transcribe template applies it via
aai.settings.base_url (defaulting to production when unset). Tests cover scaffold
writing the base URL, the template honoring it, and the command emitting it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ights + per-speaker colors

- Results area becomes a tabbed explorer (Transcript · Chapters · Sentiment ·
  Entities · Highlights), showing only tabs that have data. Chapters show
  headline/summary + timestamps; sentiment color-codes POSITIVE/NEGATIVE/NEUTRAL;
  entities group by type; highlights rank key phrases with counts.
- Speakers are differentiated with a per-speaker color (name + left rule) from a
  small palette, in both Transcript and Sentiment views.
- Status messages get state-styled pills (working / done / error).
- Still a single dependency-free HTML file; verified end-to-end in a browser
  against the sandbox sample. Guard test keeps the feature views + speaker
  coloring wired.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds an Ask panel that questions the transcript with AssemblyAI's LLM Gateway
(OpenAI-compatible; transcript injected server-side via extra_body transcript_id),
shown alongside the transcript explorer so both stay visible.

- New POST /api/ask endpoint (model claude-haiku-4-5), surfaces gateway errors
  (e.g. plan access) as a clean 502 the UI shows.
- aai init now also writes ASSEMBLYAI_LLM_GATEWAY_URL (active env's gateway) to
  .env; the template honors it, defaulting to production. Adds openai to the
  template requirements.
- Tests: ask endpoint (openai stubbed, asserts transcript_id passed via extra_body),
  scaffold/command write the gateway URL, guard test keeps the Ask panel wired.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tness)

Correctness:
- Template transcribe/transcribe-url/status endpoints now catch SDK exceptions and
  return a clean 502 instead of a 500 traceback (esp. placeholder/invalid key).
- index.html: poll() failures are caught (no more stuck 'Transcribing…' button);
  speaker colors reset per transcription.
- aai init: when logged out, install runs but a 'launch: skipped — run aai login'
  step is reported instead of exiting silently.
- Template uploads are deleted after submit (no /tmp leak); assemblyai pinned
  (>=0.64,<1) to bound the internal-API surface it uses.

Cleanup / robustness:
- Extract shared aai_cli/steps.py (Step + render_steps(heading=...)); init and
  claude both use it (removes the duplicated TypedDict + render loop).
- init reuses the single has_uv() result for install and launch.
- Template registry guard now enforces both directions (registry <-> shipped dirs).
- Wheel excludes __pycache__/*.pyc so stale bytecode can't ship (verified).

Tests added for the 502 paths, the logged-out launch hint, and the shipped-but-
unregistered guard. 608 passed, ruff + format + mypy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- stream: browser captures mic and connects directly to AssemblyAI Streaming v3
  (u3-rt-pro) with a one-time token from /api/token; live partial/final captions.
- agent: browser connects directly to the Voice Agent API (speech->LLM->speech) with
  a Bearer-auth one-time token; mic capture + base64 PCM playback with barge-in.
  Fix: reply.audio carries base64 in the data field (not audio) so TTS now plays;
  resume() the audio contexts in case they start suspended after getUserMedia.
- Generalize scaffold to write the active environment hosts (base/llm/streaming/
  agents) into .env via an env_vars dict; templates honor them, default to production.
- Registry ships transcribe (with built-in LLM chat), stream, agent (no separate
  llm template; that lives in transcribe).
- Tests: token endpoints for stream/agent incl. raw-key vs Bearer auth, reply.audio
  data-field guard; contract suite auto-covers all three templates. 624 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
alexkroman-assembly and others added 2 commits June 4, 2026 17:43
Printed at the top of an interactive aai init run (human output only; suppressed
in --json/agentic mode). Guard test included.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexkroman alexkroman merged commit d909423 into main Jun 5, 2026
5 checks passed
@alexkroman alexkroman deleted the aai-init branch June 5, 2026 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants