Robustness, environment-aware code gen, and config caching#54
Conversation
…footguns - stream --llm: a gateway failure (e.g. 401, network) used to raise on the SDK reader thread, dumping a thread traceback while the stream kept going. The session now latches the CLIError, warns once on stderr, stops re-running the failing chain, and re-raises from the final main-thread flush so the command exits with the usual clean error. - code_gen: generated --show-code scripts hardcoded production hosts (streaming, agents WS URL, LLM gateway). They now embed the active environment's hosts, so sandbox users get scripts their key actually authenticates to; the transcribe script also pins aai.settings.base_url when the env differs from the SDK default. agent/session.ws_url() is now the single place the agents socket URL is built. llm.GATEWAY_BASE_URL (production-only) is gone. - aai share: the cloudflared tunnel subprocess no longer inherits ASSEMBLYAI_API_KEY — it only proxies a port; the dev server keeps the key. - agent audio: capture_frames() polls with a timeout and raises a clean audio_input_error when the PortAudio stream dies without close() (device unplugged) instead of blocking the capture thread forever. - transcribe rendering: chapter timestamps roll over to H:MM:SS past an hour (no more "60:00"); sentiment percentages use largest-remainder rounding so they sum to exactly 100. - config: _load() caches the parsed TOML keyed by path+mtime+size (a typical invocation parsed the same file 3-4 times), returning deep copies so callers' mutations and persist_login's rollback snapshot can't alias the cache; _dump() invalidates explicitly. - DRY: shared options.json_option() replaces 26 copy-pasted --json definitions (the help text had already drifted); the duplicate _objects() shims in client.py/transcribe_render.py are gone in favor of jsonshape.object_list. https://claude.ai/code/session_015RcqK7HpNbSMQgHUEouKaz
… with check.sh - The PostToolUse hook ran `ruff check --fix`, whose F401 autofix deletes an import that is momentarily unused — i.e. whenever an agent adds the import in one edit and its first usage in the next, the import silently vanishes in between. Scope it out with `--unfixable F401` (still reported, never auto-removed) and document the behavior. - AGENTS.md's check.sh stage list had drifted: it was missing the max-file-length, actionlint/zizmor, gitleaks, and (most importantly) the diff-scoped mutation gate, which requires changed lines to have assertions that fail when the line breaks — not just coverage. List them, with the `# pragma: no mutate` escape. https://claude.ai/code/session_015RcqK7HpNbSMQgHUEouKaz
| return config_dir() / "config.toml" | ||
|
|
||
|
|
||
| # Parsed-config cache: path -> (mtime_ns, size, parsed). The several _load() |
There was a problem hiding this comment.
_load_cache is a module-level mutable dict storing parsed configs; this global cache persists for the process lifetime and may unintentionally leak or reuse request-specific config state. Consider scoping the cache, documenting its shared nature, or limiting its lifetime.
Details
✨ AI Reasoning
A module-level mutable dictionary was introduced to hold parsed config state keyed by filesystem attributes. Module globals persist for the lifetime of the process and can unintentionally hold request- or user-specific data across unrelated operations, causing leakage or stale data exposure. The added cache is mutable and shared across all callers in the process, so it fits the global-caching risk pattern.
🔧 How do I fix it?
Avoid storing request-specific data in module-level variables. Use request-scoped variables or explicitly mark shared caches as intentional.
Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info
There was a problem hiding this comment.
@AikidoSec ignore: intentional process-scoped cache in a short-lived single-user CLI (one command invocation, no cross-request state). The shared nature is documented in the comment block above the dict; the cache is keyed by path+mtime+size, explicitly invalidated on every _dump() write, and _load() only hands out deep copies so caller mutations can't poison it (pinned by test_load_returns_independent_copies / test_write_invalidates_load_cache).
Generated by Claude Code
There was a problem hiding this comment.
✅ Based on your feedback, we ignored this issue because of the following reason:
intentional process-scoped cache in a short-lived single-user CLI (one command invocation, no cross-request state). The shared nature is documented in the comment block above the dict; the cache is keyed by path+mtime+size, explicitly invalidated on every
_dump()write, and_load()only hands out deep copies so caller mutations can't poison it (pinned bytest_load_returns_independent_copies/test_write_invalidates_load_cache).
Generated by Claude Code
Summary
This PR improves robustness of audio capture and LLM streaming, makes generated code environment-aware, adds config file caching, and centralizes repeated Typer option definitions.
Key Changes
Audio & Streaming Resilience
DuplexAudio.capture_frames()now polls with a timeout instead of blocking forever, detecting when a device dies mid-session without callingclose()and surfacing a clean error instead of hanging the capture threadStreamSessionlatches LLM chain errors raised on the SDK reader thread (where raising would dump a thread traceback), warns once on stderr, and re-raises from the final main-thread flush for clean command failurepoll_timeoutparameter toDuplexAudio(defaults to 1.0s, injectable for fast tests)Environment-Aware Code Generation
code_gen/stream.py,code_gen/agent.py, andcode_gen/transcribe.pynow embed the active environment's hosts (streaming, agents, LLM gateway) in generated scripts, so sandbox runs produce scripts targeting the sandbox, not productioncode_gen/transcribe.pyonly emitsbase_urlsetting when targeting a non-default environment (production SDK default is already correct)agent/session.pyexportsws_url()publicly for code_gen to useConfig File Caching
config._load()now caches parsed configs by mtime/size, avoiding redundant TOML parsing across multiple reads in a single CLI invocation_dump()) and returns deep copies so caller mutations don't poison the cacheTimestamp Formatting
_fmt_ms()now rolls over toH:MM:SSformat for timestamps ≥1 hour, preventing ambiguous60:00displaysSentiment Percentages
_percentages()implements largest-remainder rounding so integer percentages always sum to exactly 100 (e.g., three equal counts → 34/33/33, not 33/33/33=99)Code Organization
aai_cli/options.pycentralizes repeated Typer option factories (json_option()) to keep flag definitions uniform across ~26 command signatures_objects()helper in favor of directjsonshape.object_list()callsoptionsmoduleTesting
activeattribute are tolerated (file-driven streams)Notable Implementation Details
getattr(self._stream, "active", True)to gracefully handle streams that don't expose theactiveflag_llm_errorand only re-raised on the final flush (main thread), preventing thread tracebacks while still failing the command cleanly(mtime_ns, size)to handle coarse-mtime filesystems where same-size rewrites might not update mtimehttps://claude.ai/code/session_015RcqK7HpNbSMQgHUEouKaz