Skip to content

test: kill surviving mutants in errors, timeparse, follow#31

Merged
alexkroman merged 15 commits into
mainfrom
claude/mutant-testing-improvements-mGkhF
Jun 7, 2026
Merged

test: kill surviving mutants in errors, timeparse, follow#31
alexkroman merged 15 commits into
mainfrom
claude/mutant-testing-improvements-mGkhF

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Strengthen assertions so the mutation gate's structural mutants die:

  • errors: cover a structural HTTP 401 (not just 403) and pin CLIError defaults
  • timeparse: reject a truthy non-string so the guard's or/and can't be swapped
  • follow: capture Live's screen/auto_refresh kwargs and per-update refresh flag

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC

claude added 15 commits June 6, 2026 22:52
Strengthen assertions so the mutation gate's structural mutants die:
- errors: cover a structural HTTP 401 (not just 403) and pin CLIError defaults
- timeparse: reject a truthy non-string so the guard's or/and can't be swapped
- follow: capture Live's screen/auto_refresh kwargs and per-update refresh flag

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
The streaming render tests drove real Rich, so the Live construction kwargs
and per-update refresh/flush flags weren't asserted and survived mutation.
Inject a fake Live to pin screen/auto_refresh/transient/redirect_* and the
forced refresh, and a flush-recording stream to pin status-notice flushing.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- output: pin data_table pad_edge / detail_table padding, and assert emit_ndjson
  writes one flushed newline-terminated record
- context: assert resolve_session raises when only one of session/account_id is
  present (pins the `or` guard) and that a non-rejection NotAuthenticated still
  auto-logs-in with an env key set (pins the `and` in _should_auto_login)
- transcribe_render: assert exact sentiment percentages, mm:ss formatting across
  a 60s boundary, and most-relevant-first ordering for topics/content-safety

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
The transcribe --show-code tests parsed/ran the output but never asserted the
4-space indent of the rendered config kwargs, so the indent literal survived
mutation. Assert the exact indented config block.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
_survives wrote each mutated module via ast.unparse and ran the covering tests
in a subprocess. Consecutive mutants unparse to files differing by a single
token, so they're typically the same byte length and can be written within one
mtime-second. CPython validates a cached .pyc by exact (mtime, size) match, so
the subprocess could load the previous mutant's (or the original's) bytecode and
execute unmutated code — reporting a false "survivor" and failing the gate
spuriously (and flakily). Drop the module's cached .pyc after writing the source
so the subprocess always recompiles the mutant under test.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- sources: pin _is_streamable_wav's full mono/16-bit/16k `and` chain, the
  file-not-found/ffmpeg-missing/empty-audio exit codes, that ffmpeg isn't
  terminated on a clean EOF, and the exit-code fallback when stderr is empty
- microphone: pin RawInputStream channels/dtype, the ~100ms blocksize floor,
  the `rate > 0` boundary, and that resample treats audio as 16-bit mono PCM
- client: pin the single-row validation probe limit, the verbatim-vs-fallback
  transcribe error message, and that a provided on_begin is wired to Begin

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- config_builder: pin _derive_kind's Optional unwrap + bare-scalar/dict origin
  classification, and that KEY=VALUE / NAME:VALUE split only on the first
  separator (values may contain '=' / ':')
- youtube: assert yt-dlp is driven quietly and actually downloads, and pin the
  no-file-produced exit code
- ams: assert the error "detail" field is extracted cleanly rather than leaking
  the raw JSON body (the fallback happened to contain the same substring)

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- agent/audio: pin the device-rate blocksize, exact pending() sample count, the
  callback's exact zero-fill remainder, the audio-open exit code, and that
  close() lets the stream reopen on a later start()
- agent/session: assert the server error message wins over code/fallback, that a
  transcript without "interrupted" defaults to False, and that a player which
  failed to open is never closed
- auth/loopback: assert the callback answers 200 and unknown paths answer 404

Remaining survivors here are equivalent/threading-internal (daemon flags,
join/wait timeout values, the sub-10Hz blocksize floor) and aren't observable.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
The human-table test asserted only the id/model columns, so blanking a present
created_at / audio_duration_sec value (the `value or ""` -> `and`) survived.
Assert those values appear in the rendered table.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- account: assert the one-missing-bound window label, the one-day-window collapse,
  and that the default usage range spans exactly the last 30 days
- llm: assert `-o json` forces JSON output even for a non-agentic human
- login: assert the authenticated/logged_out success flags in --json output
- samples: assert the stream sample requests format_turns and that human-mode
  `samples list` renders its bullet list (pins the string concatenation)

(samples mkdir parents=True is equivalent here — the dir is one level under cwd.)

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- assert the "Listening…" notice latches to exactly one emission
- assert begin/turn/termination are all forwarded to the renderer in non-follow
  mode (pins the follow-vs-None handler wiring)
- assert a turn event with no end_of_turn flag is treated as non-final in --llm
  follow mode (pins the getattr default)
- assert the renderer is closed even when streaming raises mid-run

Remaining survivors (worker-thread daemon flag, 0.1s join poll interval) are
equivalent/threading-internal and not observable.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
The new 200/404 status assertions need the response code. Use http.client
instead of urllib.request.urlopen so a 404 is a normal response status (not a
raised HTTPError), the status types as int (no mypy no-any-return), and no
urllib audit (S310) suppression — hence no new noqa escape hatch — is needed.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
The blocksize floor (max(1, rate//10)) and the rate>0 guard only differ for
sub-10Hz / 1Hz rates no real device reports — near-equivalent mutants, like the
daemon/timeout ones left elsewhere. Their fakes needed a `fake_sd: Any` module
(pyright can't assign attributes to a bare ModuleType), which tripped the
"no net-new Any" gate. The valuable resample-params and channels/dtype
assertions remain and add no Any.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
DEFAULT_ENV is already sandbox000, so the existing env tests couldn't tell the
`sandbox and env is None` override apart from the default. Bind a profile to
production and assert that a bare invocation keeps production while --sandbox
forces sandbox000 — killing the and/or and is/is-not mutants on that line.

(The two err=True echoes route to stderr; CliRunner mixes streams in this
version, so that routing isn't separately assertable here.)

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
- init: assert a resolved key omits the skipped-key row, and that the
  launch-skipped row (with the manual uvicorn command) appears only when deps
  were installed AND no key is present — pinning both guards
- setup: a direct _proc_detail test pinning stderr-then-stdout preference

Remaining survivors in these modules (subprocess flag kwargs, timeout/exit-code
constants, the no-TTY picker guard) are low-value infra/near-equivalent.

https://claude.ai/code/session_014MYaBvEeWEJXJt36CCSHtC
@alexkroman alexkroman merged commit 35d497c into main Jun 7, 2026
10 checks passed
@alexkroman alexkroman deleted the claude/mutant-testing-improvements-mGkhF branch June 7, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants