Skip to content

Default to production environment; drop login state nonce#44

Merged
alexkroman merged 3 commits into
mainfrom
switch-default-env-to-production
Jun 8, 2026
Merged

Default to production environment; drop login state nonce#44
alexkroman merged 3 commits into
mainfrom
switch-default-env-to-production

Conversation

@alexkroman

@alexkroman alexkroman commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Flip DEFAULT_ENV from sandbox000 to production now that the prod Stytch redirect is live, and wire in the real production public token. Profiles with no stored env now resolve to production; --sandbox (or --env sandbox000 / AAI_ENV) still selects the sandbox. Profiles with a stored env (e.g. an existing default pinned to sandbox000) are unaffected until they re-login with --env production.
  • Remove the loopback CSRF state nonce. aai login now sends a bare http://127.0.0.1:8585/callback redirect with no query parameters. Stytch validates the redirect's query string as well as its path, and the production project only has the bare …/callback registered, so the nonce produced query_params_do_not_match (400). Dropping it makes prod login work with no Stytch dashboard change.
  • Lift the Stytch public tokens to module constants so the stytch_public_token= kwarg takes a name rather than a string literal — this removes both # noqa: S106 comments (S106 only fires on inline literals).
  • Make test_shell_completion_is_available deterministic: Typer detects the shell via shellingham, which needs a controlling TTY that CliRunner/CI lack, so the test now pins the detected shell.
  • Raise a clean error on keyring write failure. set_api_key / set_session called keyring.set_password directly, so a locked or ACL-denied OS keychain (e.g. macOS errSecInvalidOwnerEdit, -25244) escaped as a raw PasswordSetError traceback during aai login. Both writes now go through _keyring_set, which converts keyring.errors.KeyringError into a CLIError with a fixable suggestion.
  • Make browser-login credential writes atomic. The key/env/session were three separate keyring+config writes, so a mid-sequence failure left a half-written profile (a key with no session — looks signed-in but can't reach AMS). New config.persist_login runs the three writes and rolls back to the pre-login snapshot on any failure (config.toml rewritten verbatim in one atomic dump; keyring entries restored best-effort).

⚠️ Security tradeoff (deliberate)

Removing the state nonce drops the login-CSRF / account-confusion protection on the loopback callback (an automated review flagged this HIGH). This was an explicit decision to avoid a backend/dashboard change. The protection-preserving alternative is to register http://127.0.0.1:8585/callback?state=* in each Stytch project and restore the nonce binding (build_start_url(state) + capture_callback(expected_state) with secrets.compare_digest). Revisit if/when the wildcard redirect is registered.

Test plan

  • ./scripts/check.sh passes end-to-end (lint, mypy/pyright, vulture/deptry/import-linter/xenon, 90% branch + 100% patch coverage, mutation gate, escape-hatch gate, build + twine).
  • Production browser login verified manually: the OAuth flow completes against api.stytch.com with the bare redirect (no query_params_do_not_match).
  • New unit tests cover the keyring-failure path (set_api_key/set_session) and the atomic-rollback path (persist_login rolls back to empty and to prior credentials when the session write fails).

🤖 Generated with Claude Code

Flip DEFAULT_ENV from sandbox000 to production now that the prod Stytch
redirect is live, and wire in the real production public token. --sandbox
(or --env sandbox000 / AAI_ENV) still selects the sandbox.

Remove the loopback CSRF state nonce so aai login sends a bare
/callback redirect with no query parameters. Stytch validates the
redirect's query string too, and the prod project only registers the
bare path, so the nonce produced query_params_do_not_match. Dropping it
avoids any Stytch dashboard change at the cost of the login-CSRF /
account-confusion protection (deliberate, see PR notes).

Also:
- Lift the Stytch public tokens to module constants so the constructor
  kwarg takes a name, not a literal, dropping both # noqa: S106.
- Make the shell-completion smoke test deterministic by pinning the
  detected shell (shellingham needs a TTY that CliRunner/CI lack).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
**Environment binding.** The backend environment is selected by `--env`
(or `AAI_ENV`, or the profile's stored env). `--sandbox` is shorthand for
`--env sandbox000`. The default environment is currently `sandbox000`.
`--env sandbox000`. The default environment is `production`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the documented default environment to 'production'. This can cause users to unknowingly run commands against production; avoid defaulting to production or clearly warn and require explicit opt-in.

Details

✨ AI Reasoning
​The change makes the CLI's documented default environment 'production' instead of 'sandbox000'. This increases the chance that users (especially new users or CI without explicit --env) will run commands against production, potentially performing destructive or sensitive operations without explicit confirmation. The modification narrows the safety margin provided by a sandbox default and therefore meaningfully worsens the security posture relative to the prior state.

🔧 How do I fix it?
Ensure skill actions match the description. Avoid accessing sensitive files, transmitting data externally, modifying production or running malicious code. Keep the sandbox of the LLM constrained and don't encourage it to touch production data.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

alexkroman-assembly and others added 2 commits June 8, 2026 11:38
set_api_key / set_session called keyring.set_password directly, so a
locked or ACL-denied OS keychain (e.g. macOS errSecInvalidOwnerEdit,
-25244) escaped as a raw PasswordSetError traceback during `aai login`.
Wrap both writes in _keyring_set, which converts keyring.errors.KeyringError
into a CLIError with a fixable suggestion — matching the project rule that
commands never print tracebacks for expected failures.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
persist_browser_login wrote the API key, env, and session as three
separate keyring/config writes, so a mid-sequence failure (e.g. a locked
keychain after the key was already stored) left a half-written profile —
an API key with no session, which looks signed-in but can't reach AMS.

Add config.persist_login, which runs the three writes and, on any
failure, restores the pre-login snapshot: config.toml is rewritten
verbatim in one atomic dump and the two keyring entries are restored
best-effort (try/finally + done flag, so no blind except). Rewire
persist_browser_login onto it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexkroman alexkroman merged commit 74fefde into main Jun 8, 2026
11 checks passed
@alexkroman alexkroman deleted the switch-default-env-to-production branch June 8, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants