Skip to content

Add Neo4j graph output (--emit neo4j)#4

Merged
rahlk merged 5 commits into
mainfrom
feat/neo4j-migration
Jun 19, 2026
Merged

Add Neo4j graph output (--emit neo4j)#4
rahlk merged 5 commits into
mainfrom
feat/neo4j-migration

Conversation

@rahlk

@rahlk rahlk commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a Neo4j graph as a first-class output of the analyzer, alongside the existing analysis.json. The analysis engine is unchanged — this is a new output path plus the schema, tests, docs, and packaging around it.

Output targets (--emit)

  • json (default, unchanged) — the canonical analysis.json the Python SDK parses.
  • neo4j — projects the in-memory TSApplication into a labeled property graph:
    • without --neo4j-uri: writes a self-contained graph.cypher snapshot (constraints + indexes, a scoped wipe, then batched MERGE).
    • with --neo4j-uri: pushes to a live Neo4j over Bolt incrementally — only modules whose content hash changed are rewritten, and modules whose source file vanished are pruned on a full run.
  • schema — writes the machine-readable Neo4j schema contract (schema.json); needs no project.

Graph model

  • Signature-keyed declarations share a :Symbol label (one uniqueness constraint, one endpoint index); specific labels (:Class, :Callable, :Interface, …) are layered on top.
  • Call sites, decorators, class attributes, and variables are first-class nodes; entrypoints are a marker label.
  • Relationships: DECLARES, HAS_METHOD, HAS_ATTRIBUTE, DECLARES_VAR, HAS_CALLSITE, EXTENDS, IMPLEMENTS, CALLS, RESOLVES_TO, IMPORTS, RE_EXPORTS, MEMBER_OF, DECORATED_BY.
  • Properties are flattened to Neo4j primitives/arrays (nested data exploded into named props, parallel arrays, or JSON blobs).
  • schema_version is stamped on the :Application node for runtime producer/consumer drift detection.

Schema contract + anti-drift

  • src/build/neo4j/catalog.ts is the single source of truth (labels, relationships, typed properties, constraints, indexes, SCHEMA_VERSION).
  • --emit schema serializes it; the checked-in schema.neo4j.json is regenerated with bun gen:schema.
  • A conformance test asserts the emitter never produces a label/relationship/property the catalog doesn't declare, and that the checked-in schema is current.

Install + distribution

  • One-line shell installer (packaging/install/cants-installer.sh) published as a release asset: curl … /releases/latest/download/cants-installer.sh | sh.
  • schema.json is bundled in each wheel and published as a release asset; codeanalyzer_typescript.schema_path() exposes it.

Docs

  • README rewritten (badges, table of contents, install, usage, output targets); the cants --help block is kept in sync with the CLI via bun gen:readme and regenerated at release time.
  • The PyPI long description now renders the repo root README (copied into the wheel at build time, single source of truth).

Tests

  • Schema conformance test (no container).
  • @testcontainers/neo4j integration test for the bolt writer (full push, idempotency, prune) — opt-in via RUN_CONTAINER_TESTS / bun run test:container, so the release gate stays fast and flake-free.

Breaking

  • Removes the msgpack output format (-f/--format); --emit replaces it.

Closes #3.

Projects the in-memory TSApplication to a labeled property graph and
writes it either as a self-contained graph.cypher snapshot or, with
--neo4j-uri, pushes it incrementally to a live Neo4j over Bolt.

JSON output is unchanged and stays the default. Drops the msgpack
output format and adds neo4j-driver.

All new code lives under src/build/neo4j/ (projection + snapshot writer +
bolt writer); emit() gains a third branch and is now async.

Closes #3.
@rahlk rahlk added the enhancement New feature or request label Jun 19, 2026
rahlk added 3 commits June 19, 2026 17:44
Adds a declarative schema catalog (single source of truth) with a
SCHEMA_VERSION stamped onto the :Application node, a `cants --emit schema`
target that writes schema.json, and a checked-in schema.neo4j.json
(regenerate with `bun gen:schema`).

A conformance test asserts the emitter never produces a label,
relationship, or property the catalog doesn't declare, and that the
checked-in schema.json is current. Adds a @testcontainers/neo4j
integration test for the bolt writer (full push, idempotency, prune).

Packaging: build_wheels.sh emits schema.json next to the binary so it
ships in each wheel; codeanalyzer_typescript.schema_path() exposes it;
the release workflow also publishes schema.json as a GitHub Release asset.
Rewrites the README in the standard layout — centered logo + badges, a
table of contents, Features, Installation, Usage, and Output targets
(analysis.json / Neo4j / schema) — and drops the removed msgpack docs.

Adds a one-line shell installer (packaging/install/cants-installer.sh),
published as a release asset, so users can:
  curl -LsSf .../releases/latest/download/cants-installer.sh | sh

Keeps generated docs from drifting: `bun gen:readme` injects the live
`cants --help` between markers (via an extracted buildProgram()), and the
release workflow regenerates the README help block + schema.json and
commits them before publishing.

PyPI now renders the repo root README: build_wheels.sh copies it into the
wheel package at build time (gitignored, single source of truth), so the
stale packaging README is removed.
Gates the container-backed bolt suite behind RUN_CONTAINER_TESTS so the
release gate's `bun test` skips it — no Docker image pull, and no flake
that could trip the "delete tag on failure" step. Run it locally with
`bun run test:container`. The no-container schema conformance test still
runs by default.
@rahlk rahlk marked this pull request as ready for review June 19, 2026 22:27
Removed width attribute from logo image in README.
@rahlk rahlk merged commit b7276db into main Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a Neo4j graph output option

1 participant