feat(neo4j): J-namespaced, lossless Neo4j graph output (#154)#155
Merged
Conversation
Port the codeanalyzer-typescript 0.4.0 Neo4j feature to Java with the same
arg entrypoints:
--emit json|neo4j|schema (default json)
--app-name, --neo4j-uri, --neo4j-user, --neo4j-password, --neo4j-database
New com.ibm.cldk.neo4j package:
- GraphProjector: pure projection of the symbol table (+ level-2 call graph)
to graph rows. Type/Callable share a :Symbol identity; call sites, fields,
parameters, variables, enum constants, record components are first-class
nodes; annotations/packages are shared; entrypoints are a marker label;
every unit-owned node carries a _unit provenance prop.
- CypherWriter: self-contained graph.cypher snapshot (constraints, scoped
wipe, batched UNWIND/MERGE).
- BoltWriter: live incremental push over Bolt — diffs each compilation unit's
content_hash, replaces only changed units (idempotent MERGE), prunes
vanished units on a full run. Uses neo4j-java-driver 4.4.x (JDK 11/native).
- SchemaCatalog + Schema: the in-repo graph contract (labels, relationships,
typed properties, DDL); --emit schema serializes it to schema.json.
Tests:
- Neo4jSchemaConformanceTest (no container): anti-drift guard asserting the
projector never emits a label/rel/property the catalog doesn't declare, and
that schema.neo4j.json is current.
- Neo4jBoltWriterTest (opt-in, Testcontainers Neo4j): full push, idempotent
re-push, and orphan pruning against a real database. Runs only when
RUN_CONTAINER_TESTS is set.
Docs/release/packaging:
- README: install one-liner + Neo4j graph output section + refreshed --help.
- release.yml: publish codeanalyzer.jar, schema.json and the installer as
release assets, with cargo-dist-style release notes.
- packaging/install/codeanalyzer-installer.sh: curl/wget installer that fetches
the jar and drops a `codeanalyzer` launcher on PATH.
- neo4j-schema.drawio: diagram of the emitted property-graph schema.
- schema.neo4j.json: checked-in graph contract. Bump version to 2.4.0.
…e image can analyze (#153) The GraalVM native image crashed on every project with `java.lang.NoSuchFieldError: variables` -- JavaParser's metamodel (PropertyMetaModel.getValue) reflects over AST node fields, which aren't registered under native-image's closed world, so any parse died before reaching any emit target. Only `--emit schema` (no analysis) worked. Fix: register all 266 `com.github.javaparser.ast.**` classes with allDeclaredFields/Methods/Constructors (per-fixture tracing-agent capture did not generalize -- unseen apps hit fresh NoSuchFieldErrors like `pairs`/`elements`), plus the tracing-agent-captured reflect/jni/resource entries for the parse path. Verified: native `--emit json -a 1` and `--emit neo4j` now succeed on every test fixture, including the large real apps (daytrader8, plantsbywebsphere) that previously crashed. Note: a residual, native-inherent limitation remains -- symbol resolution via JavaParser's ReflectionTypeSolver is ~20% degraded on JDK/dependency types not registered for reflection (+ version reports "unknown"). The cldk SDK sidesteps this by running the jar on a bundled HotSpot JVM (full fidelity), so the native gap affects only direct use of the standalone native binary.
2e42ac3 to
e5c6065
Compare
… Bolt seam (#154) Brings the Java Neo4j backend to parity with the Python/TypeScript siblings and makes it a lossless projection of the IR. Namespacing (so a Java graph can share a Neo4j DB with Py*/TS* graphs): - all node labels J-prefixed, all relationship types J_-prefixed, constraint/index names j_-prefixed. - provenance property renamed _unit -> _module (matches the siblings). - --emit schema now writes schema.neo4j.json (matches the checked-in contract); release asset + README updated. - NEO4J_URI/USERNAME/PASSWORD/DATABASE env-var fallback (flag > env > default). Lossless projection (every Lombok entity field is represented): - new first-class nodes :JInitializationBlock, :JCrudOperation, :JCrudQuery, :JComment, with J_HAS_INIT_BLOCK / J_HAS_CRUD_OPERATION / J_HAS_CRUD_QUERY / J_HAS_COMMENT and J_HAS_CALLSITE/J_DECLARES_VAR extended to init blocks. - added scalar props: is_modified, file_path (callable), variable_initializers_json, default_value, argument_expr, is_unspecified, start/end_column on params & vars, docstrings, and source_kind/destination_kind on J_CALLS. - CypherWriter.DESCENDANTS extended so the scoped wipe / orphan prune reach the new containment edges. Packaging seam (lets the GraalVM native image prune the driver): - BoltConfig + BoltSink extracted as driver-free core types; BoltWriter is the only class importing org.neo4j.driver.* and is loaded reflectively by Neo4jEmitter, so the fat jar bundles the driver (live Bolt push works) while native-image, which never statically references it, falls back to writing graph.cypher. Schema contract regenerated (16 node labels, 20 relationship types, 15 constraints); conformance test updated; neo4j-schema.drawio refreshed to the J-prefixed schema with the call graph (J_CALLS / J_RESOLVES_TO) drawn explicitly.
e5c6065 to
9fc85c4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #154. Also carries the #153 native-image fix (already merged-as-closed).
What this does
Brings the Java Neo4j backend (
--emit neo4j) to parity with the Python/TS siblings and makes it a lossless projection of the analysis IR.Namespacing (shared-DB safe)
J-prefixed, rel typesJ_-prefixed, constraint/index namesj_-prefixed._unit→_module(matches siblings).--emit schema→schema.neo4j.json;NEO4J_URI/USERNAME/PASSWORD/DATABASEenv fallback (flag > env > default).Lossless projection
:JInitializationBlock,:JCrudOperation,:JCrudQuery,:JComment(+ their rels);J_HAS_CALLSITE/J_DECLARES_VARextended to init blocks.is_modified,file_path,variable_initializers_json,default_value,argument_expr,is_unspecified, parameter/variable columns, docstrings,J_CALLSsource_kind/destination_kind).CypherWriter.DESCENDANTSextended so the scoped wipe / orphan-prune reach the new containment edges.Driver-free Bolt seam
BoltConfig+BoltSinkare driver-free core types;BoltWriteris the only class importingorg.neo4j.driver.*and is loaded reflectively. The fat jar bundles the driver (live Bolt push works); the GraalVM native image never statically references it and falls back to writinggraph.cypher.Native (#153)
NoSuchFieldErroron every parse).Verification
schema.neo4j.jsonmatches the catalog.--emit json/--emit neo4jverified on real apps (daytrader8, plantsbywebsphere).neo4j-schema.drawiorefreshed to the J-prefixed schema, with the call graph (J_CALLS/J_RESOLVES_TO) drawn explicitly.Not included
PyPI/JDK distribution scaffolding was intentionally left out of this PR.