codeanalyzer-python
Point canpy at a project and it builds one analysis in memory — a typed model of every module, class, method, and call edge, plus the framework entrypoints that reach them — then emits it the way you need it. It’s the Python backend behind CLDK, usable standalone as a CLI or a library.
One analysis, three output targets via --emit:
analysis.json(default) — the self-containedPyApplicationartifact, loaded whole into memory by the consumer.- Neo4j property graph (
--emit neo4j) — project the same model into a labeled property graph: agraph.cyphersnapshot, or an incremental live push to Neo4j over Bolt. Every node label isPy-prefixed and every relationship typePY_-prefixed (:PyClass,PY_CALLS), so Java, TypeScript, and Python analyzers can share one database without label collisions. The graph is a queryable, persistent system of record that holds many applications at once — cross-service questions become a Cypher traversal instead of parsing giant JSON blobs. - Schema contract (
--emit schema) — the machine-readable, version-stamped Neo4j schema (schema_version1.1.0), no project required.
Start building
Section titled “Start building”Emit to a Neo4j property graph
Section titled “Emit to a Neo4j property graph”Build the analysis once and project it into a graph. Without --neo4j-uri, canpy writes a self-contained graph.cypher (constraints + indexes, a scoped wipe of this app’s prior subgraph, then batched MERGEs) that you load with cypher-shell:
canpy --input ./my-service --emit neo4j --app-name my-servicecypher-shell < graph.cypherWith --neo4j-uri, it pushes to a live Neo4j over Bolt incrementally — only modules whose content hash changed are rewritten, and on a full run modules whose source file vanished are pruned. The push is scoped to the :PyApplication anchor named by --app-name, so writing one application never clobbers another’s modules in a shared database:
export NEO4J_URI=bolt://localhost:7687export NEO4J_PASSWORD=… # prefer the env var so it stays out of shell historycanpy --input ./my-service --emit neo4j --app-name my-serviceThe live push needs the neo4j driver extra (pip install 'codeanalyzer-python[neo4j]'); the snapshot and schema modes need nothing extra.
Read the graph back with CLDK
Section titled “Read the graph back with CLDK”A separate job populates the graph out of band; consumers just read it. The CLDK Python SDK has a read-only Neo4j backend — point it at the Bolt URI and it reconstructs the same typed PyClass/PyCallable objects and the same networkx call graph as the in-process analyzer, with no JDK, no native binary, and no project source on the consumer. It only needs the graph and read-only credentials.
from cldk import CLDKfrom cldk.analysis.commons.backend_config import Neo4jConnectionConfig
analysis = CLDK.python( backend=Neo4jConnectionConfig( uri="bolt://localhost:7687", username="neo4j", password="neo4j", application_name="my-service", # matches canpy --app-name ),)classes = analysis.get_classes() # Dict[str, PyClass]cg = analysis.get_call_graph() # networkx.DiGraph keyed by callable signaturesapplication_name matches the --app-name the graph was loaded with, scoping every query to that one application. The neo4j driver is an optional extra here too: pip install cldk[neo4j].