Call graph schema
At analysis level 2, codeanalyzer-java runs WALA over the compiled program and adds a call_graph array to the output. Each element is one caller→callee edge.
{ "call_graph": [ { "source": { "file_path": "...", "type_declaration": "...", "signature": "...", "callable_declaration": "..." }, "target": { "file_path": "...", "type_declaration": "...", "signature": "...", "callable_declaration": "..." }, "type": "CALL", "weight": "1" } ]}Edge shape
Section titled “Edge shape”{ source: CallableVertex // The caller target: CallableVertex // The callee type: string // Edge kind, e.g. "CALL" weight: string // Call multiplicity (usually "1")}Vertex shape (CallableVertex)
Section titled “Vertex shape (CallableVertex)”Both source and target identify a method or constructor:
{ file_path: string // File the callable lives in type_declaration: string // Declaring type signature: string // "methodName(Type1, Type2)" callable_declaration: string // Full declaration text}The signature matches the keys used in the symbol table’s callable_declarations, so you can join an edge endpoint back to its full callable record (body, complexity, annotations, …).
Working with the edges
Section titled “Working with the edges”Because edges are flat, the natural move is to load them into a graph library. The CLDK Python SDK does exactly this, exposing the call graph as a networkx.DiGraph:
from cldk import CLDKfrom cldk.analysis import AnalysisLevelimport networkx as nx
analysis = CLDK.java( project_path="commons-cli", analysis_level=AnalysisLevel.call_graph, # -> runs with -a 2)
cg = analysis.get_call_graph() # networkx.DiGraphnx.has_path(cg, source_node, sink_node) # reachability as a graph queryIf you consume the JSON directly, the same idea applies — build adjacency from source → target and run your traversal of choice.
The same edges in the Neo4j graph
Section titled “The same edges in the Neo4j graph”When you emit to Neo4j (--emit neo4j) instead of JSON, these edges are projected as a first-class relationship rather than a flat array. Each caller→callee pair becomes a typed J_CALLS edge between the two :JCallable nodes:
(:JCallable)-[:J_CALLS {type, weight, source_kind, destination_kind}]->(:JCallable)The edge properties carry the same type and weight you see in JSON, plus source_kind and destination_kind describing the endpoints. The endpoints are the same :JCallable nodes the symbol-table projection already created — so a call edge and the method bodies it connects live in one graph, queryable together. See the Neo4j graph schema for the full node-and-relationship reference.
Two projection rules are worth stating plainly, because they shape what you can and can’t query:
J_CALLSexists only at-a 2. Level 1 emits the lossless symbol-table subgraph with types, methods, and fields but no call edges — exactly mirroring a level-1analysis.json. Combining-t/--target-fileswith-a 2downgrades the run to level 1, so a targeted incremental push refreshes structure without recomputingJ_CALLS.J_CALLSis gated to resolved application callables. A call edge is kept only when both endpoints were emitted as:JCallablenodes. Calls into the JDK or third-party jars therefore do not appear asJ_CALLSedges — the same boundary as the in-memory call graph. The projector keys vertices offcallable_declaration(the raw declaration signature) rather than the displaysignature, which is what lets constructor edges resolve to their target nodes instead of dangling (fix #158).
Reachability as a Cypher traversal
Section titled “Reachability as a Cypher traversal”The networkx has_path check above has a direct graph-database analogue. Scope to one application by its :JApplication anchor — the --app-name the graph was loaded with — and ask Cypher for a path along J_CALLS:
MATCH (app:JApplication {name: $appName})MATCH (app)-[:J_HAS_UNIT]->(:JCompilationUnit)-[:J_DECLARES_TYPE]->(:JType) -[:J_HAS_CALLABLE]->(src:JCallable {signature: $sourceSig})MATCH (dst:JCallable {signature: $sinkSig})RETURN exists((src)-[:J_CALLS*1..]->(dst)) AS reachableBecause the graph is persistent and multi-tenant — many applications anchored at their own :JApplication in one database — this traversal runs without re-analyzing anything. The CLDK SDK reads the same edges back as a networkx.DiGraph over a read-only Neo4j connection, so the Python example above is unchanged except for the backend:
from cldk import CLDKfrom cldk.analysis import AnalysisLevelfrom cldk.analysis.commons.backend_config import Neo4jConnectionConfigimport networkx as nx
analysis = CLDK.java( analysis_level=AnalysisLevel.call_graph, backend=Neo4jConnectionConfig( uri="bolt://localhost:7687", username="neo4j", password="neo4j", application_name="daytrader8", # == the --app-name the graph was loaded with ),)
cg = analysis.get_call_graph() # networkx.DiGraph, rebuilt from J_CALLSnx.has_path(cg, source_node, sink_node) # identical query, no JDK or project sourceThe analysis is produced once — out of band, by a job running codeanalyzer -a 2 --emit neo4j — and read cheaply everywhere after that. Read-only credentials are sufficient on the consumer.
Why a build is required
Section titled “Why a build is required”WALA analyzes compiled classes and needs an entry point to anchor traversal. That’s why level 2 builds the project by default (Build integration) and why a project with no main and no recognized framework entry points can yield an empty call_graph. See Troubleshooting if that happens.