Call graph schema

At analysis level 2, codeanalyzer-java runs WALA over the compiled program and adds a call_graph array to the output. Each element is one caller→callee edge.

{
  "call_graph": [
    {
      "source": { "file_path": "...", "type_declaration": "...", "signature": "...", "callable_declaration": "..." },
      "target": { "file_path": "...", "type_declaration": "...", "signature": "...", "callable_declaration": "..." },
      "type": "CALL",
      "weight": "1"
    }
  ]
}

Edge shape

{
  source: CallableVertex                      // The caller
  target: CallableVertex                      // The callee
  type: string                                // Edge kind, e.g. "CALL"
  weight: string                              // Call multiplicity (usually "1")
}

Vertex shape (`CallableVertex`)

Both source and target identify a method or constructor:

{
  file_path: string                           // File the callable lives in
  type_declaration: string                    // Declaring type
  signature: string                           // "methodName(Type1, Type2)"
  callable_declaration: string                // Full declaration text
}

The signature matches the keys used in the symbol table’s callable_declarations, so you can join an edge endpoint back to its full callable record (body, complexity, annotations, …).

Working with the edges

Because edges are flat, the natural move is to load them into a graph library. The CLDK Python SDK does exactly this, exposing the call graph as a networkx.DiGraph:

from cldk import CLDK
from cldk.analysis import AnalysisLevel
import networkx as nx

analysis = CLDK.java(
    project_path="commons-cli",
    analysis_level=AnalysisLevel.call_graph,    # -> runs with -a 2
)

cg = analysis.get_call_graph()              # networkx.DiGraph
nx.has_path(cg, source_node, sink_node)     # reachability as a graph query

If you consume the JSON directly, the same idea applies — build adjacency from source → target and run your traversal of choice.

The same edges in the Neo4j graph

When you emit to Neo4j (--emit neo4j) instead of JSON, these edges are projected as a first-class relationship rather than a flat array. Each caller→callee pair becomes a typed J_CALLS edge between the two :JCallable nodes:

(:JCallable)-[:J_CALLS {type, weight, source_kind, destination_kind}]->(:JCallable)

The edge properties carry the same type and weight you see in JSON, plus source_kind and destination_kind describing the endpoints. The endpoints are the same :JCallable nodes the symbol-table projection already created — so a call edge and the method bodies it connects live in one graph, queryable together. See the Neo4j graph schema for the full node-and-relationship reference.

Two projection rules are worth stating plainly, because they shape what you can and can’t query:

J_CALLS exists only at -a 2. Level 1 emits the lossless symbol-table subgraph with types, methods, and fields but no call edges — exactly mirroring a level-1 analysis.json. Combining -t/--target-files with -a 2 downgrades the run to level 1, so a targeted incremental push refreshes structure without recomputing J_CALLS.
J_CALLS is gated to resolved application callables. A call edge is kept only when both endpoints were emitted as :JCallable nodes. Calls into the JDK or third-party jars therefore do not appear as J_CALLS edges — the same boundary as the in-memory call graph. The projector keys vertices off callable_declaration (the raw declaration signature) rather than the display signature, which is what lets constructor edges resolve to their target nodes instead of dangling (fix #158).

Reachability as a Cypher traversal

The networkx has_path check above has a direct graph-database analogue. Scope to one application by its :JApplication anchor — the --app-name the graph was loaded with — and ask Cypher for a path along J_CALLS:

MATCH (app:JApplication {name: $appName})
MATCH (app)-[:J_HAS_UNIT]->(:JCompilationUnit)-[:J_DECLARES_TYPE]->(:JType)
          -[:J_HAS_CALLABLE]->(src:JCallable {signature: $sourceSig})
MATCH (dst:JCallable {signature: $sinkSig})
RETURN exists((src)-[:J_CALLS*1..]->(dst)) AS reachable

Because the graph is persistent and multi-tenant — many applications anchored at their own :JApplication in one database — this traversal runs without re-analyzing anything. The CLDK SDK reads the same edges back as a networkx.DiGraph over a read-only Neo4j connection, so the Python example above is unchanged except for the backend:

from cldk import CLDK
from cldk.analysis import AnalysisLevel
from cldk.analysis.commons.backend_config import Neo4jConnectionConfig
import networkx as nx

analysis = CLDK.java(
    analysis_level=AnalysisLevel.call_graph,
    backend=Neo4jConnectionConfig(
        uri="bolt://localhost:7687",
        username="neo4j",
        password="neo4j",
        application_name="daytrader8",          # == the --app-name the graph was loaded with
    ),
)

cg = analysis.get_call_graph()              # networkx.DiGraph, rebuilt from J_CALLS
nx.has_path(cg, source_node, sink_node)     # identical query, no JDK or project source

The analysis is produced once — out of band, by a job running codeanalyzer -a 2 --emit neo4j — and read cheaply everywhere after that. Read-only credentials are sufficient on the consumer.

Why a build is required

WALA analyzes compiled classes and needs an entry point to anchor traversal. That’s why level 2 builds the project by default (Build integration) and why a project with no main and no recognized framework entry points can yield an empty call_graph. See Troubleshooting if that happens.