Skip to content

feat(consensus): Record Consolidate Task Duration Histogram #2040

@refcell

Description

@refcell

Summary

ConsolidateTask::consolidate already computes total_duration via global_start.elapsed() on both success paths — the transient (no-FCU) path at line 204 and the full FCU path at line 239 — logging it in both info! calls but never recording it to Prometheus. An engine_consolidate_duration_seconds histogram should be recorded at both callsites, with the metric defined in crates/consensus/engine/src/metrics/mod.rs. The existing engine_method_request_duration histogram is not a substitute — it covers raw EL RPC calls only, not the full task path including block fetch and L2BlockInfo construction.

Consolidation is the critical path for safe head advancement: every derived block flows through it before safe head moves. A rising p99 on this histogram is the earliest measurable signal that the EL is under load before any downstream metric (safe head lag, derivation step failures) reacts. Keeping the measurement separate from the FCU histogram also lets operators distinguish whether slowness is in the block fetch phase or the forkchoice update phase.

// In metrics/mod.rs
#[describe("Total duration of the consolidate task in seconds")]
engine_consolidate_duration_seconds: histogram,
// In both success branches of ConsolidateTask::consolidate, after total_duration is computed:
Metrics::engine_consolidate_duration_seconds().record(total_duration.as_secs_f64());

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions