Skip to content

Instant::into_mach_absolute_time_ceil calls mach_timebase_info on every invocation #156914

@mysma-9403

Description

@mysma-9403

Summary

Instant::into_mach_absolute_time_ceil (library/std/src/sys/time/unix.rs:102) invokes mach_timebase_info on every call, without caching. The returned numer/denom values are boot-time constants of the process and never change, so the kernel-side resolution (Mach trap / commpage read) is pure waste on every invocation.

This function is on the hot path of std::thread::sleep_until on every Apple target (#[cfg(target_vendor = "apple")])

A cache for exactly this value used to exist (added in #77727, 2020). It was removed in bc300102d4 together with the broader feature it backed (clock_gettime migration). The cache was not restored when mach_timebase_info was reintroduced for sleep_until.

Reproduction

Code-level evidence:

grep -rn 'mach_timebase_info' library/std/src/sys/
# library/std/src/sys/time/unix.rs:104,110,115,116 -- only call site, no cache scaffolding

git log --all --oneline -S 'mach_timebase_info' -- library/std/
# f30cc74fb41 -- cache introduced  (#77727, 2020)
# bc300102d43 -- function removed entirely (Sep 2023, clock_gettime migration)
# 959be82effe -- function re-introduced WITHOUT cache (#151004, 2026)

Micro-benchmark on stable rustc (no nightly, no custom std required) showing the per-call cost regressed:

use std::hint::black_box;
use std::sync::OnceLock;
use std::time::Instant;

#[repr(C)]
struct MachTimebaseInfo { numer: u32, denom: u32 }

unsafe extern "C" {
    fn mach_timebase_info(info: *mut MachTimebaseInfo) -> i32;
}

#[inline(never)]
fn uncached() -> (u32, u32) {
    let mut t = MachTimebaseInfo { numer: 0, denom: 0 };
    let kr = unsafe { mach_timebase_info(&mut t) };
    assert_eq!(kr, 0);
    (t.numer, t.denom)
}

#[inline(never)]
fn cached() -> (u32, u32) {
    static T: OnceLock<(u32, u32)> = OnceLock::new();
    *T.get_or_init(uncached)
}

fn main() {
    let _ = cached(); // warm-up
    const N: u32 = 10_000_000;

    let s = Instant::now();
    let mut acc = 0u64;
    for _ in 0..N { let (a,b) = uncached(); acc = acc.wrapping_add(a as u64).wrapping_add(b as u64); }
    let d1 = s.elapsed();
    black_box(acc);

    let s = Instant::now();
    let mut acc = 0u64;
    for _ in 0..N { let (a,b) = cached();   acc = acc.wrapping_add(a as u64).wrapping_add(b as u64); }
    let d2 = s.elapsed();
    black_box(acc);

    println!("uncached: {:?}/call", d1 / N);
    println!("cached:   {:?}/call", d2 / N);
}

Result on x86_64-apple-darwin, macOS 15.7.7, rustc -O:

uncached: 3ns/call    (10M iters in 34.5ms)
cached:   1ns/call    (10M iters in 10.9ms)

The trap-or-commpage cost is ~3x the cost of a relaxed atomic load. Every sleep_until call on Apple pays that delta unnecessarily.

Suggested fix

Extract a private helper in the same module that memoizes the pair via crate::sync::OnceLock<(u32, u32)>; into_mach_absolute_time_ceil then reads the cached pair and does only the arithmetic. Change is purely additive, no API change, no behavior change, stays inside the existing #[cfg(target_vendor = "apple")] gate. Mirrors the precedent set by #77727.

I have a working patch and can open a PR if useful.

Meta

rustc 1.93.1 (01f6ddf75 2026-02-11)
host: x86_64-apple-darwin
macOS 15.7.7 (Sequoia)

Affects every Apple target: x86_64-apple-darwin, aarch64-apple-darwin, iOS, tvOS, watchOS, visionOS - the regressed code path is gated #[cfg(target_vendor = "apple")].

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-slowIssue: Problems and improvements with respect to performance of generated code.O-macosOperating system: macOSneeds-triageThis issue may need triage. Remove it if it has been sufficiently triaged.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions