Comparing changes

Two benchmarks to help with performance tuning: - The ListMerger is the algorithm we use to merge batches in background worker threads. We benchmark it by merging N randomly generated indexed Z-set on-disk or in-memory. - The input map benchmark simulates inges into a table with a primary key using a circuit with a single input_map operator. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>

This commit applies the same optimization to ListMerge that we previously implemented for CursorList: replace linear scan for min values at each step of the cursor with a binary heap that keeps the cursors partially sorted and only does O(log(n) x m) work at each step, where n is the number of cursors and m is the number of cursors that point to the current min key or value. Benchmark results on ListMerger benchmarks. The `key range: 100` column represents the workload with many keys and few values per key; the key range: 100000000 workload is lots of unique keys in each batch a small number of values per key. The key in these benchmarks is u64; the value is Tup10<u64,...,u64>. This implementation doesn't yet contain some of the lower-level optimizations we implemented for CursorList: replacing array indexig with `get_unchecked` and storing raw pointers to keys and values instead of reading them on each access. BEFORE Memory-backed batches ┌─────────────┬────────────────────┬──────────────────────────┐ │ # Batches │ key range: 100 │ key range: 100000000 │ ├─────────────┼────────────────────┼──────────────────────────┤ │ 1 │ 7.2 │ 3.0 │ │ 8 │ 5.0 │ 2.6 │ │ 32 │ 3.2 │ 2.0 │ │ 64 │ 2.2 │ 1.6 │ └─────────────┴────────────────────┴──────────────────────────┘ File-backed batches ┌─────────────┬────────────────────┬──────────────────────────┐ │ # Batches │ key range: 100 │ key range: 100000000 │ ├─────────────┼────────────────────┼──────────────────────────┤ │ 1 │ 5.3 │ 2.2 │ │ 8 │ 4.3 │ 1.9 │ │ 32 │ 3.1 │ 1.7 │ │ 64 │ 2.4 │ 1.4 │ └─────────────┴────────────────────┴──────────────────────────┘ AFTER Memory-backed batches ┌─────────────┬────────────────────┬──────────────────────────┐ │ # Batches │ key range: 100 │ key range: 100000000 │ ├─────────────┼────────────────────┼──────────────────────────┤ │ 1 │ 7.4 │ 3.1 │ │ 8 │ 5.6 │ 2.6 │ │ 32 │ 4.9 │ 2.3 │ │ 64 │ 4.4 │ 2.1 │ └─────────────┴────────────────────┴──────────────────────────┘ File-backed batches ┌─────────────┬────────────────────┬──────────────────────────┐ │ # Batches │ key range: 100 │ key range: 100000000 │ ├─────────────┼────────────────────┼──────────────────────────┤ │ 1 │ 5.3 │ 2.2 │ │ 8 │ 4.4 │ 1.9 │ │ 32 │ 4.0 │ 1.8 │ │ 64 │ 3.8 │ 1.7 │ └─────────────┴────────────────────┴──────────────────────────┘ Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>

Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Uh oh!

Commits on Mar 3, 2026

This comparison is taking too long to generate.

Uh oh!