Skip to content

feat(rivetkit): sqlite vfs v2#4673

Draft
NathanFlurry wants to merge 19 commits into04-15-chore_rename_sandbox_-_kitchen_sinkfrom
feat/sqlite-vfs-v2
Draft

feat(rivetkit): sqlite vfs v2#4673
NathanFlurry wants to merge 19 commits into04-15-chore_rename_sandbox_-_kitchen_sinkfrom
feat/sqlite-vfs-v2

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app bot commented Apr 16, 2026

🚅 Deployed to the rivet-pr-4673 environment in rivet-frontend

Service Status Web Updated (UTC)
website 😴 Sleeping (View Logs) Web Apr 18, 2026 at 4:21 pm
frontend-cloud 😴 Sleeping (View Logs) Web Apr 18, 2026 at 5:44 am
kitchen-sink ❌ Build Failed (View Logs) Web Apr 18, 2026 at 3:47 am
frontend-inspector 😴 Sleeping (View Logs) Web Apr 17, 2026 at 12:35 am
mcp-hub ✅ Success (View Logs) Web Apr 16, 2026 at 4:08 pm
ladle ❌ Build Failed (View Logs) Web Apr 16, 2026 at 9:18 am

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 16, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 16, 2026

Preview packages published to npm

Install with:

npm install rivetkit@pr-4673

All packages published as 0.0.0-pr.4673.bd7bc7b with tag pr-4673.

Engine binary is shipped via @rivetkit/engine-cli on linux-x64-musl, linux-arm64-musl, darwin-x64, and darwin-arm64. Windows users should use the release installer or set RIVET_ENGINE_BINARY.

Docker images:

docker pull rivetdev/engine:slim-bd7bc7b
docker pull rivetdev/engine:full-bd7bc7b
Individual packages
npm install rivetkit@pr-4673
npm install @rivetkit/react@pr-4673
npm install @rivetkit/rivetkit-native@pr-4673
npm install @rivetkit/workflow-engine@pr-4673

@NathanFlurry NathanFlurry changed the title feat: [US-001] - [Create v1 SQLite VFS baseline test suite] feat(rivetkit): sqlite vfs v2 Apr 16, 2026
xWrite in the v2 VFS was calling resolve_pages for every dirty page,
even when the write was a full-page aligned overwrite. For page numbers
beyond db_size_pages (newly allocated pages), this meant fetching from
the engine to get data that doesn't exist.

A 256-row INSERT transaction (~1MB of dirty data) was making 288 round
trips to the engine: one get_pages RTT per new page allocation. At
128ms staging RTT that's ~37s of theoretical network time vs the ~130ms
needed for just the final commit.

Now:
- Full-page aligned writes (offset % page_size == 0 && amt % page_size == 0)
  skip resolve_pages entirely and use a zero-filled page as the base.
- Partial writes filter out pages > db_size_pages before calling
  resolve_pages, synthesizing zero pages locally for new allocations.

Direct engine profiling tests verify 0 engine fetches for 1MB, 5MB, and
100-hot-row-updates workloads.

Adds new stress-test workloads to the kitchen-sink bench:
churnInsertDelete, mixedOltpLarge, growingAggregation,
indexCreationOnLargeTable, bulkUpdate1000Rows, truncateAndRegrow,
manySmallTables.
The commit, compaction, and takeover paths all read meta_key and then
write meta_key within the same transaction. Meta reads were using
Snapshot isolation which does NOT register a read conflict range in
UniversalDB (and FoundationDB semantics).

Without a read conflict range, two concurrent transactions could both
read meta at the same snapshot, both plan mutations based on that stale
read, and both commit with last-write-wins semantics. For commit vs
compaction this caused head_txid to rewind: commit advances from N to
N+1, compaction then writes its own meta with head=N preserved (from
its snapshot), overwriting the commit's head=N+1. The VFS sees its own
successful commit response but the engine state is behind, and the next
commit fence-mismatches with 'commit head_txid X+1 did not match current
head_txid X'.

Fix: introduce tx_get_value_serializable in udb and use it for meta reads
in the fast-path commit, the compaction shard write, and the takeover
TOCTOU re-check. Serializable reads register the key in the read conflict
range so concurrent writes by other transactions trigger a retry.

Adds regression tests in the v2 VFS Direct-engine harness:
- autocommit_inserts_maintain_head_txid_consistency
- autocommit_survives_close_reopen
- concurrent_multi_actor_autocommits
- stress_concurrent_multi_actor_autocommits

Local bench against the engine now passes Insert single x100/x1000/x10000
(autocommit workloads) that previously timed out or fence-mismatched.

The bench still has occasional fence errors under heavy concurrent
multi-actor load. Those do not reproduce in the Direct engine stress
test (5 actors x 200 interleaved commits), so the remaining issue is
not in the storage layer and needs separate investigation.
Both the RocksDB and Postgres drivers called
`substitute_versionstamp_if_incomplete` inside `Operation::Set` paths,
which reads the last 4 bytes of the value as an "offset" and
unconditionally truncates them before checking whether the value is
actually a versionstamped payload. The helper then swallows the
validation error, so a value whose trailing 4 bytes happen to form a
plausible u32 pointing at a 0x33 byte gets silently corrupted:
4 bytes lost, no error surfaced.

This showed up in the v2 SQLite slow path: large transactions (~50 MiB)
split into 7-8 MiB staged chunks, UDB chunks the serde_bare blob into
10 KiB physical chunks, and a small fraction of those chunks have
trailing bytes that look like a versionstamp offset. One corrupted
chunk is enough to break serde_bare decoding at commit_finalize time
with "unexpected end of input", aborting the commit.

Versionstamp substitution only belongs on SetVersionstampedKey and
SetVersionstampedValue mutation types, which go through AtomicOp. Plain
`Set` must preserve the bytes verbatim, so this change removes the
substitution from both driver Set paths and drops the now-unused import.

Adds bench-parity Direct-engine tests in
rivetkit-typescript/packages/sqlite-native/src/v2/vfs.rs so kitchen-sink
bench workloads (large TX insert 500KB/10MB/50MB, churn insert/delete,
mixed OLTP, bulk update 1000 rows, truncate + regrow, many small
tables, index creation on 10k rows, growing aggregation, insert TX
x10000) run against the real engine without the full stack. The 50MB
test is the minimal reproducer for this bug and failed reliably before
the fix.
@claude
Copy link
Copy Markdown

claude bot commented Apr 17, 2026

SQLite VFS v2 Review (Updated 2026-04-18)

This updates my previous review, reflecting commits through US-062 and the latest chore commits.


Improvements Since Last Review

  • Fence rewind fix: Serializable reads on META prevent stale-fence reads during concurrent takeover.
  • Versionstamp substitution: Stopped applying it on every UniversalDB Set -- avoids silent corruption on non-versionstamp keys.
  • Per-txid chunked DELTA layout (US-048): Cleaner than single-blob; simplifies orphan accounting.
  • VFS-owned fields protected in update_meta (US-046): Correct separation of client-writable vs engine-owned fields.
  • Phase-level instrumentation (US-059): Prometheus metrics on commit phases are valuable for production debugging.
  • Expanded test coverage (US-045).

Issues

Blocker: Temporary publish.yaml Override Must Be Reverted

The CI change forces release builds for all rivetkit-native targets unconditionally, affecting all branches globally. Must be reverted before merge.

Error Convention Mismatch

error.rs uses thiserror directly instead of the project's #[derive(RivetError)] macro from rivet-error. CLAUDE.md mandates struct-based errors with #[derive(RivetError)]. Either convert, or document why thiserror is acceptable here (e.g., sqlite-storage is a library crate isolated from engine error infrastructure).

lazy_static Should Be Replaced

Cargo.toml adds lazy_static.workspace = true, but the workspace already declares once_cell = "1.20.2" and this codebase prefers it.

Loop in encode_db_head_with_usage Needs a Termination Comment

The convergence loop in quota.rs is correct (serde_bare varint encoding stabilizes quickly), but add a comment explaining why the loop terminates -- the varint-boundary argument is non-obvious to a future reader.


Previously Flagged Items -- Still Need Confirmation

  • Untyped bail! for quota exceeded -- should use a typed SqliteStorageError variant.
  • No server-side cap on commit_stage chunk byte_length -- a malicious client could send oversized chunks before the full-commit size check runs.
  • PendingStage insert return value silently dropped -- scc::HashMap::insert returns the old value; a stage collision would be silently overwritten.
  • Orphaned DELTA chunks -- per-txid layout (US-048) helps, but confirm incomplete staged commits clean up their DELTA chunks on actor disconnect.

Nits

  • async-trait dependency -- Rust 1.75+ has stable async traits; verify it is still needed.
  • parking_lot is new -- ensure it is not used for Mutex<HashMap> patterns (CLAUDE.md forbids those).
  • FenceMismatch error message starts uppercase; other variants are lowercase prose.

PR is still a draft. The publish.yaml override is the only hard blocker; the rest are code-quality items to address before marking ready.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant