Skip to content

feat(cli): add mosaic inspector CLI (schema/meta/cat/pages)#66

Open
jianguotian wants to merge 3 commits into
apache:mainfrom
jianguotian:feat/mosaic-cli
Open

feat(cli): add mosaic inspector CLI (schema/meta/cat/pages)#66
jianguotian wants to merge 3 commits into
apache:mainfrom
jianguotian:feat/mosaic-cli

Conversation

@jianguotian

@jianguotian jianguotian commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What

Mosaic shipped no viewer tooling — inspecting a file meant writing Rust against the library API. This adds a mosaic binary (new cli workspace crate) mirroring parquet-cli:

  • schema — column names, Arrow types, nullability, bucket assignment
  • meta — row groups, rows, per-column stats (null_count/min/max)
  • cat — first N rows as a table, with -n and --columns projection
  • pages — per-column encoding (plain/const/dict/all_null) + slot size

All four support --json. The reader is driven over a new file-backed InputFile (pread).

Core changes (read-only, additive)

Three small accessors used by pages: BucketReader::encodings(), ColumnPageReader::encoding(), MosaicReader::page_infos(). No format or behavior change.

Verification

199 core tests pass; built schema/meta/cat/pages in both text and JSON; encodings detected correctly (const/plain/dict) on a multi-bucket file.

mingfeng and others added 3 commits June 16, 2026 04:22
Mosaic previously shipped no viewer tooling — inspecting a file meant
writing Rust against the library API. Add a `mosaic` binary (a new `cli`
workspace crate) mirroring parquet-cli:

- schema: column names, Arrow types, nullability, bucket assignment
- meta:   row groups, rows, per-column stats (null_count/min/max)
- cat:    first N rows as a table, with -n and --columns projection
- pages:  per-column encoding (plain/const/dict/all_null) + slot size

All commands support --json. The reader is driven over a new file-backed
InputFile (pread). Core gains three small read-only accessors used by
`pages`: BucketReader::encodings(), ColumnPageReader::encoding(), and
MosaicReader::page_infos(). No format/behavior change; 199 core tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a core regression test for MosaicReader::page_infos asserting
plain/dict/const detection on a paged-bucket file, and CLI unit tests
for the fmt helpers (json escaping, value/encoding rendering, ndjson
null handling, table truncation).
Drive the mosaic binary against a fixture file (via CARGO_BIN_EXE) and
assert stdout for schema/meta/pages/cat, --json output, projection,
row truncation and missing-file failure. No external dev-deps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant