Skip to content

feat: refreshTableSchemas improvements#346

Open
ilyabo wants to merge 1 commit intomainfrom
feat--refreshTableSchemas-improvements
Open

feat: refreshTableSchemas improvements#346
ilyabo wants to merge 1 commit intomainfrom
feat--refreshTableSchemas-improvements

Conversation

@ilyabo
Copy link
Collaborator

@ilyabo ilyabo commented Feb 9, 2026

Summary by CodeRabbit

  • Changes

    • Removed S3 data source integration from the data upload dialog. Upload methods now limited to Local Upload and TileSet options.
  • Improvements

    • Enhanced table schema loading with improved error handling for retrieving column information.
    • Optimized concurrent schema refresh requests to reduce duplicate processing.

Signed-off-by: Ilya Boyandin <ilyabo@gmail.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

This pull request refactors DuckDbSlice's table schema loading with deduplication and enhanced metadata handling, deprecates S3 integration in KeplerAddDataDialog, adds a config guard to RoomShellSlice, and relocates an import in ColumnTreeNode.

Changes

Cohort / File(s) Summary
DuckDB Schema Loading Refactor
packages/duckdb/src/DuckDbSlice.ts
Introduced refreshPromise state for deduplication of concurrent refresh calls. Refactored loadTableSchemas to fetch tables and views separately, aggregate results, and gather per-entry column metadata via PRAGMA table_info with per-entry error handling. Enhanced table representation with computed columns, SQL, comments, isView flag, and derived rowCount. Implemented caching behavior where in-progress refreshes return current tables, with failed PRAGMA data captured for non-views.
Kepler S3 Integration Removal
packages/kepler/src/components/KeplerAddDataDialog.tsx
Removed S3-related props from KeplerAddDataDialogProps type and disabled S3 browser functionality via commented code. Removed S3 props destructuring and UI tab content while preserving Local Upload and TileSet methods. Reordered imports for better organization.
Room Shell Config Guard
packages/room-shell/src/RoomShellSlice.ts
Added early exit guard in updateReadyDataSources when config is unavailable. Introduced debug logging for config state.
Schema Tree Import Cleanup
packages/schema-tree/src/nodes/ColumnTreeNode.tsx
Relocated cn import from @sqlrooms/ui to module top level without functional changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 A schema's tale, now cleaner told,
With promises that dupes withhold,
Tables dance in queries bright,
While S3 fades from view's sight—
The rabbit hops through code so tight! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'feat: refreshTableSchemas improvements' directly reflects the primary change—a significant refactor and enhancement of the refreshTableSchemas function in DuckDbSlice.ts, which is the largest and most impactful change in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat--refreshTableSchemas-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@packages/duckdb/src/DuckDbSlice.ts`:
- Around line 618-665: refreshTableSchemas currently returns the stale
get().db.tables when refreshPromise exists and also does not await the newly
created IIFE, so callers get pre-refresh data; change the dedup pattern so that
when refreshPromise is already set the function returns that promise, and when
you create refreshPromise = (async () => { ... })() you should return/await that
promise instead of immediately returning get().db.tables; update the function
refreshTableSchemas and its use of refreshPromise so all callers receive the
same Promise<DataTable[]> (resolve value should be the newTables returned from
get().db.loadTableSchemas) while preserving the existing error handling and
final cleanup logic.

In `@packages/room-shell/src/RoomShellSlice.ts`:
- Around line 503-506: Remove the debug console.log in updateReadyDataSources
that prints the full config object; locate the function updateReadyDataSources
(and the store subscription that calls it) and delete the
console.log('updateReadyDataSources', {config}) line so the function simply
guards on if (!config) return and proceeds without logging sensitive or noisy
config details.
🧹 Nitpick comments (4)
packages/kepler/src/components/KeplerAddDataDialog.tsx (1)

68-86: Consider removing the commented-out S3 code and the unused S3 enum member.

AddDataMethods.S3 (line 71) is now dead code since the S3 option is commented out everywhere. If S3 support is being deprecated, removing the commented blocks (lines 19, 82–85, 94, 100–105, 117–123, 188–200) and the unused enum member in favor of version control history would reduce noise. If the intent is to re-enable it soon, a tracking issue or TODO comment would make that clearer.

packages/duckdb/src/DuckDbSlice.ts (3)

486-535: Variable shadowing and N+1 query pattern in the inner loop.

  1. Shadowing: database (line 488), schema (line 489), and table (line 490) shadow the identically named destructured variables from the outer filter parameter (line 438). This is technically correct (block-scoped) but can mislead readers. Consider renaming to e.g. rowDatabase, rowSchema, rowTable — or entryDatabase, etc.

  2. N+1 queries: Each table/view row issues a separate PRAGMA table_info(...) call (line 503). For schemas with many tables, this will be noticeably slow. DuckDB's information_schema.columns or duckdb_columns() can return all column metadata in a single query, which could then be joined in-memory.


254-254: refreshPromise is a module-level mutable variable shared across all slice instances.

refreshPromise is declared outside the returned slice creator (line 254), which means it lives at module scope. If createDuckDbSlice is ever called more than once (e.g., multiple rooms or tests), all instances share the same deduplication lock. Moving it inside the closure returned by createDuckDbSlice would scope it per instance.

Proposed fix — move inside the closure
 export function createDuckDbSlice({
   connector = createWasmDuckDbConnector(),
 }: CreateDuckDbSliceProps = {}): StateCreator<DuckDbSliceState> {
-  let refreshPromise: Promise<DataTable[]> | null = null;
   return createSlice<DuckDbSliceState, BaseRoomStoreState & DuckDbSliceState>(
     (set, get) => {
+      let refreshPromise: Promise<DataTable[]> | null = null;
       return {
         db: {

443-480: Duplicated SQL filter logic between table and view queries.

The WHERE clause construction is copy-pasted between tableSql (lines 451–461) and viewSql (lines 470–480). Consider extracting a helper like buildWhereClause(defaultDatabaseFilter, schema, database, table) to reduce duplication and the risk of the two diverging.

Comment on lines 618 to 665
async refreshTableSchemas(): Promise<DataTable[]> {
if (refreshPromise) {
return get().db.tables;
}
set((state) =>
produce(state, (draft) => {
draft.db.isRefreshingTableSchemas = true;
}),
);
try {
const connector = await get().db.getConnector();
const result = await connector.query(
`SELECT current_schema() AS schema, current_database() AS database`,
);
set((state) =>
produce(state, (draft) => {
draft.db.currentSchema = result.getChild('schema')?.get(0);
draft.db.currentDatabase = result
.getChild('database')
?.get(0);
}),
);
const newTables = await get().db.loadTableSchemas();
// Only update if there's an actual change in the schemas
if (!deepEquals(newTables, get().db.tables)) {
refreshPromise = (async () => {
try {
const connector = await get().db.getConnector();
const result = await connector.query(
`SELECT current_schema() AS schema, current_database() AS database`,
);
set((state) =>
produce(state, (draft) => {
draft.db.tables = newTables;
draft.db.schemaTrees = createDbSchemaTrees(newTables);
draft.db.currentSchema = result.getChild('schema')?.get(0);
draft.db.currentDatabase = result
.getChild('database')
?.get(0);
}),
);
const newTables = await get().db.loadTableSchemas();
// Only update if there's an actual change in the schemas
if (!deepEquals(newTables, get().db.tables)) {
set((state) =>
produce(state, (draft) => {
draft.db.tables = newTables;
draft.db.schemaTrees = createDbSchemaTrees(newTables);
}),
);
}
return newTables;
} catch (err) {
get().room.captureException(err);
return [];
} finally {
refreshPromise = null;
set((state) =>
produce(state, (draft) => {
draft.db.isRefreshingTableSchemas = false;
}),
);
}
return newTables;
} catch (err) {
get().room.captureException(err);
return [];
} finally {
set((state) =>
produce(state, (draft) => {
draft.db.isRefreshingTableSchemas = false;
}),
);
}
})();
return get().db.tables;
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

refreshTableSchemas deduplication silently breaks the awaiting contract — callers receive stale tables.

When refreshPromise is already in-flight, the function returns get().db.tables (line 620) — the current stale snapshot. And even for the first caller, the IIFE is assigned to refreshPromise but not awaited (line 664 returns get().db.tables immediately, before the refresh completes).

This means every await get().db.refreshTableSchemas() in RoomShellSlice (lines 369, 409, 447, 658, 699) will resolve with pre-refresh data. For example, addRoomFile (line 409) does:

await get().db.refreshTableSchemas();
// ...
return get().db.findTableByName(tableName); // may return undefined

The standard deduplication pattern returns the same promise to all callers:

Proposed fix
          async refreshTableSchemas(): Promise<DataTable[]> {
            if (refreshPromise) {
-             return get().db.tables;
+             return refreshPromise;
            }
            set((state) =>
              produce(state, (draft) => {
                draft.db.isRefreshingTableSchemas = true;
              }),
            );
            refreshPromise = (async () => {
              try {
                const connector = await get().db.getConnector();
                const result = await connector.query(
                  `SELECT current_schema() AS schema, current_database() AS database`,
                );
                set((state) =>
                  produce(state, (draft) => {
                    draft.db.currentSchema = result.getChild('schema')?.get(0);
                    draft.db.currentDatabase = result
                      .getChild('database')
                      ?.get(0);
                  }),
                );
                const newTables = await get().db.loadTableSchemas();
                // Only update if there's an actual change in the schemas
                if (!deepEquals(newTables, get().db.tables)) {
                  set((state) =>
                    produce(state, (draft) => {
                      draft.db.tables = newTables;
                      draft.db.schemaTrees = createDbSchemaTrees(newTables);
                    }),
                  );
                }
                return newTables;
              } catch (err) {
                get().room.captureException(err);
                return [];
              } finally {
                refreshPromise = null;
                set((state) =>
                  produce(state, (draft) => {
                    draft.db.isRefreshingTableSchemas = false;
                  }),
                );
              }
            })();
-           return get().db.tables;
+           return refreshPromise;
          },
🤖 Prompt for AI Agents
In `@packages/duckdb/src/DuckDbSlice.ts` around lines 618 - 665,
refreshTableSchemas currently returns the stale get().db.tables when
refreshPromise exists and also does not await the newly created IIFE, so callers
get pre-refresh data; change the dedup pattern so that when refreshPromise is
already set the function returns that promise, and when you create
refreshPromise = (async () => { ... })() you should return/await that promise
instead of immediately returning get().db.tables; update the function
refreshTableSchemas and its use of refreshPromise so all callers receive the
same Promise<DataTable[]> (resolve value should be the newTables returned from
get().db.loadTableSchemas) while preserving the existing error handling and
final cleanup logic.

Comment on lines +503 to +506
console.log('updateReadyDataSources', {config});
if (!config) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug console.log before merging.

This looks like a leftover debug statement. It logs the full config object on every updateReadyDataSources call (including from the store subscription at line 491), which will be noisy in production and may expose config details in the console.

Proposed fix
     const {config} = get().room;
-    console.log('updateReadyDataSources', {config});
     if (!config) {
       return;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
console.log('updateReadyDataSources', {config});
if (!config) {
return;
}
if (!config) {
return;
}
🤖 Prompt for AI Agents
In `@packages/room-shell/src/RoomShellSlice.ts` around lines 503 - 506, Remove the
debug console.log in updateReadyDataSources that prints the full config object;
locate the function updateReadyDataSources (and the store subscription that
calls it) and delete the console.log('updateReadyDataSources', {config}) line so
the function simply guards on if (!config) return and proceeds without logging
sensitive or noisy config details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments