arrow ipc sdk by monochromatti · Pull Request #5814 · feldera/feldera

monochromatti · 2026-03-13T06:51:09Z

Summary

This PR adds Arrow IPC query support to the Python SDK so query results can be fetched directly as pyarrow.Table.

What changed

Added a new client API:
- FelderaClient.query_as_arrow_ipc(...) -> pyarrow.Table
Added a pipeline convenience method:
- Pipeline.query_arrow(...) -> pyarrow.Table
Added optional arrow extra for the Python package:
- pip install "feldera[arrow]"
Updated Python README with installation guidance for Arrow support.
Added unit tests covering:
- non-empty results
- empty results with schema preservation
- request parameter wiring
- missing-pyarrow error path
- Pipeline.query_arrow delegation

Notes

Arrow IPC responses are currently fully buffered in memory before deserialization.
Error messaging for missing pyarrow is explicit and points users to feldera[arrow].

mythical-fred

LGTM — but see inline: there is an existing open PR covering the same feature.

mythical-fred · 2026-03-13T08:41:34Z

python/feldera/rest/feldera_client.py

@@ -1217,6 +1232,51 @@ def query_as_parquet(self, pipeline_name: str, query: str, path: str):
                file.write(chunk)


Heads up: PR #4226 ("py: support arrow_ipc format for adhoc queries" by @abhizer) is still open and touches the same files with similar intent. It has been open since June 2025 waiting for @gz to review. You may want to coordinate — either close one in favour of the other, or check whether #4226 has superseded functionality that should be absorbed here.

gz · 2026-03-13T16:39:29Z

hi @monochromatti this looks good thanks a lot for your contribution. @abhizer can you review this

monochromatti · 2026-03-13T16:43:58Z

I'd like input on whether to return Generator[pyarrow.RecordBatch, ...] or a pyarrow.Table directly. The latter is the current state of the PR, but after some thinking it feels like generating batches is more in style with similar existing functionality and better suited for big payloads.

abhizer

Thank you!

As a heads up, the reason we didn't merge the prior PR is because the server intermittently sent bad data and we were unable to figure out why.

abhizer · 2026-03-13T17:07:47Z

I'd like input on whether to return Generator[pyarrow.RecordBatch, ...] or a pyarrow.Table directly

We normally return a generator, and it might be a good idea to keep this behavior consistent.

mihaibudiu · 2026-03-18T18:33:50Z

@monochromatti please re-request a review from @abhizer when this is ready again

Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>

monochromatti force-pushed the arrow-ipc-sdk branch 2 times, most recently from 4065f37 to edcaa7e Compare March 13, 2026 06:54

mythical-fred approved these changes Mar 13, 2026

View reviewed changes

monochromatti mentioned this pull request Mar 13, 2026

py: support arrow_ipc format for adhoc queries #4226

Open

gz requested a review from abhizer March 13, 2026 16:39

abhizer approved these changes Mar 13, 2026

View reviewed changes

monochromatti force-pushed the arrow-ipc-sdk branch from edcaa7e to dd5c74e Compare March 18, 2026 13:06

monochromatti added 3 commits March 23, 2026 09:23

[python] Add optional arrow dependency and installation docs

f000697

Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>

[python] Add Arrow IPC query API to client and pipeline

c9d89bd

Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>

[python] Add tests for Arrow IPC query results

379bfe8

Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>

monochromatti force-pushed the arrow-ipc-sdk branch from dd5c74e to 379bfe8 Compare March 23, 2026 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arrow ipc sdk#5814

arrow ipc sdk#5814
monochromatti wants to merge 3 commits intofeldera:mainfrom
monochromatti:arrow-ipc-sdk

monochromatti commented Mar 13, 2026 •

edited

Loading

Uh oh!

mythical-fred left a comment

Uh oh!

mythical-fred Mar 13, 2026

Uh oh!

gz commented Mar 13, 2026

Uh oh!

monochromatti commented Mar 13, 2026

Uh oh!

abhizer left a comment

Uh oh!

abhizer commented Mar 13, 2026

Uh oh!

mihaibudiu commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		@@ -1217,6 +1232,51 @@ def query_as_parquet(self, pipeline_name: str, query: str, path: str):
		file.write(chunk)

Conversation

monochromatti commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Notes

Uh oh!

mythical-fred left a comment

Choose a reason for hiding this comment

Uh oh!

mythical-fred Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

gz commented Mar 13, 2026

Uh oh!

monochromatti commented Mar 13, 2026

Uh oh!

abhizer left a comment

Choose a reason for hiding this comment

Uh oh!

abhizer commented Mar 13, 2026

Uh oh!

mihaibudiu commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

monochromatti commented Mar 13, 2026 •

edited

Loading