Conversation
4065f37 to
edcaa7e
Compare
mythical-fred
left a comment
There was a problem hiding this comment.
LGTM — but see inline: there is an existing open PR covering the same feature.
| @@ -1217,6 +1232,51 @@ def query_as_parquet(self, pipeline_name: str, query: str, path: str): | |||
| file.write(chunk) | |||
There was a problem hiding this comment.
Heads up: PR #4226 ("py: support arrow_ipc format for adhoc queries" by @abhizer) is still open and touches the same files with similar intent. It has been open since June 2025 waiting for @gz to review. You may want to coordinate — either close one in favour of the other, or check whether #4226 has superseded functionality that should be absorbed here.
|
hi @monochromatti this looks good thanks a lot for your contribution. @abhizer can you review this |
|
I'd like input on whether to return Generator[pyarrow.RecordBatch, ...] or a pyarrow.Table directly. The latter is the current state of the PR, but after some thinking it feels like generating batches is more in style with similar existing functionality and better suited for big payloads. |
abhizer
left a comment
There was a problem hiding this comment.
Thank you!
As a heads up, the reason we didn't merge the prior PR is because the server intermittently sent bad data and we were unable to figure out why.
We normally return a generator, and it might be a good idea to keep this behavior consistent. |
edcaa7e to
dd5c74e
Compare
|
@monochromatti please re-request a review from @abhizer when this is ready again |
Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>
Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>
Signed-off-by: Mattias Matthiesen <mattias.matthiesen@eviny.no>
dd5c74e to
379bfe8
Compare
Summary
This PR adds Arrow IPC query support to the Python SDK so query results can be fetched directly as pyarrow.Table.
What changed
Notes