Skip to content

BigQuery: Add support for BigQuery Storage API Arrow format in to_dataframe and to_arrow.#8551

Merged
tswast merged 20 commits intogoogleapis:masterfrom
TheNeuralBit:bq-storage-arrow
Jul 12, 2019
Merged

BigQuery: Add support for BigQuery Storage API Arrow format in to_dataframe and to_arrow.#8551
tswast merged 20 commits intogoogleapis:masterfrom
TheNeuralBit:bq-storage-arrow

Conversation

@TheNeuralBit
Copy link
Contributor

  • Makes _StreamParser abstract, and breaks it into two implementations: one for arrow and one for avro. The implementation is selected is based on the schema set in the ReadSession.
  • Modifiies BigQuery client library to use the Arrow format for calls to client.list_rows(table).to_dataframe(bq_storage_client)

@TheNeuralBit TheNeuralBit requested review from a team and tswast July 1, 2019 19:34
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 1, 2019
@tswast tswast added api: bigquery Issues related to the BigQuery API. api: bigquerystorage Issues related to the BigQuery Storage API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Jul 1, 2019
@tswast
Copy link
Contributor

tswast commented Jul 1, 2019

Added the do not merge label. We'll need to wait for the Arrow changes to hit prod before merging.

@tswast
Copy link
Contributor

tswast commented Jul 3, 2019

Once #8609 goes in, I'll aim to rebase this on top of that change, providing a fast version of to_arrow, too.

@tswast
Copy link
Contributor

tswast commented Jul 8, 2019

to_arrow is added in #8609 using tabledata.list. Once that's in, we can update this PR to use the BQ Storage API for that method as well.

@tswast tswast changed the title BigQuery Storage: Add support for arrow format in BQ Read API BigQuery: Add support for BigQuery Storage API Arrow format in to_dataframe and to_arrow. Jul 11, 2019
@tswast
Copy link
Contributor

tswast commented Jul 11, 2019

Let's wait for #8644 to be merged and released to PyPI before merging this one.

@tswast tswast removed api: bigquerystorage Issues related to the BigQuery Storage API. do not merge Indicates a pull request not ready for merge, due to either quality or timing. labels Jul 11, 2019
@tswast tswast requested a review from shollyman July 11, 2019 21:22
@tswast tswast merged commit 8852687 into googleapis:master Jul 12, 2019
parthea pushed a commit that referenced this pull request Aug 21, 2025
…n using BigQuery Storage API. (#8551)

* Use Arrow format in client.list_rows(..).to_dataframe(..) with BQ Storage client

* Add system test for arrow wire format.

* Add system test for to_arrow.

* Exclude bad pyarrow release.
parthea pushed a commit that referenced this pull request Sep 16, 2025
…n using BigQuery Storage API. (#8551)

* Use Arrow format in client.list_rows(..).to_dataframe(..) with BQ Storage client

* Add system test for arrow wire format.

* Add system test for to_arrow.

* Exclude bad pyarrow release.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the BigQuery API. cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants