Skip to content

Use temporary file in load_table_from_dataframe#7545

Merged
tswast merged 3 commits intogoogleapis:masterfrom
tswast:issue7543-load-table-dataframe
Mar 25, 2019
Merged

Use temporary file in load_table_from_dataframe#7545
tswast merged 3 commits intogoogleapis:masterfrom
tswast:issue7543-load-table-dataframe

Conversation

@tswast
Copy link
Contributor

@tswast tswast commented Mar 22, 2019

This fixes a bug where load_table_from_dataframe could not be used
with the fastparquet library. It should also use less memory when
uploading large dataframes.

Fixes #7543

This fixes a bug where `load_table_from_dataframe` could not be used
with the `fastparquet` library. It should also use less memory when
uploading large dataframes.
@tswast tswast requested review from alixhami and tseaver March 22, 2019 16:36
@tswast tswast requested a review from crwilcox as a code owner March 22, 2019 16:36
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 22, 2019
@alixhami
Copy link
Contributor

It might be a good idea to have tests that explicitly set the parquet engine to pyarrow and fastparquet individually, since it sounds like users regularly use both.

@tswast
Copy link
Contributor Author

tswast commented Mar 22, 2019

It might be a good idea to have tests that explicitly set the parquet engine to pyarrow and fastparquet individually, since it sounds like users regularly use both.

@alixhami Great idea. Done in 4bca63f.

@alixhami
Copy link
Contributor

It looks like there's an error when building snappy

@tswast
Copy link
Contributor Author

tswast commented Mar 22, 2019

I think I need to add an OS-level package (libsnappy-dev according to StackOverflow) to the trampoline image used for the google-cloud-python BigQuery tests. I see from the Kokoro configs that the image is gcr.io/cloud-devrel-kokoro-resources/python-multi. I've reached out to the Yoshi folks to see how to add to that image or use it as a base for a BigQuery-specific one.

@tswast tswast added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 23, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 23, 2019
@tswast
Copy link
Contributor Author

tswast commented Mar 23, 2019

Trying a rebuild now that I we have snappy in the Kokoro Dockerfile (googleapis/testing-infra-docker#7).

@tswast
Copy link
Contributor Author

tswast commented Mar 23, 2019

api_core test failure (pytype session) appears unrelated to this change.

@tswast tswast requested a review from shollyman March 25, 2019 15:53
@tswast
Copy link
Contributor Author

tswast commented Mar 25, 2019

The BigQuery tests do pass now.

@tswast tswast merged commit 675dbd8 into googleapis:master Mar 25, 2019
@tswast tswast deleted the issue7543-load-table-dataframe branch March 25, 2019 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BigQuery: load_table_from_dataframe should use a temporary file

4 participants