BigQuery: add --max_results option to magic#9169
Conversation
|
I just merged #9167. We might have to rebase. |
|
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the ℹ️ Googlers: Go here for more info. |
23fb140 to
37b4d39
Compare
|
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
tswast
left a comment
There was a problem hiding this comment.
I'd actually love to see a system / integration test for this feature, but I know we don't have any examples for you to work off of for that. (I'll see what I can do about that, since I know this will come up again in the future.)
Since we don't have a system for notebook integration tests in this repo, I'd like to see the results of manually testing this feature in a notebook.
| print("\rQuery executing: {:0.2f}s".format(time.time() - start_time), end="") | ||
| try: | ||
| query_job.result(timeout=0.5) | ||
| query_job.result(timeout=0.5, max_results=max_results) |
There was a problem hiding this comment.
Since we aren't actually returning the results in this line (just waiting for the query to finish), we don't need to pass max_results here. I believe that means we can remove the max_results parameter from _run_query as well.
| "Defaults to returning all rows." | ||
| ), | ||
| ) | ||
| @magic_arguments.argument( |
There was a problem hiding this comment.
Looks like these arguments got doubled up. (Maybe the two commits thing, we noticed when rebasing?)
There was a problem hiding this comment.
That's probably what happened (moral of the story, always run tests before pushing). Removed the duplicate in 147e43d
| ip.run_cell_magic("bigquery", "--use_legacy_sql", "SELECT 17 AS num") | ||
|
|
||
| job_config_used = run_query_mock.call_args_list[0][0][-1] | ||
| job_config_used = run_query_mock.call_args_list[0][1]["job_config"] |
There was a problem hiding this comment.
Was this change intentional? Maybe a bad rebase?
There was a problem hiding this comment.
This was intentional. It was to fix some test failures I was getting because the argument that was being retrieved was the wrong type (a result of changing the method signature for _run_query). This way the arg being accessed will always be job_config, even if other named parameters are added. Since I reverted the changes to _run_query, I can probably change this back too. I think the tests will pass either way
| try: | ||
| query_job = _run_query(client, query, job_config) | ||
| query_job = _run_query( | ||
| client, query, job_config=job_config, max_results=max_results |
There was a problem hiding this comment.
max_results is not actually needed here, it's not until the call to to_dataframe that we need max_results.
Note: to_dataframe probably doesn't have a max_results argument, and I'm actually not certain that we'd want to add one. Instead, we can call query_job.results with a max_results argument and then call to_dataframe on the resulting RowIterator.
There was a problem hiding this comment.
I updated to call to_dataframe on query_job.result, passing max_results as an argument if max_results is present in f88ffe8
* added max_results magic option and fixed broken tests * added tests for --max_results magic option * added max_results magic option and fixed broken tests * added tests for --max_results magic option * Removed duplicate `--max_results` magic argument * removed max_results param from run_query, updated tests
* added max_results magic option and fixed broken tests * added tests for --max_results magic option * added max_results magic option and fixed broken tests * added tests for --max_results magic option * Removed duplicate `--max_results` magic argument * removed max_results param from run_query, updated tests

Second of 3 PRs towards resolving #9105 as described in review for #9147