Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn when using user credentials from the Cloud SDK #266

Merged
merged 3 commits into from
May 31, 2018

Conversation

theacodes
Copy link
Contributor

No description provided.

@theacodes theacodes requested a review from crwilcox May 31, 2018 20:06
information, please see
_HELP_MESSAGE = """\
Could not automatically determine credentials. Please set {env} or \
explicitly create credential and re-run the application. For more \

This comment was marked as spam.

@theacodes theacodes merged commit a8d9348 into master May 31, 2018
@theacodes theacodes deleted the warn-on-cloud-sdk-credentials branch May 31, 2018 21:52
@max-sixty
Copy link

Hi @theacodes & @crwilcox,

Each time we run a BigQuery request we now get this warning. Does this warn every time a user makes a request with their credentials? If so, is that balancing information and noise well? My understanding is that there are lots of legitimate uses of user accounts.

I'm also a committer of https://github.com/pydata/pandas-gbq under @tswast, where we intentionally fully support user credentials - potentially conflicting with this warning.

Thanks

@tswast
Copy link
Contributor

tswast commented Jun 8, 2018

One reason I made googleapis/python-bigquery-pandas#161 is to make it easier to explicitly use user credentials when those are desired.

In pandas-gbq you can disable default credentials by setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to empty string to avoid this warning.

@theacodes
Copy link
Contributor Author

This warning shows up every time you call google.auth.default() and the credentials resolved are associated with the Cloud SDK's client ID. You can cache the return value or filter the warning using filterwarnings.

@tswast
Copy link
Contributor

tswast commented Jun 8, 2018

Yeah, it's also on my list to cache the credentials in pandas-gbq. :-)

@max-sixty
Copy link

This warning shows up every time you call google.auth.default() and the credentials resolved are associated with the Cloud SDK's client ID

Can you help me understand why this warns? We use this code path for all our research. (then when we deploy to GKE it resolves to the service account)

I can understand that running servers on user auth is bad, but presumably lots of non-server applications are going to hit this code path?

@theacodes
Copy link
Contributor Author

I summarized over at #271 (comment)

The trouble with the Cloud SDK is that all credentials that are issued from it share the same client ID, and thus, share quota. This doesn't work well for a significant portion of our APIs and often leads to users being confused and frustrated when things don't work.

Also requiring users to install a 300mb SDK that they may not need just to auth for a script is massive overkill.

@maxim-lian are you using pandas-gbq? If so, it's already equipped to do the right thing™ here and obtain authorization itself.

@max-sixty
Copy link

That comment makes sense on the intention.

Forgive me if these questions seem pedestrian (and @tswast has already spent lots of time hand-holding me through some auth questions): what would you recommend for us? We run a set of processes that interact with google services extensively - e.g. read & write to BQ, GCS. We run these on local containers when we're doing research, and then on GKE in production.

User accounts are extremely convenient (much more so than having everyone create a service account, and then managing permissions for those in parallel to user permissions).

Is this a ClientID issue? Should we be using a different one?

Feel free to point me towards documentation if it's out there.

Thanks

@tswast
Copy link
Contributor

tswast commented Jun 11, 2018

With local containers, my recommendation would be to let pandas-gbq do the user auth flow. Revoke the default credentials with

gcloud auth application-default revoke

to make pandas-gbq and other tools stop picking up the user auth from gcloud.

@max-sixty
Copy link

Thanks Tim

Would this generalize to using GCS, for example?

@tswast
Copy link
Contributor

tswast commented Jun 11, 2018

No, it doesn't generalize to GCS, unless we create a pandas-gbq-like wrapper for GCS, too.

The pandas_gbq.get_user_credentials function I propose in googleapis/python-bigquery-pandas#161 would work for creating a GCS client, but then your code would be explicitly using user credentials, unlike the "default" auth method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants