Skip to content

Conversation

@skvark
Copy link
Contributor

@skvark skvark commented Apr 9, 2024

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

What

This pull request adds a custom Bearer authentication token provider function for the HTTP clients. Additionally, Bearer token authentication is enabled for the GRPC clients as well.

Why

Currently, it is impossible to use the clients properly with token-based authentication, since the the only way of passing the token is via the static metadata parameter. When the token expires, the whole client needs to be recreated.

Changes

The clients have a new parameter, auth_token_provider. This can be sync or async function, that returns str Bearer token.

For the HTTP clients, this PR adds a new custom BearerAuth class that is based on httpx.Auth class (https://www.python-httpx.org/advanced/authentication/#custom-authentication-schemes). This is passed down in the auth parameter to the httpx library.

The GRPC code in connection.py has been refactored to support the new feature and the code in get_channel and get_async_channel should be simpler now.

Example

There are simple examples in the tests, but here is a proper example with azure-identity library:

import qdrant_client
from azure.identity.aio import DefaultAzureCredential

credential = DefaultAzureCredential(exclude_interactive_browser_credential=False)

async def get_token():
    result = await credential.get_token("your_qdrant_app_scope")
    return result.token

client = qdrant_client.AsyncQdrantClient(
    url=f"your.qdrant.app.address",
    https=True,
    port=443,
    auth_token_provider=get_token,
)

await client.get_collections()

@netlify
Copy link

netlify bot commented Apr 9, 2024

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 574d88a
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/661d53a4117a930008da35bc
😎 Deploy Preview https://deploy-preview-591--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@joein joein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skvark thank you for the contribution!

if self._timeout is not None:
self._rest_args["timeout"] = self._timeout
if self._auth_token_provider is not None:
if self.__class__.__name__ == "QdrantRemote" and asyncio.iscoroutinefunction(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to avoid magic methods checks and avoid adding it to AsyncQdrantClient as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn't like this myself either. It's better now after your changes.

@@ -0,0 +1,44 @@
import asyncio
import threading
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to clean these after removing some stuff from the auth class.

self, continuation: Any, client_call_details: Any, request: Any
) -> Any:
new_details, new_request_iterator, postprocess = self._fn(
new_details, new_request_iterator, postprocess = await self._fn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch 🙈


def header_adder_interceptor(new_metadata: List[Tuple[str, str]]) -> _GenericClientInterceptor:
def header_adder_interceptor(
new_metadata: List[Tuple[str, str]] = [],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lists should not be default values due to their mutability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True 👍


ids = [
GrpcToRest.convert_point_id(idx) if isinstance(idx, grpc.PointId) else idx
(GrpcToRest.convert_point_id(idx) if isinstance(idx, grpc.PointId) else idx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like you did not use pre-commit or used a different python version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use pre-commit but it seems I had too new Python version.

@joein
Copy link
Member

joein commented Apr 10, 2024

I've left some comments, I'll address them and merge

@skvark
Copy link
Contributor Author

skvark commented Apr 10, 2024

I've left some comments, I'll address them and merge

Thanks for fixing the issues!

@joein
Copy link
Member

joein commented Apr 11, 2024

Hey @skvark
I've given this PR another thought

I am somewhat reluctant on bringing new parameters to init (and I think that auth_token_provider actually affects only a small subset of users)

Afaik it's not a common practice to propagate token reissue logic into a database object, and it is usually handled on the client side, isn't it?

@skvark
Copy link
Contributor Author

skvark commented Apr 12, 2024

Hey @skvark I've given this PR another thought

I am somewhat reluctant on bringing new parameters to init (and I think that auth_token_provider actually affects only a small subset of users)

Afaik it's not a common practice to propagate token reissue logic into a database object, and it is usually handled on the client side, isn't it?

Instead of init, we could introduce some new method to the client that registers the authentication token provider.

Here's one example from SQLAlchemy. It uses an event-based approach, but the renewal mechanism is the same: https://docs.sqlalchemy.org/en/20/core/engines.html#generating-dynamic-authentication-tokens

Weaviate implements the OIDC auth flows themselves (but their DB also validates the tokens so it's a bit different): Weaviate

All databases should sit behind a proper authentication (and authorization) layer in addition to firewalls. Oauth 2.0 bearer tokens are the standard way of implementing that layer. We can use the current QDrant client, but not having a proper way to update the authorization headers on the existing Qdrant database client object leads to unnecessarily complicated code on our side.

@joein
Copy link
Member

joein commented Apr 15, 2024

@skvark I think I'll try to make it in a separate method

@joein
Copy link
Member

joein commented Apr 16, 2024

I'm going to merge it now, thank you for the contribution @skvark

@joein joein merged commit d957ca7 into qdrant:dev Apr 16, 2024
@skvark
Copy link
Contributor Author

skvark commented Apr 16, 2024

I'm going to merge it now, thank you for the contribution @skvark

Thanks!

joein added a commit that referenced this pull request Apr 22, 2024
* bearer token authentication provider support

* add tests and checks, move auth file to separate dir

* fix error message

* remove locks

* rename var

* refactoring: refactor exceptions, fix mypy

* fix: regen async

* tests: extend token tests to check token updates

* new: add warning when auth token provider is used with an insecure connection

* fix: propagate auth token to rest client even with prefer_grpc set

---------

Co-authored-by: George Panchuk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants