Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you provide an example of Http Scan Source (not lookup source) ? #41

Open
ChenShuai1981 opened this issue Dec 17, 2022 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@ChenShuai1981
Copy link

ChenShuai1981 commented Dec 17, 2022

Could you provide an example of Http Periodically Scan Source (not lookup source)? Does it support renew access token after expiration?

@kristoffSC
Copy link
Collaborator

kristoffSC commented Dec 18, 2022

Hi @ChenShuai1981
Scan source is currently not supported by this connector, hence no example available :) for now we have only lookup source. Although this would be a great feature, would you like to contribute? :)

The proper Flink interfaces would have to be implanted.

This feature would be a nice one though, however it would be very "client specific".

@davidradl
Copy link
Contributor

@ChenShuai1981 the lookup support that exists currently ends up issuing gets, puts or posts on single records. For the scan to work, I suspect we would need to issue searches, and get involved with paging the results. This could really impact performance of a scan, as we could end up effective doing table scans, unless we could do predicate pushdown.

@ChenShuai1981
Copy link
Author

@ChenShuai1981 the lookup support that exists currently ends up issuing gets, puts or posts on single records. For the scan to work, I suspect we would need to issue searches, and get involved with paging the results. This could really impact performance of a scan, as we could end up effective doing table scans, unless we could do predicate pushdown.

Yes, you are right. Since content provider will update information irregularly so we have to periodly send get/post request to fetch them and sync into our database. Scenario like network crawler and system integration. Generally speaking if the results is too large the provider will return a streaming response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants