Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest): add support for Looker view built from SQL-based derived tables #2478

Merged

Conversation

remisalmon
Copy link
Contributor

This PR adds support for ingesting Looker views built from SQL-based derived tables (https://docs.looker.com/data-modeling/learning-lookml/derived-tables).

Tested with success on our Looker instance, with the lineage of Looker views built from SQL-based derived tables showing up in DataHub.

Also updates the lkml dependency to better parse .lkml files.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

Parse the SQL query in the derived_table field of Looker views to extract upstream tables.
Dicard temporary tables created with the SQL WITH statement.

Update the lkml dependency to parse .lkml files without throwing errors when field values contain special characters.
Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks @remisalmon!

@frsann is this OK to merge before you port the Looker integration into the main ingestion framework?

@frsann
Copy link
Contributor

frsann commented Apr 30, 2021

Yes, go ahead! I haven't started yet! Awesome stuff here!

@frsann
Copy link
Contributor

frsann commented Apr 30, 2021

A quick question, from a Looker newbie: are the sql queries in the derived tables expressed in the dialect of the "backend" database? I was curious about this sql-metadata package and tried it on some of our more complicated Snowflake queries, and it failed to parse correctly.

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for this contribution :)

@remisalmon
Copy link
Contributor Author

@frsann the answer to your question seems to be yes: https://docs.looker.com/reference/view-params/derived_table#supported_database_dialects_for_derived_tables

I also found an issue with sql-metadata that I reported upstream: macbre/sql-metadata#128

@frsann
Copy link
Contributor

frsann commented May 1, 2021

Considering that the sql parsing might not be water tight, should we put this behind feature flag?

@jjoyce0510
Copy link
Collaborator

Considering that the sql parsing might not be water tight, should we put this behind feature flag?

Agree here -- seems like the sql-metadata libs only supports a few dialects, whereas Looker appears to support many more?

Have we considered whether something like sqlparse is suitable?

@shirshanka
Copy link
Contributor

I'm going to merge this in first. We can improve sql parsing in a follow up PR. Want to first prioritize the moving of this to the ingestion framework.

@shirshanka shirshanka merged commit 7948226 into datahub-project:master May 3, 2021
@hsheth2
Copy link
Collaborator

hsheth2 commented May 3, 2021

@jjoyce0510 sql-metadata actually uses sqlparse under the hood. Agree that we should have put this behind a feature flag since it's not yet fully baked.

@remisalmon
Copy link
Contributor Author

@hsheth2 @jjoyce0510 I am trying to get some sql-metadata bugs fixed upstream. I'll open another PR here once a new version of sql-metadata is out.

@remisalmon remisalmon deleted the feat-looker-derived-tables branch May 19, 2021 04:01
@remisalmon remisalmon restored the feat-looker-derived-tables branch May 19, 2021 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants