Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/snowflake): allow option for incremental properties #12080

Merged

Conversation

mayurinehate
Copy link
Collaborator

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Dec 10, 2024
@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Dec 10, 2024
return MetadataWorkUnit(id=MetadataWorkUnit.generate_workunit_id(mcp), mcp_raw=mcp)


def auto_incremental_properties(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to convert the entire aspect to a PATCH. For now, can we just patch the custom properties, to limit surface area?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would work as expected if we emit rest of aspect as a whole and only custom properties as patch. Emitting the whole aspect will overwrite any parallel edits made to datasetProperties.

@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Dec 11, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any plan or intention to make this generic to other entities beyond Dataset?

Current implementation is specific for DatasetProperties aspect, however the namings (ingremental_properties_helper.py, IncrementalPropertiesConfigMixin, ...) suggest this could apply for the properties of any entity (Container, ...).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good question and I did consider it myself.
I was hoping, we can reuse this for rest of entity types as well in future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If keeping generic is the plan, we may provide a generic flag description too, by mentioning entities instead of dataset.

Also, we may add some conditional logic when fetching the DatasetPropertiesClass instead of assuming it's a dataset entity; for the moment, we can skip processing other entity types. WDYT?

            properties_aspect = wu.get_aspect_of_type(DatasetPropertiesClass)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatasetPropertiesClass is only present for entity dataset. For other entities, it would be absent so would return None. So its safe.

I would update description once this config and workunit processor is adopted for other entities. Also planning a quick followup with addressing TODOs in this PR. Will make changes in there.

@datahub-cyborg datahub-cyborg bot added pending-submitter-merge and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Dec 11, 2024
@mayurinehate mayurinehate merged commit 355a7e6 into datahub-project:master Dec 11, 2024
123 of 156 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants