Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli) Changes rollback behaviour to apply soft deletes by default #4358

Merged
merged 16 commits into from
Mar 15, 2022
Merged

feat(cli) Changes rollback behaviour to apply soft deletes by default #4358

merged 16 commits into from
Mar 15, 2022

Conversation

pedro93
Copy link
Collaborator

@pedro93 pedro93 commented Mar 9, 2022

Summary:
Addresses feature request: Flag in delete command to only delete aspects touched by an ingestion run; add flag to nuke everything.

Adresses feature request by modifying the default behaviour of a rollback operation.

By default rollback will not delete an entity if a keyAspect is being rolled-back.

Instead, the key aspect is kept and a StatusAspect is upserted with removed=true, effectively implementing a soft delete.
To make rollback consistent & symmetric, this PR also adds a Status Aspect with removed=false whenever ingesting an Entity if status is not present.

Another PR will follow to perform garbage collection on these soft deleted entities.

To keep old behaviour, a new parameter to the cli ingest rollback endpoint: --hard-delete was added.

To keep cli commands like list-runs consistent, this PR also modifies MCL processor's UpdateIndices hook to upset Elasticsearch documents with a new property removed.
This property is used to indicate whether a given elastic document has been soft deleted or not so that we may exclude soft deleted aspects from ingestion run history.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

@github-actions
Copy link

github-actions bot commented Mar 9, 2022

Unit Test Results (build & test)

  92 files  ±0    92 suites  ±0   13m 30s ⏱️ - 4m 34s
673 tests +1  614 ✔️ +3  59 💤 ±0  0  - 2 

Results for commit c1ffae3. ± Comparison against base commit ecd263b.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Mar 9, 2022

Unit Test Results (metadata ingestion)

       5 files  ±0         5 suites  ±0   43m 19s ⏱️ - 1m 57s
   362 tests ±0     362 ✔️ ±0    0 💤 ±0  0 ±0 
1 657 runs  ±0  1 619 ✔️ ±0  38 💤 ±0  0 ±0 

Results for commit c1ffae3. ± Comparison against base commit ecd263b.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@dexter-mh-lee dexter-mh-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

pedro93 added 14 commits March 14, 2022 18:52
Summary:
Addresses feature request: Flag in delete command to only delete aspects touched by an ingestion run; add flag to nuke everything by modifying the default behaviour of a rollback operation which will not by default delete an entity if a keyAspect is being rolled-back.

Instead the key aspect is kept and a StatusAspect is upserted with removed=true, effectively making a soft delete.
Another PR will follow to perform garbage collection on these soft deleted entities.

To keep old behaviour, a new parameter to the cli ingest rollback endpoint: --hard-delete was added.
@pedro93 pedro93 mentioned this pull request Mar 14, 2022
4 tasks
Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shirshanka shirshanka merged commit e8f6c4c into datahub-project:master Mar 15, 2022
maggiehays pushed a commit to maggiehays/datahub that referenced this pull request Aug 1, 2022
…datahub-project#4358)

* Changes rollback behaviour to apply soft deletes by default

Summary:
Addresses feature request: Flag in delete command to only delete aspects touched by an ingestion run; add flag to nuke everything by modifying the default behaviour of a rollback operation which will not by default delete an entity if a keyAspect is being rolled-back.

Instead the key aspect is kept and a StatusAspect is upserted with removed=true, effectively making a soft delete.
Another PR will follow to perform garbage collection on these soft deleted entities.

To keep old behaviour, a new parameter to the cli ingest rollback endpoint: --hard-delete was added.

* Adds restli specs

* Fixes deleteAspect endpoint & adds support for nested transactions

* Enable regression test & fix docker-compose for local development

* Add generated quickstart

* Fix quickstart generation script

* Adds missing var env to docker-compose-without-neo4j

* Sets status removed=true when ingesting resources

* Adds soft deletes for ElasticSearch + soft delete flags across ingestion sub-commands

* Makes elastic search consistent

* Update tests with new behaviour

* apply review comments

* apply review comment

* Forces Elastic search to add documents with status removed false when ingesting

* Reset gradle properties to default

* Fix tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants