Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(metadata-io): Improve speed of ElasticSearch tests #3160

Merged

Conversation

EnricoMi
Copy link
Contributor

@EnricoMi EnricoMi commented Aug 25, 2021

This improves syncAfterWrite for ElasticSearch GraphService tests, which reduces test time to the possible minimum (0,08s per test, not waiting 5s per test any more). This fixes #3124.

The logic of syncAfterWrite is used by all ElasticSearch tests in metadata-io now, speeding up those tests as well.

@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch from 38a5691 to be330fb Compare August 25, 2021 13:16
@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch from be330fb to 3541a43 Compare August 25, 2021 13:36
@EnricoMi EnricoMi changed the title fix: Improve syncAfterWrite for ElasticSearch GraphService tests test(metadata-io): Improve speed of ElasticSearch tests Aug 25, 2021
@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch 2 times, most recently from 7d6a9b6 to 4a88273 Compare August 25, 2021 22:10
@gabe-lyons
Copy link
Contributor

Hey @EnricoMi - I'm really eager to get this one in! Which test are failing with this improvement? Anything I can do to help?

@EnricoMi
Copy link
Contributor Author

EnricoMi commented Sep 2, 2021

@gabe-lyons the first parametrized run of ElasticSearchGraphServiceTest.testAddEdge fails on GitHub but not for me locally. Looks like the first run has timing issues and subsequent runs are fine. I will add some waiting to @BeforeTest to mitigate this.

And test ElasticSearchServiceTest.testElasticSearchService still requires an extra sleep for 1s: https://github.com/linkedin/datahub/pull/3160/files#diff-92a451a02ae56ca5e1b057e46524b2dd863c4e7a79feea05971de0fb7275aa41R104-R105

@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch from 7dd9813 to d8bc9cb Compare September 2, 2021 15:35
@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch 7 times, most recently from dd6094a to 625bf60 Compare September 4, 2021 11:58
@EnricoMi
Copy link
Contributor Author

EnricoMi commented Sep 6, 2021

@gabe-lyons looks like there is still a race condition depending on the machine the tests run on. On my local machine I never see tests failing. On GitHub, tests occasionally fail (added data not visible, removed data still there).

Maybe the syncAfterWrite needs to do an extra trick:

  • write a specific piece of information to ElasticSearch, like a sync flag
  • busy wait for it to appear on read (sleep 10 or 100 ms inside the loop)
  • remove the flag again
  • busy wait for it to disappear

Given the flag has been written after changes before syncAfterWrite has been called, we should be safe to assume those changes has also made it to the read side.

@gabe-lyons
Copy link
Contributor

@EnricoMi that approach makes sense to me. Hopefully with that we would still be able to capture most of the performance optimizations you've gained here!

@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch from 625bf60 to 86d9c61 Compare September 8, 2021 18:04
@EnricoMi EnricoMi force-pushed the branch-fix-es-graph-test-sync branch from 86d9c61 to 7fbf8e0 Compare September 8, 2021 20:10
@gabe-lyons
Copy link
Contributor

gabe-lyons commented Sep 9, 2021

Hey @EnricoMi - this looks almost ready to merge! Can you rebase on master first? There was a small change to ES tests since this PR was opened:

https://github.com/linkedin/datahub/pull/3208/files

@EnricoMi
Copy link
Contributor Author

EnricoMi commented Sep 9, 2021

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit 66c862f into datahub-project:master Sep 13, 2021
@EnricoMi EnricoMi deleted the branch-fix-es-graph-test-sync branch September 18, 2021 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve ElasticSearchGraphServiceTest syncAfterWrite
3 participants