Skip to content

Conversation

@jmacryl
Copy link
Collaborator

@jmacryl jmacryl commented May 9, 2025

closes https://linear.app/acryl-data/issue/PFP-1277/use-best-practices-when-reindexing-to-speed-up-upgrades

Main changes are:

  • set number_of_replicas to 0 while reindexing, we increase when done
  • skip 0 doc indices
  • use tweaked index.translog.flush_threshold_size value when possible
  • don't fail on deletion/alias op, we retry several times instead. Lots of failures happen due to snapshots being taken, no reason to fail for this
  • use slices, with optimas value
  • increase batch size to 5000
  • also removed rogue dependency on new OS client: 'org.opensearch.client:opensearch-java:2.6.0' we are not still using it

Note:

  • the fact we set replicas at the end, means that for a few minutes the last index will have no replicas until they are done replicatin (previous indices would be done by this point). I don't think this is an issue at all, but we could wait till all are replicated if we wanted

@github-actions github-actions bot added the product PR or Issue related to the DataHub UI/UX label May 9, 2025
@jmacryl jmacryl marked this pull request as draft May 9, 2025 19:26
@alwaysmeticulous
Copy link

alwaysmeticulous bot commented May 9, 2025

🤖 Meticulous failed to evaluate the 126 user flows to check for differences - if this is the first attempt, the job will be auto-retried in the next few minutes.

If the job is still failing after a few minutes, consider pushing up a new commit to re-trigger Meticulous or retrying the workflow.

Last updated for commit cec8c54. This comment will update as new commits are pushed.

@codecov
Copy link

codecov bot commented May 9, 2025

Codecov Report

Attention: Patch coverage is 78.91037% with 120 lines in your changes missing coverage. Please review.

❌ Unsupported file format

Upload processing failed due to unsupported file format. Please review the parser error message:

Error parsing JUnit XML in /home/runner/work/datahub/datahub/metadata-io/build/test-results/test/TEST-com.linkedin.metadata.graph.search.elasticsearch.SearchGraphServiceElasticSearchTest.xml at 117:1058

Caused by:
    RuntimeError: Error converting computed name to ValidatedString
    
    Caused by:
        string is too long

For more help, visit our troubleshooting guide.

Files with missing lines Patch % Lines
...rch/elasticsearch/indexbuilder/ESIndexBuilder.java 60.63% 70 Missing and 4 partials ⚠️
...arch/elasticsearch/indexbuilder/ReindexConfig.java 35.00% 19 Missing and 7 partials ⚠️
.../elasticsearch/indexbuilder/OpenSearchJvmInfo.java 94.05% 1 Missing and 10 partials ⚠️
...upgrade/system/elasticsearch/ReindexDebugArgs.java 0.00% 4 Missing ⚠️
.../java/com/linkedin/datahub/upgrade/UpgradeCli.java 0.00% 2 Missing ⚠️
...hub/upgrade/system/elasticsearch/ReindexDebug.java 88.23% 2 Missing ⚠️
.../com/linkedin/metadata/search/utils/SizeUtils.java 98.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Collaborator

@RyanHolstien RyanHolstien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is still in progress. Also PR name should follow commit guidelines

jmacryl added 2 commits May 19, 2025 15:07
# Conflicts:
#	metadata-io/src/test/java/com/linkedin/metadata/search/indexbuilder/IndexBuilderTestBase.java
@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Jun 2, 2025
@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Jun 2, 2025
@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Jun 2, 2025
@jmacryl
Copy link
Collaborator Author

jmacryl commented Jun 2, 2025

both OpenSearchJvmInfo and SizeUtils are claude generated, and have a bunch of tests along with them, do we really want to change them(asking cause will need to adjust tests), or just leave them for now.

There are other conversations open but I replied to most I think, waiting to see if more changes are needed.

@jmacryl jmacryl force-pushed the PFP-1277/indexing-perf branch from cbe7705 to c7bafae Compare June 3, 2025 10:16
@datahub-cyborg datahub-cyborg bot added pending-submitter-merge and removed needs-review Label for PRs that need review from a maintainer. labels Jun 4, 2025
@jmacryl jmacryl merged commit 235a773 into master Jun 4, 2025
40 checks passed
@jmacryl jmacryl deleted the PFP-1277/indexing-perf branch June 4, 2025 17:56
@jmacryl
Copy link
Collaborator Author

jmacryl commented Jun 9, 2025

We have tested this on dh-usw2-saas-01-staging-11610a6850-dev03 vs current version:

  • old version ES upgrade: 63m
  • this version: 32m

so exactly double the speed. This instance is not big, largest index is just 4M, vs >1B in big indices in our largest installations, so we might get larger speedups in those.

kartikey-visa pushed a commit to kartikey-visa/datahub that referenced this pull request Jul 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pending-submitter-merge product PR or Issue related to the DataHub UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants