-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Fix some bugs in reshard bulk request splitting #139394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
| } | ||
| }); | ||
| // Item responses should match the order of the original item requests | ||
| BulkItemResponse[] bulkItemResponses = Arrays.stream(originalRequest.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Streams allocate and given that we don't win much in readability here (IMO of course) i would go a for loop.
| * @param <E> the type of anticipated exception | ||
| */ | ||
| @FunctionalInterface | ||
| public interface CheckedTriConsumer<S, T, U, E extends Exception> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if i like this. IMO when we are at this number of parameters we could benefit from names and introduce a specific interface where it is needed (so in this case SplitRequestExecutor or something). It can still be a @FunctionalInterface.
This PR addresses two issues:
When combining bulk shard responses, item ids were used the index to an array. This only works when there's one shard, as a
BulkOperationstarts with sequential item ids [0, 1, 2...] but gets split into multipleBulkShardRequests.For instance, in a scenario with 2 shards, a
BulkOperationwith 10 item requests could get split into twoBulkShardRequests with item ids [0, 1, 3, 7, 9] and [2, 4, 5, 6, 8] respectively. After splitting the firstBulkShardRequestfurther for resharding and receiving responses, the current implementation to combine the responses would create an array of length 5 and run into an index out of bounds exception when indexing using item id 7.The proposed fix is to combine the bulk responses into a map of item id -> item response. Then we can use the original request, mapping each item request to its corresponding item response to create an item response list.
The second issue is that the source shard executes the original bulk shard request instead of the split one. I added a parameter to
doPrimaryRequestfor the request to execute.