Skip to content

Conversation

@kkrik-es
Copy link
Contributor

@kkrik-es kkrik-es commented Dec 11, 2025

The current rate logic uses extrapolation within each tbucket, ignoring counter values before and after. This leaves room for missing counter resets happening across the tbucket boundaries, produces more null buckets as it requires at least 2 data points in each, and leads to rate inaccuracies as extrapolation is somewhat arbitrary.

This can be improved by using the last data point of a time bucket and the first data point of the next one, to interpolate the counter value at the boundary. This leads to more accurate results, catches resets across tbuckets and reduces null buckets, as it just needs one data point per tbucket to produce rate results.

@kkrik-es kkrik-es self-assigned this Dec 11, 2025
@kkrik-es kkrik-es added >non-issue Team:StorageEngine :StorageEngine/ES|QL Timeseries / metrics / logsdb capabilities in ES|QL labels Dec 11, 2025
@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

BlockFactory blockFactory = driverContext.blockFactory();
int positionCount = selected.getPositionCount();
try (var rates = blockFactory.newDoubleBlockBuilder(positionCount)) {
Map<Integer, ReducedState> flushedStates = new HashMap<>(positionCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • What do we get in the selected int vector? Does include all the ts buckets in this shard for a query?
  • If this is the case then this Map can grow significantly? Should we use something else? LongObjectPagedHashMap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the map for the final results, across all shards? I was also worried about this, LongObjectPagedHashMap looks like a good option. Will give it a try.

} else if (evalContext instanceof TimeSeriesGroupingAggregatorEvaluationContext tsContext) {
rate = computeRate(flushedStates, group, tsContext, isRateOverTime, dateFactor);
} else {
rate = computeRateWithoutExtrapolate(state, isRateOverTime);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases would computeRateWithoutExtrapolate(...) execute?

Copy link
Contributor Author

@kkrik-es kkrik-es Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some tests where the context is passed as simple grouping context (e.g. RateTests), not sure why.. want to check with Nhat on this. But, this is existing logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is for aggregations that are not timeseries aggregations - e.g. rate over logs (?) but we can also check w Nhat

* Computes the rate for a given group by interpolating boundary values with adjacent groups,
* or extrapolating values at the time bucket boundaries.
*/
private static double computeRate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think this rate is much more useful than the current rate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :StorageEngine/ES|QL Timeseries / metrics / logsdb capabilities in ES|QL Team:StorageEngine v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants