Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collector to target latency metrics #50

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from
Open

Conversation

oguzhanunlu
Copy link

@oguzhanunlu oguzhanunlu commented Oct 17, 2024

Jira ref: PDP-1495

This PR adds 2 metrics

  • latency_collector_to_target_millis : the minimum latency of all batches in the current metric period, where latency of a batch is computed as {load time - max collector_tstamp of the batch}
  • latency_collector_to_target_pessimistic_millis: the maximum latency of all batches in the current metric period, where latency of a batch is computed as {report time - min collector_tstamp of the batch}

In common-streams 0.8.x we shifted alerting / retrying / webhook out of
the applications and into the common library.  It also adds new features
like heartbeat webhooks starting when the loader first becomes healthy.

This commit also makes the webhook alert messages more human-friendly.
@oguzhanunlu oguzhanunlu self-assigned this Oct 17, 2024
@oguzhanunlu oguzhanunlu force-pushed the latency-sla branch 4 times, most recently from 8aa692a to 68bcdc0 Compare October 18, 2024 13:23
@oguzhanunlu oguzhanunlu marked this pull request as ready for review October 18, 2024 13:28
Copy link

@colmsnowplow colmsnowplow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the clarifying questions are answered, to the best of my judgment this implementation does indeed meet the specification. Thanks Oguzhan!

istreeter and others added 2 commits October 22, 2024 13:55
Compared to common-streams 0.8.0-M2, this version adds:

- Re-implemented Kinesis source without fs2-kinesis
- Pubsub source opens more transport channels when necessary
- Changes default webhook heartbeat period to 5 minutes
- Http4s Client with configuration appropriate for common-streams apps

Other changes to common-streams are not relevant for snowflake loader,
so not mentioned here.
This version holds onto a map of batches that are in memory but not yet
loaded. This means we can report latency of in-memory batches under all
of the following circumstances:

- Loader is doing long backoff and retry due to snowflake setup issue
  (e.g. expired password)
- Loader is stuck trying to open a connection to snowflake due to a
  transient network issue
- Loader is trying to shutdown due to any other runtime exception
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants