The issue is that only the last probe is recorded for the histogram metric. The /metrics
endpoint always returns a response like this (count doesn't exceed 1):
latency_count{probe=\"ext-google\",dst=\"8.8.8.8\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.001\"} 0 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.005\"} 0 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.01\"} 0 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.02\"} 0 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.03\"} 0 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.05\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.075\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.1\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.2\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.5\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"0.75\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"1\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"1.5\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"2\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"2.5\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"3\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"4\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"5\"} 1 1724571103875\nlatency_bucket{probe=\"ext-google\",dst=\"8.8.8.8\",le=\"+Inf\"} 1 1724571103875\n
My question: Is this behavior intentional or can it be considered a bug?
\nThis line is likely causing the behavior. It should be put in the Start
method or should be changed like this:
if result.latency == nil {\n result.latency = p.opts.LatencyDist.CloneDist()\n}
@therealak12, which version you're running?
\nI think you've found a bug in the unreleased version (which is great), but latest release should be okay.
\nContext: I recently moved DNS probe to the common scheduler, but that expects metrics to be accumulated. Earlier DNS probe was using statskeeper (now retired), which took care of accumulating the metrics. So if you're using a released version it should be working ok.
","upvoteCount":1,"url":"https://github.com/cloudprober/cloudprober/discussions/833#discussioncomment-10441662"}}}latency_distribution
work in practice?
#833
-
Hi, I have a simple yaml config as follows: probe:
- name: "google"
type: DNS
targets:
host_names: "8.8.8.8"
dns_probe:
queryType: A
resolved_domain: "google.com"
latency_distribution:
explicit_buckets: "0.001,0.005,0.01,0.02,0.03,0.05,0.075,0.1,0.2,0.5,0.75,1,1.5,2,2.5,3,4,5"
interval: 1s
timeout: 1s
latency_unit: s The issue is that only the last probe is recorded for the histogram metric. The
My question: Is this behavior intentional or can it be considered a bug? This line is likely causing the behavior. It should be put in the if result.latency == nil {
result.latency = p.opts.LatencyDist.CloneDist()
} |
Beta Was this translation helpful? Give feedback.
-
@therealak12, which version you're running? I think you've found a bug in the unreleased version (which is great), but latest release should be okay. Context: I recently moved DNS probe to the common scheduler, but that expects metrics to be accumulated. Earlier DNS probe was using statskeeper (now retired), which took care of accumulating the metrics. So if you're using a released version it should be working ok. |
Beta Was this translation helpful? Give feedback.
-
@manugarg, I'm running the main branch by |
Beta Was this translation helpful? Give feedback.
@therealak12, which version you're running?
I think you've found a bug in the unreleased version (which is great), but latest release should be okay.
Context: I recently moved DNS probe to the common scheduler, but that expects metrics to be accumulated. Earlier DNS probe was using statskeeper (now retired), which took care of accumulating the metrics. So if you're using a released version it should be working ok.