-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement API usage metrics for gce storage #40338
Implement API usage metrics for gce storage #40338
Conversation
Jenkins GCI GKE smoke e2e failed for commit 6a88f38. Full PR test history. cc @gnufied, your PR dashboard The magic incantation to run this job again is Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Jenkins GKE smoke e2e failed for commit 6a88f38. Full PR test history. cc @gnufied, your PR dashboard The magic incantation to run this job again is Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@@ -2930,3 +2981,8 @@ func lastComponent(s string) string { | |||
} | |||
return s | |||
} | |||
|
|||
// Gets the time since the specified start in microseconds. | |||
func SinceInMicroseconds(start time.Time) float64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
microsecondsSince
@@ -2493,7 +2533,9 @@ func (gce *GCECloud) doDeleteDisk(diskToDelete string) error { | |||
return err | |||
} | |||
|
|||
deleteOp, err := gce.service.Disks.Delete(gce.projectID, disk.Zone, disk.Name).Do() | |||
dc := contextWithNamespace(diskToDelete, "gce_disk_delete") | |||
deleteOp, err := gce.service.Disks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO ugly indention
@@ -113,6 +116,12 @@ type Config struct { | |||
|
|||
type DiskType string | |||
|
|||
// ApiWithNamespace stores api and namespace in context | |||
type ApiWithNamespace struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why public?
apiNamespace, ok := t.(ApiWithNamespace) | ||
|
||
if ok { | ||
apiResponseReceived := func(resp *http.Response) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !ok {
return nil
}
apiNamespace, ok := t.(ApiWithNamespace) | ||
|
||
if ok { | ||
apiResponseReceived := func(resp *http.Response) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return func(....)
if ok { | ||
apiResponseReceived := func(resp *http.Response) { | ||
timeTaken := SinceInMicroseconds(requestTime) | ||
gceMetricMap[apiNamespace.apiCall]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure that gceMetricMap[apiNamespace.apiCall]
is never nil?
@@ -30,6 +30,8 @@ import ( | |||
|
|||
"gopkg.in/gcfg.v1" | |||
|
|||
"golang.org/x/net/context" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for empty line above? And why not merge the non-kube blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing this. I pushed this as a WIP and used it as a starting point for CloudProvider metrics proposal. I will clean it up and add more checks for nil and stuff.
I will give you another shout when it is ready for review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
6a88f38
to
2ec3618
Compare
2ec3618
to
0e5e8a1
Compare
cc @bowei |
0e5e8a1
to
7e6da70
Compare
7e6da70
to
76392ee
Compare
/approve |
@@ -0,0 +1,76 @@ | |||
/* | |||
Copyright 2014 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
var gceMetricMap = map[string]*prometheus.HistogramVec{ | ||
"gce_instance_list": prometheus.NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Name: "gce_instance_list", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note the naming conventions for histograms: https://prometheus.io/docs/practices/histograms/
gce_instance_list_duration_seconds
would be applicable here. Note that metrics are also meant to be in the base unit. Generally following the instrumentation guide.
Same applies to all other metrics introduced here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean - these metrics should only be collected in seconds? not in ms as done here?
I couldn't find any info. about that in linked documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, pointed at the wrong resources there, this is what I meant: https://prometheus.io/docs/practices/naming/#metric-names
Help: "Latency of instance listing calls in ms", | ||
Buckets: prometheus.ExponentialBuckets(100, 2, 10), | ||
}, | ||
[]string{"kube_namespace"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the instrumentation guide, this should be just namespace
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
76392ee
to
59c0773
Compare
59c0773
to
04c7ed9
Compare
@brancz fixed all review comments. ptal |
This PR implements tracking of GCE API usage via prometheus metrics.
04c7ed9
to
c4aaf47
Compare
lgtm 👍 |
} | ||
|
||
apiResponseReceived := func(resp *http.Response) { | ||
timeTaken := time.Since(requestTime).Seconds() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are seconds too high for latency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://prometheus.io/docs/practices/naming/#metric-names metrics should use base units. I guess we should be okay, since we are reporting seconds as floats anyways.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: brancz, gnufied, mikedanese
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue |
What this PR does / why we need it:
This PR implements support for emitting metrics from GCE about storage operations.
Which issue this PR fixes
Fixes kubernetes/enhancements#182
Release note: