Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement API usage metrics for gce storage #40338

Merged

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Jan 24, 2017

What this PR does / why we need it:

This PR implements support for emitting metrics from GCE about storage operations.

Which issue this PR fixes

Fixes kubernetes/enhancements#182

Release note:

Add support for emitting metrics from GCE cloudprovider about storage operations.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 24, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GKE smoke e2e failed for commit 6a88f38. Full PR test history.

cc @gnufied, your PR dashboard

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Jan 24, 2017
@k8s-ci-robot
Copy link
Contributor

Jenkins GKE smoke e2e failed for commit 6a88f38. Full PR test history.

cc @gnufied, your PR dashboard

The magic incantation to run this job again is @k8s-bot cvm gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@@ -2930,3 +2981,8 @@ func lastComponent(s string) string {
}
return s
}

// Gets the time since the specified start in microseconds.
func SinceInMicroseconds(start time.Time) float64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why public?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

microsecondsSince

@@ -2493,7 +2533,9 @@ func (gce *GCECloud) doDeleteDisk(diskToDelete string) error {
return err
}

deleteOp, err := gce.service.Disks.Delete(gce.projectID, disk.Zone, disk.Name).Do()
dc := contextWithNamespace(diskToDelete, "gce_disk_delete")
deleteOp, err := gce.service.Disks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO ugly indention

@@ -113,6 +116,12 @@ type Config struct {

type DiskType string

// ApiWithNamespace stores api and namespace in context
type ApiWithNamespace struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why public?

apiNamespace, ok := t.(ApiWithNamespace)

if ok {
apiResponseReceived := func(resp *http.Response) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if !ok {
  return nil
}

apiNamespace, ok := t.(ApiWithNamespace)

if ok {
apiResponseReceived := func(resp *http.Response) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return func(....)

if ok {
apiResponseReceived := func(resp *http.Response) {
timeTaken := SinceInMicroseconds(requestTime)
gceMetricMap[apiNamespace.apiCall].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that gceMetricMap[apiNamespace.apiCall] is never nil?

@@ -30,6 +30,8 @@ import (

"gopkg.in/gcfg.v1"

"golang.org/x/net/context"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for empty line above? And why not merge the non-kube blocks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing this. I pushed this as a WIP and used it as a starting point for CloudProvider metrics proposal. I will clean it up and add more checks for nil and stuff.

I will give you another shout when it is ready for review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@k8s-github-robot k8s-github-robot assigned mikedanese and unassigned sttts Jan 30, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2017
@gnufied gnufied changed the title [WIP] Implement API usage metrics for gce storage Implement API usage metrics for gce storage Mar 20, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 20, 2017
@gnufied
Copy link
Member Author

gnufied commented Mar 20, 2017

@sttts I have fixed the review comments and this PR is ready for review. Also adding @piosz @jingxu97

@gnufied
Copy link
Member Author

gnufied commented Mar 20, 2017

/assign @jingxu97 @piosz

@jingxu97
Copy link
Contributor

cc @bowei

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 20, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 21, 2017
@sttts sttts removed their assignment Mar 21, 2017
@mikedanese mikedanese removed their assignment Mar 21, 2017
@mikedanese
Copy link
Member

/approve

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2017
@piosz
Copy link
Member

piosz commented Mar 22, 2017

LGTM
cc @fabxc @brancz

@@ -0,0 +1,76 @@
/*
Copyright 2014 The Kubernetes Authors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2017

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

var gceMetricMap = map[string]*prometheus.HistogramVec{
"gce_instance_list": prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "gce_instance_list",
Copy link
Member

@brancz brancz Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note the naming conventions for histograms: https://prometheus.io/docs/practices/histograms/

gce_instance_list_duration_seconds would be applicable here. Note that metrics are also meant to be in the base unit. Generally following the instrumentation guide.

Same applies to all other metrics introduced here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean - these metrics should only be collected in seconds? not in ms as done here?

I couldn't find any info. about that in linked documentation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, pointed at the wrong resources there, this is what I meant: https://prometheus.io/docs/practices/naming/#metric-names

Help: "Latency of instance listing calls in ms",
Buckets: prometheus.ExponentialBuckets(100, 2, 10),
},
[]string{"kube_namespace"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the instrumentation guide, this should be just namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 24, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 28, 2017
@gnufied
Copy link
Member Author

gnufied commented Mar 28, 2017

@brancz fixed all review comments. ptal

This PR implements tracking of GCE API usage via prometheus metrics.
@brancz
Copy link
Member

brancz commented Mar 29, 2017

lgtm 👍

@gnufied
Copy link
Member Author

gnufied commented Mar 29, 2017

@brancz @piosz someone needs to "/lgtm" and "/approve" for this PR to get merged.

cc @saad-ali

}

apiResponseReceived := func(resp *http.Response) {
timeTaken := time.Since(requestTime).Seconds()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are seconds too high for latency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://prometheus.io/docs/practices/naming/#metric-names metrics should use base units. I guess we should be okay, since we are reporting seconds as floats anyways.

@brancz
Copy link
Member

brancz commented Mar 30, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 30, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brancz, gnufied, mikedanese

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Mar 30, 2017
@childsb childsb removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Mar 30, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cloudprovider metrics for storage