Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove requirement to run the Portworx volume driver on master node #45518

Merged
merged 5 commits into from
May 25, 2017

Conversation

harsh-px
Copy link
Contributor

@harsh-px harsh-px commented May 9, 2017

What this PR does / why we need it:
This change removes requirement to run the Portworx volume driver on Kubernetes master node.

Special notes for your reviewer:
Before this pull request, in order to use a Portworx volume, users had to run the Portworx container on the master node. Since it isn't ideal (and impossible on GKE) to schedule any pods on the master node, this PR removes that requirement.

Portworx volume driver no longer has to run on the master.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 9, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 9, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @harsh-px. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels May 9, 2017
@harsh-px
Copy link
Contributor Author

harsh-px commented May 9, 2017

/assign @jsafrane

@jsafrane
Copy link
Member

jsafrane commented May 9, 2017

@k8s-bot ok to test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 9, 2017

if driverClient != nil {
util.portworxClient = driverClient
break OUTER
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like this break, can't you just return here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Fixed in the latest incremental (4164374)

e = err
kubeClient := volumeHost.GetKubeClient()
if kubeClient != nil {
nodes, err := kubeClient.CoreV1().Nodes().List(metav1.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a label selector to ListOptions to save the label check below and transfer less data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed filtering logic in latest incremental. Please see reasoning in comment below.

pxdDriverName = "pxd"
pwxSockName = "pwx"
pvcClaimLabel = "pvc"
labelNodeRoleMaster = "node-role.kubernetes.io/master"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Portworx specific label and thus should have portworx prefix. In addition, masters and nodes can be separate machines, /master is not a good name here... Perhaps portworx.kubernetes.io/driver? Feel free to suggest a better name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In latest incremental (4164374), I decided to remove filtering based on a specific label.

Previously the goal was to filter out master nodes but for that we needed to label master nodes during deployement. We weren't gaining enough for this added step so I decided to remove the filtering altogether.

@jsafrane
Copy link
Member

With the latest commit, you scan all nodes on every mount and unmount, that's IMO very inefficient. There can be thousands of nodes, only few of them can run something on port 9001, yet you try all of them. That's no-go for me.

I don't know Portworx internals and I am confused by the terminology here. You are looking for a "master" node. What is this "master"? Is it Kubernetes API server(s)? That's not listed in nodes in most cases. Is it a set of nodes that runs some sort of Portworx server pods? Use a service (with well-known name in a well-known namespace) to get to it.

In addition, can I, as malicious user, run a dummy Portworx service on port 9001 on a random node and steal your data or even credentials? I hope you at least validate server ssl certificates...

@harsh-px
Copy link
Contributor Author

@jsafrane There is some confusion here. Let me try to clear it.

  1. The code scans the nodes only on the first mount. After it finds a validated node, it remembers that and there won't be subsequent scans (becuase of the if util.portworxClient == nil)
  2. We are not looking for the master node. By master node, I mean the node running the k8s api server. The Portworx API server will not be deployed on the master/api-server node.
  3. All nodes in the k8s cluster will run the Portworx container since we get deployed as a DaemonSet. So the scan loop with reference with point 1 will in most cases find Portworx running on the first node in the list.
  4. We don't save any user credentials or data as of today. The portworx api server at 9001 has endpoints for volume operations.

Let me know that clears your concerns.

@saad-ali saad-ali added this to the v1.6 milestone May 11, 2017
@jsafrane
Copy link
Member

jsafrane commented May 12, 2017

The code scans the nodes only on the first mount. After it finds a validated node, it remembers that and there won't be subsequent scans (becuase of the if util.portworxClient == nil)

No. Kubelet creates new mounter for each mount. And mounter gets empty PortworxVolumeUtil. https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/portworx/portworx.go#L92
So for each mount you download list of all nodes. That's IMHO bad. Can you cache the client in portworxVolumePlugin instead?

And if you are going to cache the node, you should recover somehow when the node is deleted and you should create a new client, pointing to a different node.

All nodes in the k8s cluster will run the Portworx container since we get deployed as a DaemonSet. So the scan loop with reference with point 1 will in most cases find Portworx running on the first node in the list.

Why do you need to list all nodes then? If the container is on all nodes, kubelet can easily try localhost:9001 (or hostname:9001) or get its own node addresses, you don't need to list all of them.

@thockin
Copy link
Member

thockin commented May 12, 2017

Are you scanning every node in the cluster, looking for something that tastes like your management server? With no credentials? That any pod in the cluster can trivially fake?

This is a really bad idea, and really needs to be re-thought.

Why not publish a Service? It still has no credentials, but at least you're not just hunting.

@thockin thockin assigned saad-ali and unassigned thockin May 12, 2017
@thockin
Copy link
Member

thockin commented May 12, 2017

I'm out, assigning to saad

@harsh-px
Copy link
Contributor Author

harsh-px commented May 18, 2017

@thockin and @jsafrane: Good inputs. And yes agreed ! We are working on a service-based design.

A quick question: What do you guys think about running a NodePort type service and the Portworx volume plugin always sends the request to localhost:. The NodePort service will take care of routing the request to one of the pods backing the service.

Our primary use case here is GKE where our pod won't be running on the master but the volume create request is sent to the master. So if we have a NodePort service running, we can update our plugin driver running on master to send requests to localhost:.

Alternately, I can use the default clusterIP service and talk to the service using it's cluster IP.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 22, 2017
@jsafrane
Copy link
Member

Now it looks much better (and shorter). I don'd like only some copying of code, which can be IMO easily refactored into a function, the rest is OK.

@jsafrane
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harsh-px, jsafrane

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels May 25, 2017
@jsafrane
Copy link
Member

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels May 25, 2017
@jsafrane jsafrane removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 25, 2017
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

@harsh-px: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins kops AWS e2e 4164374 link @k8s-bot kops aws e2e test this
Jenkins verification 4164374 link @k8s-bot verify test this
pull-kubernetes-unit ad4f21f link @k8s-bot pull-kubernetes-unit test this
pull-kubernetes-e2e-gce-etcd3 ad4f21f link @k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 45518, 46127, 46146, 45932, 45003)

@k8s-github-robot k8s-github-robot merged commit b017a7a into kubernetes:master May 25, 2017
@enisoc enisoc added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels May 26, 2017
k8s-github-robot pushed a commit that referenced this pull request May 27, 2017
…18-upstream-release-1.6

Automatic merge from submit-queue

Automated cherry pick of #45518 upstream release 1.6

Cherrypick of #45518

**What this PR does / why we need it**:
This change removes requirement to run the Portworx volume driver on Kubernetes master node.

**Special notes for your reviewer**:
Before this pull request, in order to use a Portworx volume, users had to run the Portworx container on the master node. Since it isn't ideal (and impossible on GKE) to schedule any pods on the master node, this PR removes that requirement.
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.6" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

@harsh-px
Copy link
Contributor Author

@saad-ali How do I ensure this makes it into the next 1.7 release?

For the 1.6 release, I created a cherry-pick pull request (#46528) that has been merged succesfully but I don't see a branch for the next 1.7 release.

@harsh-px harsh-px deleted the px-remote branch May 30, 2017 18:36
@jsafrane
Copy link
Member

@harsh-px, master is the 1.7 branch, so this PR will be in 1.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants