-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GKE to use nodePool instead of nodeConfig #323
base: cloud-foundation
Are you sure you want to change the base?
Conversation
…gement of pool working
masterAuth: | ||
username: <FIXME:user_name> | ||
password: <FIXME:password> | ||
username: <FIXME:username> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really be using basic auth here? It's considered a security weakness: https://cloud.google.com/kubernetes-engine/docs/concepts/security-overview
Also, trailing backspace might cause confusion in YAML
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was provided as an example and in no means best practice. Also, basic auth is still the default behavior when creating via Gui.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, the GUI cautions against using basic auth with the following message "Basic authentication is not recommended because it provides no confidentiality protection for transmitted credentials".
Starting with version 1.12, clusters will have basic authentication and client certificate issuance disabled by default, even via GUI.
username: <FIXME:user_name> | ||
password: <FIXME:password> | ||
username: <FIXME:username> | ||
password: <FIXME:password> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really be using basic auth here? It's considered a security weakness: https://cloud.google.com/kubernetes-engine/docs/concepts/security-overview
Also, trailing backspace might cause confusion in YAML
maxNodeCount: 2 | ||
management: | ||
autoUpgrade: True | ||
autoRepair: True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best practice to add a line at EOF
@@ -34,4 +35,4 @@ resources: | |||
clusterIpv4Cidr: ${CLUSTERIPV4_CIDR} | |||
ipAllocationPolicy: | |||
useIpAliases: True | |||
servicesIpv4CidrBlock: ${SERVICESIPV4_CIDRBLOCK} | |||
servicesIpv4CidrBlock: ${SERVICESIPV4_CIDRBLOCK} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best practice to add a line at EOF
- https://www.googleapis.com/auth/devstorage.read_only | ||
- https://www.googleapis.com/auth/logging.write | ||
- https://www.googleapis.com/auth/monitoring | ||
nodePools: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Malformed YAML, extra-spaces. nodePools
should align with subnetwork
- https://www.googleapis.com/auth/devstorage.read_only | ||
- https://www.googleapis.com/auth/logging.write | ||
- https://www.googleapis.com/auth/monitoring | ||
nodePools: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No initialClusterVersion
value like in other templates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was because cluster version changes frequently and as an example, this will just default to the default version. the other could probably be changed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be beneficial to have some consistency among the examples. If a user starts with the gke_regional.yaml example file, and does not know they need to add initialClusterVersion
themselves (since it's not in the example for gke_regional.yaml), they will end up with a broken default version from gke.py.schema and the following error message:
Waiting for create [operation-1542046313989-57a7ba2c5c389-96c2ff4c-4b16d716]...failed.
ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1542046313989-57a7ba2c5c389-96c2ff4c-4b16d716]: errors:
- code: RESOURCE_ERROR
location: /deployments/my-gke-regional/resources/myk8sregional
message: '{"ResourceType":"gcp-types/container-v1beta1:projects.locations.clusters","ResourceErrorCode":"400","ResourceErrorMessage":{"code":400,"message":"Master
version \"1.9.7-gke.6\" is unsupported.","status":"INVALID_ARGUMENT","statusMessage":"Bad
Request","requestPath":"https://container.googleapis.com/v1beta1/projects/cft-test-220920/locations/us-east1/clusters","httpMethod":"POST"}}'
They will then have to figure out themselves which versions are working at the moment and add those to the example themselves, or they will need to go ahead and start modifying the default in gke.py.schema, which is the less elegant but more intuitive thing to do. Both of those seem not ideal.
Additionally, I'm curious why are we specifying initialClusterVersion
in gke.py as a required parameter (and not an optional one) to begin with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, I'm curious why are we specifying initialClusterVersion in gke.py as a required parameter (and not an optional one) to begin with?
We shouldn't make initialClusterVersion
a required parameter... I see this in the .py file
'properties':
{
'cluster':
{
'name':
name + '-cluster',
'initialNodeCount':
propc.get('initialNodeCount'),
'initialClusterVersion':
propc.get('initialClusterVersion')
}
}
Completely missed it first pass - but this essentially makes this field required.
autoscaling: | ||
enabled: True | ||
minNodeCount: 1 | ||
maxNodeCount: 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configuration results in unschedulable pods for K8s management pods:
Problem
Your cluster has 7 unschedulable pods
Details
Unschedulable pods
event-exporter-v0.1.9-5c8fb98cdb-8ggrr
heapster-v1.5.2-6d67555d76-sqhn9
kube-dns-autoscaler-69c5cbdcdd-pbzfh
kube-dns-db496df86-j9k9b
kubernetes-dashboard-7dbf5b9c66-6zmvw
and other 2 unschedulable pods.
Possible Actions
Increase maximum size limit for autoscaling in one or more node pools that have autoscaling enabled.
- https://www.googleapis.com/auth/devstorage.read_only | ||
- https://www.googleapis.com/auth/logging.write | ||
- https://www.googleapis.com/auth/monitoring | ||
autoscaling: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no 'taints' specified for the zonal cluster, but we specify them for global and global private. We should consider being consistent.
taints: | ||
- key: mykey1 | ||
value: value1 | ||
effect: NO_SCHEDULE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifying NO_SCHEDULE
for a minNodeCount: 1
and maxNodeCount: 3
with the specified node type results in un-scheduled system pods for k8s. To address this without increasing autoscaling definitions, use PREFER_NO_SCHEDULE
instead of NO_SCHEDULE
https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
https://cloud.google.com/kubernetes-engine/docs/reference/rest/v1beta1/NodeConfig#NodeTaint
@sourced-vince - can you please check this? |
use of nodePools to provide flexibility in node level autoscaling and management, as well multiple node pool in one cluster.