Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centos provider: generate SSL certificates for etcd cluster. #42994

Merged

Conversation

Shawyeok
Copy link
Contributor

What this PR does / why we need it:
Support secure etcd cluster for centos provider, generate SSL certificates for etcd in default. Running it w/o SSL is exposing cluster data to everyone and is not recommended. #39462

/cc @jszczepkowski @zmerlynn

Release note:

Support secure etcd cluster for centos provider.

@k8s-ci-robot
Copy link
Contributor

Hi @Shawyeok. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 13, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 13, 2017
sudo mkdir -p /opt/kubernetes/bin; \
sudo mkdir -p /opt/kubernetes/cfg"
}

function ensure-etcd-cert() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a lot like https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/util.sh#L740

I wonder if @jszczepkowski would be interest in you moving that stuff up higher in the tree and making it reusable by more providers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a good suggestion. Download the cfssl binaries is duplicate at least https://github.com/kubernetes/kubernetes/blob/master/cluster/common.sh#L908.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sound reasonable to move this up.

@@ -18,10 +18,16 @@
ETCD_SERVERS=${1:-"http://8.8.8.18:2379"}
FLANNEL_NET=${2:-"172.16.0.0/16"}

CA_FILE="/srv/kubernetes/etcd/ca.pem"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need it on node? Etcd peers are only on masters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's not a peer certfile for etcd(A peer certfile just like peer-*.pem). Because flannel needs to access the etcd cluster, so...

echo "Generate CA certificates ..."
./cfssl gencert -initca ca-csr.json | ./cfssljson -bare ca -

echo "Generate client certificates..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need client & server certs? Aren't peer certs enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose a pod running in a k8s cluster(peer certs only), the pod still can access the etcd service, so I think it's necessary.

@jszczepkowski jszczepkowski added this to the v1.7 milestone Mar 14, 2017
@Shawyeok Shawyeok force-pushed the features/full-tls-etcd-cluster branch 2 times, most recently from 74045d4 to 2375a1d Compare March 14, 2017 19:08
# $2 (the ip of etcd member)
# $3 (the type of etcd certificates, must be one of client, server, peer)
# $4 (the prefix of the certificate filename, default is $3)
function generate-etcd-cert() {
Copy link
Contributor Author

@Shawyeok Shawyeok Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've extracted generate-etcd-cert method to the cluster/common.sh, feel free to review, thanks a lot. @jszczepkowski

"size": 256
},
"names": [
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this section needed? If so, shouldn't we fill it with some reasonable values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is. cfssl docs
I've changed the O (Organization) to kubernetes.io and deleted another unnecessary fields.

@Shawyeok Shawyeok force-pushed the features/full-tls-etcd-cluster branch 2 times, most recently from a01523e to 13e4a45 Compare March 15, 2017 16:56
@@ -117,7 +120,7 @@ export FLANNEL_NET=${FLANNEL_NET:-"172.16.0.0/16"}

# Admission Controllers to invoke prior to persisting objects in cluster
# If we included ResourceQuota, we should keep it at the end of the list to prevent incrementing quota usage prematurely.
export ADMISSION_CONTROL=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,DefaultTolerationSeconds
export ADMISSION_CONTROL=${ADMISSION_CONTROL:-"NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,DefaultTolerationSeconds"}
Copy link
Contributor Author

@Shawyeok Shawyeok Mar 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support defined ADMISSION_CONTROL outside by user, DefaultTolerationSeconds is not exists before cd427fa4. /cc @kevin-wangzefeng

@@ -742,44 +742,14 @@ function create-etcd-certs {
local ca_cert=${2:-}
local ca_key=${3:-}

download-cfssl
GEN_ETCD_CA_CERT="${ca_cert}" GEN_ETCD_CA_KEY="${ca_key}" \
generate-etcd-cert "${KUBE_TEMP}/cfssl" "${host}" "peer" "peer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who creates ${KUBE_TEMP}/cfssl directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The directory ${KUBE_TEMP}/cfssl will create in download-cfssl function, but it's not exists before call download in generate-etcd-cert function. It maybe cause error: the file or directory not exists. I'll fix it.

local cert_dir="${ROOT}/etcd-cert"

mkdir -p "${cert_dir}"
pushd "${cert_dir}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be unnecessary: function generate-etcd-cert does pushd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I need to check client.pem or client-key.pem exists in the ${cert_dir} directory. Use generate-etcd-cert "${cert_dir}" instead of generate-etcd-cert . seem could be better.

local master_ip="$2"
local cert_dir="${ROOT}/etcd-cert"

mkdir -p "${cert_dir}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't better to create ${cert_dir} inside generate-etcd-cert function (if the dir does not exist). Currently, generate-etcd-cert function has undocumented assumption that the directory already exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is.

@Shawyeok Shawyeok force-pushed the features/full-tls-etcd-cluster branch from 13e4a45 to 92997ee Compare March 16, 2017 10:59
local master_ip="$2"
local cert_dir="${ROOT}/etcd-cert"

if [[ ! -r "${cert_dir}/client.pem" || ! -r "${cert_dir}/client-key.pem" ]]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the client.pem and client-key.pem in the directory exists without cd.

local GEN_ETCD_CA_CERT=${GEN_ETCD_CA_CERT:-}
local GEN_ETCD_CA_KEY=${GEN_ETCD_CA_KEY:-}

mkdir -p "${cert_dir}"
Copy link
Contributor Author

@Shawyeok Shawyeok Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create ${cert_dir} directory inside generate-etcd-cert function.

#
function download-cfssl {
mkdir -p "${KUBE_TEMP}/cfssl"
pushd "${KUBE_TEMP}/cfssl"
mkdir -p "$1"
Copy link
Contributor

@jszczepkowski jszczepkowski Mar 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a local variable with a meaningful name for "$1" (like here)?

@jszczepkowski
Copy link
Contributor

@k8s-bot ok to test

@Shawyeok Shawyeok force-pushed the features/full-tls-etcd-cluster branch from 92997ee to ca4ea4e Compare March 16, 2017 15:21
local cfssl_dir="${1}"

mkdir -p "${cfssl_dir}"
pushd "${cfssl_dir}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ${cfssl_dir} instead of $1.

if [[ -f $service_file ]]; then \
sudo systemctl stop $service_name; \
sudo systemctl disable $service_name; \
sudo rm -f $service_file; \
fi"
done
kube-ssh "${1}" "sudo rm -rf /opt/kubernetes"
kube-ssh "${1}" "sudo rm -rf ${KUBE_TEMP}"
kube-ssh "${1}" "sudo rm -rf /var/lib/etcd"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ${master} instead of $1.

if [[ -f $service_file ]]; then \
sudo systemctl stop $service_name; \
sudo systemctl disable $service_name; \
sudo rm -f $service_file; \
fi"
done
kube-ssh "$1" "sudo rm -rf /run/flannel"
kube-ssh "$1" "sudo rm -rf /opt/kubernetes"
kube-ssh "$1" "sudo rm -rf ${KUBE_TEMP}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ${node} instead of $1.

@jszczepkowski
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 16, 2017
@jszczepkowski
Copy link
Contributor

FYI: I did manual tests of this patch for GCE and it is passing.

@jszczepkowski
Copy link
Contributor

@zmerlynn
Zach, can you approve it?

@eparis
Copy link
Contributor

eparis commented Mar 16, 2017

/approve

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 16, 2017
@Shawyeok Shawyeok force-pushed the features/full-tls-etcd-cluster branch from ca4ea4e to 92997ee Compare March 16, 2017 18:01
@Shawyeok
Copy link
Contributor Author

@k8s-bot cvm gce e2e test this

@Shawyeok
Copy link
Contributor Author

Shawyeok commented Mar 16, 2017

It's strange. Why tests pull-kubernetes-e2e-gce, pull-kubernetes-e2e-gce-etcd3, pull-kubernetes-e2e-gce-gci failed after I force push ca4ea4e8 to replace 92997ee5?

The diff between two commits are:

$ git diff 92997ee5616670533bd188f02a262e6d256b5ea9 ca4ea4e88eea97982a9176f474bc70803dcb0228

diff --git a/cluster/centos/util.sh b/cluster/centos/util.sh
index 88302a3..f817759 100755
--- a/cluster/centos/util.sh
+++ b/cluster/centos/util.sh
@@ -197,38 +197,40 @@ function troubleshoot-node() {

 # Clean up on master
 function tear-down-master() {
-echo "[INFO] tear-down-master on $1"
+  local master="${1}"
+  echo "[INFO] tear-down-master on ${master}"
   for service_name in etcd kube-apiserver kube-controller-manager kube-scheduler ; do
       service_file="/usr/lib/systemd/system/${service_name}.service"
-      kube-ssh "$1" " \
+      kube-ssh "${master}" " \
         if [[ -f $service_file ]]; then \
           sudo systemctl stop $service_name; \
           sudo systemctl disable $service_name; \
           sudo rm -f $service_file; \
         fi"
   done
-  kube-ssh "${1}" "sudo rm -rf /opt/kubernetes"
-  kube-ssh "${1}" "sudo rm -rf /srv/kubernetes"
-  kube-ssh "${1}" "sudo rm -rf ${KUBE_TEMP}"
-  kube-ssh "${1}" "sudo rm -rf /var/lib/etcd"
+  kube-ssh "${master}" "sudo rm -rf /opt/kubernetes"
+  kube-ssh "${master}" "sudo rm -rf /srv/kubernetes"
+  kube-ssh "${master}" "sudo rm -rf ${KUBE_TEMP}"
+  kube-ssh "${master}" "sudo rm -rf /var/lib/etcd"
 }

 # Clean up on node
 function tear-down-node() {
-echo "[INFO] tear-down-node on $1"
+  local node="${1}"
+  echo "[INFO] tear-down-node on ${node}"
   for service_name in kube-proxy kubelet docker flannel ; do
       service_file="/usr/lib/systemd/system/${service_name}.service"
-      kube-ssh "$1" " \
+      kube-ssh "${node}" " \
         if [[ -f $service_file ]]; then \
           sudo systemctl stop $service_name; \
           sudo systemctl disable $service_name; \
           sudo rm -f $service_file; \
         fi"
   done
-  kube-ssh "$1" "sudo rm -rf /run/flannel"
-  kube-ssh "$1" "sudo rm -rf /opt/kubernetes"
-  kube-ssh "$1" "sudo rm -rf /srv/kubernetes"
-  kube-ssh "$1" "sudo rm -rf ${KUBE_TEMP}"
+  kube-ssh "${node}" "sudo rm -rf /run/flannel"
+  kube-ssh "${node}" "sudo rm -rf /opt/kubernetes"
+  kube-ssh "${node}" "sudo rm -rf /srv/kubernetes"
+  kube-ssh "${node}" "sudo rm -rf ${KUBE_TEMP}"
 }

 # Generate the CA certificates for k8s components
diff --git a/cluster/common.sh b/cluster/common.sh
index 656bfd1..e9e7fd1 100755
--- a/cluster/common.sh
+++ b/cluster/common.sh
@@ -903,11 +903,13 @@ function sha1sum-file() {
 # Downloads cfssl into $1 directory
 #
 # Assumed vars:
-#   $1 (cfssl directory)
+#   $1 (the directory that cfssl to save)
 #
 function download-cfssl {
-  mkdir -p "$1"
-  pushd "$1"
+  local cfssl_dir="${1}"
+
+  mkdir -p "${cfssl_dir}"
+  pushd "${cfssl_dir}"

   kernel=$(uname -s)
   case "${kernel}" in

Full PR test history: #42994
There are some layout problems, all failed test are under commit ca4ea4e8.

/cc @k8s-oncall @kubernetes/test-infra-maintainers

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 23, 2017
Making download-cfssl reusable.

Extract generate-etcd-cert method up to common.sh.
@Shawyeok Shawyeok force-pushed the features/full-tls-etcd-cluster branch from 92997ee to c692b55 Compare March 24, 2017 01:18
@k8s-github-robot k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 24, 2017
@Shawyeok
Copy link
Contributor Author

This PR needs LGTM again after rebase, please. @jszczepkowski @zmerlynn

@jszczepkowski
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 28, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Shawyeok, eparis, jszczepkowski

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit be4452c into kubernetes:master Mar 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants