Description
What happened?
New install with calico used as cni. Log into a k8s node and run calicoctl.sh ipam check
and you get a perm issue
root@k8s-worker-1:/home/ansible# calicoctl.sh ipam check
Checking IPAM for inconsistencies...
Loading all IPAM blocks...
Found 5 IPAM blocks.
IPAM block 10.233.110.128/26 affinity=host:k8s-worker-1:
IPAM block 10.233.113.0/26 affinity=host:k8s-control-1:
IPAM block 10.233.66.0/26 affinity=host:k8s-worker-4:
IPAM block 10.233.85.64/26 affinity=host:k8s-control-2:
IPAM block 10.233.93.192/26 affinity=host:k8s-control-3:
IPAM blocks record 5 allocations.
Loading all IPAM pools...
10.233.64.0/18
Found 1 active IP pools.
Loading all nodes.
failed to list nodes: connection is unauthorized: nodes is forbidden: User "system:serviceaccount:kube-system:calico-cni-plugin" cannot list resource "nodes" in API group "" at the cluster scope
The fix was to apply:
diff --git a/roles/network_plugin/calico/templates/calico-cr.yml.j2 b/roles/network_plugin/calico/templates/calico-cr.yml.j2
index 7ddec1698..5e6651761 100644
--- a/roles/network_plugin/calico/templates/calico-cr.yml.j2
+++ b/roles/network_plugin/calico/templates/calico-cr.yml.j2
@@ -11,6 +11,7 @@ rules:
- namespaces
verbs:
- get
+ - list
- apiGroups: [""]
resources:
- pods/status
Note that I added list to pods, nodes, and namespaces because they also needed the permission (separated out nodes for list and get, then namespaces errored on the next run)
What did you expect to happen?
I expected it not to error and get a result similar to the following:
root@k8s-worker-1:/home/ansible# calicoctl.sh ipam check
Checking IPAM for inconsistencies...
Loading all IPAM blocks...
Found 5 IPAM blocks.
IPAM block 10.233.110.128/26 affinity=host:k8s-worker-1:
IPAM block 10.233.113.0/26 affinity=host:k8s-control-1:
IPAM block 10.233.66.0/26 affinity=host:k8s-worker-4:
IPAM block 10.233.85.64/26 affinity=host:k8s-control-2:
IPAM block 10.233.93.192/26 affinity=host:k8s-control-3:
IPAM blocks record 5 allocations.
Loading all IPAM pools...
10.233.64.0/18
Found 1 active IP pools.
Loading all nodes.
Found 0 node tunnel IPs.
Loading all workload endpoints.
Found 5 workload IPs.
Workloads and nodes are using 5 IPs.
Loading all handles
Looking for top (up to 20) nodes by allocations...
k8s-worker-4 has 1 allocations
k8s-control-2 has 1 allocations
k8s-control-3 has 1 allocations
k8s-worker-1 has 1 allocations
k8s-control-1 has 1 allocations
Node with most allocations has 1; median is 1
Scanning for IPs that are allocated but not actually in use...
Found 0 IPs that are allocated in IPAM but not actually in use.
Scanning for IPs that are in use by a workload or node but not allocated in IPAM...
Found 0 in-use IPs that are not in active IP pools.
Found 0 in-use IPs that are in active IP pools but have no corresponding IPAM allocation.
Scanning for IPAM handles with no matching IPs...
Found 0 handles with no matching IPs (and 5 handles with matches).
Scanning for IPs with missing handle...
Found 0 handles mentioned in blocks with no matching handle resource.
Check complete; found 0 problems.
How can we reproduce it (as minimally and precisely as possible)?
Do an install with calico setup (will include my calico vars below). Log into a k8s node and run calicoctl.sh ipam check
. Apply the diff from above and then run the same command and it works.
My group_vars/k8s_cluster/k8s-net-calico.yml is as follows, however you should not need all of the bgp settings.
---
calico_cni_name: k8s-pod-network
peer_with_router: true
nat_outgoing: true
nat_outgoing_ipv6: false
calico_pool_name: "default-pool"
calico_pool_blocksize: 26
calico_pool_cidr: 10.233.64.0/18
calico_cni_pool: true
global_as_num: "64513"
calico_mtu: 1500
calico_veth_mtu: 1500
calico_advertise_cluster_ips: true
calico_datastore: "kdd"
typha_enabled: true
calico_network_backend: 'bird'
calico_ipip_mode: 'Never'
calico_vxlan_mode: 'Never'
calico_apiserver_enabled: true
peers:
- router_id: "10.10.130.1"
as: "64512"
OS
Linux 6.8.0-48-generic x86_64
PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
Version of Ansible
ansible [core 2.16.12]
config file = None
configured module search path = ['/home/bvierra/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/bvierra/p/homelab/.direnv/python-3.12.3/lib/python3.12/site-packages/ansible
ansible collection location = /home/bvierra/.ansible/collections:/usr/share/ansible/collections
executable location = /home/bvierra/p/homelab/.direnv/python-3.12.3/bin/ansible
python version = 3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0] (/home/bvierra/p/homelab/.direnv/python-3.12.3/bin/python)
jinja version = 3.1.4
libyaml = True
Version of Python
Python 3.12.3
Version of Kubespray (commit)
Network plugin used
calico
Full inventory with variables
If needed I can add, but it appeared there was some stuff I would have to redact and it was large :)
Command used to invoke ansible
ansible-playbook -i inventory/mycluster/hosts.yaml cluster.yml
Output of ansible run
as above
Anything else we need to know
No response