-
Notifications
You must be signed in to change notification settings - Fork 42.1k
Closed
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.Categorizes an issue or PR as relevant to SIG Scheduling.
Milestone
Description
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
Pod got stuck in Pending state forever when failed to bind to host.
What you expected to happen:
For it to be rescheduled on another host
How to reproduce it (as minimally and precisely as possible):
Not always reproducible. But simply created an RC replicas=2 on a 3 host setup, and one of the Pods got stuck in pending state for a while. Following error messages were found in the scheduler logs:
7/20/2017 8:30:40 AME0720 15:30:40.435703 1 scheduler.go:282] Internal error binding pod: (scheduler cache ForgetPod failed: pod test-677550-rc-edit-namespace/nginx-jvn09 state was assumed on a different node)
7/20/2017 8:31:11 AMW0720 15:31:11.119885 1 cache.go:371] Pod test-677550-rc-edit-namespace/nginx-jvn09 expired
Describe pod output:
nginx-jvn09 0/1 Pending 0 2h
> kubectl describe pod nginx-jvn09 --namespace=test-677550-rc-edit-namespace
Name: nginx-jvn09
Namespace: test-677550-rc-edit-namespace
Node: /
Labels: name=nginx
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"Replicatio
nController","namespace":"test-677550-rc-edit-namespace","name":"nginx","uid":"6313354d-6d60-11e...
Status: Pending
IP:
Controllers: ReplicationController/nginx
Containers:
nginx:
Image: nginx
Port: 80/TCP
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vz8x6 (ro)
Volumes:
default-token-vz8x6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-vz8x6
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events: <none>
RC yml:
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 2
selector:
name: nginx
template:
metadata:
labels:
name: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): 1.7.1 - Cloud provider or hardware configuration**:
- OS (e.g. from /etc/os-release): RHEL7, Docker native 1.12.6
- Kernel (e.g.
uname -a): - Install tools:
- Others:
guillelb and samgavinio
Metadata
Metadata
Assignees
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.Categorizes an issue or PR as relevant to SIG Scheduling.