Automatic merge from submit-queue
Kubelet: Fail kubelet if cadvisor is not started.
Fixes https://github.com/kubernetes/kubernetes/issues/28997.
We started cadvisor in `sync.Do()`, which only run once no matter cadvisor successfully starts or not.
Once it fails, kubelet will be stuck in a bad state. Kubelet could never start sync loop because there is an internal error, but kubelet would never retry starting cadvisor again.
This PR just fails kubelet when cadvisor start fails, and then relies on the babysitter to restart kubelet.
In the future, we may want to add backoff logic in the babysitter to protect the system.
On the other hand, https://github.com/kubernetes/kubernetes/pull/29492 will fix cadvisor side to prevent cadvisor failing because of these kind of transient error.
Mark P1 to match the original issue.
@dchen1107 @vishh
Automatic merge from submit-queue
Replica location planner for Federated ReplicaSet Controller
Requires #29385 to be merged.
cc: @quinton-hoole @wojtek-t
Automatic merge from submit-queue
network/cni: Unconditionally bring up `lo` interface
This is already done in kubenet. This specifically fixes an issue where a kubelet-managed network for the rkt runtime does not have an "UP" lo interface.
Fixes#28561
If this fix doesn't seem right, it could also be implemented by rkt effectively managing two "cni" network plugins, one for the user requested network, one for lo.
Followup CRs can improve unit testing further and then possibly remove the vendor directory logic (which seems like dead code)
cc @kubernetes/sig-rktnetes @kubernetes/sig-network @dcbw
Automatic merge from submit-queue
Kubelet: Pod level Resource Management
This proposal outlines our plan for improving resource management in Kubernetes by having a Cgroup hierarchy with QoS and Pod level Cgroups.
This is the initial proposal which broadly covers our goals and how we plan to achieve it. At this point we would really appreciate feedback from the community.
This is tied to the upstream issue #5671. So i would request
@vishh @dchen1107 @bgrant0607 @jdef PTAL.
[]()
Automatic merge from submit-queue
TestLoadBalancer() test v1 not v2
TestLoadBalancer() should test v1 and TestLoadBalancerV2() test v2, but In TestLoadBalancerV() there are codes:
cfg.LoadBalancer.LBVersion = "v2"
Automatic merge from submit-queue
Extract kubelet node status into separate file
Extract kubelet node status management into a separate file as a continuation of the kubelet code simplification effort.
Automatic merge from submit-queue
Give the complete and correct path to client/kubectl/apiserver related
for client/kubectl, “client/unversioned","kubectl/describe.go","kubectl/stop.go", it is not easy to find the location, better add "pkg/".
for apiserver, "registry/daemon", also better add "pkg/", and daemon not exists, should be "daemonset".
Automatic merge from submit-queue
Remove duplicate prometheus metrics
This was a relic from before Kubernetes set Docker labels properly. Cadvisor now properly exposes the Docker labels (e.g. `io.kubernetes.pod.name` as `io_kubernetes_pod_name`, etc) so this is no longer required & actually results in unnecessary duplicate Prometheus labels.
Automatic merge from submit-queue
cleanup wrong naming: limitrange -> hpa
The code is in `horizontalpodautoscaler/strategy.go`, but the parameter is "limitrange". This is legacy copy-paste issue...
Automatic merge from submit-queue
Syncing imaging pulling backoff logic
- Syncing the backoff logic in the parallel image puller and the sequential image puller to prepare for merging the two pullers into one.
- Moving image error definitions under kubelet/images
Automatic merge from submit-queue
make addition group RESTStorage registration easier
Starts factoring out `RESTStorage` creation to eventually allow for decoupled API group `RESTStorage` configuration.
Right now you can't add additional groups without modifying the main API Group registration in master.go. Allows the `master.Config` to hold a function that can build a `RESTStorage` based on the `Master` struct.
@lavalamp @caesarxuchao @kubernetes/sig-api-machinery
@liggitt @smarterclayton
Automatic merge from submit-queue
Validation logic applied to edited file
The file that is submitted via ``edit`` is now subject to validation
logic as any other file. The validation flags were added to the ``edit``
command.
Fixes: #17542
Automatic merge from submit-queue
rkt: Fix /etc/hosts /etc/resolv.conf permissions
#29024 introduced copying /etc/hosts and /etc/resolv.conf before mounting them into rkt containers. However, the new files' permissions are set to 0640, which make these files unusable by any other users than root in the container as shown below. This small patch changes the permissions to 0644, as typically set.
```
# host rabbitmq
rabbitmq.default.svc.cluster.local has address 10.3.0.211
# ls -la /etc/resolv.conf
-rw-r-----. 1 root root 102 Jul 23 13:20 /etc/resolv.conf
# sudo -E -u foo bash
$ cat /etc/resolv.conf
cat: /etc/resolv.conf: Permission denied
$ host rabbitmq
;; connection timed out; no servers could be reached
# exit
# chmod 0644 /etc/resolv.conf /etc/hosts
# sudo -E -u foo host rabbitmq
rabbitmq.default.svc.cluster.local has address 10.3.0.211
```
cc @kubernetes/sig-rktnetes @yifan-gu @euank
Automatic merge from submit-queue
To break the loop when object found in removeOrphanFinalizer()
To break the loop when object found in removeOrphanFinalizer()
Automatic merge from submit-queue
Eviction manager needs to start as runtime dependent module
To support disk eviction, the eviction manager needs to know if there is a dedicated device for the imagefs. In order to know that information, we need to start the eviction manager after cadvisor. This refactors the location eviction manager is started.
/cc @kubernetes/sig-node @kubernetes/rh-cluster-infra @vishh @ronnielai
Automatic merge from submit-queue
export KUBE_USER to salt (support custom usernames) for vagrant, vsph…
GCE/GKE were handled in #29164, AWS was handled in #29428. This should cover the rest of the configurations that use ABAC.
Automatic merge from submit-queue
Allow PVs to specify supplemental GIDs
Retry of https://github.com/kubernetes/kubernetes/pull/28691 . Adds a Kubelet helper function for getting extra supplemental groups
Automatic merge from submit-queue
Add parsing code in kubelet for eviction-minimum-reclaim
The kubelet parses the eviction-minimum-reclaim flag and validates it for correctness.
The first two commits are from https://github.com/kubernetes/kubernetes/pull/29329 which has already achieved LGTM.
Automatic merge from submit-queue
kube-up: increase download timeout for kubernetes.tar.gz
Particularly on smaller instances on AWS, we were hitting the 80 second
timeout now that our image is well over the 1GB mark.
Increase the timeout from 80 seconds to 300 seconds.
Fix#29418
Automatic merge from submit-queue
Silence curl output
Removes the following from script output:
curl: (7) Failed to connect to 127.0.0.1 port 8080: Connection refused
The requirement that ExternalID returns InstanceNotFound when the
instance not found was incorrectly documented on InstanceID and
InstanceType. This requirement arises from the node controller, which
is the only place that checks for the InstanceNotFound error.