Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
[Part 1] Remove docker dep in kubelet startup
**What this PR does / why we need it**:
Remove dependency of docker during kubelet start up.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part 1 of #54090
**Special notes for your reviewer**:
Changes include:
1. Move docker client initialization into dockershim pkg.
2. Pass a docker `ClientConfig` from kubelet to dockershim
3. Pass parameters needed by `FakeDockerClient` thru `ClientConfig` to dockershim
(TODO, the second part) Make dockershim tolerate when dockerd is down, otherwise it will still fail kubelet
Please note after this PR, kubelet will still fail if dockerd is down, this will be fixed in the subsequent PR by making dockershim tolerate dockerd failure (initializing docker client in a separate goroutine), and refactoring cgroup and log driver detection.
**Release note**:
```release-note
Remove docker dependency during kubelet start up
```
Automatic merge from submit-queue (batch tested with PRs 54460, 55258, 54858, 55506, 55510). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
redendancy code and error log message in cni
**What this PR does / why we need it**:
redendancy code and error log message in cni
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/sig-node
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubenet: disable DAD in the container.
Since kubenet externally guarantees that IP address will not conflict, we can short-circuit the kernel's normal wait. This lets us avoid the 1 second network wait.
**What this PR does / why we need it**:
Fixes the pod startup latency identified in #54651 and #55060
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support copying "options" in resolv.conf into pod sandbox when dnsPolicy is Default
**What this PR does / why we need it**:
This PR adds support for copying "options" from host's /etc/resolv.conf (or --resolv-conf) into pod's resolv.conf when dnsPolicy is Default. Being able to customize options is important because it is common to leverage options to fine-tune the behavior of DNS client.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#42542
**Special notes for your reviewer**:
I originally wanted to also tackle the issue of copying options for when dnsPolicy is ClusterFirst, but with ability to "merge" with default options (ndots:5 more specifically) when it makes sense. I decided to leave it off for now because the "merging" may need more discussions. Happy to add that to this PR or create another PR for that if it makes sense and is clear what should be done. I think even when dnsPolicy is ClusterFirst it is important to allow customization.
**Release note**:
```kubelet: add support for copying "options" from /etc/resolv.conf (or --resolv-conf if it is used) into pod's /etc/resolv.conf when dnsPolicy is Default.```
Automatic merge from submit-queue (batch tested with PRs 53747, 54528, 55279, 55251, 55311). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
a typo in dockershim.cm.containerManager.doWork
Signed-off-by: yanxuean <yan.xuean@zte.com.cn>
**What this PR does / why we need it**:
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
/sig node
Automatic merge from submit-queue (batch tested with PRs 54177, 55203, 55120, 55275, 55260). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: provide an option to disable docker's live-restore
**What this PR does / why we need it**:
Provide an option to disable docker's live-restore for COS/ubuntu images on GCE. Some newer COS images have live-restore enabled by default. This allows users to override the option if needed.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
GCE: provide an option to disable docker's live-restore on COS/ubuntu
```
Automatic merge from submit-queue (batch tested with PRs 55331, 55272, 55228, 49763, 55242). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
use versiond group clients from client-go
**What this PR does / why we need it**:
Some **Deprecated** group clients are still used, replace them with versioned group clients.
**Which issue this PR fixes**: fixes#49760
**Special notes for your reviewer**:
/assign @caesarxuchao
**Release note**:
```release-note
NONE
```
Since kubenet externally guarantees that IP address will not conflict,
we can short-circuit the kernel's normal wait. This lets us avoid the 1
second network wait.
Automatic merge from submit-queue (batch tested with PRs 55114, 52976, 54871, 55122, 55140). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Make CRI logs parsing to a library
**What this PR does / why we need it**:
Make CRI logs parsing to a library.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55136
**Special notes for your reviewer**:
**Release note**:
```release-note
Add CRI log parsing library at pkg/kubelet/apis/cri/logs
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Clean up redundant DNS related codes
**What this PR does / why we need it**:
As https://github.com/kubernetes/kubernetes/pull/54773#discussion_r148904955 described, resolv.conf setup for pod is handled by `generatePodSandboxConfig()`, though we have some redundant DNS related codes in `GenerateRunContainerOptions()` which seems to have no effect.
This PR cleans up the ineffective codes and rearranges the cluster DNS unit test and hopefully it would be less confusing.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#55201
**Special notes for your reviewer**:
cc @Random-Liu @phsiao
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55034, 55068). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Close the file before renaming in FileStore
Also change the unit test to use a real file system to detect errors
like this.
Automatic merge from submit-queue (batch tested with PRs 53679, 51063). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fixes to enable Windows CNI
**What this PR does / why we need it**:
This PR has fixed which enables Kubelet to use Windows CNI plugin.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
#49646
**Special notes for your reviewer**:
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 55050, 53464, 54936, 55028, 54928). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix panic in kubelet because of uninitialized map
**What this PR does / why we need it**:
Initialized the uninitialized map in kubelet
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes [#54927](https://github.com/kubernetes/kubernetes/issues/54927)
**Special notes for your reviewer**:
The default value of --enable-controller-attach-detach is true, map will be initialized like:
```
if kl.enableControllerAttachDetach {
if node.Annotations == nil {
node.Annotations = make(map[string]string)
}
...
}
```
if set --enable-controller-attach-detach to false, map will have no Initialized.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55050, 53464, 54936, 55028, 54928). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubelet: dockershim: remove orphaned checkpoint files
Fixes https://github.com/kubernetes/kubernetes/issues/55070
Currently, `ListPodSandbox()` returns a combined list of sandboxes populated from both the runtime and the dockershim checkpoint files. However the sandboxes in the checkpoint files might not exist anymore.
The kubelet sees the sandbox returned by `ListPodSandbox()` and determines it shouldn't be running and calls `StopPodSandbox()` on it. This generates an error when `StopContainer()` is called as the container does not exist. However the checkpoint file is not cleaned up. This leads to subsequent calls to `StopPodSandbox()` that fail in the same way each time.
This PR removes the checkpoint file if StopContainer fails due to container not found.
The only other place `RemoveCheckpoint()` is called, except if it is corrupt, is from `RemoveSandbox()`. If the container does not exist, what `RemoveSandbox()` would have done has been effectively been done already. So this is just clean up.
@derekwaynecarr @eparis @freehan @dcbw
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
StopPodSandbox should not log when container is already removed
**What this PR does / why we need it**:
StopPodSandbox should not log when a container is already gone. It should only log if it could not stop and the container was still present.
Fixes https://github.com/kubernetes/kubernetes/issues/55021
**Special notes for your reviewer**:
This was seen in our production logs, need to eliminate spam.
**Release note**:
```release-note
NONE
```
Following are part of this commit
+++++++++++++++++++++++++++++++++
* Windows CNI Support
(1) Support to use --network-plugin=cni
(2) Handled platform requirement of calling CNI ADD for all the containers.
(2.1) For POD Infra container, netNs has to be empty
(2.2) For all other containers, sharing the network namespace of POD container,
should pass netNS name as "container:<Pod Infra Container Id>", same as the
NetworkMode of the current container
(2.3) The Windows CNI plugin has to handle this to call into Platform.
Sample Windows CNI Plugin code to be shared soon.
* Sandbox support for Windows
(1) Sandbox support for Windows. Works only with Docker runtime.
(2) Retained CONTAINER_NETWORK as a backward compatibilty flag,
to not break existing deployments using it.
(3) Works only with CNI plugin enabled.
(*) Changes to reinvoke CNI ADD for every new container created. This is hooked up with PodStatus,
but would be ideal to move it outside of this, once we have CNI GET support
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add admission handler for device resources allocation
**What this PR does / why we need it**:
Add admission handler for device resources allocation to fail fast during pod creation
**Which issue this PR fixes**
fixes#51592
**Special notes for your reviewer**:
@jiayingz Sorry, there is something wrong with my branch in #51895. And I think the existing comments in the PR might be too long for others to view. So I closed it and opened the new one, as we have basically reach an agreement on the implement :)
I have covered the functionality and unit test part here, and would set about the e2e part ASAP
/cc @jiayingz @vishh @RenaudWasTaken
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add a file store utility package in kubelet
More and more components checkpoints (i.e., persist their states) in
kubelet. Refurbish and move the implementation in dockershim to a
utility package to improve code reusability.
Automatic merge from submit-queue (batch tested with PRs 52367, 53363, 54989, 54872, 54643). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Lift embedded structure out of ManifestURLHeader field
Related: #53833
```release-note
It is now possible to set multiple manifest url headers via the Kubelet's --manifest-url-header flag. Multiple headers for the same key will be added in the order provided. The ManifestURLHeader field in KubeletConfiguration object (kubeletconfig/v1alpha1) is now a map[string][]string, which facilitates writing JSON and YAML files.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
kubenet: yield lock while executing CNI plugin.
The CNI plugin can take up to 3 seconds to execute. CNI plugins can safely be
executed in parallel, so yield the lock to speed up pod creation.
This caused problems with the pod latency tests - previously, CNI plugins executed
in under 20ms. Now they must wait for DAD to finish and addresses to leave
tentative state.
Fixes: #54651
**What this PR does / why we need it**:
After upgrading CNI plugins to v0.6 in #51250, the pod latency tests began failing. This is because the plugins, in order to support IPv6, need to wait for DAD to finish. Because this
delay is while the kubenet lock is held, it significantly slows down the pod creation rate.
**Special notes for your reviewer**:
The CNI plugins also do locking for their critical paths, so it is safe to run them concurrently.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 54894, 54630, 54828, 54926, 54865). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
set leveled logging (v=4) for 'updating container' message
**What this PR does / why we need it**:
Currently cpu_manager.go logs a line for every pod at every reconcilePeriod (10 sec default) when it reconciles and updates the pod's cpuset setting. This creates a lot of logging information that is not very interesting and we should suppress that by default by increasing the logging level.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#54804
**Special notes for your reviewer**:
I chose V(4) because that seems to be a popular level for messages at this detail. Happy to follow logging guideline if there is any.
**Release note**:
``` kubelet: cpu_manager logs informative reconcile message at V(4) to reduce clutter ```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update volume OWNERS to reflect active sig-storage reviewers
**What this PR does / why we need it**:
Update sig-storage reviewers to add new members and remove those that don't have as much time to review storage PRs. Approvers are unchanged.
**Special notes for your reviewer**:
For all those that have been removed, please approve. If you want to remain as a reviewer, let me know and I will add you back.
**Release note**:
NONE
Automatic merge from submit-queue (batch tested with PRs 53962, 54708). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Prevent successful containers from restarting with OnFailure restart policy
**What this PR does / why we need it**:
This is a follow-on to #54597 which makes sure that its validation
also applies to pods with a restart policy of OnFailure. This
deficiency was pointed out by @smarterclayton here:
https://github.com/kubernetes/kubernetes/pull/54530#discussion_r147226458
**Which issue this PR fixes** This is another fix to address #54499
**Release note**:
```release-note
NONE
```