Commit Graph

114639 Commits

Author SHA1 Message Date
Joe Betz
7bbda746fe Implement secondary authz 2023-03-06 12:08:14 -05:00
Jan Safranek
9ca548fcf0 Add metrics for force cleaned mounts after failed reconstruction
Count nr. of force cleaned mounts + their failures after a volume fails
reconstruction.
2023-03-06 17:48:59 +01:00
Kubernetes Prow Robot
0270fc75d0
Merge pull request #116256 from atiratree/unmanaged-pods
improve message, log level and testing for unmanaged pods in disruption controller
2023-03-06 08:19:14 -08:00
Kubernetes Prow Robot
ff27ccfabc
Merge pull request #116255 from danwinship/controller-manager-ipv6dualstack
Belatedly remove controller-manager IPv6DualStack feature gate
2023-03-06 08:19:05 -08:00
Kubernetes Prow Robot
6bfa9371cf
Merge pull request #115978 from seans3/discovery-empty-response
"empty response" not logged as error in memcache discovery client
2023-03-06 08:18:56 -08:00
Kubernetes Prow Robot
388ad23561
Merge pull request #115968 from stlaz/sc_accessors
add SeccompProfile to Pod and Container accessors/mutators
2023-03-06 08:18:41 -08:00
Kubernetes Prow Robot
d6e9cff212
Merge pull request #115838 from torredil/remove-aws
Remove AWS legacy cloud provider + EBS in-tree storage plugin
2023-03-06 08:18:29 -08:00
Kubernetes Prow Robot
778b24c97e
Merge pull request #116297 from p0lyn0mial/upstream-reflector-list-n-watch-refactor
reflector: extract watch and startResyncAsync methods
2023-03-06 07:10:41 -08:00
Kubernetes Prow Robot
890d39f976
Merge pull request #114640 from swatisehgal/handle-device-mgr-recovery
node: device-mgr: Handle recovery flow by checking if healthy devices exist
2023-03-06 07:10:28 -08:00
Kubernetes Prow Robot
4185bf7238
Merge pull request #116273 from sourcelliu/improveresource
Improve the performance when Resource Clone
2023-03-06 06:06:41 -08:00
Kubernetes Prow Robot
68eea2468c
Merge pull request #114572 from huyinhou/fix-concurrent-map-access
kubelet/deviceplugin: fix concurrent map iteration and map write
2023-03-06 06:06:29 -08:00
torredil
6aebda9b1e Remove AWS legacy cloud provider + EBS in-tree storage plugin
Signed-off-by: torredil <torredil@amazon.com>
2023-03-06 14:01:15 +00:00
Swati Sehgal
937d330393 node: topologymgr: Remove ResourceAllocator as TM is always enabled
With Topology Manager enabled by default, we no longer need
`resourceAllocator` as Topology Manager serves as the main
PodAdmitHandler completely responsible for admission check
based on hints received from the hintProviders and the
subsequent allocation of the corresponding resources to a
pod as can be seen here:
https://github.com/kubernetes/kubernetes/blob/v1.26.0/pkg/kubelet/cm/topologymanager/scope.go#L150

With regard to DRA, the passing of `cm.draManager` into
resourceAllocator seems redundant as no admission checks
(and allocation of resources handled by DRA) is taking place
in `Admit` method of resourceAllocator. DRA has a completely
different model to the rest of the resource managers where
pod is only scheduled on a node once resources are reserved
for it. Because of this, admission checks or waiting for
resources to be provisioned after the pod has been scheduled
on the node is not required.

Before making the above change, it was verified that DRA Manager
is instantiated in `NewContainerManager`:
https://github.com/kubernetes/kubernetes/blob/v1.26.0/pkg/kubelet/cm/container_manager_linux.go#L318

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:51:11 +00:00
Swati Sehgal
6a62f0236a node: topologymgr: trivial internal variable renaming
Since Topology manager is graduating to GA, we remove
internal configuration variable names with `Experimental`
prefix.

There is no expected change in behavior, only trival
variable renaming.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:51:11 +00:00
Swati Sehgal
d536a342b4 node: topologymgr: GA graduation implies Feature Gate is ON by default
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:51:05 +00:00
Swati Sehgal
04438aa6f8 node: topologymgr: Graduate Kubelet Topology Manager to GA
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:47:59 +00:00
Lukasz Szaszkiewicz
34fe27355b reflector: extract watch and startResyncAsync methods 2023-03-06 13:40:35 +01:00
Kubernetes Prow Robot
30df862563
Merge pull request #115119 from seans3/openapi-query-param-v3
Open API V3 version of QueryParamVerifier
2023-03-06 04:40:29 -08:00
Swati Sehgal
01a9148887 node: device-mgr: e2e: adapt to sample device plugin refactoring
These updates are to adapt to the sample device plugin
refactoring done here: 92e00203e0.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:15:59 +00:00
Swati Sehgal
bae8a164e0 node: device-mgr: e2e: address e2e test review comments
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:15:58 +00:00
Swati Sehgal
674879a959 node: device-mgr: e2e: Update the e2e test to reproduce issue:109595
Breakdown of the steps implemented as part of this e2e test is as follows:
1. Create a file `registration` at path `/var/lib/kubelet/device-plugins/sample/`
2. Create sample device plugin with an environment variable with
   `REGISTER_CONTROL_FILE=/var/lib/kubelet/device-plugins/sample/registration` that
    waits for a client to delete the control file.
3. Trigger plugin registeration by deleting the abovementioned directory.
4. Create a test pod requesting devices exposed by the device plugin.
5. Stop kubelet.
6. Remove pods using CRI to ensure new pods are created after kubelet restart.
7. Restart kubelet.
8. Wait for the sample device plugin pod to be running. In this case,
   the registration is not triggered.
9. Ensure that resource capacity/allocatable exported by the device plugin is zero.
10. The test pod should fail with `UnexpectedAdmissionError`
11. Delete the test pod.
12. Delete the sample device plugin pod.
13. Remove `/var/lib/kubelet/device-plugins/sample/` and its content, the directory
    created to control registration

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 12:15:58 +00:00
Swati Sehgal
db7afc1cd8 node: device-mgr: e2e: Implement End to end test
This commit reuses e2e tests implmented as part of https://github.com/kubernetes/kubernetes/pull/110729.
The commit is borrowed from the aforementioned PR as is to preserve
authorship. Subsequent commit will update the end to end test to
simulate the problem this PR is trying to solve by reproducing
the issue: 109595.

Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 11:52:23 +00:00
Swati Sehgal
5b2a3dbbdc node: device-mgr: explicitly check if pre-allocated devices are healthy
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 11:52:23 +00:00
Swati Sehgal
a799ffb571 node: device-mgr: unit-tests: admission failure due to unhealthy devices
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 11:52:23 +00:00
Swati Sehgal
7ac399c205 node: device-mgr: Handle recovery by checking if healthy devices exist
In case of node reboot/kubelet restart, the flow of events involves
obtaining the state from the checkpoint file followed by setting
the `healthDevices`/`unhealthyDevices` to its zero value. This is
done to allow the device plugin to re-register itself so that
capacity can be updated appropriately.

During the allocation phase, we need to check if the resources requested
by the pod have been registered AND healthy devices are present on
the node to be allocated.

Also we need to move this check above `needed==0` where needed is
required - devices allocated to the container (which is obtained from
the checkpoint file) because even in cases where no additional devices
have to be allocated (as they were pre-allocated), we still need to
make the devices that were previously allocated are healthy.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-03-06 11:52:23 +00:00
Kubernetes Prow Robot
b6acf6f805
Merge pull request #116294 from p0lyn0mial/upstream-flaky-getcurrentrvfromstorage
cacher: deflake TestGetCurrentResourceVersionFromStorage
2023-03-06 03:36:30 -08:00
Wojciech Tyczyński
280651abcc Autogenerated 2023-03-06 12:08:34 +01:00
Wojciech Tyczyński
760acbbbe3 Bump QPS limits for Kubelet 2023-03-06 12:07:52 +01:00
SataQiu
528a471302 remove unused resize.go from pkg/kubelet/container 2023-03-06 18:33:13 +08:00
Lukasz Szaszkiewicz
8fd9d573f0 cacher: deflake TestGetCurrentResourceVersionFromStorage 2023-03-06 11:30:39 +01:00
Kubernetes Prow Robot
b8aaaf380a
Merge pull request #116083 from SataQiu/clean-20230227
kubelet: remove unused DockerID type
2023-03-06 02:22:58 -08:00
Alexander Constantinescu
ec917850af Add proxy healthz result to ETP=local health check
Today, the health check response to the load balancers asking Kube-proxy for
the status of ETP:Local services does not include the healthz state of Kube-
proxy. This means that Kube-proxy might indicate to load balancers that they
should forward traffic to the node in question, simply because the endpoint
is running on the node - this overlooks the fact that Kube-proxy might be
not-healthy and hasn't successfully written the rules enabling traffic to
reach the endpoint.
2023-03-06 10:53:17 +01:00
vinay kulkarni
b0dce923f1 Add Get interfaces for container's checkpointed ResourcesAllocated and Resize values, remove error logging for valid standalone kubelet scenario 2023-03-06 09:50:12 +00:00
Kubernetes Prow Robot
931e07de16
Merge pull request #116284 from thockin/codegen_subprojects_cleanup_verify
Codegen: subprojects: clean up verify scripts
2023-03-06 00:14:59 -08:00
Alex Wang
13b941e120 feat: graduate matchLabelKeys in podTopologySpread to beta 2023-03-06 14:46:17 +08:00
huyinhou
88274d96fc update code style
Signed-off-by: huyinhou <huyinhou@bytedance.com>
2023-03-06 14:23:14 +08:00
csDengh
f762145e06
minor code improvement
minor code improvement 
from repeated assignments in loops to initialize outside the loop
2023-03-06 09:00:40 +08:00
Kensei Nakada
608f4808ff support PreFilter as well 2023-03-06 00:48:30 +00:00
Tim Hockin
357bfbc436
Codegen: subprojects: clean up verify scripts
They all run successfully.
2023-03-05 15:05:26 -08:00
Kubernetes Prow Robot
fafa45d13c
Merge pull request #116279 from bart0sh/PR105-fix-CDI-spec-version
DRA: fix CDI spec version
2023-03-05 12:22:57 -08:00
Ed Bartosh
35fd124f4d DRA: fix CDI spec version
The latest CDI release includes spec version check that fails
if version is less than 0.3.0:
  https://github.com/container-orchestrated-devices/container-device-interface/blob/v0.5.4/pkg/cdi/version.go#L42

Updating CDI spec version to 0.3.0 in the test kubelet plugin code
should fix e2e test failures on the CRI runtimes that use CDI >= 0.5.4
(Containerd master atm, CRI-O soon).
2023-03-05 16:49:56 +02:00
Kubernetes Prow Robot
bbbbfcd967
Merge pull request #116266 from SergeyKanzhelev/ExperimentalPodPidsLimit
rename ExperimentalPodPidsLimit to PodPidsLimit
2023-03-05 06:30:56 -08:00
Mateusz Puczyński
d1877f514a
adjust comment prefixes in k8s.io/api/apps/v1beta1/types.go 2023-03-04 21:20:24 +01:00
Mateusz Puczyński
f74724a3f4
update obsolete links 2023-03-04 19:57:52 +01:00
mantuliu
83fdbd76a1 Improve the performance when Resource Clone
Signed-off-by: mantuliu <240951888@qq.com>
2023-03-05 00:35:51 +08:00
vinay kulkarni
12435b26fc Fix nil pointer access panic in kubelet from uninitialized pod allocation checkpoint manager in standalone kubelet scenario 2023-03-04 08:07:40 +00:00
Kubernetes Prow Robot
d48b8167f7
Merge pull request #115463 from SergeyKanzhelev/containerStatusDocs
update docs for ContainerStatus fields
2023-03-03 20:17:06 -08:00
Yoon Park
8d2c81e7ec Fix comments at fit_test.go to increase redability 2023-03-04 13:03:15 +09:00
SataQiu
eb541bb819 controller-manager: fix a bug that the kubeconfig field of kubecontrollermanager.config.k8s.io configuration is not set correctly 2023-03-04 11:17:55 +08:00
Sergey Kanzhelev
04189b1fc4 rename ExperimentalPodPidsLimit to PodPidsLimit 2023-03-04 01:48:16 +00:00