Commit Graph

115766 Commits

Author SHA1 Message Date
Swati Sehgal
9697573703 node: device-mgr: e2e: adapt to sample device plugin refactoring
These updates are to adapt to the sample device plugin
refactoring done here: 92e00203e0.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-04-28 14:43:25 +01:00
Swati Sehgal
282a6a80b9 node: device-mgr: e2e: Update the e2e test to reproduce issue:109595
Breakdown of the steps implemented as part of this e2e test is as follows:
1. Create a file `registration` at path `/var/lib/kubelet/device-plugins/sample/`
2. Create sample device plugin with an environment variable with
   `REGISTER_CONTROL_FILE=/var/lib/kubelet/device-plugins/sample/registration` that
    waits for a client to delete the control file.
3. Trigger plugin registeration by deleting the abovementioned directory.
4. Create a test pod requesting devices exposed by the device plugin.
5. Stop kubelet.
6. Remove pods using CRI to ensure new pods are created after kubelet restart.
7. Restart kubelet.
8. Wait for the sample device plugin pod to be running. In this case,
   the registration is not triggered.
9. Ensure that resource capacity/allocatable exported by the device plugin is zero.
10. The test pod should fail with `UnexpectedAdmissionError`
11. Delete the test pod.
12. Delete the sample device plugin pod.
13. Remove `/var/lib/kubelet/device-plugins/sample/` and its content, the directory
    created to control registration

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-04-28 14:43:17 +01:00
Swati Sehgal
d509e79837 node: device-mgr: e2e: Implement End to end test
This commit reuses e2e tests implmented as part of https://github.com/kubernetes/kubernetes/pull/110729.
The commit is borrowed from the aforementioned PR as is to preserve
authorship. Subsequent commit will update the end to end test to
simulate the problem this PR is trying to solve by reproducing
the issue: 109595.

Co-authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-04-28 14:41:38 +01:00
Swati Sehgal
dc1a592632 node: device-mgr: Handle recovery by checking if healthy devices exist
In case of node reboot/kubelet restart, the flow of events involves
obtaining the state from the checkpoint file followed by setting
the `healthDevices`/`unhealthyDevices` to its zero value. This is
done to allow the device plugin to re-register itself so that
capacity can be updated appropriately.

During the allocation phase, we need to check if the resources requested
by the pod have been registered AND healthy devices are present on
the node to be allocated.

Also we need to move this check above `needed==0` where needed is
required - devices allocated to the container (which is obtained from
the checkpoint file) because even in cases where no additional devices
have to be allocated (as they were pre-allocated), we still need to
make sure he devices that were previously allocated are healthy.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2023-04-28 14:41:30 +01:00
Kubernetes Prow Robot
f66e1a3386
Merge pull request #116685 from czybjtu/fix_lease_remove_endpoints
Remove last endpoint for kubernetes Service during graceful shutdown of final kube-apiserver
2023-04-28 06:02:16 -07:00
Paweł Banaszewski
53c9103a1d Set ENABLE_AUTH_PROVIDER_GCP to true in gce tests 2023-04-28 11:47:08 +00:00
Kubernetes Prow Robot
9924dc65b7
Merge pull request #117614 from chendave/multi_cri
kubeadm: fix unit test failure on node with multiple cri endpoints
2023-04-28 01:30:16 -07:00
Dave Chen
2572a43034 kubeadm: fix unit test failure on node with multiple cri endpoints
Signed-off-by: Dave Chen <dave.chen@arm.com>
2023-04-28 15:00:16 +08:00
carlory
f81b49d873 Remove ability to re-enable serving deprecated eventv1beta1 APIs 2023-04-28 14:58:59 +08:00
carlory
48d01d6d9b Remove ability to re-enable serving deprecated batchapiv1beta1 APIs 2023-04-28 14:08:31 +08:00
Paco Xu
c6f4bee98d kubeadm: add deprecated FG UpgradeAddonsBeforeControlPlane 2023-04-28 13:55:46 +08:00
Kubernetes Prow Robot
28247d53d2
Merge pull request #117595 from mowangdk/cleanup/cleanup_new_added_tests
Chore: add ipfamilies tweak functions
2023-04-27 22:46:15 -07:00
Shiming Zhang
4d9261c756 Fix LocationOfOrigin shows up unexpectedly 2023-04-28 13:04:04 +08:00
Kubernetes Prow Robot
7077491f68
Merge pull request #117237 from yulng/cleanpolicy
Remove GA feature gates in 1.28 about network
2023-04-27 20:48:15 -07:00
Akhil Mohan
76fe41a996
chore: update cgroups and ttrpc versions
- update github.com/containerd/cgroups to v1.1.0
- update github.com/containerd/ttrpc to v1.2.1

Signed-off-by: Akhil Mohan <akhilerm@gmail.com>
2023-04-27 20:46:23 -07:00
yulng
0dbeff4b6e
remove GA feature gates in 1.28 about network
Signed-off-by: yulng <wei.yang@daocloud.io>
2023-04-28 10:51:37 +08:00
Kubernetes Prow Robot
0d0870e5b4
Merge pull request #117654 from SergeyKanzhelev/initContainerTests
added init containers tests to simplify the sidecar KEP large PR
2023-04-27 16:46:15 -07:00
Sergey Kanzhelev
fc0d2cd32f added init containers tests to simplify the sidecar KEP large PR 2023-04-27 22:42:49 +00:00
Dan Winship
258c4c4251 Remove duplicated config fields from ProxyServer
Rather than duplicating some of the KubeProxyConfiguration into
ProxyServer, just store the KubeProxyConfiguration itself so later
code can reference it directly.

For the fields that get platform-specific defaults (Mode,
DetectLocalMode), fill the defaults directly into the
KubeProxyConfiguration rather than keeping the original there and the
defaulted version in the ProxyServer.
2023-04-27 15:43:35 -04:00
Dan Winship
9d4f10f5d2 Fix up detect-local-mode validation
Validate the --detect-local-mode value in the API object validation
rather than doing it separately later. Also, remove runtime checks and
unit tests for cases that would be blocked by validation
2023-04-27 15:43:35 -04:00
Kubernetes Prow Robot
00eee07272
Merge pull request #117641 from wojtek-t/cleanup_cacher_tests_847
Refactor some watchcache tests
2023-04-27 12:28:41 -07:00
Kubernetes Prow Robot
299db84401
Merge pull request #117057 from ffromani/e2e-device-plugin-test-fixes
node: e2e device plugin test improvements
2023-04-27 12:28:34 -07:00
Kubernetes Prow Robot
b44482a37c
Merge pull request #116797 from mengjiao-liu/contextual-looging-scheduler-plugin-podtopologyspread
Migrated `pkg/scheduler/framework/plugins/podtopologyspread` to contextual logging
2023-04-27 12:28:27 -07:00
Kubernetes Prow Robot
a38efaccc0
Merge pull request #116748 from mengjiao-liu/contextual-logging-scheduler-plugin-noderesource
Migrated `pkg/scheduler/framework/plugins/noderesources` to contextual logging
2023-04-27 12:28:15 -07:00
Lars Ekman
5ece6541b8 proxy/ipvs: don't bind nodeips to the dummy device 2023-04-27 21:02:25 +02:00
Lars Ekman
5310305098 proxy/ipvs: add a GetAllLocalAddressesExcept() function 2023-04-27 21:02:20 +02:00
Kubernetes Prow Robot
3a15029a95
Merge pull request #117643 from humblec/etcd
update the etcd base image to v1.4.2
2023-04-27 11:10:27 -07:00
Kubernetes Prow Robot
5170c25609
Merge pull request #116835 from mengjiao-liu/contextual-logging-scheduler-plugin-preemption
Migrated `pkg/scheduler/framework/preemption & defaultpreemption` to use contextual logging
2023-04-27 11:10:16 -07:00
Kubernetes Prow Robot
e8108b5a47
Merge pull request #117651 from humblec/cluster
use go 1.19.x for etcd version monitor compilation
2023-04-27 09:16:16 -07:00
Humble Chirammal
f24d1d2c95 use go 1.19.x for etcd version monitor compilation
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
2023-04-27 20:21:00 +05:30
Kubernetes Prow Robot
926bb80f5d
Merge pull request #117644 from humblec/etcd-1
correct etcd base image reference in the doc
2023-04-27 07:22:17 -07:00
Humble Chirammal
91df71be54 correct etcd base image reference in the doc
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
2023-04-27 18:11:39 +05:30
Kubernetes Prow Robot
041bb9a56c
Merge pull request #117534 from Mskxn/fix_watcher
stop watcher when error occurs
2023-04-27 04:56:14 -07:00
Humble Chirammal
6b40cd8cd3 update test/conformance/image version to v1.4.2
Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
2023-04-27 17:08:04 +05:30
Humble Chirammal
6c8be35fa8 update the etcd base image to v1.4.2
The current base v1.3.0 has many CVEs[1] which are addressed in latest
versions of the bullseye

[1] ex:
CVE-2022-2509
CVE-2021-46828

Signed-off-by: Humble Chirammal <humble.devassy@gmail.com>
2023-04-27 17:06:54 +05:30
Wojciech Tyczyński
1eca720dcc Refactor some watchcache tests 2023-04-27 13:06:01 +02:00
Kubernetes Prow Robot
78b56ce16d
Merge pull request #116570 from SataQiu/fix-kubeadm-20230314
kubeadm: support upgrade coredns and kube-proxy addons after all the control plane instances have been upgraded
2023-04-27 01:44:26 -07:00
Kubernetes Prow Robot
87f3acf7f6
Merge pull request #115398 from tangwz/add_NodeVolumeLimits_PreFilter
feat(NodeVolumeLimits): return Skip in PreFilter
2023-04-27 01:44:14 -07:00
Mengjiao Liu
7f370d651d Migrated pkg/scheduler/framework/plugins/podtopologyspread to contextual logging 2023-04-27 15:55:09 +08:00
Mengjiao Liu
54e6f609ce Migrated pkg/scheduler/framework/plugins/noderesources to contextual logging 2023-04-27 14:46:13 +08:00
Kubernetes Prow Robot
3554bcde87
Merge pull request #117368 from sunnylovestiramisu/fix
Fix nil pointer in test AfterEach for volumeperf.go
2023-04-26 21:54:14 -07:00
mowangdk
152c1a0272 Chore: Replace re-initialized variables with create new ones 2023-04-27 12:18:18 +08:00
Mengjiao Liu
37a9260d5c Migrate pkg/scheduler/framework/plugins/defaultpreemption/default_preemption.go to use contextual logging 2023-04-27 11:28:19 +08:00
Mengjiao Liu
eeb1399383 Migrated pkg/scheduler/framework/preemption to use contextual logging 2023-04-27 11:28:14 +08:00
Kubernetes Prow Robot
8ae8e77560
Merge pull request #117593 from jpbetz/test-join
Fix bug where CEL listOfString.join() results in unexpected error
2023-04-26 16:10:13 -07:00
Kubernetes Prow Robot
dd62a53e1a
Merge pull request #117196 from pohly/scheduler-perf-labels
scheduler_perf: support test case selection via labels
2023-04-26 14:26:14 -07:00
Kubernetes Prow Robot
7adcb3cb37
Merge pull request #117306 from marosset/update-go-winio-dep
updating microsft/go-winio package to latest version
2023-04-26 13:04:14 -07:00
Kubernetes Prow Robot
569788cb20
Merge pull request #117619 from SataQiu/code-clean-20230426
Code clean up for kubeadm
2023-04-26 12:04:27 -07:00
Kubernetes Prow Robot
c5c2806e23
Merge pull request #117571 from seans3/agg-discovery-fix
Fixes bug when extra params added to discovery content-type
2023-04-26 12:04:15 -07:00
Patrick Ohly
550d4c0074 scheduler_perf: support test case selection via labels
Entire test cases and workloads can have labels attached to them. The union of
these must match the label filter which works as in GitHub. The benchmark by
default runs the tests that are labeled "performance", which is the same as
before.
2023-04-26 21:01:31 +02:00