Commit Graph

127106 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
af41aa1d9f
Merge pull request #128664 from carlory/fix-node
mark the status of fake node as NotReady
2024-11-07 16:10:52 +00:00
Kubernetes Prow Robot
a660b51201
Merge pull request #128498 from googs1025/flake/TestReconcilerAPIServerLeaseMultiCombined
flake(TestReconcilerAPIServerLeaseMultiCombined): fix TestReconcilerAPIServerLeaseMultiCombined flake
2024-11-07 16:10:43 +00:00
Kubernetes Prow Robot
e5f5975f96
Merge pull request #128472 from sanposhiho/qhint-beta
feat: graduate SchedulerQueueingHints to beta
2024-11-07 16:10:36 +00:00
Kubernetes Prow Robot
c93ba4e96f
Merge pull request #124817 from carlory/cleanup-VolumePluginMgr
remove loggedDeprecationWarnings from VolumePluginMgr
2024-11-07 16:10:28 +00:00
Francesco Romani
2a99bfc3d1 node: cm: don't share containerMap instances between managers
Since the GA graduation of memory manager in https://github.com/kubernetes/kubernetes/pull/128517
we are sharing the initial container map across managers.

The intention of this sharing was not to actually share a data
structure, but
1. save the relatively expensive relisting from runtime
2. have all the managers share a consistent view - even though the
   chance for misalignement tend to be tiny.

The unwanted side effect though is now all the managers race
to modify a data shared, not thread safe data structure.

The fix is to clone (deepcopy) the computed map when passing it
to each manager. This restores the old semantic of the code.

This issue brings the topic of possibly managers go out of sync
since each of them maintain a private view of the world.
This risk is real, yet this is how the code worked for
most of the lifetime, so the plan is to look at this and evaluate
possible improvements later on.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2024-11-07 16:02:55 +01:00
Jordan Liggitt
ecb5fc450b
Update staging docs to add externaljwt 2024-11-07 09:59:27 -05:00
Kubernetes Prow Robot
33c64b380a
Merge pull request #128646 from pohly/dra-kubelet-separate-beta-api
DRA kubelet: separate beta and alpha gRPC APIs
2024-11-07 14:57:45 +00:00
Kubernetes Prow Robot
e30492f77a
Merge pull request #128495 from olyazavr/refresh-probed-plugins
refresh probed plugins on init to avoid probe race/erroneous unmounts
2024-11-07 14:57:37 +00:00
Omer Aplatony
9d816f1587
Replace PollImmediate with PollUntilContextTimeout (#128147)
* Replace PollImmediate with PollUntilContextTimeout

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* Add context to RetryErrorCondition function

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* lint: fix error comparison in scale package

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* Fix RetryErrorCondition function signature

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

* revert to if err statement

Signed-off-by: Omer Aplatony <omerap12@gmail.com>

---------

Signed-off-by: Omer Aplatony <omerap12@gmail.com>
2024-11-07 14:57:29 +00:00
Yuki Iwai
5dda60ee4e Job: Add evaluation step comments in the syncJob
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-11-07 23:36:44 +09:00
Stanislav Láznička
9705024349
make update 2024-11-07 15:33:39 +01:00
Stanislav Láznička
c8b5401175
move CTB alpha deprecation 2 releases 2024-11-07 15:33:34 +01:00
Ben Luddy
42d3e9752c
Add E2E test for CBOR client compatibility with older apiservers.
Clients must be able to use CBOR without a guarantee that all apiservers support it. The apiserver
aggregation layer avoids changing in any way that would require an aggregated apiservers to be
updated. This end-to-end test verifies that a client's content negotiation behaviors continue to
work over time when communicating with a 1.17 sample-apiserver.
2024-11-07 09:22:44 -05:00
Ben Luddy
a77f4c7ba2
Fix content type fallback when a client defaults to CBOR.
With the ClientsAllowCBOR client-go feature gate enabled, a 415 response to a CBOR-encoded REST
causes all subsequent requests from the client to fall back to a JSON request encoding. This
mechanism had only worked as intended when CBOR was explicitly configured in the
ClientContentConfig. When both ClientsAllowCBOR and ClientsPreferCBOR are enabled, an
unconfigured (empty) content type defaults to CBOR instead of JSON. Both ways of configuring a
client to use the CBOR request encoding are now subject to the same fallback mechanism.
2024-11-07 09:14:59 -05:00
Maciej Skoczeń
379bff8dc9 Fix pod scale down failure in EventHandlingPodUpdate scheduler_perf test case 2024-11-07 13:48:50 +00:00
Kubernetes Prow Robot
c9024e7ae6
Merge pull request #128640 from mengqiy/spreadkubeletlaod
Add random interval to nodeStatusReport interval every time after an actual node status change
2024-11-07 13:48:03 +00:00
Kubernetes Prow Robot
ef37cb503b
Merge pull request #128634 from thockin/remove_PodHostIPs_gate_for_1.32
Remove PodHostIPs feature gates
2024-11-07 13:47:54 +00:00
Kubernetes Prow Robot
7667a68b72
Merge pull request #128383 from carlory/cleanup-codes
remove csi translator from volume operation generator
2024-11-07 13:47:46 +00:00
Kubernetes Prow Robot
52ebcb11e7
Merge pull request #128148 from bzsuni/bz/dependence/opencontainers/selinux/v1.11.1
Dependences: update opencontainers/selinux to v1.11.1
2024-11-07 13:47:37 +00:00
Kubernetes Prow Robot
40498ce561
Merge pull request #127224 from utam0k/test-qhint-csi
Add integration test for NodeVolumeLimits in requeueing scenarios
2024-11-07 13:47:29 +00:00
carlory
ba70c764c0 mark the status of fake node as NotReady 2024-11-07 21:26:22 +08:00
zhifei92
bed96b4eb6 fix: fix the issue of losing the pending phase after a node restart. 2024-11-07 21:10:11 +08:00
huweiwen
fd2dbe0d68 kubelet: don't check for mounted before update dsw PV size
We are still only calling NodeExpand after the volume is mounted.

avoid depending on ASW from dswp.findAndAddNewPods(). It is weird to determine desired state based on actual state.
2024-11-07 20:59:54 +08:00
Kensei Nakada
b96eee847e feat: graduate SchedulerQueueingHints to beta 2024-11-07 21:45:18 +09:00
Kubernetes Prow Robot
48ead4e622
Merge pull request #128648 from pohly/dra-scheduler-perf-flake
scheduler_perf: fix steady-state pod creation/deletion
2024-11-07 12:33:41 +00:00
Kubernetes Prow Robot
9729ac8c6f
Merge pull request #128637 from jpbetz/fix-mutating-admission-defaulting
Bug fix: MutatingAdmissionPolicy should default builtin types after each mutation
2024-11-07 12:33:30 +00:00
HirazawaUi
ecf2b402be remove runonce mode 2024-11-07 19:54:11 +08:00
Kubernetes Prow Robot
4391d09367
Merge pull request #128630 from SergeyKanzhelev/cancelOldDRAPluginContext
call cancel on plugin that is replaced by another plugin with the same name
2024-11-07 11:13:36 +00:00
Kubernetes Prow Robot
9a9331afd6
Merge pull request #124952 from AxeZhan/maxContainerRestarts
[Sidecar Containers] Pods comparison by maxContainerRestarts should account for sidecar containers
2024-11-07 11:13:30 +00:00
utam0k
e828a4b40a
Add integration test for NodeVolumeLimits in requeueing scenarios
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-11-07 19:51:50 +09:00
Lionel Jouin
c1dd8c6d04 [KEP-4817] UPDATE_API_KNOWN_VIOLATIONS=true ./hack/update-codegen.sh
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 11:32:24 +01:00
Lionel Jouin
d28b50e0a0 [KEP-4817] make update
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 10:36:09 +01:00
Lionel Jouin
39f55e1cd0 [KEP-4817] Add data length limit (from #128601)
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 10:35:29 +01:00
Lionel Jouin
7e0035ec86 [KEP-4817] Update to v1beta1
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin
4b76ba1a87 [KEP-4817] Rename Addresses to IPs
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin
43d23b8994 [KEP-4817] Use structured.MakeDeviceID
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin
8ab33b8413 [KEP-4817] Improve NetworkData Validation
* Add max length for InterfaceName and HardwareAddress
* Prevent duplicated Addresses

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin
a062f91106 [KEP-4817] Fixes based on review
* Rename HWAddress to HardwareAddress
* Fix condition validation
* Remove feature gate validation
* Fix drop field on disabled feature gate

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin
5df47a64d3 [KEP-4817] Remove unnecessary DeepCopy in validation tests
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:56 +01:00
Lionel Jouin
cb9ee1d4fe [KEP-4817] Remove pointer on Data, InterfaceName and HWAddress fields
Adapat validation and tests based on these API changes

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:59:51 +01:00
Lionel Jouin
8be335a755 [KEP-4817] E2E: Update ResourceClaim.Status.Devices
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:19 +01:00
Lionel Jouin
c59359289f [KEP-4817] Drop deallocated devices from resourceclaim.status.devices
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:19 +01:00
Lionel Jouin
5d7a16b0a5 [KEP-4817] improve testing
* Test feature-gate enabled/disabled for validation
* Test pkg/registry/resource/resourceclaim
* Add Data and NetworkData to integration test

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:19 +01:00
Lionel Jouin
4bd62e5234 [KEP-4817] Fix fuzz API tests and ./hack/update-featuregates.sh
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:19 +01:00
Lionel Jouin
3e595db0af [KEP-4817] API, validation and feature-gate
* Add status
* Add validation to check if fields are correct (Network field, device
  has been allocated))
* Add feature-gate
* Drop field if feature-gate not set

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-11-07 09:54:17 +01:00
Kubernetes Prow Robot
09e5e6269a
Merge pull request #128626 from dims/add-go-spew-to-unwanted-dependencies-we-track
Add go-spew to unwanted dependencies we track
2024-11-07 08:51:38 +00:00
Kubernetes Prow Robot
1ac23e24a0
Merge pull request #127956 from carlory/KEP-3902-test
node-lifecycle-controller: improve processPod test-coverage
2024-11-07 08:51:30 +00:00
Patrick Ohly
0301b6b504 scheduler_perf: fix steady-state pod creation/deletion
This fixes an issue in
TestSchedulerPerf/SteadyStateClusterResourceClaimTemplate:

    scheduler_perf.go:1542: FATAL ERROR: op 7: delete scheduled pods: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline

That occurs when the test is almost done, but hasn't observed all scheduled
pods yet. The previous attempt to address this error wasn't actually 100%
correct. It covered the case when the context has already been canceled, but
not this particular "will reach deadline soon".
2024-11-07 09:36:36 +01:00
bzsuni
f2ff07fcfa update opencontainers/selinux/go-selinux to v1.11.1
Signed-off-by: bzsuni <bingzhe.sun@daocloud.io>
2024-11-07 08:22:25 +00:00
Sreeram Venkitesh
8f1e69bbb0 Fix verify-gofmt.sh 2024-11-07 13:28:40 +05:30