Commit Graph

126337 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
c0e0785fe4
Merge pull request #128427 from dom4ha/scheduler-perf
Fix Unschedulable test by using high priority churn pods to get processed right after they were injected
2024-10-30 22:23:25 +00:00
Joe Betz
c59fba7f26
Promote CRD field selector e2e test to conformance (#128109)
* Promote CRD field selector e2e test to conformance

* Fix release number for conformance test

* re-run update conformance
2024-10-30 21:19:25 +00:00
Kubernetes Prow Robot
dc1d7f41ef
Merge pull request #128456 from benluddy/nondeterministic-response-encoding
KEP-4222: Allow nondeterministic object encoding in HTTP response bodies.
2024-10-30 20:13:27 +00:00
Ben Luddy
dee76a460e
Allow nondeterministic object encoding in HTTP response bodies. 2024-10-30 15:10:16 -04:00
Kubernetes Prow Robot
16f9fdc705
Merge pull request #128273 from benluddy/cbor-apply
KEP-4222: Support CBOR encoding for apply requests.
2024-10-30 17:25:25 +00:00
Ben Luddy
41f55d7117
Regenerate clients to support application/apply-patch+cbor. 2024-10-30 12:21:15 -04:00
Ben Luddy
37ed906a33
Support application/apply-patch+cbor in patch requests. 2024-10-30 12:21:15 -04:00
Kubernetes Prow Robot
6435489064
Merge pull request #128275 from pohly/dra-resourceslice-controller-multiple-slices
DRA resourceslice controller: support publishing multiple slices
2024-10-30 16:01:26 +00:00
Patrick Ohly
1088f4fb44 DRA resourceslice controller: do DeepCopy for driver resources
The reason for the previous behavior was unnecessary performance overhead that
occurs when the caller already provided a "fresh" copy and doesn't touch it
afterwards.

But this is something that DRA driver developers can easily get wrong, so it's
better to be safe than sorry.
2024-10-30 15:54:32 +01:00
Patrick Ohly
67f0428769 DRA resourceslice controller: delay sync
When deleting a bunch of slices, the delete events queue the pool while it is
being synced. It then got synced again immediately, while the deleted slices
were still being removed from the informer cache. The obsolete slice in the
cache caused the controller to delete it again, which fails with a "not
found". That error is ignored, but this still caused extra API calls.

Now syncing gets delayed with a configuration duration (default: 30 seconds) so
the informer cache is more likely to be up-to-date when the pool gets synced
again.
2024-10-30 15:54:32 +01:00
Patrick Ohly
99cf2d8a2e DRA resource slice controller: add E2E test
This test covers creating and deleting 100 large ResourceSlices. It is strict
about using the minimum number of calls.

The test also verifies that creating large slices works.
2024-10-30 15:54:32 +01:00
Patrick Ohly
7473e643fa DRA resource slice controller: use MutationCache to avoid race
This avoids the problem of creating an additional slice when the one from the
previous sync is not in the informer cache yet. It also avoids false
attempts to delete slices which were updated in the previous sync. Such
attempts would fail the ResourceVersion precondition check, but would
still cause work for the apiserver.
2024-10-30 15:54:32 +01:00
Patrick Ohly
e88d5c37e6 DRA resource claim controller: add statistics
This is primarily for testing. Proper metrics might be useful, but can still be
added later.
2024-10-30 15:54:32 +01:00
Patrick Ohly
d94752ebc8 DRA resourceslice controller: use preconditions for Delete
It's better to verify UID and ResourceVersion of the ResourceSlice that we want
to delete. If anything changed, the decision to remove it might not apply
anymore and we need to check again.
2024-10-30 15:54:32 +01:00
Patrick Ohly
a6d180c7d3 DRA: validate set of devices in a pool before using the pool
The ResourceSlice controller (theoretically) might end up creating too many
slices if it syncs again before its informer cache was updated. This could
cause the scheduler to allocate a device from a duplicated slice. They should
be identical, but its still better to fail and wait until the controller
removes the redundant slice.
2024-10-30 15:54:32 +01:00
Patrick Ohly
26650371cc DRA resourceslice controller: support publishing multiple slices
The driver determines what each slice is meant to look like. The controller
then ensures that only those slices exist. It reuses existing slices where the
set of devices, as identified by their names, is the same as in some desired
slice. Such slices get updated to match the desired state.

In other words, attributes and the order of devices can be changed by updating
an existing slice, but adding or removing a device is done by deleting and
re-creating slices.

Co-authored-by: googs1025 <googs1025@gmail.com>

The test update is partly based on
https://github.com/kubernetes/kubernetes/pull/127645.
2024-10-30 15:54:32 +01:00
dom4ha
ff584a76e0 Fix Unschedulable test by scheduling high priority churn pods to get processed right after they were injected (before the queued test pods) 2024-10-30 13:04:38 +00:00
Kubernetes Prow Robot
d001d5684e
Merge pull request #128417 from tenzen-y/self-nominate-job-controller-reviewer
Self nominate tenzen-y as a reviewer for the Job controller
2024-10-30 11:21:39 +00:00
Kubernetes Prow Robot
a18b50e7e4
Merge pull request #128373 from mimowo/job-cover-negative-codes
Job Pod Failure policy - cover testing of negative exit codes
2024-10-30 11:21:31 +00:00
Kubernetes Prow Robot
7529696b59
Merge pull request #128334 from mimowo/job-windows-e2e-test
Job Pod Failure policy refactor e2e test using exit codes
2024-10-30 11:21:25 +00:00
Kubernetes Prow Robot
daef8c2419
Merge pull request #127266 from pohly/dra-admin-access-in-status
DRA API: AdminAccess in DeviceRequestAllocationResult + DRAAdminAccess feature gate
2024-10-30 03:41:25 +00:00
Kubernetes Prow Robot
5fcef4f79d
Merge pull request #128422 from bart0sh/PR163-density-e2e_node-adjust-limits
density test: adjust CPU and memory limits
2024-10-30 02:37:31 +00:00
Kubernetes Prow Robot
db66e397d9
Merge pull request #128359 from matteriben/disable-caching-for-authoritative-zone
disable caching for authoritative zone to comply with rfc-1035 section 6.1.2
2024-10-30 02:37:24 +00:00
Kubernetes Prow Robot
a93e3e7ae1
Merge pull request #127483 from nokia/strict-cpu-reservation-core
KEP-4540: Add CPUManager policy option to restrict reservedSystemCPUs to system daemons and interrupt processing
2024-10-30 01:21:47 +00:00
Kubernetes Prow Robot
d702d265c7
Merge pull request #127291 from zhifei92/fix-apiserver-unexpected-panic
[FG:InPlacePodVerticalScaling] Fixed the apiserver panic issue that occurred when adding a container during pod updates in the InPlacePodVerticalScaling scenario.
2024-10-30 01:21:40 +00:00
Kubernetes Prow Robot
a0e5e244b3
Merge pull request #126875 from serathius/watchcache-test-indexers
Adding tests for using indexers in tests
2024-10-30 01:21:32 +00:00
Kubernetes Prow Robot
6737352b03
Merge pull request #125708 from hshiina/dopodresizeaction-error
[FG:InPlacePodVerticalScaling] Fix order of resizing pod cgroups in doPodResizeAction()
2024-10-30 01:21:25 +00:00
Kubernetes Prow Robot
e8a75ac53f
Merge pull request #128420 from tallclair/e2e-cleanup
Reuse cached client config for exec requests in e2e
2024-10-30 00:17:37 +00:00
Kubernetes Prow Robot
42b7cfecec
Merge pull request #128274 from eddycharly/fix-cel-type-provider
fix: cel type provider should return a type type
2024-10-30 00:17:30 +00:00
Kubernetes Prow Robot
a339a36a36
Merge pull request #127506 from ffromani/cpu-pool-size-metrics
node: metrics: add metrics about cpu pool sizes
2024-10-30 00:17:24 +00:00
Ed Bartosh
04f7a86001 density test: adjust CPU and memory limits
Adjusted limits based on recent job log:
I1028 20:05:42.079182 1002 resource_usage_test.go:199] Resource usage:
  container cpu(cores) memory_working_set(MB) memory_rss(MB)
  "kubelet" 0.024      22.17                  14.20
  "runtime" 0.041      409.70                 84.21

  I1028 20:05:42.079274 1002 resource_usage_test.go:206] CPU usage of containers:
  container 50th% 90th% 95th% 99th% 100th%
  "/"       N/A   N/A   N/A   N/A   N/A
  "runtime" 0.014 0.834 0.834 0.834 1.083
  "kubelet" 0.023 0.093 0.093 0.093 0.164

Increasing 95th percentile for runtime CPU usage should also make
pull-kubernetes-node-kubelet-containerd-flaky less flaky.
2024-10-30 00:48:56 +02:00
Kubernetes Prow Robot
f087575f21
Merge pull request #127226 from myeunee/cleanup
Clean up unnecessary else block and redundant variable assignment
2024-10-29 22:41:25 +00:00
Matt Riben
30d9ed7203
disable caching for authoritative zone
Signed-off-by: Matt Riben <matt.riben@swirldslabs.com>
2024-10-29 17:10:07 -05:00
myeunee
9cc65ce872 Restrict cz variable scope within else clause 2024-10-30 06:31:06 +09:00
Kubernetes Release Robot
f01e0d64db CHANGELOG: Update directory for v1.32.0-alpha.3 release 2024-10-29 20:17:52 +00:00
myeunee
2faaedbe39 Refactor error handling for configz initialization
Improved code readability and limited variable scope as per reviewer's suggestion.
2024-10-30 04:53:51 +09:00
Marek Siarkowicz
711772a1e1 Adding tests for using indexers in tests 2024-10-29 20:22:16 +01:00
Kubernetes Prow Robot
988769933e
Merge pull request #128307 from NoicFank/bugfix-scheduler-preemption
bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it
2024-10-29 19:05:02 +00:00
Kubernetes Prow Robot
a12a32cd12
Merge pull request #127146 from bart0sh/PR156-DRA-Kubelet-latency
Kubelet: add DRA latency metrics
2024-10-29 19:04:55 +00:00
Tim Allclair
2407a49956 Reuse cached client config for exec requests in e2e 2024-10-29 10:00:11 -07:00
Kubernetes Prow Robot
c3980f601c
Merge pull request #128267 from benluddy/cbor-response-negotiation
KEP-4222: Test response content negotiation for each CBOR enablement state.
2024-10-29 16:48:55 +00:00
Yuki Iwai
eca7ee877a Self nominate tenzen-y as a reviewer for the Job controller
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-10-30 01:14:47 +09:00
Kubernetes Prow Robot
b8e20b74dd
Merge pull request #128382 from carlory/rm-vac
remove unused vac code
2024-10-29 15:25:05 +00:00
Kubernetes Prow Robot
c5ccf59974
Merge pull request #128379 from pohly/dra-owners-wg-label
DRA: add wg/device-management label automatically
2024-10-29 15:24:57 +00:00
Kubernetes Prow Robot
eb445ac66c
Merge pull request #128414 from soltysh/improve_error
Provide link with e2e guidelines when verity-test-code.sh fails
2024-10-29 14:21:06 +00:00
Kubernetes Prow Robot
c83250d104
Merge pull request #126754 from serathius/watchcache-btree
Reimplement watch cache storage with btree
2024-10-29 14:20:58 +00:00
Kubernetes Prow Robot
d09d98e07c
Merge pull request #128022 from googs1025/cleanup/ut/preemption
chore(scheduler): add unit test for framework preemption part
2024-10-29 13:16:55 +00:00
Marek Siarkowicz
50d2fab279 Implement btree based storage indexer 2024-10-29 13:13:21 +01:00
Maciej Szulik
97fcb05374
Provide link with e2e guidelines when verity-test-code.sh fails
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2024-10-29 13:07:05 +01:00
NoicFank
68f7a7c682 bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it.
Introducing pdb to preemption had disrupted the orderliness of pods in the victims,
which would leads picking wrong victim node with higher priority pod on it.
2024-10-29 19:50:55 +08:00