Commit Graph

126634 Commits

Author SHA1 Message Date
Patrick Ohly
d94752ebc8 DRA resourceslice controller: use preconditions for Delete
It's better to verify UID and ResourceVersion of the ResourceSlice that we want
to delete. If anything changed, the decision to remove it might not apply
anymore and we need to check again.
2024-10-30 15:54:32 +01:00
Patrick Ohly
a6d180c7d3 DRA: validate set of devices in a pool before using the pool
The ResourceSlice controller (theoretically) might end up creating too many
slices if it syncs again before its informer cache was updated. This could
cause the scheduler to allocate a device from a duplicated slice. They should
be identical, but its still better to fail and wait until the controller
removes the redundant slice.
2024-10-30 15:54:32 +01:00
Patrick Ohly
26650371cc DRA resourceslice controller: support publishing multiple slices
The driver determines what each slice is meant to look like. The controller
then ensures that only those slices exist. It reuses existing slices where the
set of devices, as identified by their names, is the same as in some desired
slice. Such slices get updated to match the desired state.

In other words, attributes and the order of devices can be changed by updating
an existing slice, but adding or removing a device is done by deleting and
re-creating slices.

Co-authored-by: googs1025 <googs1025@gmail.com>

The test update is partly based on
https://github.com/kubernetes/kubernetes/pull/127645.
2024-10-30 15:54:32 +01:00
Antoni Zawodny
4afa554f65 Add --concurrent-daemonset-syncs flag to kube-controller-manager 2024-10-30 15:03:26 +01:00
dom4ha
ff584a76e0 Fix Unschedulable test by scheduling high priority churn pods to get processed right after they were injected (before the queued test pods) 2024-10-30 13:04:38 +00:00
Itamar Holder
f21473b924 Set pod-level CPUPeriod only if CPUQuota is changed
Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-10-30 14:21:35 +02:00
Itamar Holder
c792c30b6a Refactor: remove no longer needed resourceName parameter
Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-10-30 13:58:38 +02:00
Itamar Holder
7207ce20f0 Refactor: remove functions that are no longer used
Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-10-30 13:58:38 +02:00
Itamar Holder
510ff67528 Use libcontainer's cgroup manager to update resources through systemd
libcontainer's cgroup manager is version agnostic, and is agnostic
to whether systemd is used. This way if systemd is used, the cgroup
manager would be able to update resources properly so that if
the daemon would be restarted the changes would not be reverted.

Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-10-30 13:58:38 +02:00
Itamar Holder
2a5a6c7fb8 Refactor: add import alias to libcontainer cgroup manager
Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-10-30 13:58:38 +02:00
Kubernetes Prow Robot
d001d5684e
Merge pull request #128417 from tenzen-y/self-nominate-job-controller-reviewer
Self nominate tenzen-y as a reviewer for the Job controller
2024-10-30 11:21:39 +00:00
Kubernetes Prow Robot
a18b50e7e4
Merge pull request #128373 from mimowo/job-cover-negative-codes
Job Pod Failure policy - cover testing of negative exit codes
2024-10-30 11:21:31 +00:00
Kubernetes Prow Robot
7529696b59
Merge pull request #128334 from mimowo/job-windows-e2e-test
Job Pod Failure policy refactor e2e test using exit codes
2024-10-30 11:21:25 +00:00
yunwang0911
05493c0924
Update pkg/kubelet/status/state/state_checkpoint_test.go
Co-authored-by: Tim Allclair <timallclair@gmail.com>
2024-10-30 18:11:10 +08:00
yunwang0911
e4c8eefeb2
Update pkg/kubelet/status/state/state_checkpoint_test.go
Co-authored-by: Tim Allclair <timallclair@gmail.com>
2024-10-30 18:08:53 +08:00
Kubernetes Prow Robot
daef8c2419
Merge pull request #127266 from pohly/dra-admin-access-in-status
DRA API: AdminAccess in DeviceRequestAllocationResult + DRAAdminAccess feature gate
2024-10-30 03:41:25 +00:00
Kubernetes Prow Robot
5fcef4f79d
Merge pull request #128422 from bart0sh/PR163-density-e2e_node-adjust-limits
density test: adjust CPU and memory limits
2024-10-30 02:37:31 +00:00
Kubernetes Prow Robot
db66e397d9
Merge pull request #128359 from matteriben/disable-caching-for-authoritative-zone
disable caching for authoritative zone to comply with rfc-1035 section 6.1.2
2024-10-30 02:37:24 +00:00
Kubernetes Prow Robot
a93e3e7ae1
Merge pull request #127483 from nokia/strict-cpu-reservation-core
KEP-4540: Add CPUManager policy option to restrict reservedSystemCPUs to system daemons and interrupt processing
2024-10-30 01:21:47 +00:00
Kubernetes Prow Robot
d702d265c7
Merge pull request #127291 from zhifei92/fix-apiserver-unexpected-panic
[FG:InPlacePodVerticalScaling] Fixed the apiserver panic issue that occurred when adding a container during pod updates in the InPlacePodVerticalScaling scenario.
2024-10-30 01:21:40 +00:00
Kubernetes Prow Robot
a0e5e244b3
Merge pull request #126875 from serathius/watchcache-test-indexers
Adding tests for using indexers in tests
2024-10-30 01:21:32 +00:00
Kubernetes Prow Robot
6737352b03
Merge pull request #125708 from hshiina/dopodresizeaction-error
[FG:InPlacePodVerticalScaling] Fix order of resizing pod cgroups in doPodResizeAction()
2024-10-30 01:21:25 +00:00
Kubernetes Prow Robot
e8a75ac53f
Merge pull request #128420 from tallclair/e2e-cleanup
Reuse cached client config for exec requests in e2e
2024-10-30 00:17:37 +00:00
Kubernetes Prow Robot
42b7cfecec
Merge pull request #128274 from eddycharly/fix-cel-type-provider
fix: cel type provider should return a type type
2024-10-30 00:17:30 +00:00
Kubernetes Prow Robot
a339a36a36
Merge pull request #127506 from ffromani/cpu-pool-size-metrics
node: metrics: add metrics about cpu pool sizes
2024-10-30 00:17:24 +00:00
James Sturtevant
ac174f518c
Respond to sig-node feedback
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
2024-10-29 16:56:37 -07:00
Ed Bartosh
04f7a86001 density test: adjust CPU and memory limits
Adjusted limits based on recent job log:
I1028 20:05:42.079182 1002 resource_usage_test.go:199] Resource usage:
  container cpu(cores) memory_working_set(MB) memory_rss(MB)
  "kubelet" 0.024      22.17                  14.20
  "runtime" 0.041      409.70                 84.21

  I1028 20:05:42.079274 1002 resource_usage_test.go:206] CPU usage of containers:
  container 50th% 90th% 95th% 99th% 100th%
  "/"       N/A   N/A   N/A   N/A   N/A
  "runtime" 0.014 0.834 0.834 0.834 1.083
  "kubelet" 0.023 0.093 0.093 0.093 0.164

Increasing 95th percentile for runtime CPU usage should also make
pull-kubernetes-node-kubelet-containerd-flaky less flaky.
2024-10-30 00:48:56 +02:00
Kubernetes Prow Robot
f087575f21
Merge pull request #127226 from myeunee/cleanup
Clean up unnecessary else block and redundant variable assignment
2024-10-29 22:41:25 +00:00
Matt Riben
30d9ed7203
disable caching for authoritative zone
Signed-off-by: Matt Riben <matt.riben@swirldslabs.com>
2024-10-29 17:10:07 -05:00
myeunee
9cc65ce872 Restrict cz variable scope within else clause 2024-10-30 06:31:06 +09:00
Kubernetes Release Robot
f01e0d64db CHANGELOG: Update directory for v1.32.0-alpha.3 release 2024-10-29 20:17:52 +00:00
myeunee
2faaedbe39 Refactor error handling for configz initialization
Improved code readability and limited variable scope as per reviewer's suggestion.
2024-10-30 04:53:51 +09:00
Marek Siarkowicz
711772a1e1 Adding tests for using indexers in tests 2024-10-29 20:22:16 +01:00
Kubernetes Prow Robot
988769933e
Merge pull request #128307 from NoicFank/bugfix-scheduler-preemption
bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it
2024-10-29 19:05:02 +00:00
Kubernetes Prow Robot
a12a32cd12
Merge pull request #127146 from bart0sh/PR156-DRA-Kubelet-latency
Kubelet: add DRA latency metrics
2024-10-29 19:04:55 +00:00
Tim Allclair
2407a49956 Reuse cached client config for exec requests in e2e 2024-10-29 10:00:11 -07:00
Kubernetes Prow Robot
c3980f601c
Merge pull request #128267 from benluddy/cbor-response-negotiation
KEP-4222: Test response content negotiation for each CBOR enablement state.
2024-10-29 16:48:55 +00:00
Yuki Iwai
eca7ee877a Self nominate tenzen-y as a reviewer for the Job controller
Signed-off-by: Yuki Iwai <yuki.iwai.tz@gmail.com>
2024-10-30 01:14:47 +09:00
Kubernetes Prow Robot
b8e20b74dd
Merge pull request #128382 from carlory/rm-vac
remove unused vac code
2024-10-29 15:25:05 +00:00
Kubernetes Prow Robot
c5ccf59974
Merge pull request #128379 from pohly/dra-owners-wg-label
DRA: add wg/device-management label automatically
2024-10-29 15:24:57 +00:00
Kubernetes Prow Robot
eb445ac66c
Merge pull request #128414 from soltysh/improve_error
Provide link with e2e guidelines when verity-test-code.sh fails
2024-10-29 14:21:06 +00:00
Kubernetes Prow Robot
c83250d104
Merge pull request #126754 from serathius/watchcache-btree
Reimplement watch cache storage with btree
2024-10-29 14:20:58 +00:00
Kubernetes Prow Robot
d09d98e07c
Merge pull request #128022 from googs1025/cleanup/ut/preemption
chore(scheduler): add unit test for framework preemption part
2024-10-29 13:16:55 +00:00
Talor Itzhak
d64f34eb2c memorymanager: areMemoryStatesEqual helper
perform the memoryStates comparison in helper function

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2024-10-29 14:22:04 +02:00
Marek Siarkowicz
50d2fab279 Implement btree based storage indexer 2024-10-29 13:13:21 +01:00
Maciej Szulik
97fcb05374
Provide link with e2e guidelines when verity-test-code.sh fails
Signed-off-by: Maciej Szulik <soltysh@gmail.com>
2024-10-29 13:07:05 +01:00
NoicFank
68f7a7c682 bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it.
Introducing pdb to preemption had disrupted the orderliness of pods in the victims,
which would leads picking wrong victim node with higher priority pod on it.
2024-10-29 19:50:55 +08:00
Talor Itzhak
7476f46d71 memorymanager: fix checkpoint file comparison
For a resource within a group, such as memory,
we should validate the total `Free` and total `Reserved` size of the expected `machineState` and state restored from checkpoint file after kubelet start.
If total `Free` and total `Reserved` are equal, the restored state is valid.

The old comparison however was done by reflection.

There're times when the memory accounting is equals
but the allocations across the NUMA nodes are varies.

In such cases we still need to consider the states as equals.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2024-10-29 12:10:27 +02:00
holder
6709317ae2 chore: optimize code logic
(cherry picked from commit 91a9a195ac0fe0e31301dc60af0ea868fc4756ff)
2024-10-29 12:08:28 +02:00
holder
6d7a1226d5 update the test case name
(cherry picked from commit de033352079c7d87417f88f073d6b7891e51e590)
2024-10-29 12:08:23 +02:00