Commit Graph

126703 Commits

Author SHA1 Message Date
Patrick Ohly
dd6ad66f5f prune-junit-xml: avoid appending semicolon
Appending a semicolon after some text is unnecessary if it's the last
entry. This led to visually distracting extra semicolons in Spyglass which
looked like a bug in Spyglass.

Now the code checks if a semicolon is necessary before inserting it.
2024-11-08 16:12:47 +01:00
Kubernetes Prow Robot
e273349f3a
Merge pull request #127511 from pohly/dra-1.32-api
DRA 1.32 API: promotion to beta
2024-11-06 13:13:29 +00:00
Patrick Ohly
d6bad27b7d DRA apiserver: allow DRAAdminAccess feature without DynamicResourceAllocation
This makes a configuration with --feature-gates=AllAlpha=true valid
again. Without this change, that flag enabled DRAAdminAccess without
DynamicResourceAllocation being enabled (default off!) and the kube-apiserver
refused to start.

While DRAAdminAccess isn't usable without DynamicResourceAllocation, it's also
not really wrong to allow it - it simply won't matter.
2024-11-06 13:03:20 +01:00
Patrick Ohly
a1b8e9d3a7 DRA kubelet: increase plugin test coverage
Deleting slices was not covered to begin with and the recent registration
changes also could have been covered better. Now coverage is at 91%.
2024-11-06 13:03:20 +01:00
Patrick Ohly
1193ff1271 DRA driver: optionally support kubelet 1.31
Supporting the alpha gRPC interface isn't enough anymore to be compatible
with kubelet 1.31: the "supported versions" must contain version numbers,
otherwise the older kubelet refuses to register the driver.

With this change, a DRA driver can decide to support both kubelet 1.31 and
kubelet 1.32 by registering *only* the alpha gRPC interface (NodeV1alpha4(true)
and NodeV1beta1(false) as options for Start).

The default is to provide both interfaces and using the registration mechanism
for 1.32, which makes DRA drivers compatible only with Kubernetes >= 1.32.
2024-11-06 13:03:20 +01:00
Patrick Ohly
2c23fe1b82 DRA kubelet: list supported gRPC services during registration
Listing supported gRPC services (e.g. drav1alpha3.Node, drav1beta1.DRAPlugin)
during registration enables the kubelet to determine in advance which methods
it can call.

Versioning by Kubernetes release makes less sense because it doesn't say
anything about which gRPC service is supported. New ones might get added and
obsolete ones removed. Some services might be optional.

In the past, this versioning support wasn't really used. At least one version
had to be provided and kubelet tried to use the plugin with the highest
version. This version comparison gets dropped. In the unlikely situation
that different plugins register under the same name, the most recent one is
used.

Because advertising gRPC services is a new convention, plugins only reporting
some version are treated as providing the old alpha gRPC service.
2024-11-06 13:03:20 +01:00
Patrick Ohly
437be1e651 DRA kubelet: rename gRPC server from Node to DRAPlugin in v1beta1
The version bump is an opportunity to pick a name that is a bit more
descriptive. It matches the "DevicePlugin" service name.
2024-11-06 13:03:20 +01:00
Patrick Ohly
63a7865736 DRA CEL: properly define IntroducedVersion
Using 1.0 was a workaround to grant Kubernetes 1.31 access to things introduce
in that same release. In Kubernetes 1.32 we don't need that workaround anymore
because everything is still available after a downgrade and thus usable.
2024-11-06 13:03:20 +01:00
Patrick Ohly
ea51d975fc DRA: promote feature gate to beta 2024-11-06 13:03:20 +01:00
Patrick Ohly
30f5282656 DRA API: rename DeviceCapacity.Quantity to DeviceCapacity.Value
Based on review
feedback (https://github.com/kubernetes/kubernetes/pull/127511#discussion_r1823521172).
2024-11-06 13:03:20 +01:00
Patrick Ohly
0b8a85c54a update-openapi-spec.sh: enable all alpha and beta APIs
This becomes relevant once DynamicResourceAllocation becomes beta with
"disabled" as default. Otherwise AllAlpha=true enables DRAAdminAccess which
depends on DynamicResourceAllocation, which is disabled.
2024-11-06 13:03:19 +01:00
Patrick Ohly
33ea278c51 DRA: use v1beta1 API
No code is left which depends on the v1alpha3, except of course the code
implementing that version.
2024-11-06 13:03:19 +01:00
Patrick Ohly
81fd64256c DRA API: use DeviceCapacity struct instead of plain Quantity
This enables a future extension where capacity of a single device gets consumed
by different claims. The semantic without any additional fields is the same as
before: a capacity cannot be split up and is only an attribute of a device.

Because its semantically the same as before, two-way conversion to v1alpha3 is
possible.
2024-11-06 13:03:19 +01:00
Patrick Ohly
142319bd92 DRA API: use v1beta1 as storage version
This is meant to make it easier to remove the v1alpha3 because it won't be used
in clusters that started with DRA as beta in Kubernetes 1.32 when all clients
support v1beta1.
2024-11-06 13:03:19 +01:00
Patrick Ohly
0ee52b23cd DRA API: generated files 2024-11-06 13:03:19 +01:00
Patrick Ohly
2e64c72249 DRA API: register v1beta1
This is the minimal set of changes that are needed to make the new version
usable. The storage version is still v1alpha3. More changes will follow.
2024-11-06 13:03:18 +01:00
Patrick Ohly
584fdc9d1c DRA API: update lifecyle meta data
The tag is about the version/type combination, not just the type. The v1beta1
types will become deprecated automatically after three releases, starting in
1.32.

The v1alpha3 types get marked as replaced to ensure that the compatibility
version code doesn't force using v1alpha3 as storage
version (https://github.com/kubernetes/kubernetes/issues/128448).
2024-11-06 13:03:18 +01:00
Patrick Ohly
d685064ff7 DRA API: search/replace v1alpha3 -> v1beta1 2024-11-06 13:03:18 +01:00
Patrick Ohly
f1e5616f05 DRA API: verbatim copy of v1alpha3 -> v1beta1 2024-11-06 13:03:18 +01:00
Patrick Ohly
99acb67c68 DRA API: enhance validation testing
The line coverage is now at 98.5% and several more corner cases are
covered. The remaining lines are hard or impossible to reach.

The actual validation is the same as before, with some small tweaks to the
generated errors.

When failures are not as expected, it is useful to show what the expected and
actual failures look like to a user. Perhaps even better would be to put the
expected texts into the test files instead of the error structs. That would
be easier to review and shorter.
2024-11-06 13:03:18 +01:00
Kubernetes Prow Robot
50d0f920c0
Merge pull request #126750 from AMDEPYC/uncore_v1
Split L3 Cache Topology Awareness in CPU Manager
2024-11-06 11:13:29 +00:00
Patrick Ohly
51d5992335 DRA API: fix some comments
Wording in one case was wrong. The tombstone comment should use
the same field definition as before the removal.
2024-11-06 11:05:05 +01:00
Patrick Ohly
7b3a9afca3 DRA kubelet: add v1beta1 gRPC API
The v1beta1 API is identical to the previous v1alpha4, which erroneously was
still called "v1alpha3" in a few places, including the gRPC interface
definition itself.

The only reason for v1beta1 is to document the increased maturity of this API.

To simplify the transition, kubelet supports both v1alpha4 and v1beta1, picking
the more recent one automatically. All that DRA driver authors need to do to
implement v1beta1 is to update to the latest
k8s.io/dynamic-resource-allocation/kubeletplugin: it will automatically
register both API versions unless explicitly configured otherwise, which is
mostly just for testing.

DRA driver authors may replace their package import of v1alpha4 with v1beta1,
but they don't have to because the types in both packages are the same.
2024-11-06 11:05:05 +01:00
Kubernetes Prow Robot
f451aec237
Merge pull request #128296 from AnishShah/kubectl-resize
[FG:InPlacePodVerticalScaling] Remove restrictions on subresource flag in kubectl commands
2024-11-06 10:01:45 +00:00
Kubernetes Prow Robot
833ee8502e
Merge pull request #128194 from AnishShah/extended-resource
test: refactor logic to add/remove extended resources
2024-11-06 10:01:37 +00:00
Kubernetes Prow Robot
0fad78930f
Merge pull request #127904 from towca/jtuznik/dra-autoscaling
DRA: allow Cluster Autoscaler to integrate with DRA scheduler plugin
2024-11-06 10:01:29 +00:00
Kubernetes Prow Robot
ab4b869b52
Merge pull request #128590 from benluddy/protobuf-storage-integration-test
Add integration test for per-resource storage encoding.
2024-11-06 08:51:44 +00:00
Kubernetes Prow Robot
89c1925e23
Merge pull request #128582 from pohly/dra-resourceslice-unit-test-fix
DRA resource slice controller: fix unit test flake
2024-11-06 08:51:36 +00:00
Kubernetes Prow Robot
3dcad5f0db
Merge pull request #128532 from neolit123/1.32-handle-custom-addreses-comp-readyz
kubeadm: use advertise address for WaitForAllControlPlaneComponents
2024-11-06 08:51:29 +00:00
Anish Shah
e1ca63489f kubectl: remove subresource restrictions from all commands
Removing this restrictions will allow us to use these commands with the
new resize subresource.
2024-11-05 23:06:40 -08:00
Kubernetes Prow Robot
aafcf4e932
Merge pull request #128453 from tallclair/cacheless-pleg
Cleanup unused cacheless PLEG code
2024-11-06 06:59:35 +00:00
Kubernetes Prow Robot
648717cc74
Merge pull request #128266 from AnishShah/resize-subresource
[FG:InPlacePodVerticalScaling] Introduce  /resize subresource to request pod resource resizing
2024-11-06 06:59:29 +00:00
Kubernetes Prow Robot
b631dae569
Merge pull request #128584 from thockin/compartmentalize_spew
Compartmentalize spew more
2024-11-06 04:19:51 +00:00
Kubernetes Prow Robot
a50b4e52a9
Merge pull request #128553 from thockin/master
Validation: merge TooLong and TooLongMaxLen
2024-11-06 04:19:43 +00:00
Kubernetes Prow Robot
5e0b818ff9
Merge pull request #128551 from tallclair/allocated-checkpoint
[FG:InPlacePodVerticalScaling] Don't checkpoint ResizeStatus
2024-11-06 04:19:36 +00:00
Kubernetes Prow Robot
bf75546494
Merge pull request #128432 from zhifei92/integrating-health-check
Integrate device plugin registration gRPC server health checks.
2024-11-06 04:19:29 +00:00
Ben Luddy
006146f58f
Add integration test for per-resource storage encoding. 2024-11-05 22:38:46 -05:00
Kubernetes Prow Robot
ce81cc70a6
Merge pull request #128403 from carlory/fix-128385
Fix failing test:  PodRejectionStatus Kubelet should reject pod when the node didn't have enough resource
2024-11-06 02:29:36 +00:00
Kubernetes Prow Robot
8c5472ce66
Merge pull request #128189 from zylxjtu/bug
Fix the incorrect metrics setting/naming in nodeshutdown manager
2024-11-06 02:29:29 +00:00
Anish Shah
bfb0b83d45 update codegen 2024-11-06 01:43:50 +00:00
Anish Shah
e55bf09ca5 Fix unit tests 2024-11-06 01:33:16 +00:00
Anish Shah
5b5e4a87c3 apply feedback 2024-11-06 01:33:16 +00:00
Anish Shah
332d794559 remove redundant validation check for pod resize 2024-11-06 01:33:15 +00:00
Anish Shah
832d7f7dc2 apply feedback 2024-11-06 01:33:15 +00:00
Anish Shah
4c69bf2496 implement GetResetFieldsFilter
GetResetFieldsFilter returns a set of fields filter reset
by pod resize strategy. This is needed to make server-side apply
work correctly.
2024-11-06 01:33:15 +00:00
Anish Shah
0a80c5ecb7 better variable names 2024-11-06 01:33:15 +00:00
Anish Shah
79f45bce19 client-go: rename Resize to UpdateResize 2024-11-06 01:33:15 +00:00
Anish Shah
3b91edb660 unit tests to ensure pod metadata cannot be updated during resize. 2024-11-06 01:33:15 +00:00
Anish Shah
7ac302b47a test: cleanup validation tests 2024-11-06 01:33:15 +00:00
Anish Shah
dc3c4ed559 pod resize support in LimitRanger admission plugin 2024-11-06 01:33:15 +00:00