Commit Graph

124107 Commits

Author SHA1 Message Date
Patrick Ohly
599fe605f9 DRA scheduler: adapt to v1alpha3 API
The structured parameter allocation logic was written from scratch in
staging/src/k8s.io/dynamic-resource-allocation/structured where it might be
useful for out-of-tree components.

Besides the new features (amount, admin access) and API it now supports
backtracking when the initial device selection doesn't lead to a complete
allocation of all claims.

Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
Co-authored-by: John Belamaric <jbelamaric@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
0fc78b9bcc DRA resource claim controller: update test
The resource claim controller is completely agnostic to the claim spec. It
doesn't care about classes or devices, therefore it needs no changes in 1.31
besides the v1alpha2 -> v1alpha3 renaming from a previous commit.
2024-07-22 18:09:34 +02:00
Patrick Ohly
c526d7796e DRA e2e: use VAP to control "admin access" permissions
The advantages of using a validation admission policy (VAP) are that no changes
are needed in Kubernetes and that admins have full flexibility if and how they
want to control which users are allowed to use "admin access" in their
requests.

The downside is that without admins taking actions, the feature is enabled
out-of-the-box in a cluster. Documentation for DRA will have to make it very
clear that something needs to be done in multi-tenant clusters.

The test/e2e/testing-manifests/dra/admin-access-policy.yaml shows how to do
this. The corresponding E2E tests ensures that it actually works as intended.

For some reason, adding the namespace to the message expression leads to a
type check errors, so it's currently commented out.
2024-07-22 18:09:34 +02:00
Patrick Ohly
0b62bfb690 DRA e2e: adapt to v1alpha3 API 2024-07-22 18:09:34 +02:00
Patrick Ohly
877829aeaa DRA kubelet: adapt to v1alpha3 API
This adds the ability to select specific requests inside a claim for a
container.

NodePrepareResources is always called, even if the claim is not used by any
container. This could be useful for drivers where that call has some effect
other than injecting CDI device IDs into containers. It also ensures that
drivers can validate configs.

The pod resource API can no longer report a class for each claim because there
is no such 1:1 relationship anymore. Instead, that API reports claim,
API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet
itself doesn't extract that information from the claim. Instead, it relies on
drivers to report this information when the claim gets prepared. This isolates
the kubelet from API changes.

Because of a faulty E2E test, kubelet was told to contact the wrong driver for
a claim. This was not visible in the kubelet log output. Now changes to the
claim info cache are getting logged. While at it, naming of variables and some
existing log output gets harmonized.

Co-authored-by: Oksana Baranova <oksana.baranova@intel.com>
Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
20f98f3a2f DRA: update helper packages
Publishing ResourceSlices now supports network-attached devices and the new
v1alpha3 API.  The logic for splitting up across different slices is missing.
2024-07-22 18:09:34 +02:00
Patrick Ohly
91d7882e86 DRA: new API for 1.31
This is a complete revamp of the original API. Some of the key
differences:
- refocused on structured parameters and allocating devices
- support for constraints across devices
- support for allocating "all" or a fixed amount
  of similar devices in a single request
- no class for ResourceClaims, instead individual
  device requests are associated with a mandatory
  DeviceClass

For the sake of simplicity, optional basic types (ints, strings) where the null
value is the default are represented as values in the API types. This makes Go
code simpler because it doesn't have to check for nil (consumers) and values
can be set directly (producers). The effect is that in protobuf, these fields
always get encoded because `opt` only has an effect for pointers.

The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new
"request" field. This is considered acceptable because the entire `claims`
field in the pod spec is still alpha.

The implementation is complete enough to bring up the apiserver.
Adapting other components follows.
2024-07-22 18:09:34 +02:00
Patrick Ohly
bcececadfb CEL: add QuantityDeclType
Most functions in k8s.io/apiserver/pkg/cel work with DeclType for type
definitions, which made the existing QuantityType unusable with them. The new
QuantityDeclType fills that gap.
2024-07-21 17:28:14 +02:00
Patrick Ohly
62d21589ef api test: update TestDefaulting
Logging and sub-tests were added to help debug this problem:
the test passes for ResourceClaim (same defaulting!) and fails
for the list, but only if run together with the other test cases?!

    $ go test ./pkg/api/testing
    --- FAIL: TestDefaulting (1.76s)
        --- FAIL: TestDefaulting/resource.k8s.io/v1alpha3,_Kind=ResourceClaimList (0.01s)
            defaulting_test.go:238: expected resource.k8s.io/v1alpha3, Kind=ResourceClaimList to trigger defaulting due to fuzzing
    FAIL
    FAIL	k8s.io/kubernetes/pkg/api/testing	17.294s
    FAIL
    $ go test -run=TestDefaulting/resource.k8s.io/v1alpha3,_Kind=ResourceClaimList ./pkg/api/testing
    ok  	k8s.io/kubernetes/pkg/api/testing	0.062s

What fixed that problem was increasing the likelihood of generating the right
test object by iterating more often before giving up.
2024-07-21 17:28:14 +02:00
Patrick Ohly
8a629b9f15 DRA: remove "sharable" from claim allocation result
Now all claims are shareable up to the limit imposed by the size of the
"reserverFor" array.

This is one of the agreed simplifications for 1.31.
2024-07-21 17:28:14 +02:00
Patrick Ohly
de5742ae83 DRA: remove immediate allocation
As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate
allocation is one of those features which can be removed because it makes no
sense for structured parameters and the justification for classic DRA is weak.
2024-07-21 17:28:14 +02:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
Kubernetes Prow Robot
815efa2baa
Merge pull request #126250 from my-git9/pkiutil-consot
kubeadm: remove unused constants in util/pkiutil
2024-07-21 03:02:57 -07:00
Kubernetes Prow Robot
10496b35a8
Merge pull request #126015 from micahhausler/kubelet-cert-validation
Enhance node admission to validate kubelet CSR's CN
2024-07-20 21:27:42 -07:00
Kubernetes Prow Robot
558c9536a1
Merge pull request #123678 from kinvolk/userns-use-kubelet-user-mappings
kubelet: Add logs for userns custom mappings parsing
2024-07-20 19:59:57 -07:00
Micah Hausler
b251efe0ad Enhance node admission to validate kubelet CSR's CN
Signed-off-by: Micah Hausler <mhausler@amazon.com>
2024-07-20 19:06:00 -05:00
Kubernetes Prow Robot
b14769f2af
Merge pull request #126224 from neolit123/1.31-fix-bug-in-join-patches-healthz
kubeadm: fix join bug where kubeletconfig was not patched in memory
2024-07-20 14:27:24 -07:00
Kubernetes Prow Robot
90a84704d6
Merge pull request #126231 from seans3/websocket-https-proxy-fix
Falls back to SPDY for gorilla/websocket https proxy error
2024-07-20 13:23:16 -07:00
Kubernetes Prow Robot
8527092e02
Merge pull request #119024 from wafuwafu13/deprecated-node-label
chore(node/util): add more labels to `deprecatedNodeLabels`
2024-07-20 11:31:40 -07:00
Lubomir I. Ivanov
b90b280c5a kubeadm: fix join bug where kubeletconfig was not patched in memory
During kubeadm join in 1.30 kubeadm started respecting
the kubeletconfiguration healthz address/port. Previously
it hardcoded the health check to localhost:defaultport.

A corner case was not handled where the user applies --patches
on join to modify the local kubeletconfiguration. This results
in kubeletconfiguration patch target patches not being applied to
the KubeletConfiguration in memory and the health check
running on the address:port which are present in the kubelet-config
configmap.

Fix that by explicitly calling a new function to patch the
KubeletConfiguration in memory. This is scoped to only handle
the healthz checks *after* the kubelet config.yaml was already
patched and written to disk.
2024-07-20 19:31:19 +03:00
xin.li
c1dca0ad7c kubeadm: remove unused constants in util/pkiutil
Signed-off-by: xin.li <xin.li@daocloud.io>
2024-07-20 23:30:25 +08:00
Kubernetes Prow Robot
892acaa6a7
Merge pull request #126107 from enj/enj/i/svm_not_found_err
svm: set UID and RV on SSA patch to cause conflict on logical create
2024-07-20 08:18:01 -07:00
Sean Sullivan
bc52647251 moving for easier cherry-pick 2024-07-20 05:29:57 -07:00
Kubernetes Prow Robot
b293ca9057
Merge pull request #126229 from aojea/network_policies_0.5.0
bump kube-network-policies to v0.5.0
2024-07-20 05:13:54 -07:00
Kubernetes Prow Robot
f2f7708375
Merge pull request #126244 from googs1025/informer
chore(servicecidr): use WaitForCacheSync after sharedInformerFactory Start in integration test
2024-07-20 03:11:39 -07:00
googs1025
bc514ff68b chore: remove t.Fatal typo 2024-07-20 16:19:47 +08:00
googs1025
a6ee8599f1 chore: use WaitForCacheSync method after sharedInformerFactory Start 2024-07-20 16:17:57 +08:00
Sean Sullivan
9d560540c5 Falls back to SPDY for gorilla/websocket https proxy error 2024-07-20 00:10:32 -07:00
Kubernetes Prow Robot
8f265b6305
Merge pull request #126136 from cici37/removeFG
Remove feature gate CustomResourceValidationExpressions
2024-07-20 00:08:52 -07:00
Kubernetes Prow Robot
a8d354bf39
Merge pull request #126122 from HirazawaUi/remove-unused-options
kubelet: Remove unused run container options
2024-07-19 18:05:16 -07:00
Kubernetes Prow Robot
14b34fc255
Merge pull request #125834 from tallclair/log-cleanup
[kubelet] Cleanup incorrect log about static pod status change
2024-07-19 16:58:54 -07:00
Kubernetes Prow Robot
64ba17c605
Merge pull request #125571 from liggitt/filter-auth-02-sar
add field and label selectors to authorization
2024-07-19 15:30:01 -07:00
Kubernetes Prow Robot
ec8015daac
Merge pull request #124273 from panoswoo/fix/124255
Remove missing extended resources from init containers
2024-07-19 15:29:53 -07:00
Kubernetes Prow Robot
fa15f12fb5
Merge pull request #126174 from dobsonj/corruptedmnt-enodev
mount-utils: treat syscall.ENODEV as corrupted mount
2024-07-19 13:08:48 -07:00
Jordan Liggitt
5f22dd7c1a
Add integration test exercising webhook selector authz 2024-07-19 15:06:52 -04:00
Jordan Liggitt
9f8f36708a
Fixup lint warning 2024-07-19 15:06:52 -04:00
Jordan Liggitt
4d535db8be
Add selector authorization to the Node authorizer 2024-07-19 15:06:51 -04:00
Jordan Liggitt
a1398a8cca
Add structured labelSelector / fieldSelector to authorization webhook match conditions 2024-07-19 15:06:50 -04:00
Jordan Liggitt
83bd512861
Adjust CEL cost calculation and versioning for authorization library 2024-07-19 15:06:49 -04:00
David Eads
be2e32fa3e
Add CEL fieldSelector / labelSelector support to authorizer library 2024-07-19 15:06:49 -04:00
Jordan Liggitt
03d48b7683
Move CEL env initialization out of package init()
This ensures compatibility version and feature gates can be initialized
before cached CEL environments are created.
2024-07-19 15:06:48 -04:00
Jordan Liggitt
1d2ad282cf
Improve CEL cost tests to catch unhandled estimates or types 2024-07-19 15:06:47 -04:00
David Eads
92e3445e9d
add field and label selectors to authorization attributes
Co-authored-by: Jordan Liggitt <liggitt@google.com>
2024-07-19 15:06:47 -04:00
Kubernetes Prow Robot
b3e769b72e
Merge pull request #126228 from googs1025/fix_informer
chore(Job):  make trivial improvements to job controller unit test
2024-07-19 12:03:24 -07:00
Kubernetes Prow Robot
6f3f115378
Merge pull request #126222 from macsko/dont_lock_activeq_twice_in_activate_in_scheduling_queue
Don't lock activeQ twice when activating pod in scheduling queue
2024-07-19 12:03:10 -07:00
David Eads
f5e5bef2e0
generate 2024-07-19 14:35:37 -04:00
David Eads
90f0b88b6a
add subjectaccessreview field and label selectors
Co-authored-by: Jordan Liggitt <liggitt@google.com>
2024-07-19 14:34:49 -04:00
Kubernetes Prow Robot
acaec0c23a
Merge pull request #126124 from cici37/feature/validating-admission-policy/metrics-improvement
Feature/validating admission policy/metrics improvement
2024-07-19 10:34:58 -07:00
Kubernetes Prow Robot
ce961fdc84
Merge pull request #125165 from carlory/clean-volume-util
remove unused functions in volume/util
2024-07-19 10:34:45 -07:00
Antonio Ojea
0c10b4534c bump kube-network-policies to v0.5.0 2024-07-19 16:55:47 +00:00