kubernetes

mirror of https://github.com/k3s-io/kubernetes.git synced 2025-09-21 01:50:55 +00:00

Author	SHA1	Message	Date
Patrick Ohly	299ecde5cc	DRA quota: add ResourceClaim v1.ResourceQuota limits Dynamic resource allocation is similar to storage in the sense that users create ResourceClaim objects to request resources, same as with persistent volume claims. The actual resource usage is only known when allocating claims, but some limits can already be enforced at admission time: - "count/resourceclaims.resource.k8s.io" limits the number of ResourceClaim objects in a namespace; this is a generic feature that is already supported also without this commit. - "resourceclaims" is not an alias - use "count/resourceclaims.resource.k8s.io" instead. - <device-class-name>.deviceclass.resource.k8s.io/devices limits the number of ResourceClaim objects in a namespace such that the number of devices requested through those objects with that class does not exceed the limit. A single request may cause the allocation of multiple devices. For exact counts, the quota limit is based on the sum of those exact counts. For requests asking for "all" matching devices, the maximum number of allocated devices per claim is used as a worst-case upper bound. Requests asking for "admin access" contribute to the quota. DRA quota: remove admin mode exception	2024-07-23 18:52:34 +02:00
Patrick Ohly	1f43a80b3c	DRA quota: unit test case for resource.k8s.io quota names The names aren't actually special for validation. They are acceptable with and without the feature gate, the only difference is that they don't do anything when the feature is enabled.	2024-07-23 18:52:33 +02:00
Patrick Ohly	b5c94966bd	DRA e2e: fix the quota name The actual name has the k8s.io suffix.	2024-07-23 18:52:33 +02:00
Patrick Ohly	eaa1cad7fa	resource quota: clone PVC quota evaluator for DRA	2024-07-22 21:20:08 +02:00
Kubernetes Prow Robot	d21b17264e	Merge pull request #125488 from pohly/dra-1.31 DRA for 1.31	2024-07-22 11:45:55 -07:00
Kubernetes Prow Robot	f458a749e7	Merge pull request #125277 from iholder101/swap/skip_critical_pods [KEP-2400]: Restrict access to swap for containers in high priority Pods	2024-07-22 11:45:48 -07:00
Kubernetes Prow Robot	887def08b6	Merge pull request #126237 from cici37/promoteMetrics Promote metrics for VAP and CRD validation rules to beta.	2024-07-22 10:17:49 -07:00
Kubernetes Prow Robot	0caeba5cbe	Merge pull request #126204 from vrutkovs/unsafeRecordQueried-atomicPointer feature_gate: avoid extra copy when queried feature is already stored, use Set instead of map	2024-07-22 09:09:42 -07:00
Patrick Ohly	d11b58efe6	DRA kubelet: refactor gRPC call timeouts Some of the E2E node tests were flaky. Their timeout apparently was chosen under the assumption that kubelet would retry immediately after a failed gRPC call, with a factor of 2 as safety margin. But according to `0449cef8fd`, kubelet has a different, higher retry period of 90 seconds, which was exactly the test timeout. The test timeout has to be higher than that. As the tests don't use the gRPC call timeout anymore, it can be made private. While at it, the name and documentation gets updated.	2024-07-22 18:09:34 +02:00
Patrick Ohly	357a2926a1	DRA e2e: update VAP for a kubelet plugin This fixes the message (node name and "cluster-scoped" were switched) and simplifies the VAP: - a single matchCondition short circuits completely unless they're a user we care about - variables to extract the userNodeName and objectNodeName once (using optionals to gracefully turn missing claims and fields into empty strings) - leaves very tiny concise validations Co-authored-by: Jordan Liggitt <liggitt@google.com>	2024-07-22 18:09:34 +02:00
Patrick Ohly	9f36c8d718	DRA: add DRAControlPlaneController feature gate for "classic DRA" In the API, the effect of the feature gate is that alpha fields get dropped on create. They get preserved during updates if already set. The PodSchedulingContext registration is not restricted by the feature gate. This enables deleting stale PodSchedulingContext objects after disabling the feature gate. The scheduler checks the new feature gate before setting up an informer for PodSchedulingContext objects and when deciding whether it can schedule a pod. If any claim depends on a control plane controller, the scheduler bails out, leading to: Status: Pending ... Warning FailedScheduling 73s default-scheduler 0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling. The rest of the changes prepare for testing the new feature separately from "structured parameters". The goal is to have base "dra" jobs which just enable and test those, then "classic-dra" jobs which add DRAControlPlaneController.	2024-07-22 18:09:34 +02:00
Patrick Ohly	599fe605f9	DRA scheduler: adapt to v1alpha3 API The structured parameter allocation logic was written from scratch in staging/src/k8s.io/dynamic-resource-allocation/structured where it might be useful for out-of-tree components. Besides the new features (amount, admin access) and API it now supports backtracking when the initial device selection doesn't lead to a complete allocation of all claims. Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com> Co-authored-by: John Belamaric <jbelamaric@google.com>	2024-07-22 18:09:34 +02:00
Patrick Ohly	0fc78b9bcc	DRA resource claim controller: update test The resource claim controller is completely agnostic to the claim spec. It doesn't care about classes or devices, therefore it needs no changes in 1.31 besides the v1alpha2 -> v1alpha3 renaming from a previous commit.	2024-07-22 18:09:34 +02:00
Patrick Ohly	c526d7796e	DRA e2e: use VAP to control "admin access" permissions The advantages of using a validation admission policy (VAP) are that no changes are needed in Kubernetes and that admins have full flexibility if and how they want to control which users are allowed to use "admin access" in their requests. The downside is that without admins taking actions, the feature is enabled out-of-the-box in a cluster. Documentation for DRA will have to make it very clear that something needs to be done in multi-tenant clusters. The test/e2e/testing-manifests/dra/admin-access-policy.yaml shows how to do this. The corresponding E2E tests ensures that it actually works as intended. For some reason, adding the namespace to the message expression leads to a type check errors, so it's currently commented out.	2024-07-22 18:09:34 +02:00
Patrick Ohly	0b62bfb690	DRA e2e: adapt to v1alpha3 API	2024-07-22 18:09:34 +02:00
Patrick Ohly	877829aeaa	DRA kubelet: adapt to v1alpha3 API This adds the ability to select specific requests inside a claim for a container. NodePrepareResources is always called, even if the claim is not used by any container. This could be useful for drivers where that call has some effect other than injecting CDI device IDs into containers. It also ensures that drivers can validate configs. The pod resource API can no longer report a class for each claim because there is no such 1:1 relationship anymore. Instead, that API reports claim, API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet itself doesn't extract that information from the claim. Instead, it relies on drivers to report this information when the claim gets prepared. This isolates the kubelet from API changes. Because of a faulty E2E test, kubelet was told to contact the wrong driver for a claim. This was not visible in the kubelet log output. Now changes to the claim info cache are getting logged. While at it, naming of variables and some existing log output gets harmonized. Co-authored-by: Oksana Baranova <oksana.baranova@intel.com> Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>	2024-07-22 18:09:34 +02:00
Patrick Ohly	20f98f3a2f	DRA: update helper packages Publishing ResourceSlices now supports network-attached devices and the new v1alpha3 API. The logic for splitting up across different slices is missing.	2024-07-22 18:09:34 +02:00
Patrick Ohly	91d7882e86	DRA: new API for 1.31 This is a complete revamp of the original API. Some of the key differences: - refocused on structured parameters and allocating devices - support for constraints across devices - support for allocating "all" or a fixed amount of similar devices in a single request - no class for ResourceClaims, instead individual device requests are associated with a mandatory DeviceClass For the sake of simplicity, optional basic types (ints, strings) where the null value is the default are represented as values in the API types. This makes Go code simpler because it doesn't have to check for nil (consumers) and values can be set directly (producers). The effect is that in protobuf, these fields always get encoded because `opt` only has an effect for pointers. The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new "request" field. This is considered acceptable because the entire `claims` field in the pod spec is still alpha. The implementation is complete enough to bring up the apiserver. Adapting other components follows.	2024-07-22 18:09:34 +02:00
Kubernetes Prow Robot	af71138323	Merge pull request #124837 from carlory/rm-FindCreatablePluginBySpec remove unused FindCreatablePluginBySpec	2024-07-22 08:01:54 -07:00
Kubernetes Prow Robot	3f933ef262	Merge pull request #124053 from PichuChen/patch-1 Fix a typo	2024-07-22 08:01:40 -07:00
Itamar Holder	a6df16af85	node e2e test: exclude critical pods from swapping Signed-off-by: Itamar Holder <iholder@redhat.com>	2024-07-22 17:56:52 +03:00
Itamar Holder	6c1f14c468	unit tests: exclude critical pods from swapping Signed-off-by: Itamar Holder <iholder@redhat.com>	2024-07-22 17:56:52 +03:00
Itamar Holder	532cd5f84c	Exclude critical pods from having swap access Signed-off-by: Itamar Holder <iholder@redhat.com>	2024-07-22 17:56:52 +03:00
Kubernetes Prow Robot	8b8f84c6a7	Merge pull request #125862 from sanposhiho/cleanup-nominated cleanup: remove duplicated AddNominatedPod	2024-07-22 06:50:03 -07:00
Kubernetes Prow Robot	1f436e0fba	Merge pull request #124108 from carlory/update-test-InTreePluginXXXUnregister update unit test for adc to test volume migration	2024-07-22 06:49:49 -07:00
杨朱 · Kiki	bc3c07091b	Fix a bug where the target pod doesn't become schedulable within 5 minutes when a deleted pod uses the same PVC with the ReadWriteOncePod access mode. (#126263 ) Co-authored-by: Kensei Nakada <handbomusic@gmail.com>	2024-07-22 01:20:34 -07:00
Kubernetes Prow Robot	00d03ec049	Merge pull request #126259 from liggitt/node-get-authz Authorize Node reads via name, not graph	2024-07-21 13:08:21 -07:00
Jordan Liggitt	c75c07c8e1	Authorize Node reads via name, not graph	2024-07-21 15:01:46 -04:00
Kubernetes Prow Robot	69eee1c4a2	Merge pull request #126149 from sttts/sttts-aggregator-availability-controller-split Step 11 - Split aggregator availability controller into local and remote part	2024-07-21 09:54:46 -07:00
Dr. Stefan Schimanski	b27142852f	test/integration: adapt numbers in TestAPIServerTransportMetrics with less rest client creations Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>	2024-07-21 17:41:50 +02:00
Dr. Stefan Schimanski	834cd7ca4a	aggregator: split availability controller into local and remote part Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>	2024-07-21 17:31:24 +02:00
Patrick Ohly	bcececadfb	CEL: add QuantityDeclType Most functions in k8s.io/apiserver/pkg/cel work with DeclType for type definitions, which made the existing QuantityType unusable with them. The new QuantityDeclType fills that gap.	2024-07-21 17:28:14 +02:00
Patrick Ohly	62d21589ef	api test: update TestDefaulting Logging and sub-tests were added to help debug this problem: the test passes for ResourceClaim (same defaulting!) and fails for the list, but only if run together with the other test cases?! $ go test ./pkg/api/testing --- FAIL: TestDefaulting (1.76s) --- FAIL: TestDefaulting/resource.k8s.io/v1alpha3,_Kind=ResourceClaimList (0.01s) defaulting_test.go:238: expected resource.k8s.io/v1alpha3, Kind=ResourceClaimList to trigger defaulting due to fuzzing FAIL FAIL k8s.io/kubernetes/pkg/api/testing 17.294s FAIL $ go test -run=TestDefaulting/resource.k8s.io/v1alpha3,_Kind=ResourceClaimList ./pkg/api/testing ok k8s.io/kubernetes/pkg/api/testing 0.062s What fixed that problem was increasing the likelihood of generating the right test object by iterating more often before giving up.	2024-07-21 17:28:14 +02:00
Patrick Ohly	8a629b9f15	DRA: remove "sharable" from claim allocation result Now all claims are shareable up to the limit imposed by the size of the "reserverFor" array. This is one of the agreed simplifications for 1.31.	2024-07-21 17:28:14 +02:00
Patrick Ohly	de5742ae83	DRA: remove immediate allocation As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate allocation is one of those features which can be removed because it makes no sense for structured parameters and the justification for classic DRA is weak.	2024-07-21 17:28:14 +02:00
Patrick Ohly	b51d68bb87	DRA: bump API v1alpha2 -> v1alpha3 This is in preparation for revamping the resource.k8s.io completely. Because there will be no support for transitioning from v1alpha2 to v1alpha3, the roundtrip test data for that API in 1.29 and 1.30 gets removed. Repeating the version in the import name of the API packages is not really required. It was done for a while to support simpler grepping for usage of alpha APIs, but there are better ways for that now. So during this transition, "resourceapi" gets used instead of "resourcev1alpha3" and the version gets dropped from informer and lister imports. The advantage is that the next bump to v1beta1 will affect fewer source code lines. Only source code where the version really matters (like API registration) retains the versioned import.	2024-07-21 17:28:13 +02:00
Dr. Stefan Schimanski	bbdc247406	aggregator: make linter happy Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>	2024-07-21 16:45:28 +02:00
Dr. Stefan Schimanski	b5759ad4f9	aggregator: (pre-)move availability controller Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>	2024-07-21 13:48:50 +02:00
Dr. Stefan Schimanski	c5095069a8	aggregator: separate out status controller metrics Signed-off-by: Dr. Stefan Schimanski <stefan.schimanski@gmail.com>	2024-07-21 13:48:49 +02:00
Kubernetes Prow Robot	815efa2baa	Merge pull request #126250 from my-git9/pkiutil-consot kubeadm: remove unused constants in util/pkiutil	2024-07-21 03:02:57 -07:00
Kensei Nakada	82a54e8cc8	cleanup: remove duplicated addNominatedPodUnlocked	2024-07-21 16:04:25 +09:00
Kubernetes Prow Robot	10496b35a8	Merge pull request #126015 from micahhausler/kubelet-cert-validation Enhance node admission to validate kubelet CSR's CN	2024-07-20 21:27:42 -07:00
Kubernetes Prow Robot	558c9536a1	Merge pull request #123678 from kinvolk/userns-use-kubelet-user-mappings kubelet: Add logs for userns custom mappings parsing	2024-07-20 19:59:57 -07:00
Micah Hausler	b251efe0ad	Enhance node admission to validate kubelet CSR's CN Signed-off-by: Micah Hausler <mhausler@amazon.com>	2024-07-20 19:06:00 -05:00
Kubernetes Prow Robot	b14769f2af	Merge pull request #126224 from neolit123/1.31-fix-bug-in-join-patches-healthz kubeadm: fix join bug where kubeletconfig was not patched in memory	2024-07-20 14:27:24 -07:00
Kubernetes Prow Robot	90a84704d6	Merge pull request #126231 from seans3/websocket-https-proxy-fix Falls back to SPDY for gorilla/websocket https proxy error	2024-07-20 13:23:16 -07:00
Kubernetes Prow Robot	8527092e02	Merge pull request #119024 from wafuwafu13/deprecated-node-label chore(node/util): add more labels to `deprecatedNodeLabels`	2024-07-20 11:31:40 -07:00
Lubomir I. Ivanov	b90b280c5a	kubeadm: fix join bug where kubeletconfig was not patched in memory During kubeadm join in 1.30 kubeadm started respecting the kubeletconfiguration healthz address/port. Previously it hardcoded the health check to localhost:defaultport. A corner case was not handled where the user applies --patches on join to modify the local kubeletconfiguration. This results in kubeletconfiguration patch target patches not being applied to the KubeletConfiguration in memory and the health check running on the address:port which are present in the kubelet-config configmap. Fix that by explicitly calling a new function to patch the KubeletConfiguration in memory. This is scoped to only handle the healthz checks after the kubelet config.yaml was already patched and written to disk.	2024-07-20 19:31:19 +03:00
xin.li	c1dca0ad7c	kubeadm: remove unused constants in util/pkiutil Signed-off-by: xin.li <xin.li@daocloud.io>	2024-07-20 23:30:25 +08:00
Kubernetes Prow Robot	892acaa6a7	Merge pull request #126107 from enj/enj/i/svm_not_found_err svm: set UID and RV on SSA patch to cause conflict on logical create	2024-07-20 08:18:01 -07:00

1 2 3 4 5 ...

124141 Commits