Commit Graph

128084 Commits

Author SHA1 Message Date
Filipe Xavier
14c0bc19ac kubelet: improve allocated resources checkpointing
changed calls to set allocation from container level to pod level on status manager.
2025-02-06 09:20:39 -03:00
Kubernetes Prow Robot
491a23f079
Merge pull request #129999 from pohly/test-e2e-node-timeout
E2E node: fix --timeout default
2025-02-06 03:59:55 -08:00
Patrick Ohly
46a17f60e4 E2E node: fix --timeout default
For unknown reasons, hack/make-rules/test-e2e-node.sh adds -timeout instead of
--timeout. Therefore the fallback code in test/e2e_node/remote/remote.go didn't
find it and added its own --timeout=60m after it. This effectively limits E2E
node test runs to 60 minutes, regardless of what is specified in the job:

    W0206 09:53:51.425532    7151 remote.go:158] ginkgo flags are missing explicit --timeout (ginkgo defaults to 60 minutes)
    I0206 09:53:51.425565    7151 remote.go:165] updated ginkgo flags: -timeout=24h --label-filter="Feature: containsAny DynamicResourceAllocation && Feature: isSubsetOf { Beta, DynamicResourceAllocation } && !Flaky && !Slow"  --no-color -v --timeout=60m
    ...
    I0206 09:53:57.767096    7151 ssh.go:146] Running the command ssh, with args: ... timeout -k 30s 3600.000000s ./ginkgo -timeout=24h --label-filter="Feature: containsAny DynamicResourceAllocation && Feature: isSubsetOf { Beta, DynamicResourceAllocation } && !Flaky && !Slow"  --no-color -v --timeout=60m ...

Note that the timeout for the test was 60m in this case (hence the "timeout -k
30s 3600.000000s") but it could also be something larger.
2025-02-06 11:45:12 +01:00
Kubernetes Prow Robot
9a03243789
Merge pull request #129929 from serathius/deprecate-separate-rpc
Flip SeparateCacheWatchRPC feature gate to false and deprecate it
2025-02-05 17:18:16 -08:00
Siyuan Zhang
8fc3a33454 Refactor compatibility version code
Replace DefaultComponentGlobalsRegistry with new instance of componentGlobalsRegistry in test api server.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

move kube effective version validation out of component base.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

move DefaultComponentGlobalsRegistry out of component base.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

move ComponentGlobalsRegistry out of featuregate pkg.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

remove usage of DefaultComponentGlobalsRegistry in test files.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

change non-test DefaultKubeEffectiveVersion to use DefaultBuildEffectiveVersion.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

Restore useDefaultBuildBinaryVersion in effective version.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

rename DefaultKubeEffectiveVersion to DefaultKubeEffectiveVersionForTest.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

pass options.ComponentGlobalsRegistry into config for controller manager and scheduler.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

Pass apiserver effective version to DefaultResourceEncodingConfig.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

change statusz registry to take effective version from the components.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

Address review comments

Signed-off-by: Siyuan Zhang <sizhang@google.com>

update vendor

Signed-off-by: Siyuan Zhang <sizhang@google.com>
2025-02-05 16:10:53 -08:00
Kubernetes Prow Robot
22f25efc2c
Merge pull request #128991 from Henrywu573/cm-statuz
Add statusz endpoint for kube-controller-manager
2025-02-05 15:54:15 -08:00
Kubernetes Prow Robot
72d74869e9
Merge pull request #129114 from bart0sh/PR167-fix-DRA-registration-test
kubelet: fix DRA registration test
2025-02-05 14:38:26 -08:00
Kubernetes Prow Robot
0634e21fb5
Merge pull request #128367 from vivzbansal/sidecar-2
[FG:InPlacePodVerticalScaling] Implement resize for sidecar containers
2025-02-05 14:38:15 -08:00
Kubernetes Release Robot
e54be1e133 CHANGELOG: Update directory for v1.33.0-alpha.1 release 2025-02-05 20:11:40 +00:00
Kubernetes Prow Robot
925cf7db71
Merge pull request #129930 from serathius/deprecate-watch-from-storage
Deprecate WatchFromStorageWithoutResourceVersion
2025-02-05 10:18:23 -08:00
Kubernetes Prow Robot
1527a145b1
Merge pull request #129921 from srivastav-abhishek/fix-etcd-test
Additional timeout to receive all watchEvents
2025-02-05 10:18:17 -08:00
Henry(Qishan) Wu
8bd4e1bab2 Update test/integration/serving/serving_test.go
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
2025-02-05 09:48:08 -08:00
Kubernetes Prow Robot
d2ad0cc7c0
Merge pull request #129956 from chrischdi/pr-kubeadm-cp-local-mode-fixes
kubeadm: Promote ControlPlaneKubeletLocalMode feature gate to beta - second attempt
2025-02-05 07:02:16 -08:00
Kubernetes Prow Robot
8b1307894d
Merge pull request #129962 from cpanato/update-go-123-main
[go] Bump images, dependencies and versions to go 1.23.5 and distroless iptables
2025-02-05 05:48:16 -08:00
Kubernetes Prow Robot
54eddbd50f
Merge pull request #128989 from Henrywu573/kube-proxy-statuz-new
Add statusz endpoint for kube-proxy
2025-02-05 04:34:15 -08:00
Christian Schlotter
6c093b1699
kubeadm: fix dry-run for kubelet-wait-bootstrap phase 2025-02-05 12:40:08 +01:00
Kubernetes Prow Robot
c7489b20f2
Merge pull request #129750 from googs1025/scheduler/add_integration_for_queuesortplugin
feature: add scheduler queuesort plugins integration test
2025-02-05 03:08:17 -08:00
Marek Siarkowicz
065bf2004d Deprecate WatchFromStorageWithoutResourceVersion
Around the 1.31 release, we discovered that a change introduced in 1.27 allowead
clients to open WATCH requests directly to etcd. This had detrimental consequences,
enabling abusive clients to bypass caching and overwhelm etcd.
Unlike the API server, etcd lacks protection against such behavior.

To mitigate this, we redirected all WATCH requests to be served from the cache.
The WatchFromStorageWithoutResourceVersion feature gate was retained as an escape hatch.
However, since we have no plans to allow direct WATCH requests to etcd again,
this flag is now obsolete.

Direct WATCH requests to etcd offer no advantage, as they don't provide stronger
consistency guarantees. WATCH operations are inherently inconsistent; unlike LIST
operations, they do not confirm the resource version with a quorum. While Kubernetes
uses the WithRequireLeader option on WATCH requests to prevent maintaining connections
to isolated etcd members, the API server provides the same level of guarantee through
its health checks, which fail if it cannot connect to etcd member.  Therefore,
the WatchFromStorageWithoutResourceVersion feature gate can be deprecated and removed.
2025-02-05 11:42:18 +01:00
Christian Schlotter
20fbdeac96
kubeadm: fix upgrade to be able to rollback ControlPlaneLocalMode 2025-02-05 11:33:55 +01:00
Christian Schlotter
bb36212342
kubeadm: Promote ControlPlaneKubeletLocalMode feature gate to beta 2025-02-05 11:33:34 +01:00
Kubernetes Prow Robot
569d1896e6
Merge pull request #129620 from neolit123/1.33-update-all-cp-components-check
kubeadm: graduate WaitForAllControlPlaneComponents to Beta
2025-02-05 01:54:17 -08:00
Marek Siarkowicz
b1ad53c533 Disable StorageNamespaceIndex feature gate when BtreeWatchCache is enabled and deprecate it
Previously, the cache used a map keyed by the full object key,
requiring iteration and filtering by namespace for namespace-scoped requests.
This index allowed for faster responses by avoiding this iteration.

With the introduction of the BtreeWatchCache, this optimization is no longer necessary.
The B-tree structure allows efficient prefix-based searches,
including fetching objects by namespace.
Furthermore, the B-tree returns elements ordered by key, eliminating the need for separate sorting.

Performance improvements with the BtreeWatchCache have been validated through benchmarks matching K8s scalability dimentions (see table below).
These results demonstrate that the B-tree approach provides comparable or better performance than the map with index.
Therefore, the StorageNamespaceIndex feature flag can be safely flipped to false and subsequently deprecated.

| Benchmark                                                                         | Btree with Index (current) | Btree without Index    | Map with Index         | Map without Index (sanity check) |
| --------------------------------------------------------------------------------- | -------------------------- | ---------------------- | ---------------------- | -------------------------------- |
| StoreList (10k Namespaces, 150k Pods, 5k Nodes, RV=, Namespace Scope)             | 20.77µs ± 10%              | 20.14µs ± 13% (~0%)    | 19.73µs ± 6% (~0%)     | 1067.34µs ± 10% (+5037.73%)      |
| StoreList (10k Namespaces, 150k Pods, 5k Nodes, RV=NotOlderThan, Namespace Scope) | 3.943µs ± 6%               | 3.928µs ± 6% (~0%)     | 3.665µs ± 3% (-7.05%)  | 944.641µs ± 1% (+23857.41%)      |
| StoreList (50 Namespaces, 150k Pods, 5k Nodes, RV=, Namespace Scope)              | 303.3µs ± 2%               | 258.2µs ± 2% (-14.85%) | 340.1µs ± 3% (+12.15%) | 1668.6µs ± 4% (+450.23%)         |
| StoreList (50 Namespaces, 150k Pods, 5k Nodes, RV=NotOlderThan, Namespace Scope)  | 286.2µs ± 3%               | 234.7µs ± 1% (-17.99%) | 326.9µs ± 2% (+14.22%) | 1347.7µs ± 4% (+370.91%)         |
| StoreList (100 Namespaces, 110k Pods, 1k Nodes, RV=, Namespace Scope)             | 125.3µs ± 2%               | 112.3µs ± 5% (-10.38%) | 137.5µs ± 2% (+9.81%)  | 1395.1µs ± 8% (+1013.78%)        |
| StoreList (100 Namespaces, 110k Pods, 1k Nodes, RV=NotOlderThan, Namespace Scope) | 120.6µs ± 2%               | 113.2µs ± 1% (-6.13%)  | 133.8µs ± 1% (+10.92%) | 1719.1µs ± 5% (+1325.35%)        |
| Geometric Mean                                                                    | 68.94µs                    | 62.73µs (-9.02%)       | 72.72µs (+5.48%)       | 1.326ms (+1823.40%)              |
2025-02-05 10:49:22 +01:00
Kubernetes Prow Robot
481cc1a392
Merge pull request #129560 from bart0sh/PR168-DRA-fix-All-allocation-mode
DRA: fix allocation mode `All`
2025-02-05 00:38:16 -08:00
Henry Wu
c5f66bfe70 Add statusz endpoint for kube-proxy 2025-02-04 22:20:31 -08:00
Kubernetes Prow Robot
c4434c3161
Merge pull request #129910 from bitoku/fix-129836
Fix flaky test for container life cycle
2025-02-04 16:23:09 -08:00
Kubernetes Prow Robot
fab0d76574
Merge pull request #129731 from gjkim42/promote-sidecar-containers-to-ga
Promote SidecarContainers feature to GA
2025-02-04 16:22:58 -08:00
Kubernetes Prow Robot
f82439f536
Merge pull request #129486 from iholder101/bugfix/swap-container-cri-stats
[KEP-2400] [Bugfix]: Ensure container-level swap metrics are collected
2025-02-04 08:14:59 -08:00
Patrick Ohly
1a8d8c9b4a client-go watch: NewIndexerInformerWatcherWithContext -> WithLogger
The ability to automatically stop on context cancellation was new functionality
that adds complexity and wasn't really used in Kubernetes. If someone wants
this, they can add it outside of the function.

A *WithLogger variant avoids the complexity and is consistent with
NewStreamWatcherWithLogger over in apimachinery.
2025-02-04 16:32:55 +01:00
Kubernetes Prow Robot
7f9fdd65eb
Merge pull request #129968 from sanposhiho/patch-15
fix: remove the mention that DRA uses Pending
2025-02-04 01:36:59 -08:00
Kubernetes Prow Robot
a376ae5dad
Merge pull request #128845 from SergeyKanzhelev/staticPodUpgrade
static pod upgrade test with hostNetwork
2025-02-03 23:30:58 -08:00
Kubernetes Prow Robot
28ba942659
Merge pull request #129844 from cici37/bumCEL
Bump cel-go to v0.23.2
2025-02-03 17:26:58 -08:00
Cici Huang
e1ab6073ab Add more tests for optional. 2025-02-04 00:14:43 +00:00
Kensei Nakada
3701e39327 fix: remove the mention that DRA uses Pending 2025-02-04 06:45:05 +09:00
Cici Huang
8a3d0d68a2 Update the env option. 2025-02-03 18:07:23 +00:00
Cici Huang
7b1c7c639e Fixed the estimated cost for opt map. 2025-02-03 18:06:51 +00:00
Cici Huang
c1e0443232 Bump cel-go to v0.23.2. 2025-02-03 18:06:51 +00:00
cpanato
0ca45bd4f8
Bump images, dependencies and versions to go 1.23.5 and distroless iptables
Signed-off-by: cpanato <ctadeu@gmail.com>
2025-02-03 18:26:38 +01:00
Kubernetes Prow Robot
82e3a671e7
Merge pull request #129920 from googs1025/feature/integration_scoring
feature: Added score integration tests for missing part plugins: TaintToleration plugin
2025-02-03 08:46:57 -08:00
Kubernetes Prow Robot
1b7a059187
Merge pull request #128999 from macsko/improve_goroutines_metric_writes_in_parallelizer_until
Improve Goroutines metric calls in parallelizer.Until
2025-02-03 07:30:57 -08:00
Kubernetes Prow Robot
183ef23273
Merge pull request #129937 from pohly/dra-scheduler-perf-limits
scheduler_perf: add thresholds to DRA test cases
2025-02-03 05:24:57 -08:00
googs1025
5281152f07 feature: Added score integration tests for missing part plugins: TaintToleration plugin 2025-02-03 21:20:00 +08:00
Patrick Ohly
e2ff03486d scheduler_perf: add thresholds to DRA test cases
They were enabled yesterday and executed seven times, with results that (so
far) seem to be fairly stable with just one run that was slower across the
board.

The links in the YAML can be used to navigate to each test case quickly. The
thresholds were chose with a 20% security margin below what seems to be a
common result.
2025-02-03 13:10:10 +01:00
Kubernetes Prow Robot
fc268ecd09
Merge pull request #129823 from googs1025/chore/log_improve
fix(dra plugin): when there is no resourceclaim, return directly
2025-02-02 16:28:56 -08:00
Patrick Ohly
949385731f golangci-lint: remove "strict" checking
The corresponding "pull-kubernetes-verify-lint" job was already removed
earlier. Manual strict checking was still possible, but doesn't really make
sense for the same reasons why the job was removed (e.g. the decisions which
checks should be "strict" were too arbitrary).

The explanations for "hints" no longer end with "In general please prefer to
fix the error, ..." because that was misleading and only really applied to the
checks for existing code. For those checks we prefer to fix errors instead of
suppressing them, but not for hints.
2025-02-02 18:50:27 +01:00
Kubernetes Prow Robot
6e3546228d
Merge pull request #129895 from tallclair/refactor-allocation
Delete unused code: allocated state ClearState
2025-02-02 09:40:55 -08:00
Gunju Kim
8d27bf2108
Leave TODOs in pkg/kubelet/kuberuntime for later removal
This leaves TODOs in pkg/kubelet/kuberuntime to remove these redundant
code paths later, since they are supposed to be a subset of the new code
paths.
2025-02-02 17:45:50 +09:00
Gunju Kim
0bee0bcaa7
Promote SidecarContainers feature to GA 2025-02-02 17:45:36 +09:00
Kubernetes Prow Robot
b4f902f037
Merge pull request #129897 from vinayakankugoyal/testfix
Fix kubelet_authz_test.go
2025-01-31 08:52:56 -08:00
Vinayak Goyal
81f09811ca Fix kubelet_authz_test.go 2025-01-31 15:38:18 +00:00
Marek Siarkowicz
e0f548183c Graduate BtreeWatchCache feature gate to GA 2025-01-31 15:33:24 +01:00