Commit Graph

125410 Commits

Author SHA1 Message Date
Matthieu MOREL
0006a3cc37 fix: enable expected-actual rule from testifylint in module k8s.io/apimachinery
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-27 07:49:18 +02:00
Kubernetes Prow Robot
a8c955ab42
Merge pull request #127606 from thockin/skip_test_target_normalization
Skip Go target normalization in integration tests
2024-09-27 04:42:01 +01:00
Kubernetes Prow Robot
466a6c3407
Merge pull request #127672 from jpbetz/apiextensions-versioned-feature-gates
Migrate apiextensions-apiserver to versioned feature gates
2024-09-27 03:38:08 +01:00
Kubernetes Prow Robot
f84bbfe94f
Merge pull request #127488 from dims/remove-unnecessary-stuff-for-nvidia-gpus
Remove remants of broken stuff - nvidia/autoscaling
2024-09-27 03:38:01 +01:00
Kubernetes Prow Robot
960e3984b0
Merge pull request #127444 from dom4ha/fine-grained-qhints
Fine grain QueueHints for NodeAffinity plugin
2024-09-27 01:42:00 +01:00
Joe Betz
82415c3d9d Update feature gate lists 2024-09-26 20:09:41 -04:00
Joe Betz
138106896e Migrate apiextensions-apiserver to versioned feature gates 2024-09-26 20:09:01 -04:00
Kubernetes Prow Robot
5ebd0da6cc
Merge pull request #127662 from macsko/make_scheduler_perf_sleepop_duration_parametrizable
Make sleepOp duration parametrizable in scheduler_perf
2024-09-26 20:10:01 +01:00
Kubernetes Prow Robot
421436a94c
Merge pull request #127473 from dom4ha/fine-grain-qhints-fit
feature(scheduler): more fine-grained Node QHint for NodeResourceFit plugin
2024-09-26 18:34:02 +01:00
Kubernetes Prow Robot
c89205f7d6
Merge pull request #127647 from mmorel-35/testifylint/formatter@k8s.io/apiserver
fix: enable formatter rule from testifylint in module `k8s.io/apiserver`
2024-09-26 14:56:08 +01:00
Kubernetes Prow Robot
514de367df
Merge pull request #125502 from mas9612/fix-changelog-1.28
Fix pod-index label name for StatefulSet Pods
2024-09-26 14:56:01 +01:00
Maciej Skoczeń
837d917d91 Make sleepOp duration parametrizable in scheduler_perf 2024-09-26 13:07:22 +00:00
Kubernetes Prow Robot
4b33029691
Merge pull request #127646 from mmorel-35/testifylint/formatter@k8s.io/kubectl
fix: enable formatter rule from testifylint in module `k8s.io/kubectl`
2024-09-26 13:00:12 +01:00
Kubernetes Prow Robot
a83e295270
Merge pull request #127644 from kiashok/refactor-hcs-references
Add local reference to hcs structs in windows cri stats test
2024-09-26 13:00:03 +01:00
dom4ha
c7db4bb450 Fine grain QueueHints for nodeaffinity plugin.
Skip queue on unrelated change that keeps pod schedulable when QueueHints are enabled.

Split add from QHints disabled case

Remove case when QHints are disabled

Remove two GHint alternatives in unit tests

more fine-grained Node QHint for NodeResourceFit plugin

Return early when updated Node causes unmatch

Revert "more fine-grained Node QHint for NodeResourceFit plugin"

This reverts commit dfbceb60e0c1c4e47748c12722d9ed6dba1a8366.

Add integration test for requeue of a pod previously rejected by NodeAffinity plugin when a suitable Node is added

Add integratin test for a Node update operation that does not trigger requeue in NodeAffinity plugin

Remove innacurrate comment

Apply review comments
2024-09-26 10:21:08 +00:00
dom4ha
903b1f7e28 more fine-grained Node QHint for NodeResourceFit plugin 2024-09-26 09:51:36 +00:00
Kubernetes Prow Robot
996e674ea7
Merge pull request #127650 from SataQiu/fix-etcd-20240926
kubeadm: fix a bug where the RemoveMember function did not return the correct member list when the member to be removed did not exist
2024-09-26 09:40:00 +01:00
Kubernetes Prow Robot
81ebfb3d0c
Merge pull request #127012 from Chaunceyctx/new-send-bookmark
send bookmark right now after sending all items in watchCache store
2024-09-26 08:22:01 +01:00
SataQiu
2dc0d2962a kubeadm: fix a bug where the RemoveMember function did not return the correct member list when the member to be removed did not exist 2024-09-26 14:29:30 +08:00
Matthieu MOREL
3e558fe604 fix: enable formatter rule from testifylint in module k8s.io/kubectl
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-26 08:26:24 +02:00
Matthieu MOREL
58d5acd598 fix: enable formatter rule from testifylint in module k8s.io/apiserver
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-26 08:22:57 +02:00
Chaunceyctx
7239202533 send bookmark right now after sending all items in watchCache store 2024-09-26 14:18:37 +08:00
Kubernetes Prow Robot
9770283c13
Merge pull request #127571 from AxeZhan/reviewer
nominate AxeZhan to sig-scheduling reviewers
2024-09-26 05:00:03 +01:00
Kubernetes Prow Robot
239802e4f7
Merge pull request #127574 from bouaouda-achraf/e2e-test-add-network-subnet-param
feat(test-e2e): support custom network and subnet on remote e2e mode
2024-09-26 03:50:08 +01:00
Kubernetes Prow Robot
a73a277715
Merge pull request #125678 from benluddy/cbor-nondeterministic-encode
KEP-4222:  Support nondeterministic encode for the CBOR serializer.
2024-09-26 03:50:00 +01:00
Kirtana Ashok
4a5513c19c Add local reference to hcs structs in windows cri stats test
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2024-09-25 18:56:03 -07:00
Kubernetes Prow Robot
b62d364195
Merge pull request #127200 from omerap12/version_fg_apiserver
chore: moving apiserver featuregates to versioned.
2024-09-26 02:19:28 +01:00
Kubernetes Prow Robot
45676184d4
Merge pull request #127560 from macsko/add_updateanyop_to_scheduler_perf
Add updateAnyOp to scheduler_perf
2024-09-26 00:47:28 +01:00
Tim Hockin
cf280dd6c2
Skip Go target normalization in integration tests 2024-09-25 13:15:48 -07:00
Kubernetes Prow Robot
3582dce115
Merge pull request #127626 from thockin/stop_disabling_GO111MODULE_in_tests
Stop setting GO111MODULE=off in tests
2024-09-25 21:02:17 +01:00
Kubernetes Prow Robot
9f42d51e2f
Merge pull request #127578 from skitt/drop-ptr-wrappers-pkg-api
pkg/api(s): drop pointer wrapper functions
2024-09-25 21:02:09 +01:00
Kubernetes Prow Robot
61c408a7d9
Merge pull request #125917 from skitt/drop-auth-path-kubernetes-auth
Drop references to auth-path and kubernetes_auth
2024-09-25 21:02:02 +01:00
Kubernetes Prow Robot
e542d9c8ca
Merge pull request #127608 from carlory/fix-127403
drop the option mark from the InvolvedObject field of internal event object
2024-09-25 19:58:07 +01:00
Kubernetes Prow Robot
f976be809e
Merge pull request #127552 from mmorel-35/testifylint/nil-compare@k8s.io/kubernetes
fix: enable nil-compare and error-nil rules from testifylint in module `k8s.io/kubernetes`
2024-09-25 19:58:00 +01:00
Tim Hockin
061c4f4f70
Stop setting GO111MODULE=off in tests 2024-09-25 10:00:57 -07:00
Kubernetes Prow Robot
2b196cff8b
Merge pull request #127589 from soltysh/timestamp_e2e
e2e: add test covering cronjob-scheduled-timestamp annotation added by cronjob
2024-09-25 17:46:09 +01:00
Kubernetes Prow Robot
5de3c1e93d
Merge pull request #127292 from skitt/fix-client-go-extensions-without-test
client-go: add missing template functions and types for extensions
2024-09-25 17:46:00 +01:00
Kubernetes Prow Robot
36bbdd692f
Merge pull request #127466 from guozheng-shen/fix-return
endpointsLeasesResourceLock and configMapsLeasesResourceLock  has been removed
2024-09-25 14:36:01 +01:00
Maciej Skoczeń
40154baab0 Add updateAnyOp to scheduler_perf 2024-09-25 12:42:25 +00:00
Kubernetes Prow Robot
5fc4e71a30
Merge pull request #127499 from pohly/scheduler-perf-updates
scheduler_perf: updates to enhance performance testing of DRA
2024-09-25 13:32:00 +01:00
Maciej Szulik
f11ddad99d
e2e: add test covering cronjob-scheduled-timestamp annotation added by cronjob 2024-09-25 12:47:27 +02:00
Kubernetes Prow Robot
75214d11d5
Merge pull request #127428 from googs1025/scheduler/plugin
chore(scheduler): refactor import package ordering in scheduler
2024-09-25 11:40:07 +01:00
Kubernetes Prow Robot
4c4edfede5
Merge pull request #127398 from my-git9/patch-23
kubeadm: update comment for ArgumentsFromCommand function in app/util/arguments
2024-09-25 11:40:00 +01:00
Lukasz Szaszkiewicz
ae35048cb0
adds watchListEndpointRestrictions for watchlist requests (#126996)
* endpoints/handlers/get: intro watchListEndpointRestrictions

* consistencydetector/list_data_consistency_detector: expose IsDataConsistencyDetectionForListEnabled

* e2e/watchlist: extract common function for adding unstructured secrets

* e2e/watchlist: new e2e scenarios for convering watchListEndpointRestrict
2024-09-25 10:12:01 +01:00
Patrick Ohly
d100768d94 scheduler_perf: track and visualize progress over time
This is useful to see whether pod scheduling happens in bursts and how it
behaves over time, which is relevant in particular for dynamic resource
allocation where it may become harder at the end to find the node which still
has resources available.

Besides "pods scheduled" it's also useful to know how many attempts were
needed, so schedule_attempts_total also gets sampled and stored.

To visualize the result of one or more test runs, use:

     gnuplot.sh *.dat
2024-09-25 11:09:15 +02:00
xin.li
706e939382 kubeadm: update comment for ArgumentsFromCommand function in app/util/arguments
Signed-off-by: xin.li <xin.li@daocloud.io>
2024-09-25 16:19:28 +08:00
Patrick Ohly
ded96042f7 scheduler_perf + DRA: load up cluster by allocating claims
Having to schedule 4999 pods to simulate a "full" cluster is slow. Creating
claims and then allocating them more or less like the scheduler would when
scheduling pods is much faster and in practice has the same effect on the
dynamicresources plugin because it looks at claims, not pods.

This allows defining the "steady state" workloads with higher number of
devices ("claimsPerNode") again. This was prohibitively slow before.
2024-09-25 09:45:39 +02:00
Patrick Ohly
385599f0a8 scheduler_perf + DRA: measure pod scheduling at a steady state
The previous tests were based on scheduling pods until the cluster was
full. This is a valid scenario, but not necessarily realistic.

More realistic is how quickly the scheduler can schedule new pods when some
old pods finished running, in particular in a cluster that is properly
utilized (= almost full). To test this, pods must get created, scheduled, and
then immediately deleted. This can run for a certain period of time.

Scenarios with empty and full cluster have different scheduling rates. This was
previously visible for DRA because the 50% percentile of the scheduling
throughput was lower than the average, but one had to guess in which scenario
the throughput was lower. Now this can be measured for DRA with the new
SteadyStateClusterResourceClaimTemplateStructured test.

The metrics collector must watch pod events to figure out how many pods got
scheduled. Polling misses pods that already got deleted again. There seems to
be no relevant difference in the collected
metrics (SchedulingWithResourceClaimTemplateStructured/2000pods_200nodes, 6 repetitions):

     │            before            │                     after                     │
     │ SchedulingThroughput/Average │ SchedulingThroughput/Average  vs base         │
                         157.1 ± 0%                     157.1 ± 0%  ~ (p=0.329 n=6)

     │           before            │                    after                     │
     │ SchedulingThroughput/Perc50 │ SchedulingThroughput/Perc50  vs base         │
                        48.99 ± 8%                    47.52 ± 9%  ~ (p=0.937 n=6)

     │           before            │                    after                     │
     │ SchedulingThroughput/Perc90 │ SchedulingThroughput/Perc90  vs base         │
                       463.9 ± 16%                   460.1 ± 13%  ~ (p=0.818 n=6)

     │           before            │                    after                     │
     │ SchedulingThroughput/Perc95 │ SchedulingThroughput/Perc95  vs base         │
                       463.9 ± 16%                   460.1 ± 13%  ~ (p=0.818 n=6)

     │           before            │                    after                     │
     │ SchedulingThroughput/Perc99 │ SchedulingThroughput/Perc99  vs base         │
                       463.9 ± 16%                   460.1 ± 13%  ~ (p=0.818 n=6)
2024-09-25 09:45:39 +02:00
Patrick Ohly
51cafb0053 scheduler_perf: more useful errors for configuration mistakes
Before, the first error was reported, which typically was the "invalid op code"
error from the createAny operation:

    scheduler_perf.go:900: parsing test cases error: error unmarshaling JSON: while decoding JSON: cannot unmarshal {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into any known op type: invalid opcode "createPods"; expected "createAny"

Now the opcode is determined first, then decoding into exactly the matching operation is
tried and validated. Unknown fields are an error.

In the case above, decoding a string into time.Duration failed:

    scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into *benchmark.createPodsOp: json: cannot unmarshal string into Go struct field createPodsOp.Duration of type time.Duration

Some typos:

    scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: unknown opcode "sleeep" in {"duration":"5s","opcode":"sleeep"}

    scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"countParram":"$deletingPods","deletePodsPerSecond":50,"opcode":"createPods"} into *benchmark.createPodsOp: json: unknown field "countParram"
2024-09-25 09:45:39 +02:00
Stephen Kitt
13dfa4cbf5
Run codegen
Signed-off-by: Stephen Kitt <skitt@redhat.com>
2024-09-25 07:59:49 +02:00