kubernetes

mirror of https://github.com/k3s-io/kubernetes.git synced 2025-08-01 15:58:37 +00:00

Author	SHA1	Message	Date
Kubernetes Prow Robot	36bbdd692f	Merge pull request #127466 from guozheng-shen/fix-return endpointsLeasesResourceLock and configMapsLeasesResourceLock has been removed	2024-09-25 14:36:01 +01:00
Kubernetes Prow Robot	5fc4e71a30	Merge pull request #127499 from pohly/scheduler-perf-updates scheduler_perf: updates to enhance performance testing of DRA	2024-09-25 13:32:00 +01:00
Kubernetes Prow Robot	75214d11d5	Merge pull request #127428 from googs1025/scheduler/plugin chore(scheduler): refactor import package ordering in scheduler	2024-09-25 11:40:07 +01:00
Kubernetes Prow Robot	4c4edfede5	Merge pull request #127398 from my-git9/patch-23 kubeadm: update comment for ArgumentsFromCommand function in app/util/arguments	2024-09-25 11:40:00 +01:00
Lukasz Szaszkiewicz	ae35048cb0	adds watchListEndpointRestrictions for watchlist requests (#126996 ) * endpoints/handlers/get: intro watchListEndpointRestrictions * consistencydetector/list_data_consistency_detector: expose IsDataConsistencyDetectionForListEnabled * e2e/watchlist: extract common function for adding unstructured secrets * e2e/watchlist: new e2e scenarios for convering watchListEndpointRestrict	2024-09-25 10:12:01 +01:00
Patrick Ohly	d100768d94	scheduler_perf: track and visualize progress over time This is useful to see whether pod scheduling happens in bursts and how it behaves over time, which is relevant in particular for dynamic resource allocation where it may become harder at the end to find the node which still has resources available. Besides "pods scheduled" it's also useful to know how many attempts were needed, so schedule_attempts_total also gets sampled and stored. To visualize the result of one or more test runs, use: gnuplot.sh *.dat	2024-09-25 11:09:15 +02:00
xin.li	706e939382	kubeadm: update comment for ArgumentsFromCommand function in app/util/arguments Signed-off-by: xin.li <xin.li@daocloud.io>	2024-09-25 16:19:28 +08:00
Patrick Ohly	ded96042f7	scheduler_perf + DRA: load up cluster by allocating claims Having to schedule 4999 pods to simulate a "full" cluster is slow. Creating claims and then allocating them more or less like the scheduler would when scheduling pods is much faster and in practice has the same effect on the dynamicresources plugin because it looks at claims, not pods. This allows defining the "steady state" workloads with higher number of devices ("claimsPerNode") again. This was prohibitively slow before.	2024-09-25 09:45:39 +02:00
Patrick Ohly	385599f0a8	scheduler_perf + DRA: measure pod scheduling at a steady state The previous tests were based on scheduling pods until the cluster was full. This is a valid scenario, but not necessarily realistic. More realistic is how quickly the scheduler can schedule new pods when some old pods finished running, in particular in a cluster that is properly utilized (= almost full). To test this, pods must get created, scheduled, and then immediately deleted. This can run for a certain period of time. Scenarios with empty and full cluster have different scheduling rates. This was previously visible for DRA because the 50% percentile of the scheduling throughput was lower than the average, but one had to guess in which scenario the throughput was lower. Now this can be measured for DRA with the new SteadyStateClusterResourceClaimTemplateStructured test. The metrics collector must watch pod events to figure out how many pods got scheduled. Polling misses pods that already got deleted again. There seems to be no relevant difference in the collected metrics (SchedulingWithResourceClaimTemplateStructured/2000pods_200nodes, 6 repetitions): │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ 157.1 ± 0% 157.1 ± 0% ~ (p=0.329 n=6) │ before │ after │ │ SchedulingThroughput/Perc50 │ SchedulingThroughput/Perc50 vs base │ 48.99 ± 8% 47.52 ± 9% ~ (p=0.937 n=6) │ before │ after │ │ SchedulingThroughput/Perc90 │ SchedulingThroughput/Perc90 vs base │ 463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6) │ before │ after │ │ SchedulingThroughput/Perc95 │ SchedulingThroughput/Perc95 vs base │ 463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6) │ before │ after │ │ SchedulingThroughput/Perc99 │ SchedulingThroughput/Perc99 vs base │ 463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6)	2024-09-25 09:45:39 +02:00
Patrick Ohly	51cafb0053	scheduler_perf: more useful errors for configuration mistakes Before, the first error was reported, which typically was the "invalid op code" error from the createAny operation: scheduler_perf.go:900: parsing test cases error: error unmarshaling JSON: while decoding JSON: cannot unmarshal {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into any known op type: invalid opcode "createPods"; expected "createAny" Now the opcode is determined first, then decoding into exactly the matching operation is tried and validated. Unknown fields are an error. In the case above, decoding a string into time.Duration failed: scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into benchmark.createPodsOp: json: cannot unmarshal string into Go struct field createPodsOp.Duration of type time.Duration Some typos: scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: unknown opcode "sleeep" in {"duration":"5s","opcode":"sleeep"} scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"countParram":"$deletingPods","deletePodsPerSecond":50,"opcode":"createPods"} into benchmark.createPodsOp: json: unknown field "countParram"	2024-09-25 09:45:39 +02:00
Kubernetes Prow Robot	99ff62e87a	Merge pull request #127491 from SataQiu/fix-etcd-20240920 kubeadm: check whether the peer URL for the added etcd member already exists when the MemberAddAsLearner/MemberAdd fails	2024-09-25 05:08:07 +01:00
Kubernetes Prow Robot	2e6216170b	Merge pull request #127319 from p0lyn0mial/upstream-define-initial-events-list-blueprint apimachinery/meta/types.go: define InitialEventsListBlueprintAnnotationKey const	2024-09-25 05:08:00 +01:00
Kubernetes Prow Robot	f3a54b68f9	Merge pull request #127579 from chrishenzie/context Propagate existing ctx instead of context.TODO() in sample-controller	2024-09-25 04:02:06 +01:00
Kubernetes Prow Robot	5dd244ff00	Merge pull request #125796 from haorenfsa/fix-gc-sync-blocked garbagecollector: controller should not be blocking on failed cache sync	2024-09-25 04:02:00 +01:00
Kubernetes Prow Robot	8ccc878de0	Merge pull request #127583 from mmorel-35/testifylint/disable/require-error chore: disable require-error rule from testifylint	2024-09-24 23:08:00 +01:00
Kubernetes Prow Robot	e9cde03b91	Merge pull request #127598 from aojea/servicecidr_seconday_dualwrite bugfix: initialize secondary range registry with the right value	2024-09-24 21:08:08 +01:00
Kubernetes Prow Robot	63fc917521	Merge pull request #127480 from thockin/skip_test_target_normalization Skip test target normalization	2024-09-24 21:08:01 +01:00
Kubernetes Prow Robot	9e157c5450	Merge pull request #127357 from lengrongfu/feat/add-chan-buffer add resourceupdates.Update chan buffer	2024-09-24 20:02:01 +01:00
Antonio Ojea	7a9bca3888	bugfix: initialize secondary range registry with the right value When MultiCIDRServiceAllocator feature is enabled, we added an additional feature gate DisableAllocatorDualWrite that allows to enable a mirror behavior on the old allocator to deal with problems during cluster upgrades. During the implementation the secondary range of the legacy allocator was initialized with the valuye of the primary range, hence, when a Service tried to allocate a new IP on the secondary range, it succeded in the new ip allocator but failed when it tried to allocate the same IP on the legacy allocator, since it has a different range. Expand the integration test that run over all the combinations of Service ClusterIP possibilities to run with all the possible combinations of the feature gates. The integration test need to change the way of starting the apiserver otherwise it will timeout.	2024-09-24 17:48:13 +00:00
Patrick Ohly	7bbb3465e5	scheduler_perf: more realistic structured parameters tests Real devices are likely to have a handful of attributes and (for GPUs) the memory as capacity. Most keys will be driver specific, a few may eventually have a domain (none standardized right now).	2024-09-24 18:52:45 +02:00
rongfu.leng	ead64fb8f0	add resourceupdates.Update chan buffer Signed-off-by: rongfu.leng <lenronfu@gmail.com>	2024-09-24 16:48:32 +00:00
Kubernetes Prow Robot	b071443187	Merge pull request #127592 from dims/wait-for-gpus-even-for-aws-kubetest2-ec2-harness Wait for GPUs even for AWS kubetest2 ec2 harness	2024-09-24 17:26:08 +01:00
Kubernetes Prow Robot	56071089e2	Merge pull request #127573 from benluddy/dynamic-golden-response-test Add test for unintended changes to dynamic client response handling.	2024-09-24 17:26:01 +01:00
Tim Hockin	8912df652b	Use Go workspaces + `go list` to find test targets Plain old UNIX find requires us to do all sorts of silly filtering.	2024-09-24 09:04:13 -07:00
SataQiu	9af1b25bec	kubeadm: check the member list status before adding or removing an etcd member	2024-09-24 22:53:42 +08:00
Kubernetes Prow Robot	4c24b9337f	Merge pull request #127575 from alculquicondor/acondor-apps Stepping down from SIG Apps reviewers	2024-09-24 15:38:06 +01:00
Kubernetes Prow Robot	9571d3b6c6	Merge pull request #125995 from carlory/remove-unnecessary-permissions remove unneeded permissions for volume controllers	2024-09-24 15:38:00 +01:00
Davanum Srinivas	472ca3b279	skip control plane nodes, they may not have GPUs Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2024-09-24 10:09:33 -04:00
Kubernetes Prow Robot	6ded721910	Merge pull request #127496 from macsko/add_metricscollectionop_to_scheduler_perf Add separate ops for collecting metrics from multiple namespaces in scheduler_perf	2024-09-24 14:34:00 +01:00
Davanum Srinivas	349c7136c9	Wait for GPUs even for AWS kubetest2 ec2 harness Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2024-09-24 09:11:18 -04:00
Maciej Skoczeń	a273e5381a	Add separate ops for collecting metrics from multiple namespaces in scheduler_perf	2024-09-24 12:28:53 +00:00
Kubernetes Prow Robot	5973accf48	Merge pull request #127570 from soltysh/do_not_return_err Do not return error where it's not needed	2024-09-24 10:20:01 +01:00
Kubernetes Prow Robot	2ade53e264	Merge pull request #124947 from toVersus/fix/eviction-message [Sidecar Containers] Consider init containers in eviction message	2024-09-24 08:58:00 +01:00
Matthieu MOREL	64e9fd50ed	chore: disable require-error rule from testifylint Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-09-24 07:17:52 +02:00
Kubernetes Prow Robot	f0036aac21	Merge pull request #127572 from soltysh/reuse_helper Reuse CreateTestCRD helper for kubectl e2e	2024-09-24 06:05:59 +01:00
Kubernetes Prow Robot	4851ea85e0	Merge pull request #127582 from dims/avoid-collecting-dmesg-when-running-as-daemon Avoid collecting dmesg when running as daemon	2024-09-24 04:55:59 +01:00
Davanum Srinivas	1dc29b74b9	Avoid collecting dmesg when running as daemon Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2024-09-23 21:32:05 -04:00
Kubernetes Prow Robot	94df29b8f2	Merge pull request #127464 from sanposhiho/trigger-nodedelete fix(eventhandler): trigger Node/Delete event	2024-09-24 02:24:00 +01:00
Kubernetes Prow Robot	1137a6a0cc	Merge pull request #127093 from jpbetz/retry-generate-name-ga Promote RetryGenerateName to GA	2024-09-24 00:46:06 +01:00
Kubernetes Prow Robot	d6bb550b10	Merge pull request #122890 from HirazawaUi/fix-pod-grace-period [kubelet]: Fix the bug where pod grace period will be overwritten	2024-09-24 00:45:59 +01:00
Tim Hockin	7d89e9b4c0	Only normalize user-provided test targets	2024-09-23 16:25:29 -07:00
Chris Henzie	3f1c41d53e	Propagate existing ctx instead of context.TODO()	2024-09-23 14:40:07 -07:00
Kubernetes Prow Robot	211d67a511	Merge pull request #125398 from AxeZhan/pvAffinity [scheduler] When the hostname and nodename of a node do not match, ensure that pods carrying PVs with nodeAffinity are scheduled correctly.	2024-09-23 21:22:02 +01:00
Aldo Culquicondor	3d5525ec21	Stepping down from SIG Apps reviewers Change-Id: I4ec085bfe9b5f65ae9b250bd2a7a519379874425	2024-09-23 19:11:54 +00:00
Kubernetes Prow Robot	851cf43a35	Merge pull request #127487 from hakuna-matatah/jobperf-delete-eventhandler Offload the main Job reconciler w.r.t cleaning finalizers	2024-09-23 18:08:06 +01:00
Kubernetes Prow Robot	7ff0580bc8	Merge pull request #127458 from ii/promote-volume-attachment-status-test Promote e2e test for VolumeAttachmentStatus Endpoints +3 Endpoints	2024-09-23 18:08:00 +01:00
Ben Luddy	c8b1037a58	Add test for unintended changes to dynamic client response handling. The goal is to increase confidence that a change to the dynamic client does not unintentionally introduce subtle changes to objects returned by dynamic clients in existing programs.	2024-09-23 12:45:22 -04:00
Maciej Szulik	b51d6308a7	Reuse CreateTestCRD helper for kubectl e2e	2024-09-23 18:32:27 +02:00
Maciej Szulik	3bff2b7ee9	Do not return error where it's not needed	2024-09-23 18:12:31 +02:00
Kubernetes Prow Robot	ff391cefe2	Merge pull request #127547 from dims/skip-reinstallation-of-gpu-daemonset Skip re-installation of GPU daemonset	2024-09-23 15:28:00 +01:00

1 2 3 4 5 ...

125355 Commits