kubernetes

mirror of https://github.com/k3s-io/kubernetes.git synced 2025-09-05 03:03:40 +00:00

Author	SHA1	Message	Date
Kubernetes Prow Robot	da87af638f	Merge pull request #115856 from lanycrost/e2e-115780-grpc-probe-tests Promote gRPC probe e2e test to Conformance	2023-03-09 01:06:03 -08:00
cpanato	99c80ac119	[go] Bump images, dependencies and versions to go 1.20.2	2023-03-09 09:57:45 +01:00
Kubernetes Prow Robot	f5ddaa152e	Merge pull request #116392 from seans3/fallback-verifier Fallback query param verifier	2023-03-08 23:06:00 -08:00
Todd Neal	78ca93e39c	rework init containers test to remove host file dependency Since we can't rely on the test runner and hosts under test to be on the same machine, we write to the terminate log from each container and concatenate the results.	2023-03-08 23:17:17 -06:00
Clayton Coleman	6b9a381185	kubelet: Force deleted pods can fail to move out of terminating If a CRI error occurs during the terminating phase after a pod is force deleted (API or static) then the housekeeping loop will not deliver updates to the pod worker which prevents the pod's state machine from progressing. The pod will remain in the terminating phase but no further attempts to terminate or cleanup will occur until the kubelet is restarted. The pod worker now maintains a store of the pods state that it is attempting to reconcile and uses that to resync unknown pods when SyncKnownPods() is invoked, so that failures in sync methods for unknown pods no longer hang forever. The pod worker's store tracks desired updates and the last update applied on podSyncStatuses. Each goroutine now synchronizes to acquire the next work item, context, and whether the pod can start. This synchronization moves the pending update to the stored last update, which will ensure third parties accessing pod worker state don't see updates before the pod worker begins synchronizing them. As a consequence, the update channel becomes a simple notifier (struct{}) so that SyncKnownPods can coordinate with the pod worker to create a synthetic pending update for unknown pods (i.e. no one besides the pod worker has data about those pods). Otherwise the pending update info would be hidden inside the channel. In order to properly track pending updates, we have to be very careful not to mix RunningPods (which are calculated from the container runtime and are missing all spec info) and config- sourced pods. Update the pod worker to avoid using ToAPIPod() and instead require the pod worker to directly use update.Options.Pod or update.Options.RunningPod for the correct methods. Add a new SyncTerminatingRuntimePod to prevent accidental invocations of runtime only pod data. Finally, fix SyncKnownPods to replay the last valid update for undesired pods which drives the pod state machine towards termination, and alter HandlePodCleanups to: - terminate runtime pods that aren't known to the pod worker - launch admitted pods that aren't known to the pod worker Any started pods receive a replay until they reach the finished state, and then are removed from the pod worker. When a desired pod is detected as not being in the worker, the usual cause is that the pod was deleted and recreated with the same UID (almost always a static pod since API UID reuse is statistically unlikely). This simplifies the previous restartable pod support. We are careful to filter for active pods (those not already terminal or those which have been previously rejected by admission). We also force a refresh of the runtime cache to ensure we don't see an older version of the state. Future changes will allow other components that need to view the pod worker's actual state (not the desired state the podManager represents) to retrieve that info from the pod worker. Several bugs in pod lifecycle have been undetectable at runtime because the kubelet does not clearly describe the number of pods in use. To better report, add the following metrics: kubelet_desired_pods: Pods the pod manager sees kubelet_active_pods: "Admitted" pods that gate new pods kubelet_mirror_pods: Mirror pods the kubelet is tracking kubelet_working_pods: Breakdown of pods from the last sync in each phase, orphaned state, and static or not kubelet_restarted_pods_total: A counter for pods that saw a CREATE before the previous pod with the same UID was finished kubelet_orphaned_runtime_pods_total: A counter for pods detected at runtime that were not known to the kubelet. Will be populated at Kubelet startup and should never be incremented after. Add a metric check to our e2e tests that verifies the values are captured correctly during a serial test, and then verify them in detail in unit tests. Adds 23 series to the kubelet /metrics endpoint.	2023-03-08 22:03:51 -06:00
Paco Xu	a1def4b9c0	pod-infra-container-image: update comments as it will be removed in couple more releases Signed-off-by: Paco Xu <paco.xu@daocloud.io>	2023-03-09 11:14:32 +08:00
Kubernetes Prow Robot	bbe0eb7595	Merge pull request #116386 from kinvolk/rata/local-up-cleanup hack/local-up-cluster.sh: Cleaup on SIGINT	2023-03-08 18:46:07 -08:00
Kubernetes Prow Robot	625b8be09e	Merge pull request #115371 from pacoxu/cgroup-v2-memory-tuning default memoryThrottlingFactor to 0.9 and optimize the memory.high formulas	2023-03-08 18:46:00 -08:00
Kubernetes Prow Robot	8d5c96fed2	Merge pull request #116093 from swatisehgal/topologymanager-ga-graduation node: topologymgr: Graduate Kubelet Topology Manager to GA	2023-03-08 16:56:06 -08:00
Kubernetes Prow Robot	30ee6914c5	Merge pull request #115149 from nilekhc/encrypt-all Allow encryption for all resources	2023-03-08 16:55:59 -08:00
Kubernetes Prow Robot	7fe0fb7fbf	Merge pull request #116393 from liggitt/etcd-cancel-error Recognize etcd/grpc cancel errors correctly	2023-03-08 15:42:49 -08:00
Kubernetes Prow Robot	8fa82976fc	Merge pull request #116356 from pacoxu/cleanup-bump_qps_kubelet sync default qps of kubelet change everywhere	2023-03-08 15:42:41 -08:00
Maksim Nabokikh	c1431af4f8	KEP-3325: Promote SelfSubjectReview to Beta (#116274 ) * Promote SelfSubjectReview to Beta Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com> * Fix whoami API Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com> * Fixes according to code review Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com> --------- Signed-off-by: m.nabokikh <maksim.nabokikh@flant.com>	2023-03-08 15:42:33 -08:00
Kubernetes Prow Robot	0a5310fe9a	Merge pull request #116232 from aojea/e2e_terminating_connectivity test connectivity for terminating pods	2023-03-08 15:42:21 -08:00
Kubernetes Prow Robot	8ee9b82b10	Merge pull request #115984 from tzneal/init-container-tests add more init container testing	2023-03-08 15:42:08 -08:00
Jefftree	361391117d	Enable aggregated discovery	2023-03-08 23:03:52 +00:00
Peter Schuurman	c57bc292de	Add e2e tests for StatefulSetStartOrdinal feature	2023-03-08 14:55:58 -08:00
Kubernetes Prow Robot	4a896644de	Merge pull request #116235 from Jefftree/oas-ga Promote OpenAPI V3 to GA	2023-03-08 14:44:20 -08:00
Kubernetes Prow Robot	b1ba5c5462	Merge pull request #116145 from seans3/discovery-stale Surface "stale" GroupVersions from AggregatedDiscovery	2023-03-08 14:44:08 -08:00
Sean Sullivan	f5865043ed	Fallback query param verifier	2023-03-08 22:20:39 +00:00
Nilekh Chaudhari	9382fab9b6	feat: implements encrypt all Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com>	2023-03-08 22:18:49 +00:00
Antoine Pelisse	4f3859ce91	managedfields: Move most of fieldmanager package to managefields	2023-03-08 13:44:00 -08:00
Jordan Liggitt	ac876e5038	Turn off P&F filter in standalone CRD server tests	2023-03-08 16:21:59 -05:00
Aldo Culquicondor	07a73bb2e1	One lock among PodNominator and SchedulingQueue Change-Id: I17fe5da40250e42c04124c25b530ce6c8dea4154	2023-03-08 16:18:36 -05:00
Kubernetes Prow Robot	8319ac5274	Merge pull request #116383 from Huang-Wei/fix/sched-perf-test fix: remove SchedulingMigratedInTreePVs feature gate in sched perf test	2023-03-08 13:12:20 -08:00
Kubernetes Prow Robot	2a22864d9c	Merge pull request #116381 from pohly/cronjob-integration-test-shutdown cronjob: shut down integration test quickly again	2023-03-08 13:12:08 -08:00
Jordan Liggitt	267eb25e60	Recognize etcd/grpc cancel errors correctly	2023-03-08 15:51:25 -05:00
Todd Neal	123ab80333	add more init container lifetime testing Add some additional init container tests that work via monitoring container lifetime based on logs written to a common file. This allows more easily writing assertions about the container lifetimes with respect to one another.	2023-03-08 14:39:10 -06:00
Kubernetes Prow Robot	8b413d224a	Merge pull request #116342 from msau42/unlock Unlock CSIMigrationvSphere feature gate	2023-03-08 11:27:24 -08:00
Kubernetes Prow Robot	7598ff36cf	Merge pull request #116333 from aojea/multiport_service e2e network test for multiple protocol services on same port	2023-03-08 11:27:12 -08:00
Kubernetes Prow Robot	3307b39bba	Merge pull request #116367 from pohly/lint-config-check golangci-lint: synchronize configs and add verification for that	2023-03-08 09:41:29 -08:00
Kubernetes Prow Robot	a66e139214	Merge pull request #116354 from pacoxu/cleanup-CronJobTimeZone cleanup: sync testdata as CronJobTimeZone is GAed	2023-03-08 09:41:22 -08:00
Jordan Liggitt	003b6d229c	Detect and clean up unneeded after_roundtrip fixtures	2023-03-08 11:38:32 -05:00
Rodrigo Campos	5f568d51be	hack/local-up-cluster.sh: Cleaup on SIGINT Currently we only cleanup on exit. Let's trap SIGINT (ctrl-c) too, so we always cleanup everything. Otherwise if we ctrl-c is easy to leave something running, specially if we ctrl-c while the cleanup function is running. And when we leave something running and don't reused the certs ($REUSE_CERTS), that is the default, something is left running and it fails with weird ways as we can't auth with the new certs. Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>	2023-03-08 17:37:50 +01:00
Wei Huang	c9bc2f98d0	fix: remove SchedulingMigratedInTreePVs feature gate in sched perf test	2023-03-08 08:34:44 -08:00
Patrick Ohly	be82872eff	cronjob: shut down integration test quickly again `6f2cd1b5bd` swapped the order of cancel() and closeFn() so that closeFn got called first when the test was done. This caused it to block while waiting for goroutines which themselves were waiting for the context cancellation. The test still shut down, it just took ~86s instead of ~30s. The fix is to register the cancel twice: once as soon as the context is created (to clean up in case of an unexpected panic) and once after closeFn (because then it'll get called first, as before).	2023-03-08 17:26:47 +01:00
Kubernetes Prow Robot	713ded7368	Merge pull request #115769 from mochizuki875/revert-115732-revert-root-test Revert "Revert #114605: its unit test requires root permission"	2023-03-08 07:53:24 -08:00
Kubernetes Prow Robot	499a03d88b	Merge pull request #115451 from zhucan/nodeexpandvolume-secret-e2e e2e: add e2e test to node expand volume with secret	2023-03-08 07:53:12 -08:00
Kubernetes Prow Robot	03ff890ef4	Merge pull request #116329 from dims/drop-aws-kubelet-credential-provider-and-cleanup-aws-storage-e2e-tests Drop aws kubelet credential provider and cleanup aws storage e2e tests	2023-03-08 06:49:11 -08:00
Patrick Ohly	a04e20f622	golangci-lint: synchronize configs and add verification for that https://github.com/kubernetes/kubernetes/pull/109728 added a golangci-strict.yaml where gingkolinter and stylecheck (some recent additions to golangci.yaml) were missing. To prevent such mistakes in the future, lines that are intentionally different get annotated with a comment about golangci-strict.yaml or golangci.yaml. Then a suitable diff command in the new verify-golangci-lint-config.sh checks that only such lines, comments and blank lines are different.	2023-03-08 15:23:27 +01:00
Patrick Ohly	cbf7d96a85	garbagecollector: structured logging of objectReference When using JSON as output format, we want objectReference values to be represented as a struct. For example, "item" is such a value: {"ts":1678135015708.349,"caller":"garbagecollector/garbagecollector.go:595","msg":"classify object references","v":5,"item":{"name":"dra-test-driver-g4tkd","namespace":"dra-1830","apiVersion":"v1","uid":"c3f88616-7282-488c-887c-3f04291e6f4f"},"solid":null,"dangling":[{"apiVersion":"apps/v1","kind":"ReplicaSet","name":"dra-test-driver","uid":"dbe9a90c-9dfd-4ad0-8395-e5fa228f9851","controller":true,"blockOwnerDeletion":true}],"waitingForDependentsDeletion":null}	2023-03-08 08:37:56 -05:00
Andy Goldstein	26e3dab78b	garbagecollector: use contextual logging Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>	2023-03-08 08:37:56 -05:00
nilskch	0c11171b7e	add tests for probe errors and ExecProbeTimeout	2023-03-08 11:59:59 +01:00
ZhangKe10140699	a239b9986b	Migrated the StatefulSet controller (within `kube-controller-manager) to use [contextual logging](https://k8s.io/docs/concepts/cluster-administration/system-logs/#contextual-logging )	2023-03-08 18:57:57 +08:00
Kubernetes Prow Robot	f99c351992	Merge pull request #116174 from pacoxu/fix-TestNewNodeIpamControllerWithCIDRMasks node ipam controller ut: run test in parallel to avoid timeout	2023-03-08 02:25:12 -08:00
Arda Güçlü	0e98533d1b	Not share process namespace if user explicitly disables it This PR sets higher priority to the `share-processes` flag than provided profile. For example, if user tries to use copy-to debugging with restricted profiling, share process namespace should be false if user explicitly disables it via `--share-processes=false`.	2023-03-08 11:58:28 +03:00
calvin0327	0ffac50126	cleanup container runtime options Signed-off-by: calvin0327 <wen.chen@daocloud.io>	2023-03-08 16:53:19 +08:00
Paco Xu	f368413d65	sync default qps of kubelet change	2023-03-08 14:04:51 +08:00
Paco Xu	23a583dc36	cleanup: sync testdata as CronJobTimeZone is GAed	2023-03-08 13:15:46 +08:00
Paco Xu	0e6636eb33	nodeipam: return error instead of panics	2023-03-08 12:47:14 +08:00

1 2 3 4 5 ...

114639 Commits