kubernetes

mirror of https://github.com/k3s-io/kubernetes.git synced 2025-07-27 05:27:21 +00:00

Author	SHA1	Message	Date
Michal Wozniak	3d68f362c3	Give terminal phase correctly to all pods that will not be restarted	2023-03-16 21:25:29 +01:00
Clayton Coleman	71a36529d1	kubelet: TestSyncKnownPods should not race SyncKnownPods began triggering UpdatePod() for pods that have been orphaned by desired config to ensure pods run to termination. This test reads a mutex protected value while pod workers are running in the background and as a consequence triggers a data race. Wait for the workers to stabilize before reading the value. Other tests validate that the correct sync events are triggered (see kubelet_pods_test.go#TestKubelet_HandlePodCleanups for full verification of this behavior). It is slightly concerning that I was unable to recreate the race locally even under stress testing, but I cannot identify why.	2023-03-13 16:24:37 -06:00
Clayton Coleman	6b9a381185	kubelet: Force deleted pods can fail to move out of terminating If a CRI error occurs during the terminating phase after a pod is force deleted (API or static) then the housekeeping loop will not deliver updates to the pod worker which prevents the pod's state machine from progressing. The pod will remain in the terminating phase but no further attempts to terminate or cleanup will occur until the kubelet is restarted. The pod worker now maintains a store of the pods state that it is attempting to reconcile and uses that to resync unknown pods when SyncKnownPods() is invoked, so that failures in sync methods for unknown pods no longer hang forever. The pod worker's store tracks desired updates and the last update applied on podSyncStatuses. Each goroutine now synchronizes to acquire the next work item, context, and whether the pod can start. This synchronization moves the pending update to the stored last update, which will ensure third parties accessing pod worker state don't see updates before the pod worker begins synchronizing them. As a consequence, the update channel becomes a simple notifier (struct{}) so that SyncKnownPods can coordinate with the pod worker to create a synthetic pending update for unknown pods (i.e. no one besides the pod worker has data about those pods). Otherwise the pending update info would be hidden inside the channel. In order to properly track pending updates, we have to be very careful not to mix RunningPods (which are calculated from the container runtime and are missing all spec info) and config- sourced pods. Update the pod worker to avoid using ToAPIPod() and instead require the pod worker to directly use update.Options.Pod or update.Options.RunningPod for the correct methods. Add a new SyncTerminatingRuntimePod to prevent accidental invocations of runtime only pod data. Finally, fix SyncKnownPods to replay the last valid update for undesired pods which drives the pod state machine towards termination, and alter HandlePodCleanups to: - terminate runtime pods that aren't known to the pod worker - launch admitted pods that aren't known to the pod worker Any started pods receive a replay until they reach the finished state, and then are removed from the pod worker. When a desired pod is detected as not being in the worker, the usual cause is that the pod was deleted and recreated with the same UID (almost always a static pod since API UID reuse is statistically unlikely). This simplifies the previous restartable pod support. We are careful to filter for active pods (those not already terminal or those which have been previously rejected by admission). We also force a refresh of the runtime cache to ensure we don't see an older version of the state. Future changes will allow other components that need to view the pod worker's actual state (not the desired state the podManager represents) to retrieve that info from the pod worker. Several bugs in pod lifecycle have been undetectable at runtime because the kubelet does not clearly describe the number of pods in use. To better report, add the following metrics: kubelet_desired_pods: Pods the pod manager sees kubelet_active_pods: "Admitted" pods that gate new pods kubelet_mirror_pods: Mirror pods the kubelet is tracking kubelet_working_pods: Breakdown of pods from the last sync in each phase, orphaned state, and static or not kubelet_restarted_pods_total: A counter for pods that saw a CREATE before the previous pod with the same UID was finished kubelet_orphaned_runtime_pods_total: A counter for pods detected at runtime that were not known to the kubelet. Will be populated at Kubelet startup and should never be incremented after. Add a metric check to our e2e tests that verifies the values are captured correctly during a serial test, and then verify them in detail in unit tests. Adds 23 series to the kubelet /metrics endpoint.	2023-03-08 22:03:51 -06:00
Kubernetes Prow Robot	0b57f4ed4b	Merge pull request #110071 from gjkim42/deflake-TestStaticPodExclusion Deflake TestStaticPodExclusion	2022-07-29 13:17:43 -07:00
Rodrigo Campos	466c4d24a9	pkg/kubelet: skip long test on short mode When adding functionality to the kubelet package and a test file, is kind of painful to run unit tests today locally. We usually can't run specifying the test file, as if xx_test.go and xx.go use the same package, we need to specify all the dependencies. As soon as xx.go uses the Kuebelet type (we need to do that to fake a kubelet in the unit tests), this is completely impossible to do in practice. So the other option is to run the unit tests for the whole package or run only a specific funtion. Running a single function can work in some cases, but it is painful when we want to test all the functions we wrote. On the other hand, running the test for the whole package is very slow. Today some unit tests try to connect to the API server (with retries) create and list lot of pods/volumes, etc. This makes running the unit test for the kubelet package slow. This patch tries to make running the unit test for the whole package more palatable. This patch adds a skip if the short version was requested (go test -short ...), so we don't try to connect to the API server or skip other slow tests. Before this patch running the unit tests took in my computer (I've run it several times so the compilation is already done): $ time go test -v real 0m21.303s user 0m9.033s sys 0m2.052s With this patch it takes ~1/3 of the time: $ time go test -short -v real 0m7.825s user 0m9.588s sys 0m1.723s Around 8 seconds is something I can wait to run the tests :) Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>	2022-06-24 18:00:21 +02:00
Gunju Kim	563c99599f	Deflake TestStaticPodExclusion	2022-05-16 23:30:57 +09:00
Kubernetes Prow Robot	e04a4e1c5b	Merge pull request #105599 from jonyhy96/fix-pod-workers-test fix: pod workers test	2022-04-02 06:50:09 -07:00
Clayton Coleman	69a3820214	kubelet: Delay writing a terminal phase until the pod is terminated Other components must know when the Kubelet has released critical resources for terminal pods. Do not set the phase in the apiserver to terminal until all containers are stopped and cannot restart. As a consequence of this change, the Kubelet must explicitly transition a terminal pod to the terminating state in the pod worker which is handled by returning a new isTerminal boolean from syncPod. Finally, if a pod with init containers hasn't been initialized yet, don't default container statuses or not yet attempted init containers to the unknown failure state.	2022-03-16 13:15:00 -04:00
jonyhy96	60cd896602	fix: pod worker test Signed-off-by: jonyhy96 <hy352144278@gmail.com>	2022-02-24 16:35:33 +08:00
Gunju Kim	3ce5c944a8	kubelet: Clean up a static pod that has been terminated before starting - Allow a podWorker to start if it is blocked by a pod that has been terminated before starting - When a pod can't start AND has already been terminated, exit cleanly - Add a unit test that exercises race conditions in pod workers	2022-02-02 16:05:32 -05:00
Gunju Kim	3bce245279	Ensure there is one running static pod with the same full name	2021-10-19 16:30:18 +09:00
Ryan Phillips	0166d446b9	kubelet: set terminated podWorker status for terminated pods	2021-10-07 16:18:59 -05:00
Clayton Coleman	d5719800bf	kubelet: Handle UID reuse in pod worker If a pod is killed (no longer wanted) and then a subsequent create/ add/update event is seen in the pod worker, assume that a pod UID was reused (as it could be in static pods) and have the next SyncKnownPods after the pod terminates remove the worker history so that the config loop can restart the static pod, as well as return to the caller the fact that this termination was not final. The housekeeping loop then reconciles the desired state of the Kubelet (pods in pod manager that are not in a terminal state, i.e. admitted pods) with the pod worker by resubmitting those pods. This adds a small amount of latency (2s) when a pod UID is reused and the pod is terminated and restarted.	2021-09-15 14:02:00 -04:00
Kubernetes Prow Robot	047a6b9f86	Merge pull request #104874 from wojtek-t/migrate_clock_1 Unify towards k8s.io/utils/clock - part 1	2021-09-13 19:09:20 -07:00
wojtekt	53ce79a18a	Migrate to k8s.io/utils/clock in pkg/kubelet	2021-09-10 12:20:09 +02:00
Kubernetes Prow Robot	5724484bda	Merge pull request #104069 from pacoxu/fix-data-race-104057 fix data race in kubelet volume test: add lock for ut	2021-09-09 21:09:59 -07:00
paco	ab055e9ba4	fix data race in kubelet volume test: add lock Signed-off-by: Paco Xu <paco.xu@daocloud.io> Co-authored-by: Jian Zeng <zengjian.zj@bytedance.com>	2021-09-01 16:13:55 +08:00
Clayton Coleman	a2ca66d280	kubelet: Admission must exclude completed pods and avoid races Fixes two issues with how the pod worker refactor calculated the pods that admission could see (GetActivePods() and filterOutTerminatedPods()) First, completed pods must be filtered from the "desired" state for admission, which arguably should be happening earlier in config. Exclude the two terminal pods states from GetActivePods() Second, the previous check introduced with the pod worker lifecycle ownership changes was subtly wrong for the admission use case. Admission has to include pods that haven't yet hit the pod worker, which CouldHaveRunningContainers was filtering out (because the pod worker hasn't seen them). Introduce a weaker check - IsPodKnownTerminated() - that returns true only if the pod is in a known terminated state (no running containers AND known to pod worker). This weaker check may only be called from components that need admitted pods, not other kubelet subsystems. This commit does not fix the long standing bug that force deleted pods are omitted from admission checks, which must be fixed by having GetActivePods() also include pods "still terminating".	2021-08-25 13:31:02 -04:00
Clayton Coleman	de9cdab5ae	kubelet: Prevent runtime-only pods from going into terminated phase If a pod is already in terminated and the housekeeping loop sees an out of date cache entry for a running container, the pod worker should ignore that running pod termination request. Once the worker completes, a subsequent housekeeping invocation will then invoke terminating because the worker is no longer processing any pod with that UID. This does leave the possibility of syncTerminatedPod being blocked if a container in the pod is started after killPod successfully completes but before syncTerminatedPod can exit successfully, perhaps because the terminated flow (detach volumes) is blocked on that running container. A future change will address that issue.	2021-07-13 15:41:49 -04:00
Clayton Coleman	3eadd1a9ea	Keep pod worker running until pod is truly complete A number of race conditions exist when pods are terminated early in their lifecycle because components in the kubelet need to know "no running containers" or "containers can't be started from now on" but were relying on outdated state. Only the pod worker knows whether containers are being started for a given pod, which is required to know when a pod is "terminated" (no running containers, none coming). Move that responsibility and podKiller function into the pod workers, and have everything that was killing the pod go into the UpdatePod loop. Split syncPod into three phases - setup, terminate containers, and cleanup pod - and have transitions between those methods be visible to other components. After this change, to kill a pod you tell the pod worker to UpdatePod({UpdateType: SyncPodKill, Pod: pod}). Several places in the kubelet were incorrect about whether they were handling terminating (should stop running, might have containers) or terminated (no running containers) pods. The pod worker exposes methods that allow other loops to know when to set up or tear down resources based on the state of the pod - these methods remove the possibility of race conditions by ensuring a single component is responsible for knowing each pod's allowed state and other components simply delegate to checking whether they are in the window by UID. Removing containers now no longer blocks final pod deletion in the API server and are handled as background cleanup. Node shutdown no longer marks pods as failed as they can be restarted in the next step. See https://docs.google.com/document/d/1Pic5TPntdJnYfIpBeZndDelM-AbS4FN9H2GTLFhoJ04/edit# for details	2021-07-06 15:55:22 -04:00
KeZhang	83ee5da75e	Fix:slow memory leak may be in kubelet podworkers.isWorking	2021-06-15 15:26:30 +08:00
Jordan Liggitt	124a5ddf72	Fix int->string casts	2020-07-24 16:23:12 -04:00
Ted Yu	2242e396d4	Pass desiredPods to CleanupPods	2019-07-03 10:35:13 +08:00
Davanum Srinivas	7b8c9acc09	remove unused code Change-Id: If821920ec8872e326b7d85437ad8d2620807799d	2019-04-19 08:36:31 -04:00
mYmNeo	e74cabe545	Correct TestUpdatePod comment Signed-off-by: mYmNeo <thomassong2012@gmail.com>	2017-10-20 09:41:18 +08:00
Klaus Ma	63b78a37e0	Added golint check for pkg/kubelet.	2017-07-19 11:33:06 +08:00
Chao Xu	60604f8818	run hack/update-all	2017-06-22 11:31:03 -07:00
Chao Xu	f4989a45a5	run root-rewrite-v1-..., compile	2017-06-22 10:25:57 -07:00
Clayton Coleman	3e095d12b4	Refactor move of client-go/util/clock to apimachinery	2017-05-20 14:19:48 -04:00
deads2k	8a12000402	move client/record	2017-01-31 19:14:13 -05:00
deads2k	5a8f075197	move authoritative client-go utils out of pkg	2017-01-24 08:59:18 -05:00
deads2k	c47717134b	move utils used in restclient to client-go	2017-01-19 07:55:14 -05:00
Clayton Coleman	9a2a50cda7	refactor: use metav1.ObjectMeta in other types	2017-01-17 16:17:19 -05:00
deads2k	6a4d5cd7cc	start the apimachinery repo	2017-01-11 09:09:48 -05:00
Chao Xu	5e1adf91df	cmd/kubelet	2016-11-23 15:53:09 -08:00
derekwaynecarr	ff017839c7	Log an event when container runtime exceeds grace-period during eviction	2016-09-07 13:28:08 -04:00
Harry Zhang	cb14b35bde	Refactor util clock into it's own pkg	2016-07-28 02:29:04 -04:00
David McMahon	ef0c9f0c5b	Remove "All rights reserved" from all the headers.	2016-06-29 17:47:36 -07:00
derekwaynecarr	6fefb428c1	Add killPodNow to kubelet	2016-05-12 19:17:08 -04:00
Saad Ali	25f37007aa	Merge pull request #24846 from pmorie/kubelet-test-loc Reduce LOC in kubelet tests	2016-05-12 15:52:27 -07:00
Paul Morie	d1e0e726f2	Reduce LOC in kubelet tests	2016-05-03 22:45:08 -04:00
Random-Liu	4cca5b2290	Use fake clock in TestGetPodsToSync to fix flake.	2016-05-02 16:05:36 -07:00
Tim St. Clair	7b6d843309	Move test-only files to test-only packages	2016-03-01 09:11:32 -08:00
Yu-Ju Hong	ff04de4fc0	Remove RuntimeCache from sync path This change removes RuntimeCache in the pod workers and the syncPod() function. Note that it doesn't deprecate RuntimeCache completely as other components still rely on the cache.	2016-02-01 21:32:41 -08:00
Yu-Ju Hong	cfb5442b2d	Turn on kubecontainer.Cache in kubelet	2016-01-19 18:15:10 -08:00
Piotr Szczesniak	9659057986	Revert "Enable kubecontainer.Cache in kubelet"	2016-01-18 13:35:41 +01:00
Yu-Ju Hong	07cf5cff48	Enable kubecontainer.Cache in kubelet	2016-01-14 09:31:24 -08:00
Lantao Liu	a35220c321	cleanup pod_workers_test.go to use general runtime interface	2015-11-04 16:55:25 -08:00
Yu-Ju Hong	2eb17df46b	kubelet: independent pod syncs and backoff on error Currently kubelet syncs all pods every 10s. This is not preferred because * Some pods may have been sync'd recently. * This may cause all the pods to be sync'd at once, causing undesirable CPU spikes. This PR replaces the global syncs with independent, periodic pod syncs. At the end of syncing, each pod worker will enqueue itslef with a future timestamp ( current time + sync interval), when it will be due for another sync. * If the pod worker encoutners an sync error, it may requeue with a different timestamp to retry sooner. * If a sync is triggered by the update channel (events or spec changes), the pod worker would enqueue a new sync time. This change is necessary for moving to long or no periodic sync period once pod lifecycle event generator is completed. We will still rely on the mechanism to requeue the pod on sync error. This change also makes sure that if a sync does not succeed (either due to real error or the per-container backoff mechanism), an error would be propagated back to the pod worker, which is responsible for requeuing.	2015-11-03 13:29:08 -08:00
eulerzgy	31c09bdcb8	Del capatical local packagename for cadvisorApi	2015-10-16 11:03:50 +08:00

1 2

79 Commits