Commit Graph

50 Commits

Author SHA1 Message Date
Vinay Kulkarni
f2bd94a0de In-place Pod Vertical Scaling - core implementation
1. Core Kubelet changes to implement In-place Pod Vertical Scaling.
2. E2E tests for In-place Pod Vertical Scaling.
3. Refactor kubelet code and add missing tests (Derek's kubelet review)
4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature.
5. Fix corner-case where resize A->B->A gets ignored
6. Add cgroup v2 support to pod resize E2E test.
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources

Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>
2023-02-24 18:21:21 +00:00
arrowfeng
6a57404e28 kubelet: cleanup secretManager and configManager in podManager
Signed-off-by: arrowfeng <289716347@qq.com>
2022-11-14 23:05:32 +08:00
David Ashpole
64af1adace
Second attempt: Plumb context to Kubelet CRI calls (#113591)
* plumb context from CRI calls through kubelet

* clean up extra timeouts

* try fixing incorrectly cancelled context
2022-11-05 06:02:13 -07:00
Artur Żyliński
9f31669a53 New histogram: Pod start SLI duration 2022-10-26 11:28:17 +02:00
Jack
7655702313 add container probe duration metrics 2022-01-20 16:50:02 -08:00
Sergey Kanzhelev
a11453efbc remove ReallyCrashForTesting and cleaned up some references to HandleCrash behavior 2021-11-29 20:00:10 +00:00
Shiming Zhang
44e9f6175d Fix test
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
2021-04-14 17:02:27 +08:00
Krzysztof Gibuła
e46b280f96 Replace klog with with testing.T logging in pkg/kubelet tests 2021-03-07 23:10:02 +01:00
Matthias Bertschy
e2edd4a055 Stop probing a pod during graceful shutdown 2021-02-03 19:23:21 +01:00
Amim Knabben
0ed41c3f10 Deprecating --bootstrap-checkpoint-path flag 2020-06-09 15:27:01 -04:00
Eric Mountain
22e0ee768b Removes container RefManager 2020-03-16 14:30:57 +01:00
Matthias Bertschy
66595d54a0 Add startupProbe result handling to kuberuntime 2019-11-13 08:12:54 +01:00
Matthias Bertschy
1a08ea5984 startupProbe: Test changes 2019-08-30 00:40:26 +02:00
Tony Han
b49e07846f reset resultRun to 0 on pod restart 2018-04-19 22:58:19 +08:00
vikaschoudhary16
cedbd93255 Make 'pod' package to use unified checkpointManager
Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
2018-04-16 01:30:20 -04:00
Di Xu
48388fec7e fix all the typos across the project 2018-02-11 11:04:14 +08:00
ymqytw
3dfc8bf7f3 update import 2017-07-20 11:03:49 -07:00
Jacob Simpson
29c1b81d4c Scripted migration from clientset_generated to client-go. 2017-07-17 15:05:37 -07:00
Chao Xu
60604f8818 run hack/update-all 2017-06-22 11:31:03 -07:00
Chao Xu
f4989a45a5 run root-rewrite-v1-..., compile 2017-06-22 10:25:57 -07:00
Kubernetes Submit Queue
b641aedcac Merge pull request #46371 from sjenning/fix-liveness-probe-reset
Automatic merge from submit-queue

reset resultRun on pod restart

xref https://bugzilla.redhat.com/show_bug.cgi?id=1455056

There is currently an issue where, if the pod is restarted due to liveness probe failures exceeding failureThreshold, the failure count is not reset on the probe worker.  When the pod restarts, if the liveness probe fails even once, the pod is restarted again, not honoring failureThreshold on the restart.

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  containers:
  - name: busybox
    image: busybox
    command:
    - sleep
    - "3600"
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 3
      timeoutSeconds: 1
      periodSeconds: 3
      successThreshold: 1
      failureThreshold: 5
  terminationGracePeriodSeconds: 0
```

Before this PR:
```
$ kubectl create -f busybox-probe-fail.yaml 
pod "busybox" created
$ kubectl get pod -w
NAME      READY     STATUS    RESTARTS   AGE
busybox   1/1       Running   0          4s
busybox   1/1       Running   1         24s
busybox   1/1       Running   2         33s
busybox   0/1       CrashLoopBackOff   2         39s
```

After this PR:
```
$ kubectl create -f busybox-probe-fail.yaml
$ kubectl get pod -w
NAME      READY     STATUS              RESTARTS   AGE
busybox   0/1       ContainerCreating   0          2s
busybox   1/1       Running   0         4s
busybox   1/1       Running   1         27s
busybox   1/1       Running   2         45s
```

```release-note
Fix kubelet reset liveness probe failure count across pod restart boundaries
```

Restarts are now happen at even intervals.

@derekwaynecarr
2017-06-03 15:15:49 -07:00
Shyam Jeedigunta
4425864707 Migrate kubelet configmap management logic to an interface 2017-05-31 10:39:36 +02:00
Seth Jennings
2c866a7aaa reset resultRun on pod restart 2017-05-24 14:55:53 -05:00
David Ashpole
1d38818326 Revert "Merge pull request #41202 from dashpole/revert-41095-deletion_pod_lifecycle"
This reverts commit ff87d13b2c, reversing
changes made to 46becf2c81.
2017-02-15 08:44:03 -08:00
David Ashpole
b224f83c37 Revert "[Kubelet] Delay deletion of pod from the API server until volumes are deleted" 2017-02-09 08:45:18 -08:00
David Ashpole
67cb2704c5 delete volumes before pod deletion 2017-02-08 07:34:49 -08:00
deads2k
8a12000402 move client/record 2017-01-31 19:14:13 -05:00
Aleksandra Malinowska
74e1d8078e Revert "Delay deletion of pod from the API server until volumes are deleted" 2017-01-27 13:31:02 +01:00
David Ashpole
9094b57570 cleanup volumes before deleting from the api server 2017-01-25 10:21:15 -08:00
Wojciech Tyczynski
09e4de385c Enable nontrivial secret manager 2017-01-19 19:47:33 +01:00
deads2k
6a4d5cd7cc start the apimachinery repo 2017-01-11 09:09:48 -05:00
Chao Xu
03d8820edc rename /release_1_5 to /clientset 2016-12-14 12:39:48 -08:00
Clayton Coleman
3454a8d52c
refactor: update bazel, codec, and gofmt 2016-12-03 19:10:53 -05:00
Clayton Coleman
5df8cc39c9
refactor: generated 2016-12-03 19:10:46 -05:00
Chao Xu
5e1adf91df cmd/kubelet 2016-11-23 15:53:09 -08:00
David McMahon
ef0c9f0c5b Remove "All rights reserved" from all the headers. 2016-06-29 17:47:36 -07:00
Yu-Ju Hong
1a3d205faf kubelet: stop probing if liveness check fails
This change puts the worker on hold when the liveness check fails. It will
resume probing when observing a new container ID.
2016-02-26 17:12:27 -08:00
Chao Xu
ad46715f51 generate fake client for release_1_2 2016-02-17 16:10:02 -08:00
Tim St. Clair
da0d37f1e0 Fix panic from multiple probe cleanup calls. 2016-02-08 11:23:07 -08:00
Jan Chaloupka
4389b3f0d6 Rewritte util.* -> wait.* wherever reasonable 2016-02-07 12:02:20 +01:00
Chao Xu
cddd7b56a4 replace client with clientset in kubelet and other places 2016-02-02 20:28:45 -08:00
harry
1032067ff9 Replace runtime reference by pkg 2016-02-01 21:06:44 +08:00
Tim St. Clair
67cfed5bf3 Don't wait for sync to update readiness
Push status updates as soon as readiness state changes for containers,
rather than waiting for the sync loop to update the status. In
particular, this should help new containers to come online faster.

Additionally, consolidates prober test helpers into a single file.
2015-11-10 14:00:12 -08:00
Tim St. Clair
1e88a682da Add liveness/readiness probe parameters
- PeriodSeconds - How often to probe
- SuccessThreshold - Number of successful probes to go from failure to success state
- FailureThreshold - Number of failing probes to go from success to failure state

This commit includes to changes in behavior:

1. InitialDelaySeconds now defaults to 10 seconds, rather than the
kubelet sync interval (although that also defaults to 10 seconds).
2. Prober only retries on probe error, not failure. To compensate, the
default FailureThreshold is set to the maxRetries, 3.
2015-11-06 10:46:40 -08:00
Tim St. Clair
858126b42a Clean up static/mirror pod status logic
- status.Manager always deals with the local (static) pod, but gets the
  mirror pod when syncing
  - This lets components like the probe workers ignore mirror pods
2015-11-04 11:42:25 -08:00
Tim St. Clair
9a2089adc8 Concurrency fixes in status.Manager
- Fix deadlock when syncing deleted pods with full update channel
- Prevent sending stale updates to API server
- Don't delete cached status when sync fails (causes problems for prober)
2015-10-28 17:40:55 -07:00
Tim St. Clair
a263c77b65 Refactor liveness probing
This commit builds on previous work and creates an independent
worker for every liveness probe. Liveness probes behave largely the same
as readiness probes, so much of the code is shared by introducing a
probeType paramater to distinguish the type when it matters. The
circular dependency between the runtime and the prober is broken by
exposing a shared liveness ResultsManager, owned by the
kubelet. Finally, an Updates channel is introduced to the ResultsManager
so the kubelet can react to unhealthy containers immediately.
2015-10-19 15:15:59 -07:00
Tim St. Clair
9b8bc50357 Generalize readinessManager to handle liveness 2015-10-07 16:40:30 -07:00
Tim St. Clair
551eff63b8 Use strong type for container ID
Change all references to the container ID in pkg/kubelet/... to the
strong type defined in pkg/kubelet/container: ContainerID

The motivation for this change is to make the format of the ID
unambiguous, specifically whether or not it includes the runtime
prefix (e.g. "docker://").
2015-10-07 10:58:05 -07:00
Tim St. Clair
52ece0c34e Refactor readiness probing
Each container with a readiness has an individual go-routine which
handles periodic probing for that container. The results are cached, and
written to the status.Manager in the pod sync path.
2015-10-02 15:37:10 -07:00