Commit Graph

111396 Commits

Author SHA1 Message Date
Sascha Grunert
b296f82c69
Sort kubelet pods by their creation time
There is a corner case when blocking Pod termination via a lifecycle
preStop hook, for example by using this StateFulSet:

```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: ubi
  serviceName: "ubi"
  replicas: 1
  template:
    metadata:
      labels:
        app: ubi
    spec:
      terminationGracePeriodSeconds: 1000
      containers:
      - name: ubi
        image: ubuntu:22.04
        command: ['sh', '-c', 'echo The app is running! && sleep 360000']
        ports:
        - containerPort: 80
          name: web
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - 'echo aaa; trap : TERM INT; sleep infinity & wait'
```

After creation, downscaling, forced deletion and upscaling of the
replica like this:

```
> kubectl apply -f sts.yml
> kubectl scale sts web --replicas=0
> kubectl delete pod web-0 --grace-period=0 --force
> kubectl scale sts web --replicas=1
```

We will end up having two pods running by the container runtime, while
the API only reports one:

```
> kubectl get pods
NAME    READY   STATUS    RESTARTS   AGE
web-0   1/1     Running   0          92s
```

```
> sudo crictl pods
POD ID              CREATED              STATE     NAME     NAMESPACE     ATTEMPT     RUNTIME
e05bb7dbb7e44       12 minutes ago       Ready     web-0    default       0           (default)
d90088614c73b       12 minutes ago       Ready     web-0    default       0           (default)
```

When now running `kubectl exec -it web-0 -- ps -ef`, there is a random chance that we hit the wrong
container reporting the lifecycle command `/bin/sh -c echo aaa; trap : TERM INT; sleep infinity & wait`.

This is caused by the container lookup via its name (and no podUID) at:
02109414e8/pkg/kubelet/kubelet_pods.go (L1905-L1914)

And more specifiy by the conversion of the pod result map to a slice in `GetPods`:
02109414e8/pkg/kubelet/kuberuntime/kuberuntime_manager.go (L407-L411)

We now solve that unexpected behavior by tracking the creation time of
the pod and sorting the result based on that. This will cause to always
match the most recently created pod.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2022-10-13 16:32:44 +02:00
Khaled Hamed
74db48f510
Update to latest kubedns and nodelocaldns images. 2022-10-13 14:48:37 +02:00
Swati Sehgal
6c6865af28 node: e2e: memorymgr: Fix test failure
The change made in https://github.com/kubernetes/kubernetes/pull/112644
resulted in an update to the rejection message. In the memory manager
node e2e test, we still checked against the old expected error message
giving the impression that the pod succeeded to run even though it failed
as expected mainly because the check wasn't performed correctly.

In this patch, we update to the correct rejection message to make sure
that the memory manager is no longer failing.

NOTE: This test is supposed to run on multi NUMA systems and if the
underlying node does not have multi NUMA nodes, the test is skipped
which is what happens in upstream test infrastructure as it is mainly
composed of single NUMA nodes. Because of this, this test failure
wasn't evident via testgrid.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2022-10-13 12:45:14 +01:00
Paco Xu
2ce7a81169 fsnotify: use event.Has instead of "event.Op&h == h" 2022-10-13 13:42:26 +08:00
Paco Xu
3fee9d2735 update fsnotify to v1.6.0 2022-10-13 13:24:55 +08:00
Han Kang
03015c4bb7 update readme for instrumentation static analysis code
Change-Id: Ibd1261883b3d149de888c9ad2fd7897c00ea3158
2022-10-12 21:22:57 -07:00
Kubernetes Prow Robot
9b4b1c0e79
Merge pull request #113027 from logicalhan/stability-v2
account for timing ratio histogram function calls
2022-10-12 17:51:11 -07:00
Kubernetes Prow Robot
822d983147
Merge pull request #113022 from logicalhan/webhook-metrics
unparameterize 'webhook' from conversion metrics since it's the only one
2022-10-12 17:51:00 -07:00
Kubernetes Prow Robot
fd358d3606
Merge pull request #112219 from kidddddddddddddddddddddd/feat/ExtractParseError
Extract ParseError from PodInfo
2022-10-12 16:45:03 -07:00
Kubernetes Prow Robot
61ca612cbb
Merge pull request #112926 from jiahuif-forks/refactor/cel-out-of-apiextensions
split and move CEL package
2022-10-12 15:03:03 -07:00
Han Kang
bc5d3b06c9 account for timing ratio histogram function calls
Change-Id: Ib27d6018657e4221c36645860bdb9cb9fcf7ebf5
2022-10-12 14:44:28 -07:00
Mark Rossetti
ecd543be04
Remove out-of-support Windows 20H2 images
Signed-off-by: Mark Rossetti <marosset@microsoft.com>
2022-10-12 14:43:51 -07:00
Kubernetes Prow Robot
c56629803d
Merge pull request #112992 from logicalhan/clean-legacy-cp-metrics
fix metric labels in cloud-provider metrics since it breaks static analysis
2022-10-12 14:01:13 -07:00
Kubernetes Prow Robot
e4c4e0c50b
Merge pull request #112991 from logicalhan/explicit-stability
add explicit stability levels for shared metrics
2022-10-12 14:01:02 -07:00
Richa Banker
0dae5510b2 add metrics/slis to kube-scheduler health checks 2022-10-12 13:05:47 -07:00
Kubernetes Prow Robot
c8c955c4cb
Merge pull request #113014 from logicalhan/stability-v2
add support for parsing gauge func
2022-10-12 11:33:01 -07:00
Han Kang
849185a1fa unparameterize 'webhook' from conversion metrics since it's the only one
Change-Id: I6dda5c033786f128e9b2d5d889e47f3dc7937ed5
2022-10-12 10:58:06 -07:00
Dixita Narang
20fa9635d6 Adding ndixita@ to KubeletCredentialProviders feature owner, and capitalizing GA 2022-10-12 17:12:17 +00:00
Han Kang
f3cb904618 cleanup printlns
Change-Id: I49a48446029ba2e66b09f138a1477b837d55766a
2022-10-12 09:47:49 -07:00
Kubernetes Prow Robot
71ca3dad89
Merge pull request #112785 from MartinForReal/master
CloudProvider: service update event should be triggered when appProtocol in port is changed
2022-10-12 09:39:00 -07:00
Han Kang
658d7a184e parse time signatures for maxAge
Change-Id: I91e330d82c4ebbfa38bc52889beb64e6689bfb77
2022-10-12 09:34:14 -07:00
Han Kang
0e7814a647 fix parsing error on labels
Change-Id: I990967b93b10dbfa9a564ca4286ffbd051c69697
2022-10-12 09:25:43 -07:00
Han Kang
49c08947f7 add support for parsing gauge func
Change-Id: Id0b9cd51dead5ee9f4adac804d62f5d9742320a7
2022-10-12 08:30:41 -07:00
Brian McQueen
9c65abd2c2 bumped image version and upgraded to buster and bumped QEMUVERSION to v7.1.0-2 #109295 2022-10-12 07:48:49 -07:00
kidddddddddddddddddddddd
b901ef0f68 changes in test files 2022-10-12 22:11:04 +08:00
Kubernetes Prow Robot
4f2faa2f1c
Merge pull request #112944 from kishen-v/fix_test_failures_go_1_20
Switch to assert.ErrorEquals from assert.Equal to check error equality
2022-10-12 06:23:00 -07:00
kidddddddddddddddddddddd
121d24cfc7 changes in non-test files 2022-10-12 21:09:55 +08:00
kidddddddddddddddddddddd
1eb9d42c3f function changes 2022-10-12 21:00:48 +08:00
Kubernetes Prow Robot
525280d285
Merge pull request #112643 from SergeyKanzhelev/removeDynamicKubeletConfig
remove DynamicKubeletConfig feature gate from the code
2022-10-12 01:33:00 -07:00
Kubernetes Prow Robot
335fd41484
Merge pull request #112978 from logicalhan/kcm-fg
add 'metrics/slis' to kcm health checks
2022-10-11 23:39:00 -07:00
Kubernetes Prow Robot
a56ca8cb03
Merge pull request #112997 from liggitt/dep-approver
Add liggitt to dep-approvers alias
2022-10-11 20:41:10 -07:00
Kubernetes Prow Robot
054d86feb4
Merge pull request #112989 from ameukam/bump-golang.org/x/text-to-v0.3.8
Bump golang.org/x/text to v0.3.8
2022-10-11 20:40:59 -07:00
Kubernetes Prow Robot
73d0fac0db
Merge pull request #112995 from logicalhan/stability-v2
add support for timing histograms and const labels
2022-10-11 18:22:40 -07:00
Jordan Liggitt
ebce13bb69
Add liggitt to dep-approvers alias 2022-10-11 21:05:18 -04:00
Han Kang
52097bc02d add support for timing histograms and const labels
Change-Id: I8f77d5e16c01a403c7cfdccec464a81f4e3beba0
2022-10-11 17:19:14 -07:00
Kubernetes Release Robot
d0e735a664 CHANGELOG: Update directory for v1.26.0-alpha.2 release 2022-10-11 23:27:20 +00:00
Han Kang
44d3bbfbdf fix metric labels in cloud-provider metrics since it breaks static analysis
Change-Id: I8f4efed410cf4fae48d6340a7d45c8c6a28d60e1
2022-10-11 15:19:59 -07:00
Han Kang
e135b66845 add explicit stability levels for shared metrics
Change-Id: I1ad650379b8d0cad76596cdc3fe70397e24abba8
2022-10-11 15:07:08 -07:00
Arnaud Meukam
0d19690a54
Bump golang.org/x/text to v0.3.8
Signed-off-by: Arnaud Meukam <ameukam@gmail.com>
2022-10-11 23:30:39 +02:00
Kubernetes Prow Robot
c1602669a6
Merge pull request #112806 from dcbw/demote-service-affinity-timeout
test: demote service ClientIP affinity timeout tests from conformance
2022-10-11 14:12:40 -07:00
Kubernetes Prow Robot
5113b705d2
Merge pull request #112563 from kerthcet/cleanup/optimize-new-scheduler
Remove newScheduler for reducing complexity
2022-10-11 12:32:41 -07:00
Han Kang
bd2417b435 add 'metrics/slis' to kcm health checks
Change-Id: I8c2114e538bb417deff8c3f9f107758c089227dc
2022-10-11 09:18:42 -07:00
MartinForReal
f68345b2f0 service update event should be triggered when appProtocol in port is changed.
Signed-off-by: MartinForReal <fanshangxiang@gmail.com>
2022-10-11 15:55:46 +00:00
Kubernetes Prow Robot
4516c7972d
Merge pull request #112975 from pohly/e2e-storage-proxy
e2e storage: proxy workarounds
2022-10-11 07:09:02 -07:00
Patrick Ohly
1793132198 e2e storage: avoid usage of stdin for file creation
It turned out to be unreliable (see
https://github.com/kubernetes/kubernetes/issues/112834).  Encoding the data
inside the command as input for base64 is a workaround that is fine for small
amounts of data. It becomes less efficient and/or unusable for large amounts.
2022-10-11 15:02:25 +02:00
Patrick Ohly
64731baffe e2e storage: add logging to proxy
This is optional and can be used to capture the result of command execution in
the log output.
2022-10-11 15:02:25 +02:00
Gunju Kim
add4652352
Promote ExpandedDNSConfig feature to the beta stage
This adds an e2e test for the feature and promotes ExpandedDNSConfig
feature to the beta stage.
2022-10-11 21:00:00 +09:00
Kubernetes Prow Robot
5301d92150
Merge pull request #112945 from chendave/dry-run
kubeadm: Inherit `dry-run` flags for each sub-phases
2022-10-11 03:03:02 -07:00
andyzhangx
36b46ee010 print error msg when fsck failed
fix
2022-10-11 09:23:11 +00:00
Dave Chen
183a26f853 kubeadm: Inherit dry-run flags for each sub-phases
- The sub-phases like `kubeadm reset phase cleanup-node` which
could be run independently would be able to support the `dry-run`
mode as well.

- Consistent with the sub-phases which support the `dry-run` mode
already, such as `kubeadm init phase control-plane apiserver`.

- Prepare for the day when each of those sub-phases could be run
independently.

Signed-off-by: Dave Chen <dave.chen@arm.com>
2022-10-11 16:02:50 +08:00