Commit Graph

114824 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
1b647d5bf8
Merge pull request #114558 from TommyStarK/unit-tests/pkg-kubelet-nodestatus
kubelet/nodestatus: Improving test coverage
2023-03-09 21:34:00 -08:00
Kubernetes Prow Robot
10802e9be1
Merge pull request #114498 from runzhliu/patch-2
Update kuberuntime_manager_test.go
2023-03-09 21:33:52 -08:00
Kubernetes Prow Robot
78fe7c0dbc
Merge pull request #114426 from my-git9/ut/request_
[UT] add test for pkg/probe/http/request.go
2023-03-09 21:33:45 -08:00
Kubernetes Prow Robot
ccba890df9
Merge pull request #114420 from bzsuni/bz/optimization
Cleanup: fix variable names in comments
2023-03-09 21:33:37 -08:00
Kubernetes Prow Robot
8745423a7a
Merge pull request #114397 from my-git9/ut/util
[UT]And test for pkg/probe/util.go
2023-03-09 21:33:30 -08:00
Kubernetes Prow Robot
f6564d33ba
Merge pull request #114357 from dengyufeng2206/1208pull
Log spelling formatting
2023-03-09 21:33:22 -08:00
Kubernetes Prow Robot
37260b7648
Merge pull request #114290 from Jefftree/mime-remove
remove mime AddExtensionType in discovery test
2023-03-09 21:33:14 -08:00
Kubernetes Prow Robot
a3ad4d7623
Merge pull request #114017 from calvin0327/cleanup-containerruntime-options
cleanup container runtime options
2023-03-09 21:33:06 -08:00
Kubernetes Prow Robot
c58c1efd03
Merge pull request #112882 from pintuiitbhi/comment
apiserver: grammar correction of comment
2023-03-09 21:32:54 -08:00
Kubernetes Prow Robot
fdada9242f
Merge pull request #112061 from JoelSpeed/proto-gopath
Ensure go-to-protobuf gen can run when not in GOPATH
2023-03-09 21:32:42 -08:00
Kubernetes Prow Robot
15f5a5c6ef
Merge pull request #110949 from claudiubelu/adds-unittests-4
tests: Ports kubelet unit tests to Windows
2023-03-09 21:32:30 -08:00
Kubernetes Prow Robot
d241fcb4bd
Merge pull request #110760 from zhoumingcheng/master-unit-v2
add unit test coverage for pkg/kubelet/types/
2023-03-09 20:30:29 -08:00
Kubernetes Prow Robot
33d9543ceb
Merge pull request #111634 from KunWuLuan/pluginmanager_cache_log_amend
docs(desired_state_of_world.go): log in desired_state_of_world.go seems to be wrong
2023-03-09 19:08:29 -08:00
Kermit Alexander II
4e26f680a9 Implement MessageExpression. 2023-03-09 23:37:59 +00:00
Kermit Alexander II
6defbb4410 Update codegen/openapi spec. 2023-03-09 23:37:49 +00:00
Kubernetes Prow Robot
45b96eae98
Merge pull request #113145 from smarterclayton/zombie_terminating_pods
kubelet: Force deleted pods can fail to move out of terminating
2023-03-09 15:32:30 -08:00
Todd Neal
4096c9209c dedupe pod resource request calculation 2023-03-09 17:15:53 -06:00
Kermit Alexander II
4a54225bb4 Add MessageExpression field.
Update docs to note that generating line breaks from messageExpression is not allowed.
2023-03-09 22:25:38 +00:00
Kubernetes Prow Robot
c67953a2d0
Merge pull request #116428 from mborsz/fix
Avoid metric lookup in Parallelizer.Until on every work piece
2023-03-09 13:08:29 -08:00
Patrick Ohly
b4751a52d5 client-go: shut down watch reflector as soon as stop channel closes
Without this change, sometimes leaked goroutines were reported for
test/integration/scheduler_perf. The one that caused the cleanup to get delayed
was this one:

    goleak.go:50: found unexpected goroutines:
        [Goroutine 2704 in state chan receive, 2 minutes, with k8s.io/client-go/tools/cache.(*Reflector).watch on top of the stack:
        goroutine 2704 [chan receive, 2 minutes]:
        k8s.io/client-go/tools/cache.(*Reflector).watch(0xc00453f590, {0x0, 0x0}, 0x1f?, 0xc00a128080?)
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:388 +0x5b3
        k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc00453f590, 0xc006e94900)
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:324 +0x3bd
        k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:279 +0x45
        k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc007aafee0)
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:157 +0x49
        k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc003e18150?, {0x75e37c0, 0xc00389c280}, 0x1, 0xc006e94900)
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:158 +0xcf
        k8s.io/client-go/tools/cache.(*Reflector).Run(0xc00453f590, 0xc006e94900)
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:278 +0x257
        k8s.io/apimachinery/pkg/util/wait.(*Group).StartWithChannel.func1()
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:58 +0x3f
        k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:75 +0x74
        created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
                /nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73 +0xe5

watch() was stuck in an exponential backoff timeout. Logging confirmed that:

        I0309 21:14:21.756149 1572727 reflector.go:387] k8s.io/client-go/informers/factory.go:150: watch of *v1.PersistentVolumeClaim returned Get "https://127.0.0.1:38269/api/v1/persistentvolumeclaims?allowWatchBookmarks=true&resourceVersion=1&timeout=7m47s&timeoutSeconds=467&watch=true": dial tcp 127.0.0.1:38269: connect: connection refused - backing off
2023-03-09 21:24:19 +01:00
Alexander Zielenski
9dec9e04cc update vendor modules 2023-03-09 11:54:24 -08:00
Alexander Zielenski
9597abd089 add explain tests for openapiv3 2023-03-09 11:54:24 -08:00
Alexander Zielenski
8249a827bd kubectl: alias plaintext-openapiv2 to old explain 2023-03-09 11:54:24 -08:00
Alexander Zielenski
81dd9e3d25 refactor factory to support fake openapiv3 2023-03-09 11:54:23 -08:00
Antoine Pelisse
88ec8fba32 Update kubernetes code for minor API changes to kube-openapi 2023-03-09 11:29:44 -08:00
Antoine Pelisse
9bbdb9f130 Update kube-openapi to 15aac26d736a 2023-03-09 11:29:40 -08:00
Kubernetes Prow Robot
54ec651ab5
Merge pull request #110741 from zhoumingcheng/master-unit-v1
add unit test coverage for pkg/kubelet/util/queue
2023-03-09 11:15:51 -08:00
Kubernetes Prow Robot
c9bbb6553d
Merge pull request #116422 from aojea/nodeslect
unexport buggy function nodeSelectorAsSelector
2023-03-09 10:06:03 -08:00
Alexander Constantinescu
6c5ab4263e Implement metrics agreed on the KEP 2023-03-09 18:27:47 +01:00
Jefftree
387d97605e Add metrics for aggregated discovery 2023-03-09 17:24:02 +00:00
Maciej Borsz
30bca1e1d5 Avoid metric lookup in Parallelizer.Util on every work piece 2023-03-09 17:12:30 +00:00
Antonio Ojea
fd62265d19 unexport buggy function nodeSelectorAsSelector
Change-Id: I1e48ac0dd0b33c367fa9be4f4adb11a4531849f9
2023-03-09 16:58:25 +00:00
Francesco Romani
5ca235e0ee e2e: podresources: promote platform-independent test as NodeConformance
We have quite a few podresources e2e tests and, as the feature
progresses to GA, we should consider moving them to NodeConformance.
Unfortunately most of them require linux-specific features not in the
test themselves but in the test prelude (fixture) to check or create the
node conditions (e.g. presence or not of devices, online CPUS...) to be
verified in the test proper.

For this reason we promote only a single test for starters.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-03-09 16:26:01 +01:00
Kubernetes Prow Robot
f90643435e
Merge pull request #113840 from 249043822/br-context-logging-statefulset
statefulset: use contextual logging
2023-03-09 06:42:02 -08:00
andyzhangx
5f65c4b562 update vendor 2023-03-09 13:24:11 +00:00
andyzhangx
0b61e0d9f7 fix intree test
remove azuredisk test
2023-03-09 13:24:11 +00:00
andyzhangx
5d0a54dcb5 remove Azure Disk in-tree driver code
fix
2023-03-09 13:24:08 +00:00
Kubernetes Prow Robot
20e4491385
Merge pull request #116409 from p0lyn0mial/upstream-reflector-list-n-watch
reflector: allow watch method to accept a watcher
2023-03-09 05:16:28 -08:00
Kubernetes Prow Robot
f02da82e36
Merge pull request #116404 from cpanato/go1202
[go] Bump images, dependencies and versions to go 1.20.2
2023-03-09 05:16:20 -08:00
Kubernetes Prow Robot
e0dd10e12d
Merge pull request #116394 from liggitt/cleanup-crd-test
Turn off P&F filter in standalone CRD server tests
2023-03-09 05:16:13 -08:00
Arda Güçlü
a901bb630b
Enable plugin resolution as subcommand for selected builtin commands (#116293)
* Enable plugin resolution as subcommand for selected builtin commands

This PR adds external plugin resolution as subcommand for selected builtin
commands if subcommand does not exist as builtin.

In it's alpha stage, this will only be enabled for create command and
this feature is hidden behind `KUBECTL_ENABLE_CMD_SHADOW` environment variable.

* Rename parameter to exactMatch to better reflect
2023-03-09 05:16:01 -08:00
Alexander Constantinescu
d7060f02ce Implement KEP-3458 2023-03-09 12:04:51 +01:00
Alexander Constantinescu
e30c49b0e8 Add StableLoadBalancerNodeSet feature gate 2023-03-09 12:03:21 +01:00
Kubernetes Prow Robot
87a40ae670
Merge pull request #111658 from alexanderConstantinescu/etp-local-svc-taint-unsched
[CCM - service controller] addressing left over comments from #109706
2023-03-09 02:58:01 -08:00
Lukasz Szaszkiewicz
f6161a51e9 reflector: allow watch method to accept a watcher 2023-03-09 11:29:51 +01:00
Kubernetes Prow Robot
da87af638f
Merge pull request #115856 from lanycrost/e2e-115780-grpc-probe-tests
Promote gRPC probe e2e test to Conformance
2023-03-09 01:06:03 -08:00
cpanato
99c80ac119
[go] Bump images, dependencies and versions to go 1.20.2 2023-03-09 09:57:45 +01:00
Kubernetes Prow Robot
f5ddaa152e
Merge pull request #116392 from seans3/fallback-verifier
Fallback query param verifier
2023-03-08 23:06:00 -08:00
Todd Neal
78ca93e39c rework init containers test to remove host file dependency
Since we can't rely on the test runner and hosts under test to
be on the same machine, we write to the terminate log from each
container and concatenate the results.
2023-03-08 23:17:17 -06:00
Clayton Coleman
6b9a381185
kubelet: Force deleted pods can fail to move out of terminating
If a CRI error occurs during the terminating phase after a pod is
force deleted (API or static) then the housekeeping loop will not
deliver updates to the pod worker which prevents the pod's state
machine from progressing. The pod will remain in the terminating
phase but no further attempts to terminate or cleanup will occur
until the kubelet is restarted.

The pod worker now maintains a store of the pods state that it is
attempting to reconcile and uses that to resync unknown pods when
SyncKnownPods() is invoked, so that failures in sync methods for
unknown pods no longer hang forever.

The pod worker's store tracks desired updates and the last update
applied on podSyncStatuses. Each goroutine now synchronizes to
acquire the next work item, context, and whether the pod can start.
This synchronization moves the pending update to the stored last
update, which will ensure third parties accessing pod worker state
don't see updates before the pod worker begins synchronizing them.

As a consequence, the update channel becomes a simple notifier
(struct{}) so that SyncKnownPods can coordinate with the pod worker
to create a synthetic pending update for unknown pods (i.e. no one
besides the pod worker has data about those pods). Otherwise the
pending update info would be hidden inside the channel.

In order to properly track pending updates, we have to be very
careful not to mix RunningPods (which are calculated from the
container runtime and are missing all spec info) and config-
sourced pods. Update the pod worker to avoid using ToAPIPod()
and instead require the pod worker to directly use
update.Options.Pod or update.Options.RunningPod for the
correct methods. Add a new SyncTerminatingRuntimePod to prevent
accidental invocations of runtime only pod data.

Finally, fix SyncKnownPods to replay the last valid update for
undesired pods which drives the pod state machine towards
termination, and alter HandlePodCleanups to:

- terminate runtime pods that aren't known to the pod worker
- launch admitted pods that aren't known to the pod worker

Any started pods receive a replay until they reach the finished
state, and then are removed from the pod worker. When a desired
pod is detected as not being in the worker, the usual cause is
that the pod was deleted and recreated with the same UID (almost
always a static pod since API UID reuse is statistically
unlikely). This simplifies the previous restartable pod support.
We are careful to filter for active pods (those not already
terminal or those which have been previously rejected by
admission). We also force a refresh of the runtime cache to
ensure we don't see an older version of the state.

Future changes will allow other components that need to view the
pod worker's actual state (not the desired state the podManager
represents) to retrieve that info from the pod worker.

Several bugs in pod lifecycle have been undetectable at runtime
because the kubelet does not clearly describe the number of pods
in use. To better report, add the following metrics:

  kubelet_desired_pods: Pods the pod manager sees
  kubelet_active_pods: "Admitted" pods that gate new pods
  kubelet_mirror_pods: Mirror pods the kubelet is tracking
  kubelet_working_pods: Breakdown of pods from the last sync in
    each phase, orphaned state, and static or not
  kubelet_restarted_pods_total: A counter for pods that saw a
    CREATE before the previous pod with the same UID was finished
  kubelet_orphaned_runtime_pods_total: A counter for pods detected
    at runtime that were not known to the kubelet. Will be
    populated at Kubelet startup and should never be incremented
    after.

Add a metric check to our e2e tests that verifies the values are
captured correctly during a serial test, and then verify them in
detail in unit tests.

Adds 23 series to the kubelet /metrics endpoint.
2023-03-08 22:03:51 -06:00