If, for whatever reason, the context was context.Background(), the additional
goroutine was started and then got stuck forever because
context.Background().Done() is a nil channel. Found when indirectly
instantiating a broadcaster with such a context:
found unexpected goroutines:
[Goroutine 9106 in state chan receive (nil chan), with k8s.io/kubernetes/vendor/k8s.io/client-go/tools/record.NewBroadcaster.func1 on top of the stack:
goroutine 9106 [chan receive (nil chan)]:
k8s.io/kubernetes/vendor/k8s.io/client-go/tools/record.NewBroadcaster.func1()
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/record/event.go:206 +0x2c
created by k8s.io/kubernetes/vendor/k8s.io/client-go/tools/record.NewBroadcaster in goroutine 8957
/home/prow/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/record/event.go:205 +0x1a5
This can be fixed by checking for a nil channel.
Another problem also gets addressed: if Shutdown was called without canceling
the context, the goroutine also didn't stop. Now it waits for the cancelation
context and thus terminates in both cases.
27a68aee3a introduced context support for events. Creating an event
broadcaster with context makes tests more resilient against leaking goroutines
when that context gets canceled at the end of a test and enables per-test
output via ktesting.
27a68aee3a introduced context support for events. Creating an event
broadcaster with context makes tests more resilient against leaking goroutines
when that context gets canceled at the end of a test and enables per-test
output via ktesting.
To use this in kubelet, a more thorough code update is needed. For now,
context.TODO serves as a reminder that this is necessary.
Nov 29 22:56:18.559: INFO: Waited 371.058378ms for the sample-apiserver to be ready to handle requests.
Nov 29 22:56:49.503: INFO: At 0001-01-01 00:00:00 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: { } Scheduled: Successfully assigned e2e-openapiv3-5878/sample-apiserver-deployment-58dfd44dd-gp8nd to ip-10-0-80-91.us-west-2.compute.internal
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment: {deployment-controller } ScalingReplicaSet: Scaled up replica set sample-apiserver-deployment-58dfd44dd to 1
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd: {replicaset-controller } SuccessfulCreate: Created pod: sample-apiserver-deployment-58dfd44dd-gp8nd
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {multus } AddedInterface: Add eth0 [10.129.2.137/23] from ovn-kubernetes
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:53 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulling: Pulling image "quay.io/openshift/community-e2e-images:e2e-3-registry-k8s-io-e2e-test-images-sample-apiserver-1-17-7-F6raNs0YQ76APdUD"
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulling: Pulling image "quay.io/openshift/community-e2e-images:e2e-11-registry-k8s-io-etcd-3-5-7-0-C5nYFPeT0lxaFCao"
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Started: Started container sample-apiserver
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Created: Created container sample-apiserver
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:55:57 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulled: Successfully pulled image "quay.io/openshift/community-e2e-images:e2e-3-registry-k8s-io-e2e-test-images-sample-apiserver-1-17-7-F6raNs0YQ76APdUD" in 3.068195738s (3.068206328s including waiting)
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:06 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Pulled: Successfully pulled image "quay.io/openshift/community-e2e-images:e2e-11-registry-k8s-io-etcd-3-5-7-0-C5nYFPeT0lxaFCao" in 9.1810668s (9.18107711s including waiting)
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:06 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Created: Created container etcd
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:06 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Started: Started container etcd
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:48 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Killing: Stopping container sample-apiserver
Nov 29 22:56:49.503: INFO: At 2023-11-29 22:56:48 +0000 UTC - event for sample-apiserver-deployment-58dfd44dd-gp8nd: {kubelet ip-10-0-80-91.us-west-2.compute.internal} Killing: Stopping container etcd
Nov 29 22:56:49.570: INFO: POD NODE PHASE GRACE CONDITIONS
Nov 29 22:56:49.570: INFO:
Nov 29 22:56:49.702: INFO: skipping dumping cluster info - cluster too large
k8s.io/kubernetes/test/e2e/apimachinery.glob..func19.3({0x7f470406fbf8, 0xc001c4b530})
k8s.io/kubernetes@v1.27.1/test/e2e/apimachinery/openapiv3.go:161 +0x6bf
fail [runtime/panic.go:260]: Test Panicked: runtime error: invalid memory address or nil pointer dereference
Ginkgo exit error 1: exit with code 1
Signed-off-by: Dan Williams <dcbw@redhat.com>
In the original version of "MinimizeIPTablesRestore", we skipped the
bottom half of the sync loop when we weren't re-syncing a service, so
certain things that couldn't be skipped had to be done in the top
half. But the code was later changed to always run through the whole
loop body (just not necessarily writing out rules in the bottom half),
so we can reorganize things now to put some related bits of code back
together.
(In particular, this also resolves the fact that we were accidentally
adding the endpoint chains to activeNATChains twice.)
Also change activeNATChains to be a proper generic Set type.