Commit Graph

108520 Commits

Author SHA1 Message Date
Steve Kuznetsov
37bb0679aa
customresouce: clean up the storage constructor
The distinction between Storage and REST was lost when the constructor
for the latter began to do almost but not all of the former. No other
callers exist for newREST(), so merging the constructors allows us to be
more clear with what we're constructing and keeps us from
shallow-copying the genericregistry.Store every time even when no status
subresource is requested.

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
2022-05-19 08:26:09 -07:00
Kubernetes Prow Robot
b215a8949a
Merge pull request #108746 from deads2k/proof-2
Handle panic during validating admission webhook admission
2022-05-19 07:11:21 -07:00
Wojciech Tyczyński
b56491e6cf Fix stop signal to drained signal in genericapiserver config 2022-05-19 08:09:56 +02:00
jlsong01
d0353e3214 clarify a comment on annotation key validation
Update staging/src/k8s.io/apimachinery/pkg/api/validation/objectmeta.go

Co-authored-by: Daniel Smith <dbsmith@google.com>
2022-05-19 09:31:13 +08:00
David Eads
d412bf92b3 Handle panic during validating admission webhook admission
Validating admission webhook evaluation can fail, if uncaught this
crashes a kube-apiserver.  Add handling to catch panic while preserving
the behavior of "must not fail".
2022-05-18 14:49:55 -04:00
Kubernetes Prow Robot
92285fd74e
Merge pull request #110111 from neolit123/1.25-update-master-label-taint
kubeadm: remove RemoveOldControlPlaneLabel
2022-05-18 09:54:54 -07:00
David Eads
8d5360ccbc remove enabled by default beta resources that have been removed 2022-05-18 09:37:42 -04:00
Lubomir I. Ivanov
80e5bcae9b kubeadm: remove RemoveOldControlPlaneLabel
The function is no longer used and was missed in a
1.25 cleanup PR.
2022-05-18 15:42:08 +03:00
Kubernetes Prow Robot
a1c8e9386a
Merge pull request #110090 from wojtek-t/shutdown_broadcaster_in_controllers
Fix event broadcaster shutdown in multiple controllers
2022-05-18 03:38:53 -07:00
Kubernetes Prow Robot
b1aa1bd308
Merge pull request #110096 from tkashem/graceful-new-signal
apiserver: add lifecycle signal for preshutdown hook
2022-05-18 00:53:06 -07:00
Kubernetes Prow Robot
84c8afeba3
Merge pull request #110095 from neolit123/1.25-update-master-label-taint
kubeadm: cleanup the "master" taint on CP nodes during upgrade
2022-05-18 00:52:54 -07:00
Kubernetes Prow Robot
eebfd7b574
Merge pull request #110102 from MikeSpreitzer/fix-test-numerators
Fix more initial numerators
2022-05-17 23:02:53 -07:00
Mike Spreitzer
ba690c2257 Fix more initial numerators 2022-05-18 00:22:30 -04:00
Tim Allclair
eb88daeeae
Warn when adding PSA labels to exempt namespaces (#109680) 2022-05-17 21:18:53 -07:00
Kubernetes Prow Robot
71da53c28b
Merge pull request #108218 from cyclinder/remove_featuregate
remove featuregate in 1.25
2022-05-17 20:08:53 -07:00
Kubernetes Prow Robot
842b1b86fb
Merge pull request #109774 from JarHMJ/fix/err_msg
fix log err msg
2022-05-17 18:00:59 -07:00
Kubernetes Prow Robot
90d7400ca1
Merge pull request #109356 from pacoxu/kubeadm-2426-cri
kubeadm: remove temporary handling of CRI socket paths without URL scheme
2022-05-17 18:00:52 -07:00
Kubernetes Prow Robot
b07ee36547
Merge pull request #109752 from MadhavJivrajani/remove-apimachinery-clocks
apimachinery/clock: Delete the apimachinery/clock package
2022-05-17 16:46:42 -07:00
Kubernetes Prow Robot
9169f16841
Merge pull request #108447 from pacoxu/kubeadm-json-upgrade-plan
`Kubeadm upgrade plan` support json/yaml output
2022-05-17 16:46:31 -07:00
Kubernetes Prow Robot
f727b5af34
Merge pull request #110094 from tkashem/refactor-graceful
apiserver: refactor - move AuditBackend.Run out of NonBlockingRun
2022-05-17 15:04:42 -07:00
Kevin Delgado
91c016e4d5
Add unknown metadata field validation tests (#109316)
* add unknown metadata validation e2e tests

* Address PR Feedback

* explicitly check for unexpected nil errors or namespace errors
2022-05-17 15:04:30 -07:00
Abu Kashem
b1f7b60515
apiserver: add lifecycle signal for preshutdown hook 2022-05-17 17:24:11 -04:00
Wojciech Tyczyński
11b679c66a Fix event broadcaster shutdown in multiple controllers 2022-05-17 22:14:19 +02:00
Kubernetes Prow Robot
4bd396115d
Merge pull request #110061 from wojtek-t/shutdown_apiextensions
Cleanup CRD storage on shutdown
2022-05-17 12:17:44 -07:00
Kubernetes Prow Robot
17556d4d63
Merge pull request #110088 from ardaguclu/standartize-validate-func
Set validate functions requiring no parameters for all commands
2022-05-17 11:08:40 -07:00
Kubernetes Prow Robot
f0c47dc916
Merge pull request #110076 from karlkfi/patch-1
fix: reflector to return wrapped list errors
2022-05-17 11:08:28 -07:00
Lubomir I. Ivanov
ddd046f3dd kubeadm: cleanup the "master" taint on CP nodes during upgrade
- iniconfiguration.go: stop applying the "master" taint
for new clusters; update related unit tests in _test.go
- apply.go: Remove logic related to cleanup of the "master" label
during upgrade
- apply.go: Add cleanup of the "master" taint on CP nodes
during upgrade
- controlplane_nodes_test.go: remove test for old "master" taint
on nodes (this needs backport to 1.24, because we have a kubeadm
1.25 vs kubernetes test suite 1.24 e2e test)
2022-05-17 19:21:49 +03:00
Abu Kashem
6b8398318c
apiserver: refactor - move AuditBackend.Run out of NonBlockingRun 2022-05-17 12:01:28 -04:00
Wojciech Tyczyński
01cf641ffb Cleanup CRD storage on shutdown 2022-05-17 15:25:39 +02:00
Kubernetes Prow Robot
c79b909de7
Merge pull request #110081 from wojtek-t/document_shutdown_sequence
Diagram for graceful shutdown
2022-05-17 06:20:39 -07:00
Kubernetes Prow Robot
ad2c625162
Merge pull request #110040 from astoycos/fix-panic
Fix additional panic
2022-05-17 06:20:27 -07:00
Kubernetes Prow Robot
ed522c7460
Merge pull request #110024 from stevekuznetsov/skuznets/split-list-test
storage: split paginated and non-paginated list tests, make them generic
2022-05-17 04:16:26 -07:00
Arda Güçlü
8fb423bfab Set validate functions requiring no parameters for all commands
Validate function is used to validate command options and should not get
any additional parameter. To preserve compatibility across all
kubectl commands, this PR removes all parameters in validate functions.
2022-05-17 11:38:20 +03:00
twilight0620
62298c0493 add test case TestValidateServiceNodePort for validateServiceNodePort method 2022-05-17 14:32:06 +08:00
Wojciech Tyczyński
1145582de3 Diagram for graceful shutdown 2022-05-17 07:57:07 +02:00
Karl Isenberg
9ace604b63
fix: reflector to return wrapped list errors
This fix allows Reflector/Informer callers to detect API errors using the standard Go errors.As unwrapping methods used by the apimachinery helper methods. Combined with a custom WatchErrorHandler, this can be used to stop an informer that encounters specific errors, like resource not found or forbidden.
2022-05-16 16:33:30 -07:00
Kubernetes Prow Robot
c84d0864dd
Merge pull request #110052 from brianpursley/completion-tests
Add unit tests for kubectl completion command
2022-05-16 12:32:17 -07:00
Kubernetes Prow Robot
9f460160c1
Merge pull request #110051 from brianpursley/apiresources-tests
Add unit tests for api-resources and api-versions commands
2022-05-16 10:34:19 -07:00
Lubomir I. Ivanov
29148f61ac kubeadm: add serializable health checks for etcd probes
Use the etcd 3.5.3+ HTTP(s) endpoint "/health?serializable=true",
to allow the kubelet liveness and starup probes in the
kubeadm generated etcd.yaml (static Pod) to track
individual member health instead of tracking the whole
etcd cluster health.
2022-05-16 20:18:35 +03:00
Andrew Stoycos
b7a37f5b3d Fix additional panic
Ensure we take the incomingBlock Lock
in blockQueue to ensure there
is not any possiblity of sending on a
closed incoming channel.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
2022-05-16 11:26:36 -04:00
Andrew Stoycos
2d614a182c Write Unit test to imitate Panic
There was a race creating a panic with shutting down
an eventbroadcaster and it's associated watchers. This
test exposes it.

Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
2022-05-16 11:26:25 -04:00
Kubernetes Prow Robot
81261d4693
Merge pull request #110029 from ash2k/ash2k/no-double-tls-validation
tls.Dial() validates hostname, no need to do that manually
2022-05-16 07:34:18 -07:00
Francesco Romani
f3e157d168 e2e: node: re-enable the device plugin tests
Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 16:05:13 +02:00
Francesco Romani
48b5af49e0 e2e: node: reorder imports
trivial cleanup

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 16:04:01 +02:00
Francesco Romani
98eb6db7c0 e2e: node: fix plugins directory
Previously, the e2e test was overriding the plugins socket directory to
"/var/lib/kubelet/plugins_registry". This seems wrong, and with that
setting the e2e test was already failing, because the registration
process was timing out, in turn because the kubelet was trying to call
back the device plugin in the wrong place (see below for details).

I can't explain why it worked before - or it if worked at all - but
it really seems that `pluginapi.DevicePluginPath` is the right
setting here.

+++

In a nutshell, the device plugin registration process works like this:

1. The kubelet runs and creates the device plugin socket registration
   endpoint:
	KubeletSocket = DevicePluginPath + "kubelet.sock"
	DevicePluginPath = "/var/lib/kubelet/device-plugins/"
2. Each device plugin will listen to an ENDPOINT the kubelet will connect
   backk to.  IOW the kubelet will act like a client to each device plugin,
   to perform allocation requests (and more)
   Each device plugin will serve from a endpoint.
   The endpoint name is plugin-specific, but they all must be inside a
   well-known directory: pluginapi.DevicePluginPath
3. The kubelet creates the device plugin pod, like any other pod
4. During the startup, each device plugin wants to register itself in the
   kubelet. So it sends a request through
   the registration endpoint. Key details:
	grpc.Dial(kubelet registration socket)
	registration request
	reqt := &pluginapi.RegisterRequest{
		Version:      pluginapi.Version,
		Endpoint:     endpointSocket,	<- socket relative to pluginapi.DevicePluginPath
		ResourceName: resourceName, 	<- resource name to be exposed
}
5. While handling the registration request, kubelet dial back the
   device plugin on socketDir + req.Endpoint.
   But socketDir is hardcoded in the device manager code to
   pluginapi.KubeletSocket

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 16:03:50 +02:00
Mikhail Mazurskiy
29dc50c149 tls.Dial() validates hostname, no need to do that manually
Handshake() is still needed for tls.Client() code path. See https://github.com/kubernetes/kubernetes/pull/109750
2022-05-16 23:26:15 +10:00
Kubernetes Prow Robot
45844049fc
Merge pull request #110062 from wojtek-t/fix_storage_object_count_tracker_registration
Avoid leaking StorageObjectCountTracker goroutine
2022-05-16 06:04:17 -07:00
Francesco Romani
23147ff4b3 e2e: node: devplugin: tolerate node readiness flip
In the AfterEach check of the e2e node device plugin tests,
the tests want really bad to clean up after themselves:
- delete the sample device plugin
- restart again the kubelet
- ensure that after the restart, no stale sample devices
  (provided  by the sample device plugin) are reported anymore.

We observed that in the AfterEach block of these e2e tests
we have quite reliably a flip/flop of the kubelet readiness
state, possibly related to a race with/ a slow runtime/PLEG check.

What happens is that the kubelet readiness state is true,
but goes false for a quick interval and then goes true again
and it's pretty stable after that (observed adding more logs
to the check loop).

The key factor here is the function `getLocalNode` aborts the
test (as in `framework.ExpectNoError`) if the node state is
not ready. So any occurrence of this scenario, even if it
is transient, will cause a test failure. I believe this will
make the e2e test unnecessarily fragile without making it more
correct.

For the purpose of the test we can tolerate this kind of glitches,
with kubelet flip/flopping the ready state, granted that we meet
eventually the final desired condition on which the node reports
ready AND reports no sample devices present - which was the condition
the code was trying to check.

So, we add a variant of `getLocalNode`, which just fetches the
node object the e2e_node framework created, alongside to a flag
reporting the node readiness. The new helper does not make
implicitly the test abort if the node is not ready, just bubbles
up this information.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 14:22:25 +02:00
Francesco Romani
56c539bff0 e2e: node: deviceplug: deepcopy the pod dev template
Let's avoid unexpected side effects

Signed-off-by: Francesco Romani <fromani@redhat.com>
2022-05-16 14:22:24 +02:00
Wojciech Tyczyński
564b376812 Avoid leaking StorageObjectCountTracker goroutine 2022-05-16 11:12:00 +02:00