Currently, there are some unit tests that are failing on Windows due to
various reasons:
- filepath.IsAbs does not consider "/" or "\" as absolute paths, even
though files can be addressed as such.
- paths not properly joined (filepath.Join should be used).
- files not closed, which means that they cannot be removed / renamed.
- some assertions fail due to slashes / backslashes not matching.
- backslashes need to be escaped in yaml files, or put between ''
instead of "".
Currently, there are some unit tests that are failing on Windows due to
various reasons:
- Windows file permissions do not work the same way as the Linux ones.
- cp does not exist on Windows, and xcopy should be used instead.
- Get-Item does not work for hidden files / folders like AppData, but
works if given the -Force flag.
the fix introduced in #110634 also introduces a bug preventing `kubeadm
upgrade plan` from running on nodes having a different `os.Hostname()`
than node name. concretely, for a node `titi.company.ch`,
`os.Hostname()` will return `titi`, while the full node name is actually
`titi.company.ch`. this simple fix uses the `cfg.NodeRegistration.Name`
instead, which fixes the issue on my nodes with a FQDN node name
Keep previous hostname retrieval as fallback for dupURL CRI fix
`getClientSet` is used by both cmd `token` and `reset`, move this
method to cmd utils to decouple it from one specific cmd.
Signed-off-by: Dave Chen <dave.chen@arm.com>
- Run hack/update-codegen.sh
- Run hack/update-generated-device-plugin.sh
- Run hack/update-generated-protobuf.sh
- Run hack/update-generated-runtime.sh
- Run hack/update-generated-swagger-docs.sh
- Run hack/update-openapi-spec.sh
- Run hack/update-gofmt.sh
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The experimental-watch-progress-notify-interval flag specifies an interval
at which etcd sends data to the kube-api server.
It is used by the WatchBookmark feature which is GA since 1.17.
It will be used by a new WatchList feature which is Alpha since 1.25
In addition to that the feature was graduated to GA (non-experiment) in etcd 3.5 without any code changes
- `cert-dir` could be specified to a value other than the default value
- we have tests that should be executed successfully on the working cluster
Signed-off-by: Dave Chen <dave.chen@arm.com>
Once we patch a kubelet configuration file, the patched output
is in JSON. Make sure it's converted back to YAML, given
the kubelet config in the cluster and on disk is always in YAML.
Add unit test for the new function applyKubeletConfigPatches()
In phases/kubelet/WriteConfigToDisk() create a patch
manager for the root patches directory and apply
the user patches with a target "kubeletconfiguration".
With phases/kubelet/WriteConfigToDisk() about to support patches
it is required that the function accepts an io.Writer
where the PatchManager can output to and also a patch directory.
Modify all call sites of the function WriteConfigToDisk()
to properly prepare an pass an io.Writer and patches dir to it.
This results in command phases for init/join/upgrade to pass
the root io.Writer (usually stdout) and the patchesDir populated
either via the config file or --patches flag.
If the user runs "kubeadm upgrade apply", kubeadm can download
a configuration from the cluster. If the configuration contains
the legacy default imageRepository of "k8s.gcr.io", mutate it
to the new default of "registry.k8s.io" and update the
configuration in the config map.
During "upgrade node/diff" download the configuration, mutate the
image repository locally, but do not mutate the in-cluster value.
That is done only on "apply".
This ensures that users are migrated from the old default registry
domain.
- lock the FG to true by default
- cleanup wrappers and logic related to versioned vs unversioned
naming of API objects (CMs and RBAC)
- update unit tests
The OldControlPlaneTaint taint (master) can be replaced
with the new ControlPlaneTaint (control-plane) taint.
Adapt unit tests in markcontrolplane_test.go
and cluster_test.go.
- iniconfiguration.go: stop applying the "master" taint
for new clusters; update related unit tests in _test.go
- apply.go: Remove logic related to cleanup of the "master" label
during upgrade
- apply.go: Add cleanup of the "master" taint on CP nodes
during upgrade
- controlplane_nodes_test.go: remove test for old "master" taint
on nodes (this needs backport to 1.24, because we have a kubeadm
1.25 vs kubernetes test suite 1.24 e2e test)
Use the etcd 3.5.3+ HTTP(s) endpoint "/health?serializable=true",
to allow the kubelet liveness and starup probes in the
kubeadm generated etcd.yaml (static Pod) to track
individual member health instead of tracking the whole
etcd cluster health.
Given kubeadm 1.25 only supports kubelet 1.25 and 1.24,
1.23 related logic around dockershim can be removed.
- Don't clean the directories
/var/lib/dockershim, /var/runkubernetes, /var/lib/cni
- Pass the CRISocket directly to the kubelet
--container-runtime-endpoint flag without extra handling
of dockershim
- No longer apply the --container-runtime=remote flag
as that is the only possible value in 1.24 and 1.25
- Update unit tests
Note: we are still passing --pod-infra-container-image
to avoid the pause image to be GCed by the kubelet.
During upgrade when a CP node is missing the old / legacy "master"
taint, assume the user has manually removed it to allow
workloads to schedule.
In such cases do not re-taint the node with the new "control-plane"
taint.
Include the flag "--experimental-initial-corrupt-check"
in etcd static pod manifests to ensure
etcd member data consistency.
The etcd feature is planned for graduation in 3.6,
at which point we should switch to using the flag
without the "experimental" prefix.
Some of these changes are cosmetic (repeatedly calling klog.V instead of
reusing the result), others address real issues:
- Logging a message only above a certain verbosity threshold without
recording that verbosity level (if klog.V().Enabled() { klog.Info... }):
this matters when using a logging backend which records the verbosity
level.
- Passing a format string with parameters to a logging function that
doesn't do string formatting.
All of these locations where found by the enhanced logcheck tool from
https://github.com/kubernetes/klog/pull/297.
In some cases it reports false positives, but those can be suppressed with
source code comments.
We now re-use the crictl tool path within the `ContainerRuntime` when
exec'ing into it. This allows introducing a convenience function to
create the crictl command and re-use it where necessary.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
For the YAML examples, make the indentation consistent
by starting with a space and following with a TAB.
Also adjust the indentation of some fields to place them under
the right YAML field parent - e.g. ignorePreflightErrors
is under nodeRegistration.
- Modify VerifyUnmarshalStrict to use serializer/json instead
of sigs.k8s.io/yaml. In strict mode, the serializers
in serializer/json use the new sigs.k8s.io/json library
that also catches case sensitive errors for field names -
e.g. foo vs Foo. Include test case for that in strict/testdata.
- Move the hardcoded schemes to check to the side of the
caller - i.e. accept a slice of runtime.Scheme.
- Move the klog warnings outside of VerifyUnmarshalStrict
and make them the responsibility of the caller.
- Call VerifyUnmarshalStrict when downloading the configuration
from kubeadm-config or the kube-proxy or kubelet-config CMs.
This validation is useful if the user has manually patched the CMs.
The legacy naming "kubelet-config-x.yy" is no longer the
default behavior. Rename instances in documentation and comments
of "kubelet-config-x.yy" to "kubelet-config".
- Graduate the feature gate to Beta and enable it by default.
- Pre-set the default value for UnversionedKubeletConfigMap
to "true" in test/e2e_kubeadm.
- Fix a couple of typos in "tolerate" introduced in the PR that
added the FG in 1.23.
Compare with two pointers will always show that they are different value,
so it will always print the warning message.
Signed-off-by: Dave Chen <dave.chen@arm.com>
- During "upgrade apply" call a new function AddNewControlPlaneTaint()
that finds all nodes with the new "control-plane" node-role label
and adds the new "control-plane" taint to them.
- The function is called in "apply" and is separate from
the step to remove the old "master" label for better debugging
if errors occur.
- Apply "control-plane" taint during init/join by adding the
taint in SetNodeRegistrationDynamicDefaults(). The old
taint "master" is still applied.
- Clarify API docs (v1beta2 and v1beta3) for nodeRegistration.Taint
to not mention "master" taint and be more generic. Remove
example for taints that includes the word "master".
- Update unit tests.
- Update the markcontrolplane phase used by init and join to
only label the nodes with the new control plane label.
- Cleanup TODOs about the old label.
- Remove outdated comment about selfhosting in staticpod/utils.go.
Selfhosting has not been supported in kubeadm for a while
and the comment also mentions the "master" label.
- Update unit tests.
- Rename the function in postupgrade.go to better reflect
what is being done.
- During "upgrade apply" find all nodes with the old label
and remove it by calling PatchNode.
- Update health check for CP nodes to not track "master"
labeled nodes. At this point all CP nodes should have
"control-plane" and we can use that selector only.
- Throw an error if there is more than one known socket on the host.
- Remove the special handling for docker+containerd.
- Remove the local instances of constants for endpoints for
Windows / Unix and use the defaultKnownCRISockets variable
which is populated from OS specific constants.
- Update error message in detectCRISocketImpl to have more
details.
- Make detectCRISocketImpl accept a list of "known" sockets
- Update unit tests for detectCRISocketImpl and make them
use generic paths such as "unix:///foo/bar.sock".
Change the default container runtime CRI socket endpoint to the
one of containerd. Previously it was the one for Docker
- Rename constants.DefaultDockerCRISocket to DefaultCRISocket
- Make the constants files include the endpoints for all supported
container runtimes for Unix/Windows.
- Update unit tests related to docker runtime testing.
- In kubelet/flags.go hardcode the legacy docker socket as a check
to allow kubeadm 1.24 to run against kubelet 1.23 if the user
explicitly sets the criSocket field to "npipe:////./pipe/dockershim"
on Windows or "unix:///var/run/dockershim.sock" on Linux.
The API was deprecated in 1.23 when output/v1alpha2 was
added. v1alpha1 is problematic since it embeds kubeadm/v1beta2
BootstrapToken related types directly. v1alpha2 imports
a new group dedicated to bootstrap tokens apis/bootstraptoken.
crictl already works with the current state of dockershim.
Using the docker CLI is not required and the DockerRuntime
can be removed from kubeadm. This means that crictl
can connect at the dockershim (or cri-dockerd) socket and
be used to list containers, pull images, remove containers, and
all actions that the kubelet can otherwise perform with the socket.
Ensure that crictl is now required for all supported container runtimes
in checks.go. In the help text in waitcontrolplane.go show only
the crictl example.
Remove the check for the docker service from checks.go.
Remove the DockerValidor check from checks.go.
These two checks were special casing Docker as CR and compensating
for the lack of the same checks in dockershim. With the
extraction of dockershim to cri-dockerd, ideally cri-dockerd
should perform the required checks whether it can support
a given Docker config / version running on a host.
During "upgrade node" and "upgrade apply" read the
kubelet env file from /var/lib/kubelet/kubeadm-flags.env
patch the --container-runtime-endpoint flag value to
have the appropriate URL scheme prefix (e.g. unix:// on Linux)
and write the file back to disk.
This is a temporary workaround that should be kept only for 1 release
cycle - i.e. remove this in 1.25.
The CRI socket that kubeadm writes as an annotation
on a particular Node object can include an endpoint that
does not have an URL scheme. This is undesired as long term
the kubelet can stop allowing endpoints without URL scheme.
For control plane nodes "kubeadm upgrade apply" takes
the locally defaulted / populated NodeRegistration and refreshes
the CRI socket in PerformPostUpgradeTasks. But for secondary
nodes "kubeadm upgrade node" does not.
Adapt "upgrade node" to fetch the NodeRegistration for this node
and fix the CRI socket missing URL scheme if needed in the Node
annotation.
- Update defaults for v1beta2 and 3 to have URL scheme
- Raname DefaultUrlScheme to DefaultContainerRuntimeURLScheme
- Prepend a missing URL scheme to user sockets and warn them
that this might not be supported in the future
- Update socket validation to exclude IsAbs() testing
(This is broken on Windows). Assume the path is not empty and has
URL scheme at this point (validation happens after defaulting).
- Use net.Dial to open Unix sockets
- Update all related unit tests
Signed-off-by: pacoxu <paco.xu@daocloud.io>
Signed-off-by: Lubomir I. Ivanov <lubomirivanov@vmware.com>
Currently when the dockershim socket is used, kubeadm only passes
the --network-plugin=cni to the kubelet and assumes the built-in
dockershim. This is valid for versions <1.24, but with dockershim
and related flags removed the kubelet will fail.
Use preflight.GetKubeletVersion() to find the version of the host
kubelet and if the version is <1.24 assume that it has built-in
dockershim. Newer versions should will be treated as "remote" even
if the socket is for dockershim, for example, provided by cri-dockerd.
Update related unit tests.
Apply a small fix to ensure the kubeconfig files
that kubeadm manages have a CA when printed in the table
of the "check expiration" command. "CAName" is the field used for that.
In practice kubeconfig files can contain multiple credentials
from different CAs, but this is not supported by kubeadm and there
is a single cluster CA that signs the single client cert/key
in kubeadm managed kubeconfigs.
In case stacked etcd is used, the code that does expiration checks
does not validate if the etcd CA is "external" (missing key)
and if the etcd CA signed certificates are valid.
Add a new function UsingExternalEtcdCA() similar to existing functions
for the cluster CA and front-proxy CA, that performs the checks for
missing etcd CA key and certificate validity.
This function only runs for stacked etcd, since if etcd is external
kubeadm does not track any certs signed by that etcd CA.
This fixes a bug where the etcd CA will be reported as local even
if the etcd/ca.key is missing during "certs check-expiration".
When the "kubeadm certs check-expiration" command is used and
if the ca.key is not present, regular on disk certificate reads
pass fine, but fail for kubeconfig files. The reason for the
failure is that reading of kubeconfig files currently
requires reading both the CA key and cert from disk. Reading the CA
is done to ensure that the CA cert in the kubeconfig is not out of date
during renewal.
Instead of requiring both a CA key and cert to be read, only read
the CA cert from disk, as only the cert is needed for kubeconfig files.
This fixes printing the cert expiration table even if the ca.key
is missing on a host (i.e. the CA is considered external).