Back up kubelet config file for `kubeadm upgrade apply`, some code
refactoring is done to de-dup some redundant code logic.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The root cause for that error is because `rename` doesn't work
across different mount points.
The kubelet config file and back up directory are mounted to
different file system in kinder environment.
```
df /var/lib/kubelet/config.yaml | tail -n1 | awk '{print $1}'
/dev/sda2
df /etc/kubernetes/tmp/kubeadm-kubelet-configxxx | tail -n1 | awk '{print $1}'
overlay
```
Call `cp` instead of `rename` to back up the kubelet file would fix
that issue.
Signed-off-by: Dave Chen <dave.chen@arm.com>
This addresses the TODO item so that the old kubelet config file could
be recovered if something goes wrong.
Signed-off-by: Dave Chen <dave.chen@arm.com>
Co-authored-by: Paco Xu <paco.xu@daocloud.io>
* Make policy decision object public
Signed-off-by: Max Smythe <smythe@google.com>
* Separate version conversion from validation
Signed-off-by: Max Smythe <smythe@google.com>
* Address review comments
Signed-off-by: Max Smythe <smythe@google.com>
* Fix variable name
Signed-off-by: Max Smythe <smythe@google.com>
---------
Signed-off-by: Max Smythe <smythe@google.com>
Fixes the issue caused when multile ClusterCIDR objects have the same
nodeSelector values, order of the requirements in the nodeSelector is
not preserved when nodeSelector is marshalled and converted to a string.
Adds integration tests for the following scenarios with
MultiCIDRRangeAllocator enabled:
- ClusterCIDR is released when an associated node is deleted.
- ClusterCIDR delete when a node is associated, validate the finalizer
behavior, make sure that deleted ClusterCIDR is cleaned up after the
associated node is deleted.
- ClusterCIDR marked as terminating due to deletion must not be used for
allocating PodCIDRs to new nodes.
- Tie break behavior when multiple ClusterCIDRs are eligible to
allocate PodCIDRs to a node.
The device plugin test expects that no other pods are running prior to
the test starting. However, it has been observed that in some cases
some resources may still be around from previous tests. This is because
the deletion of resources from other tests is handled by deleting that
test's framework's namespace which is done asynchronously without
waiting for the other test's namespace to be deleted.
As a result, when the node e2e device plugin starts, there may still be
other pods in process of termination. To work around this, add a retry
to the device plugin test to account for the time it takes to delete the
resources from the prior test.
Signed-off-by: David Porter <david@porter.me>
In the `NodeSupportsPreconfiguredRuntimeClassHandler`, update the check
for the runtime handler to return a failure if the
`/etc/containerd/config.toml` or `/etc/crio/crio.conf` config files do
not exist. If an error is returned, then the underlying test will be
skipped.
Test manually with starting a kind cluster and moving the containerd
config file and verifying that the test is skipped:
```
$ docker exec -it kind-worker /bin/bash
root@kind-worker:/# mv /etc/containerd/config.toml /etc/containerd/config.toml.bak
```
```
make WHAT="test/e2e/e2e.test"
$ ./_output/bin/e2e.test -kubeconfig /tmp/kubeconfig_kind -ginkgo.focus=".*should run a Pod requesting a RuntimeClass with a configured handler.*" --num-nodes=1 2>&1 -ginkgo.v=1 | tee -i "/tmp/build-log.txt"
[sig-node] RuntimeClass [It] should run a Pod requesting a RuntimeClass with a configured handler [NodeFeature:RuntimeHandler]
test/e2e/common/node/runtimeclass.go:85
[SKIPPED] Skipping test as node does not have E2E runtime class handler preconfigured in container runtime config: command terminated with exit code 1
```
Signed-off-by: David Porter <david@porter.me>
The recently introduced failure handling in ExpectNoError depends on error
wrapping: if an error prefix gets added with `fmt.Errorf("foo: %v", err)`, then
ExpectNoError cannot detect that the root cause is an assertion failure and
then will add another useless "unexpected error" prefix and will not dump the
additional failure information (currently the backtrace inside the E2E
framework).
Instead of manually deciding on a case-by-case basis where %w is needed, all
error wrapping was updated automatically with
sed -i "s/fmt.Errorf\(.*\): '*\(%s\|%v\)'*\",\(.* err)\)/fmt.Errorf\1: %w\",\3/" $(git grep -l 'fmt.Errorf' test/e2e*)
This may be unnecessary in some cases, but it's not wrong.