Automatic merge from submit-queue
Resolve docker-daemon cgroup issue for both systemd and non-systemd node for node e2e tests
Fixed https://github.com/kubernetes/kubernetes/issues/29827
cc/ @coufon this should unblock your pr: #29764
I validated both containervm image and coreos image, and works as expected.
This is also required for adding gci image to node e2e test infrastructure.
Automatic merge from submit-queue
Node E2E: Add serial jenkins job.
This PR added a jenkins job for serial test. It will run all serial test one by one.
This will be useful for https://github.com/kubernetes/kubernetes/pull/29809.
@coufon @yujuhong @dchen1107
/cc @kubernetes/sig-node
Automatic merge from submit-queue
E2E & Node E2E: Add exec util in framework
For #29081.
Based on #29092 and #29494.
For first commit is a squashed commit of all old commits.
**The last 2 commits are new.**
This PR added exec util in framework, and moved `privileged.go` and `kubelet_etc_hosts` into `common` directory.
@vishh @timstclair
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Node E2E: Make node e2e parallel
For https://github.com/kubernetes/kubernetes/issues/29081.
Fix https://github.com/kubernetes/kubernetes/issues/26215.
Based on https://github.com/kubernetes/kubernetes/pull/28807, https://github.com/kubernetes/kubernetes/pull/29020, will rebase after they are merged.
**Only the last commit is new.**
We are going to move more tests into the node e2e test. However, currently node e2e test only run sequentially, the test duration will increase quickly when we add more test.
This PR makes the node e2e test run in parallel so as to shorten test duration, so that we can add more test to improve the test coverage.
* If you run the test locally with `make test-e2e-node`, it will use `-p` ginkgo flag, which uses `(cores-1)` parallel test nodes by default.
* If you run the test remotely or in the Jenkin, the parallelism will be controlled by the environment variable `PARALLELISM`. The default value is `8`, which is reasonable for our test node (n1-standard-1).
Before this PR, it took **833.592s** to run all test on my desktop.
With this PR, it only takes **234.058s** to run.
The pull request node e2e run with this PR takes **232.327s**.
The pull request node e2e run for other PRs takes **673.810s**.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Adding GCI to node e2e.
Depends on https://github.com/kubernetes/kubernetes/pull/29486
Adding the dev release as of now since stable and beta run docker v1.9.1
which is incompatible with kubelet.
bindata and yaml, Gobindata automation
bindata utils for generating, go generate
match server version
gitignore for dirty, ca, rbase, KUBE_ROOT, buildfix
(rebased jul-25,29)
Automatic merge from submit-queue
Fix killing child sudo process in e2e_node tests
Fixes#29211.
The context is we are trying to kill a process started as `sudo kube-apiserver`, but `sudo` ignores signals from the same process group. Applying `Setpgid` means the `sudo kill` process won't be in the same process group, so will not fall foul of this nifty feature.
I also took the liberty of removing some code setting `Pdeathsig` because it claims to be doing something in the same area, but actually it doesn't do that at all. The setting is applied to the forked process, i.e. `sudo`, and it means the `sudo` will get killed if we (`e2e_node.test`) die. This (a) isn't what the comment says and (b) doesn't help because sending SIGKILL to the sudo process leaves sudo's child alive.
I didn't use the "hack for linux-only" approach because I think `Setpgid` is available on all platforms that `e2e_node` builds on.
Automatic merge from submit-queue
Syncing imaging pulling backoff logic
- Syncing the backoff logic in the parallel image puller and the sequential image puller to prepare for merging the two pullers into one.
- Moving image error definitions under kubelet/images
Automatic merge from submit-queue
Change some node e2e test to use the prepull image framework.
Fix https://github.com/kubernetes/kubernetes/issues/28868.
Node e2e test framework pre-pulls all images in [image_list.go](bc2f223f5a/test/e2e_node/image_list.go)
All node e2e test should use image from the "image_list". If a test needs new image, we should update the image_list to include the new image.
/cc @kubernetes/sig-node to notice people to use `image_list` when adding test. :)
Automatic merge from submit-queue
Start namespace controller in node e2e
Fix https://github.com/kubernetes/kubernetes/issues/28320.
Based on https://github.com/kubernetes/kubernetes/pull/28807, only the last 2 commits are new.
Before this PR, there was no namespace controller running in node e2e test infrastructure. We can not enable the [`delete-namespace`](f2ddd60eb9/test/e2e/framework/test_context.go (L109)) flag in the test framework.
So after the test running, there will be running pod left on the test node. This seems to be acceptable in our test infrastructure because we create an new instance each time.
However, in 1.4 we may want to provide part of the test as node conformance test to the user, they definitely don't want the test to leave tons of pods on their node after test running.
Currently, there is no easy way to only start namespace controller in kube-controller-manager (confirmed with @mikedanese), so in this PR I started a "uncontainerized" one in the test infrastructure.
This PR:
* Started the namespace controller in the node e2e test infrastructure and enable the automatic namespace deletion.
* Change the privileged test to use framework (@yujuhong), so that all node e2e tests are using the framework and test pods will be cleaned up by namespace controller.
/cc @kubernetes/sig-node
Automatic merge from submit-queue
Don't repeat the program name in healthCheckCommand.String()
The name is in both `Path` and `Args[0]`, so start printing args at 1.
Also refactor to avoid an extra space character in the output.
I pondered whether `healthCheckCommand.String()` should check if the slice is empty, to avoid a panic, but it didn't check for `Cmd==nil` before.
Fixes#29107
Automatic merge from submit-queue
Change the docker validation node e2e test to use gci-canary-test
This PR changed the continuous docker validation node e2e test to use the image config file introduced in https://github.com/kubernetes/kubernetes/pull/28708. @euank
This PR also changed the gci image family from `gci-preview-test` to `gci-canary-test`. @wonderfly
Automatic merge from submit-queue
Node E2E: Make it possible to share test between e2e and node e2e
This PR is part of the plan to improve node e2e test coverage.
* Now to improve test coverage, we have to copy test from e2e to node e2e.
* When adding a new test, we have to decide its destiny at the very beginning - whether it is a node e2e or e2e.
This PR makes it possible to share test between e2e and node e2e.
By leveraging the mechanism of ginkgo, as long as we can import the test package in the test suite, the corresponding `Describe` will be run to initialize the global variable `_`, and the test will be inserted into the test suite. (See https://github.com/onsi/composition-ginkgo-example)
In the future, we just need to use the framework to write the test, and put the test into `test/e2e/node`, then it will be automatically shared by the 2 test suites.
This PR:
1) Refactored the framework to make it automatically differentiate e2e and node e2e (Mainly refactored the `PodClient` and the apiserver client initialization).
2) Created a new directory `test/e2e/node` and make it shared by e2e and node e2e.
3) Moved `container_probe.go` into `test/e2e/node` to verify the change.
@kubernetes/sig-node
[]()
Automatic merge from submit-queue
Fix a bug in mirror pod node e2e test.
Fixed a bug in test/e2e_node/mirror_pod_test.go. The function 'checkMirrorPodDisappear' returns nil even when the pod does not disappear. It should return a non-nil error.
@Random-Liu