1. move GetOneTimeResourceUsageOnNode() from test/e2e/framework/kubelet/stats.go to getOneTimeResourceUsageOnNode() in test/e2e/framework/resource_usage_gatherer.go
2. copy GetKubeletPods() from test/e2e/framework/kubelet/kubelet_pods.go to getKubeletPods() in test/e2e/framework/util.go
Signed-off-by: clarklee92 <clarklee1992@hotmail.com>
- SimpleGET: Moved to ingress sub package of e2e framework
- PollURL: Moved to ingress sub package of e2e framework
- ProxyMode: Moved to service e2e test package
- ListNamespaceEvents: Moved to e2e_node test package
- NewE2ETestNodePreparer: Removed since 59533f0cd1
In case node object update fails due to revision conflict, it does not make sense to return success.
It needs to be retried so the v1.PreferAvoidPodsAnnotationKey annotation can be properly set in the next round.
The function is used at e2e framework util module only.
So this moves the function to the module for trying to remove
dependencies to subpackages from core e2e framework.
This cleans up the framework.util.go file and moves various Expect*
functions into their own file for better organization.
Signed-off-by: alejandrox1 <alarcj137@gmail.com>
This continues the work in
https://github.com/kubernetes/kubernetes/pull/81426 by also moving the
logger_test.go, moving the log helper code from util.go to log.go (a
more logical place, as it is only used there) and updating comments.
After merging
259bb3bef5 (diff-eb7b79470992813ea1905e96c298b47b)
ExpectEqual and some of the other wrappers logged the failure twice,
once inside the wrapper itself and once in the failure handler.
Logging the stack backtrace is useful because many assertions still
don't contain an explanation and therefore knowing where they occur is
crucial. Now all failures are logged with a "Full Stack Trace", not
just those with a wrapper. The stack is pruned to skip over wrapper
functions and removes Ginkgo internal functions to keep the stack
trace smaller.
Failures occuring in the wrappers were recorded as occuring in those
wrappers. Now the wrappers are skipped and the caller is recorded
instead.
The full stack trace recorded by Ginkgo 1.10.0 is currently off by one
entry. This needs to be fixed in Ginkgo, then the test can be updated.
Now that namespace deletion is reliable, use the suite tear down
to catch non terminated namespaces and stop waiting within each test
for deletion.
Serial tests that need a clean set of namespaces must use the
appropriate flag to control whether they wait for clean namespaces
on startup.
This is an effort to clean up the framework util.go file so that it
makes more sense and it is more managable.
Signed-off-by: alejandrox1 <alarcj137@gmail.com>
Remove the "OrDie" from the name (since it doesn't "or die") and add
an extra check that there is at least 1 node available, since many
callers already did that themselves, and many others should have.
PrettyPrintJSON is most used e2emetrics function and that doesn't seem
specific for metrics. The implementation itself is generic, so it is
nice to move it to core framework for avoiding circular dependency.
The `Command` will cause the container process not starting correctly,
so we now use the `Args` to end up running `/agnhost pause` as intended.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
Adds a new flag which allows users to specify a regexp
which will effectively whitelist certain taints and
allow the test framework to startup tests despite having
tainted nodes.
Fixes an issue where e2e tests were unable to be run
in tainted environments due to the framework waiting for
all the nodes to be schedulable without any tolerations.
dumpAllNodeInfo() called Nodes().List() internally, but the function
is called from DumpAllNamespaceInfo() only and DumpAllNamespaceInfo()
calls Nodes().List() before calling dumpAllNodeInfo().
So this makes the result of Nodes().List() being passed to
dumpAllNodeInfo() then reduce a call of Nodes().List().
it turns out that the framework.TestContext.IPFamily variable is
not available for the DNS tests if they don't run in the initial
Ginkgo node when running in parallel.
We add a function to the framework to allow us to run command
only once per each Ginkgo node parallel execution.
It also adds a method to detect if the cluster is IPv6.
The use of the framework.TestContext.IPFamily variable guarantees
consistency all over the testing because this variable is only
assigned at the beginning of the testing.
All failures are worth logging immediately, not just unexpected
errors. That helps understand tests that have long-running cleanup
operations with their own logging, because the failure will be visible
inside the test output.
The logging in framework.ExpectNoError also was rather poor, because
it only showed the error, but not the additional information about it.
Tests suites now should use log.Fail as Gomega failure handler instead
of ginkgowrapper.Fail. log.Fail will handle the logging for all
failures before proceeding to record the failure in Ginkgo.
Because logging is always done also after a test failure, additional
failures during cleanup are now visible. Ginkgo itself just ignores
them.
Skips IPv6 tests on Windows.
Skips sysctl tests on Windows.
Skips network policy tests on Windows.
Skips RunAsUser / FSGroup / file permissions related tests, as those are
not supported on Windows.
Skips the test "should preserve source pod IP for traffic thru service cluster IP"
on Windows, as it creates a Pod with HostNetwork=true, which is unsupported.
What works and what doesn't work on Windows has been documented here:
https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md#windows--linux-considerations
Quite a few images are only used a few times in a few tests. Thus,
the images are being centralized into the agnhost image, reducing
the number of images that have to be pulled and used.
This PR replaces the usage of the following images with agnhost:
- audit-proxy
- crd-conversion-webhook
- entrypoint-tester
- inclusterclient
- iperf
- porter
- serve-hostname
Organized functions that abstract the access of
k8s.io/apimachinery/pkg/runtime.Objects into a framework subpackge.
Signed-off-by: Jorge Alarcon Ochoa <alarcj137@gmail.com>
- Add a package "node" under e2e/framework and alias e2enode;
- Rename some functions whose name have redundant string.
Signed-off-by: Jiatong Wang <wangjiatong@vmware.com>
Quite a few images are only used a few times in a few tests. Thus,
the images are being centralized into the agnhost image, reducing
the number of images that have to be pulled and used.
This PR replaces the usage of the following images with agnhost:
- net
- netexec
- nettest
- webhook
Many e2e tests expect two values are the same, such check is one of
common things in tests. Now gomega.Expect(foo).To(gomega.Equal(bar))
is doing that, this adds ExpectEqual() for replacing the above call
for readable code.
In addition, this replaces under apimachinery/generated_clientset.go
as a sample.
This fixes golint failures of the following files:
- test/e2e/framework/networking_utils.go
- test/e2e/framework/service_util.go
- test/e2e/framework/util.go
All golint failures in test/e2e/framework are fixed at this commit.
Remove 'test/e2e/framework' from 'hack/.golint_failures'
There is a lot of gomega.Expect(err).To(gomega.HaveOccurred()) callers
which expect an error happens in e2e tests.
However these test code seems confusing because the code readers
need to take care of To() or NotTo() on each test scenario.
This adds ExpectError() for more readable test code.
In addition, this applies ExpectError() to e2e provisioning.go as a
sample.
The framework/ssh.go code was heavily used throughout the framework
and could be useful elsewhere but reusing those methods requires
importing all of the framework.
Extracting these methods to their own package for reuse.
Only a few methods had to be copied into this package from the
rest of the framework to avoid an import cycle.
This test creates a pod with custom DNS configurations and expects
them to be set in the containers.
The test pod is using the agnhost image, which can output the container's
DNS configurations that the test checks.
This is a part of a series for fixing golint failures for util.go.
- fixes golint failures from line 2354 to line 3685 at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This is a part of a series for fixing golint failures for util.go.
- fixes golint failures from line 3686 to end of file at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This is a part of a series for fixing golint failures for util.go.
- fixes golint failures from line 1395 to line 2353 at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This changes following files because of change function name
in above file.
- test/e2e/apps/rc.go
- test/e2e/apps/replica_set.go
This is a part of a series for fixing golint failures for util.go.
- fixes `should not use dot imports` about `ginkgo` and `gomega`
- fixes golint failures from top of file to line 1394 at original util.go
This fixes golint failures of the following file:
- test/e2e/framework/util.go
This changes following files because of change function name
in above file.
- test/e2e/e2e.go
- test/e2e/network/network_tiers.go
- test/e2e/network/service.go
It turns out to be a frequent bug that is revealed when nodes don't have
external IP addresses. In the test we assume that in such case there
won't be any addresses of type 'NodeExternalIp', which is invalid. In
such case there will be an address of type 'NodeExternalIP', but with
the empty 'Address' field.
Ref. https://github.com/kubernetes/kubernetes/issues/76374
There are two reason why this is useful:
1. less code to vendor into external users of the framework
The following dependencies become obsolete due to this change (from `dep`):
(8/23) Removed unused project github.com/grpc-ecosystem/go-grpc-prometheus
(9/23) Removed unused project github.com/coreos/etcd
(10/23) Removed unused project github.com/globalsign/mgo
(11/23) Removed unused project github.com/go-openapi/strfmt
(12/23) Removed unused project github.com/asaskevich/govalidator
(13/23) Removed unused project github.com/mitchellh/mapstructure
(14/23) Removed unused project github.com/NYTimes/gziphandler
(15/23) Removed unused project gopkg.in/natefinch/lumberjack.v2
(16/23) Removed unused project github.com/go-openapi/errors
(17/23) Removed unused project github.com/go-openapi/analysis
(18/23) Removed unused project github.com/go-openapi/runtime
(19/23) Removed unused project sigs.k8s.io/structured-merge-diff
(20/23) Removed unused project github.com/go-openapi/validate
(21/23) Removed unused project github.com/coreos/go-systemd
(22/23) Removed unused project github.com/go-openapi/loads
(23/23) Removed unused project github.com/munnerz/goautoneg
2. works around https://github.com/kubernetes/kubernetes/issues/75338
which currently breaks vendoring
Some recent changes to crd_util.go must now be pulling in the broken
k8s.io/apiextensions-apiserver packages, because it was still working
in revision 2e90d92db9 (as demonstrated by
586ae281ac).
When running e2e conformance tests against a public https protected
APIserver the websocket tests would fail because it fell back to using
`ws://` instead of `wss://` for the websocket connection.
This happened because the code detect if HTTPS is used only looks for
HTTPS related configuration in the kubeconfig, like a custom CA or
certificates.
The fix is to always use HTTPS when the apiserver URL has the scheme `https://`.
Signed-off-by: Mikkel Oscar Lyderik Larsen <mikkel.larsen@zalando.de>
The feature is gated behind a newly introduced 'dump-systemd-journal' flag.
We want to dump the full systemd journal in our scalability performance tests.
Some tests which utilized SSH to run commands on nodes would
first look for external IPs but fall back to internal IPs
since those could be reachable by the testing program.
This change adds that same fallback logic to another method
used to find the appropriate SSH address for each node.
Fixes#68747
While debugging issues I found myself having to change the constant in the code
for a cluster > 20 nodes, and then on a very small cluster I found myself passing
0 to avoid the mostly useless output (it's useful in specific scenarios but generates
a *lot* of output that doesn't help debugging the rest of the time).
Some mounttest related tests are checking the file permissions set on the
container files, but the default file permissions on Windows is 775 instead of
644, causing some tests to fail.
Keep in mind that file permissions work differently on Windows, and setting file
permissions via Kubernetes is not currently supported on Windows.
There were providers which would:
- allow overrides of the file base name but not the path
- allow oeverrides of the file path
- not allow any overrides at all
This change makes it such that each of the supported providers
can override the SSH key location using an env var. The env
var itself may vary based on the provider though.
If given an absolute path to the key, it is used. If given
a relative path it will be made relative to ~/.ssh
Fixes: #68747
Ginkgo expects the caller to pass the appropriate skip, which we do
not do. Slightly improve call site messages by eliding the util.go
stack frame. Also drop the timestamp from skip messages since skip
is almost always called to check for global conditions, not temporal
ones.
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
The existence of /proc/net/nf_conntrack depends on the Linux kernel
config CONFIG_NF_CONNTRACK_PROCFS, and Ubuntu 16.04's one is disabled.
So the e2e test "should set TCP CLOSE_WAIT timeout" fails on that.
This adds the existence check of /proc/net/nf_conntrack for skipping
the test if not existing.
In addition, this makes IssueSSHCommandWithResult return Stderr in
the error message if err is nil to check "No such file or directory"
as nonexistence of /proc/net/nf_conntrack.
This way we can be sure that the kubelet can't communicate with the
master, even if falls-back to the internal/external IP (which seems to
be the case with DNS)
Issue #56787
When we get an unsupported provider message, it often isn't clear what
method actually failed - add more information to the error message.
Issue #70280
Not all users of the E2E framework want to run cloud-provider specific
tests. By splitting out the code it becomes possible to decide in
a E2E test suite which providers are supported.
This is achieved in two ways:
- the framework calls certain functions through a provider
interface instead of calling specific cloud provider functions
directly
- tests that are cloud-provider specific directly import the
new provider packages
The ingress test utilities are only needed by a few tests. Splitting
them out into a separate package makes the framework simpler for test
suites not using those tests.
Fixes: #66649
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
Automatic merge from submit-queue (batch tested with PRs 66229, 67682, 67585, 67641, 67697). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Fix validation logic in E2ETestNodePreparer.
**What this PR does / why we need it**:
There is a bug in E2ETestNodePreparer validation logic.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 67071, 66906, 66722, 67276, 67039). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
#50102 Task 1: Move apimachinery/pkg/watch.Until into client-go/tools/watch.UntilWithoutRetry
**What this PR does / why we need it**:
This is a split off from https://github.com/kubernetes/kubernetes/pull/50102 to go in smaller pieces.
Moves `apimachinery/pkg/watch.Until` into `client-go/tools/watch.UntilWithoutRetry` and adds context so it is cancelable.
**Release note**:
```release-note
NONE
```
**Dev release note**:
```dev-release-note
`apimachinery/pkg/watch.Until` has been moved to `client-go/tools/watch.UntilWithoutRetry`.
While switching please consider using the new `client-go/tools/watch.UntilWithSync` or `client-go/tools/watch.Until`.
```
/cc @smarterclayton @kubernetes/sig-api-machinery-pr-reviews
/milestone v1.12
/priority important-soon
/kind bug
(bug after the main PR which is this split from)
All e2e test images are now using multi-arch manifests so we should stop
looking up and using images that are specific to runtime.GOARCH
Change-Id: I5f3fd6e9a42b9fb88891c19e28a2dfcf7a14be82
Use the same pattern everywhere in the e2e test
harness, use busybox (from dockerhub) instead
of using the one from k8s.gcr.io registry.
Change-Id: I57c3b867408c1f9478a8909c26744ea0368ff003
Automatic merge from submit-queue (batch tested with PRs 64140, 64898, 65022, 65037, 65027). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add e2e regression tests for the kubelet being secure
**What this PR does / why we need it**:
This PR does,
1. The kubelet cAdvisor port (4194) can't be reached, neither via the API server proxy nor directly on the public IP address
2. The kubelet read-only port (10255) can't be reached, neither via the API server proxy nor directly on the public IP address
3. The kubelet can delegate ServiceAccount tokens to the API server
4. The kubelet's main port (10250) has both authentication (should fail with no credentials) and authorization (should fail with insufficient permissions) set-up
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixeskubernetes/kubeadm#838
**Special notes for your reviewer**:
/cc luxas tallclair
**Release note**:
```release-note
Add e2e regression tests for the kubelet being secure
```
Automatic merge from submit-queue (batch tested with PRs 63859, 63979). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Drop reapers
/assign @deads2k @juanvallejo
**Release note**:
```release-note
kubectl delete does not use reapers for removing objects anymore, but relies on server-side GC entirely
```
Automatic merge from submit-queue (batch tested with PRs 62756, 63862, 61419, 64015, 64063). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Handle TERM signal to reduce pod terminating time.
**What this PR does / why we need it**:
Register signal handler for `TERM`.
For container, process is run as PID 1. PID 1 is special in linux, it ignore any signal with the default action. So process will not terminate on `SIGINT` or `SIGTERM` unless it is coded to do so.
By default, docker use `TERM` signal to termiante pods, it will wait 10 seconds before sending `KILL` signal.
```
$ docker run -d --rm --name test busybox sh -c 'while true; do sleep 1; done'
aa827df5d7bfffc5ca4fae2429d0b761a5a142c57ba3e1faa59b305c92f1c875
$ time docker stop test
test
real 0m10.408s
user 0m0.048s
sys 0m0.020s
```
It's better to register a exit handler for `TERM`, it reduces a lot of time in waiting pods to termiante.
```
$ docker run -d --rm --name test busybox sh -c 'trap exit TERM; while true; do sleep 1; done'
e331bff454dba8e45df6065c3bd2a928e1c41303aafdf88ede38def3e4e5781f
$ time docker stop test
test
real 0m0.690s
user 0m0.048s
sys 0m0.020s
```
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63920, 63716, 63928, 60553, 63946). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Wait for pod deletion instead of termination in reconstruction test
**What this PR does / why we need it**:
Change volume test to wait for pod deletion instead of pod termination
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Addresses https://github.com/kubernetes/kubernetes/issues/63923
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63246, 63185). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add scale-up test from 0 for GPU node-pool
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
should use time.Since instead of time.Now().Sub
**What this PR does / why we need it**:
should use time.Since instead of time.Now().Sub
**Special notes for your reviewer**:
- Add PodStartShortTimeout and ClaimProvisionShortTimeout constants.
- Change framework.PodStartTimeout to framework.PodStartShortTimeout in
persistent_volumes-local.go. Busybox image is very small, no need to
wait for a long time.
Automatic merge from submit-queue (batch tested with PRs 62694, 62569, 62646, 61633, 62433). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add GCE-PD CSI Driver test to E2E test suite
Fixes: #60462
/sig storage
/kind technical-debt
/assign @saad-ali @msau42
**What this PR does / why we need it**:
This PR adds an E2E test for the GCE-PD CSI driver that deploys the driver in a production-like setting and tests whether dynamic provisioning with the driver is possible.
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add prometheus cluster monitoring addon.
This PR adds new cluster monitoring addon based on prometheus.
It adds prometheus deployment with e2e tests.
Additional components will be added iterativly in future.
Manifests based on current Helm chart.
At current state it's not intended for production use.
cc @piosz @kawych @miekg
```release-note
Add prometheus cluster monitoring addon to kube-up
```
/sig instrumentation
/kind feature
/priority important-soon