Instead of allowing the cloud provider to guess at the zones that
should be applied for a cluster under test, allow the explicit list
of zones to consider to be passed as a new test context flag -gce-zones.
Only the GCE test cloud provider recognizes this value because only
the GCE test cloud provider makes assumptions about zones for verifying
values, and the default assumptions for GKE do not always match non-GKE
providers.
A number of e2e tests are useful to run after the system has been
disrupted or is in the progress of being disrupted, but the current
suite and test logic blocks progress waiting for all nodes to be
healthy.
By passing -1 to --minStartupPods or --allowed-not-ready-nodes flags
the caller can bypass wait logic before and after test suites that
would prevent running e2e during disruption. This allows use of parts
of the e2e suite during cluster duress to verify that controllers or
components still function.
Both of these are explicit arguments and are more elegantly logged
in a test framework by logging the arguments to the test.
The namespaces to be deleted are already logged inside
WaitForNamespacesDeleted
- re-enable e2e_node services
- call GenerateSecureToken for e2e_node Conformance test-suite
- add log messages indicating location in process
- move log messages to some more accurate locations
we print yaml, so you can use yaml tools like `yq`:
```
e2e.test --list-conformance-tests | yq r - --collect *.testname
```
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The image "gcr.io/authenticated-image-pulling/windows-nanoserver:v1" is not a
manifest list, and it is only useful for Windows Server 1809, which means that the
test "should be able to pull from private registry with secret" will fail for
environments with Windows Server 1903, 1909, or any other future version we might
want to test.
This commit adds the the ability to have an alternative private image to pull by
using a configurable docker config file which contains the necessary credentials
needed to pull the image.
The existing walk.go and conformance.txt have a few shortcomings
which we'd like to resolve:
- difficult to get the full test name due to test context nesting
- complicated AST logic and understanding necessary due to the
different ways a test can be invoked and written
This changes the AST parsing logic to be much more simple and simply
looks for the comments at/around a specific line. This file/line
information (and the full test name) is gathered by a custom ginkgo
reporter which dumps the SpecSummary data to a file.
Also, the SpecSummary dump can, itself, be potentially useful for
other post-processing and debugging tasks.
Signed-off-by: John Schnake <jschnake@vmware.com>
We cannot anticipate all the possible configurations
needed by the SRIOV device plugin: there is too much variety.
Hence, we need to allow the test environment to supply
a host-specific ConfigMap to properly configure the device
plugin and avoid false negatives.
We still provide a the default config map as fallback and reference.
Signed-off-by: Francesco Romani <fromani@redhat.com>
One common frustration of end users running the e2e suite is that
they take a significant amount of time and it is difficult to
gauge progress.
Even if tailing the logs it can be difficult to see where one
test starts and another ends or understand the if there have been
failures in the past 1h of logs.
This change adds a new custom reporter which prints summary information
as tests complete. This includes the number of tests to run and how
many have been passed/failed/skipped along with which tests have failed.
A new flag can be set which pushes these values to an endpoint. This is
intended for integration with Sonobuoy but any endpoint could consume and
surface this data to the user so they can better understand the state
of the current test run.
e2e_node.test does not set default kubectlPath, which lead to test
errors as following:
[Fail] [sig-storage] EmptyDir volumes [It] pod should support
shared volumes between containers [Conformance]
When the test trying to read file in shared volume, it uses
"kubeclt exec namespace -c container_name -- cat file_name".
However, as variable framework.TestContext.KubectlPath not set,
kubectl binary can not be found in the test and the tast fails.
This patch move kubectlPath flag from RegisterClusterFlags to
RegisterCommonFlags, thus default value for
framework.TestContext.KubectlPath will be set,and
user can also use --kubectl-path flag to set kubectl path.
Signed-off-by: Howard Zhang <howard.zhang@arm.com>
Adds a new flag which allows users to specify a regexp
which will effectively whitelist certain taints and
allow the test framework to startup tests despite having
tainted nodes.
Fixes an issue where e2e tests were unable to be run
in tainted environments due to the framework waiting for
all the nodes to be schedulable without any tolerations.
HandleFlags() was used at e2e package and it depends on sub e2e
framework "config" in core e2e framework. That was invalid dependency.
So this moves HandleFlags() to e2e package for simple dependency.
Tests should never directly add to the global command line, because
some users of the tests might not want them there. For example,
options might only get set directly from a config file.
To achieve that, e2e/framework/config, e2e/framework/viperconfig, and
e2e/framework/test_context.go avoid using the global flag set and
instead expect to be told by the caller which flag set to use. Tests
that called flag directly either get updated or obsolete flags get
removed.
The exception is framework.HandleFlags, which as before directly
implements global command line handling.
This is a breaking change for test suites which do not use that
function (and only those): they now need to ensure that they copy
individual flags from tests. Because the RegisterCommonFlags prototype
has changed, test suite authors will notice due to the resulting
compilation errors.
Moved all flag code from `staging/src/k8s.io/apiserver/pkg/util/[flag|globalflag]` to `component-base/cli/[flag|globalflag]` except for the term function because of unwanted dependencies.
The feature is gated behind a newly introduced 'dump-systemd-journal' flag.
We want to dump the full systemd journal in our scalability performance tests.
Not accepting --provider= (i.e. setting an empty provider name) broke
some test jobs. As suggested in
https://github.com/kubernetes/kubernetes/pull/73402#issuecomment-459368230,
now --provider= and not passing --provider at all both trigger a
message and then continue as if --provider=skeleton had been used.
The empty string was the default and then triggered a special
warning. There's no good reason for that behavior, so now the special
handling for "unset provider" is gone and "skeleton" is the non-empty
default for the value.
This finishes the work started for 1.13: instead of merely warning
about an unknown value given to --profile, the test/e2e/e2e.test
binary will now print an error and refuse to run.
Fixes: #70200
While debugging issues I found myself having to change the constant in the code
for a cluster > 20 nodes, and then on a very small cluster I found myself passing
0 to avoid the mostly useless output (it's useful in specific scenarios but generates
a *lot* of output that doesn't help debugging the rest of the time).
- Move from the old github.com/golang/glog to k8s.io/klog
- klog as explicit InitFlags() so we add them as necessary
- we update the other repositories that we vendor that made a similar
change from glog to klog
* github.com/kubernetes/repo-infra
* k8s.io/gengo/
* k8s.io/kube-openapi/
* github.com/google/cadvisor
- Entirely remove all references to glog
- Fix some tests by explicit InitFlags in their init() methods
Change-Id: I92db545ff36fcec83afe98f550c9e630098b3135
The E2E refactoring tightened the sanity checking of the --provider
parameter such that it only allowed known providers. That seemed to
make sense because it catches typos, but it turned out that various
callers depended on the "accept arbitrary provider value" behavior,
therefore it gets restored.
Not all users of the E2E framework want to run cloud-provider specific
tests. By splitting out the code it becomes possible to decide in
a E2E test suite which providers are supported.
This is achieved in two ways:
- the framework calls certain functions through a provider
interface instead of calling specific cloud provider functions
directly
- tests that are cloud-provider specific directly import the
new provider packages
The ingress test utilities are only needed by a few tests. Splitting
them out into a separate package makes the framework simpler for test
suites not using those tests.
Fixes: #66649
Tests settings should be defined in the test source code itself
because conceptually the framework is a separate entity that not all
test authors can modify.
For the sake of backwards compatibility the name of the command line
flags are not changed.
Tests settings should be defined in the test source code itself
because conceptually the framework is a separate entity that not all
test authors can modify.
Using the new framework/config code also has several advantages:
- defaults can be set with less code
- no confusion around what's a duration
- the options can also be set via command line flags
While at it, a minor bug gets fixed:
- readConfig() returns only defaults when called while
registering Ginkgo tests because Viperize() gets called later,
so the scale in the logging soak test couldn't really be configured;
now the value is read when the test runs and thus can be changed
The options get moved into the "instrumentation.logging"
resp. "instrumentation.monitoring" group to make it more obvious where
they are used. This is a breaking change, but that was already
necessary to improve the duration setting from plain integer to a
proper time duration.
Tests shouldn't have to use the central context for their settings,
because conceptually tests and framework get developed independently.
This does not yet use the new framework/config utility code because
that code still needs to be reviewed.
Besides moving the flags, they also get renamed from the top-level
"--csiImage{Version|Registry}" to
"--storage.csi.image.{version|registry}". These flags were introduced
fairly recently and shouldn't be in use much, so now is a good time to
introduce a hierarchical naming for storage flags, in particular
because more flags will be added soon.
Storing settings in the framework's TestContext is not something that
out-of-tree test authors can do because for them the framework is a
read-only upstream component. Conceptually the same is true for
in-tree tests, so the recommended approach is to define configuration
settings in the code that uses them.
How to do that is a bit uncertain. Viper has several
drawbacks (maintenance status uncertain, cannot list supported
options, cannot validate the configuration file). How to handle
configuration files is currently getting discussed for kubeadm, with
similar concerns about
Viper (https://github.com/kubernetes/kubeadm/issues/1040).
Instead of making a choice now for E2E, the recommendation is that
test authors continue to define command line flags as before, except
that they should do it in their own code and with better flag names.
But the ability to read options also from a file is useful, so
several enhancements get added:
- all settings defined via flags can also be read from a
configuration file, without extra work for test authors
- framework/config makes it possible to populate a struct directly
and define flags with a single function call
- a path and file suffix can be given to --viper-config (as in
"--viper-config /tmp/e2e.json") instead of expecting the file in
the current directory; as before, just plain "--viper-config e2e"
still works
- if "--viper-config" is set, the file must exist; otherwise the
"e2e" config is optional (as before)
- errors from Viper are no longer silently ignored, so syntax errors
are detected early
- Viper support is optional: test suite authors who don't want
it are not forced to use it by the e2e/framework
Individual implementations are not yet being moved.
Fixed all dependencies which call the interface.
Fixed golint exceptions to reflect the move.
Added project info as per @dims and
https://github.com/kubernetes/kubernetes-template-project.
Added dims to the security contacts.
Fixed minor issues.
Added missing template files.
Copied ControllerClientBuilder interface to cp.
This allows us to break the only dependency on K8s/K8s.
Added TODO to ControllerClientBuilder.
Fixed GoDeps.
Factored in feedback from JustinSB.
Some e2e tests are skipped by depending on Linux distribution of
master and node, and the options can be one of "debian", "ubuntu",
"gci" or "custom". This updates the help message of the options.
Automatic merge from submit-queue (batch tested with PRs 60699, 63780). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
e2e/storage: parameterize container images
**What this PR does / why we need it**:
The CSI integration test for hostpath was hard-coded to use the latest
stable release of the sidecar and hostpath container images. This
makes sense for regression testing of changes made in Kubernetes
itself, but the same test is also useful for testing the "canary"
images on quay.io before tagging them as a new release or for testing
locally produced images. Both is now possible via command line
parameters.
**Which issue(s) this PR fixes**:
Related-to: kubernetes-csi/docs#23
**Special notes for your reviewer**:
The commit message has usage instructions.
```release-note
NONE
```
/sig storage
Putting the command line argument handling into the central test
context seems like the better solution, in particular considering that
argument handling might get changed in the future to use Viper.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Deprecate InfluxDB cluster monitoring
InfluxDB cluster monitoring addon will no longer be supported and will be removed in k8s 1.12.
Default monitoring solution will be changed to `standalone`.
Heapster will still be deployed for backward compatibility of `kubectl top`
```release-note
Stop using InfluxDB as default cluster monitoring
InfluxDB cluster monitoring is deprecated and will be removed in v1.12
```
cc @piosz
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Auto-calculate allowed-not-ready-nodes in test framework
Actually we (sig-scalability) are pretty much the only users of this flag.
This reduces the overhead of having to provide its value based on num-nodes each time we run our tests.
/cc @wojtek-t
```release-note
NONE
```
We expect lots of tools to be able to install on provider=gce - the
cluster API, kops, kube-up etc.
We introduce a new optional flag to e2e ('tooling') to enable switching
on the tooling, not just the cloud.
This will prove useful for upgrade tests, for example, where the
mechanism will likely vary by tooling, but is currently tightly bound to
the provider (i.e. cloud)
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add prometheus cluster monitoring addon.
This PR adds new cluster monitoring addon based on prometheus.
It adds prometheus deployment with e2e tests.
Additional components will be added iterativly in future.
Manifests based on current Helm chart.
At current state it's not intended for production use.
cc @piosz @kawych @miekg
```release-note
Add prometheus cluster monitoring addon to kube-up
```
/sig instrumentation
/kind feature
/priority important-soon
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update test framework featuregates type
**What this PR does / why we need it**:
A cleanup following #53025 and #57962.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref: #53025
and #57962.
**Special notes for your reviewer**:
but yeah, not sure if it's worthy to do this :)
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 55283, 55461, 55288, 53970, 55487). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Support collecting log for alternative container runtime in e2e test.
Fixes https://github.com/kubernetes/kubernetes/issues/55629.
Add support to collect logs for alternative container runtime in e2e.
Example for `cri-containerd`:
```
$ go run hack/e2e.go -- --test -v --test_args="--report-dir=$PWD --container-runtime-services=cri-containerd,containerd,cri-containerd-installation"
```
```release-note
none
```
/cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-testing-pr-reviews
In scalability testing influxdb was recently disabled, but we still
trying to execute corresponidng test, as a result it fails all the time.
Skip test if influxdb is disabled.