mirror of
				https://github.com/k3s-io/kubernetes.git
				synced 2025-10-24 17:10:44 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			720 lines
		
	
	
		
			28 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			720 lines
		
	
	
		
			28 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # End-to-End Testing in Kubernetes
 | |
| 
 | |
| Updated: 5/3/2016
 | |
| 
 | |
| **Table of Contents**
 | |
| <!-- BEGIN MUNGE: GENERATED_TOC -->
 | |
| 
 | |
| - [End-to-End Testing in Kubernetes](#end-to-end-testing-in-kubernetes)
 | |
|   - [Overview](#overview)
 | |
|   - [Building and Running the Tests](#building-and-running-the-tests)
 | |
|     - [Cleaning up](#cleaning-up)
 | |
|   - [Advanced testing](#advanced-testing)
 | |
|     - [Bringing up a cluster for testing](#bringing-up-a-cluster-for-testing)
 | |
|     - [Federation e2e tests](#federation-e2e-tests)
 | |
|       - [Configuring federation e2e tests](#configuring-federation-e2e-tests)
 | |
|       - [Image Push Repository](#image-push-repository)
 | |
|       - [Build](#build)
 | |
|       - [Deploy federation control plane](#deploy-federation-control-plane)
 | |
|       - [Run the Tests](#run-the-tests)
 | |
|       - [Teardown](#teardown)
 | |
|       - [Shortcuts for test developers](#shortcuts-for-test-developers)
 | |
|     - [Debugging clusters](#debugging-clusters)
 | |
|     - [Local clusters](#local-clusters)
 | |
|       - [Testing against local clusters](#testing-against-local-clusters)
 | |
|     - [Version-skewed and upgrade testing](#version-skewed-and-upgrade-testing)
 | |
|   - [Kinds of tests](#kinds-of-tests)
 | |
|     - [Viper configuration and hierarchichal test parameters.](#viper-configuration-and-hierarchichal-test-parameters)
 | |
|     - [Conformance tests](#conformance-tests)
 | |
|     - [Defining Conformance Subset](#defining-conformance-subset)
 | |
|   - [Continuous Integration](#continuous-integration)
 | |
|     - [What is CI?](#what-is-ci)
 | |
|     - [What runs in CI?](#what-runs-in-ci)
 | |
|       - [Non-default tests](#non-default-tests)
 | |
|     - [The PR-builder](#the-pr-builder)
 | |
|     - [Adding a test to CI](#adding-a-test-to-ci)
 | |
|     - [Moving a test out of CI](#moving-a-test-out-of-ci)
 | |
|   - [Performance Evaluation](#performance-evaluation)
 | |
|   - [One More Thing](#one-more-thing)
 | |
| 
 | |
| <!-- END MUNGE: GENERATED_TOC -->
 | |
| 
 | |
| ## Overview
 | |
| 
 | |
| End-to-end (e2e) tests for Kubernetes provide a mechanism to test end-to-end
 | |
| behavior of the system, and is the last signal to ensure end user operations
 | |
| match developer specifications. Although unit and integration tests provide a
 | |
| good signal, in a distributed system like Kubernetes it is not uncommon that a
 | |
| minor change may pass all unit and integration tests, but cause unforeseen
 | |
| changes at the system level.
 | |
| 
 | |
| The primary objectives of the e2e tests are to ensure a consistent and reliable
 | |
| behavior of the kubernetes code base, and to catch hard-to-test bugs before
 | |
| users do, when unit and integration tests are insufficient.
 | |
| 
 | |
| The e2e tests in kubernetes are built atop of
 | |
| [Ginkgo](http://onsi.github.io/ginkgo/) and
 | |
| [Gomega](http://onsi.github.io/gomega/). There are a host of features that this
 | |
| Behavior-Driven Development (BDD) testing framework provides, and it is
 | |
| recommended that the developer read the documentation prior to diving into the
 | |
|  tests.
 | |
| 
 | |
| The purpose of *this* document is to serve as a primer for developers who are
 | |
| looking to execute or add tests using a local development environment.
 | |
| 
 | |
| Before writing new tests or making substantive changes to existing tests, you
 | |
| should also read [Writing Good e2e Tests](writing-good-e2e-tests.md)
 | |
| 
 | |
| ## Building and Running the Tests
 | |
| 
 | |
| There are a variety of ways to run e2e tests, but we aim to decrease the number
 | |
| of ways to run e2e tests to a canonical way: `hack/e2e.go`.
 | |
| 
 | |
| You can run an end-to-end test which will bring up a master and nodes, perform
 | |
| some tests, and then tear everything down. Make sure you have followed the
 | |
| getting started steps for your chosen cloud platform (which might involve
 | |
| changing the `KUBERNETES_PROVIDER` environment variable to something other than
 | |
| "gce").
 | |
| 
 | |
| To build Kubernetes, up a cluster, run tests, and tear everything down, use:
 | |
| 
 | |
| ```sh
 | |
| go run hack/e2e.go -v --build --up --test --down
 | |
| ```
 | |
| 
 | |
| If you'd like to just perform one of these steps, here are some examples:
 | |
| 
 | |
| ```sh
 | |
| # Build binaries for testing
 | |
| go run hack/e2e.go -v --build
 | |
| 
 | |
| # Create a fresh cluster.  Deletes a cluster first, if it exists
 | |
| go run hack/e2e.go -v --up
 | |
| 
 | |
| # Run all tests
 | |
| go run hack/e2e.go -v --test
 | |
| 
 | |
| # Run tests matching the regex "\[Feature:Performance\]"
 | |
| go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Feature:Performance\]"
 | |
| 
 | |
| # Conversely, exclude tests that match the regex "Pods.*env"
 | |
| go run hack/e2e.go -v --test --test_args="--ginkgo.skip=Pods.*env"
 | |
| 
 | |
| # Run tests in parallel, skip any that must be run serially
 | |
| GINKGO_PARALLEL=y go run hack/e2e.go --v --test --test_args="--ginkgo.skip=\[Serial\]"
 | |
| 
 | |
| # Run tests in parallel, skip any that must be run serially and keep the test namespace if test failed
 | |
| GINKGO_PARALLEL=y go run hack/e2e.go --v --test --test_args="--ginkgo.skip=\[Serial\] --delete-namespace-on-failure=false"
 | |
| 
 | |
| # Flags can be combined, and their actions will take place in this order:
 | |
| # --build, --up, --test, --down
 | |
| #
 | |
| # You can also specify an alternative provider, such as 'aws'
 | |
| #
 | |
| # e.g.:
 | |
| KUBERNETES_PROVIDER=aws go run hack/e2e.go -v --build --up --test --down
 | |
| 
 | |
| # -ctl can be used to quickly call kubectl against your e2e cluster. Useful for
 | |
| # cleaning up after a failed test or viewing logs. Use -v to avoid suppressing
 | |
| # kubectl output.
 | |
| go run hack/e2e.go -v -ctl='get events'
 | |
| go run hack/e2e.go -v -ctl='delete pod foobar'
 | |
| ```
 | |
| 
 | |
| The tests are built into a single binary which can be run used to deploy a
 | |
| Kubernetes system or run tests against an already-deployed Kubernetes system.
 | |
| See `go run hack/e2e.go --help` (or the flag definitions in `hack/e2e.go`) for
 | |
| more options, such as reusing an existing cluster.
 | |
| 
 | |
| ### Cleaning up
 | |
| 
 | |
| During a run, pressing `control-C` should result in an orderly shutdown, but if
 | |
| something goes wrong and you still have some VMs running you can force a cleanup
 | |
| with this command:
 | |
| 
 | |
| ```sh
 | |
| go run hack/e2e.go -v --down
 | |
| ```
 | |
| 
 | |
| ## Advanced testing
 | |
| 
 | |
| ### Bringing up a cluster for testing
 | |
| 
 | |
| If you want, you may bring up a cluster in some other manner and run tests
 | |
| against it. To do so, or to do other non-standard test things, you can pass
 | |
| arguments into Ginkgo using `--test_args` (e.g. see above). For the purposes of
 | |
| brevity, we will look at a subset of the options, which are listed below:
 | |
| 
 | |
| ```
 | |
| --ginkgo.dryRun=false: If set, ginkgo will walk the test hierarchy without
 | |
| actually running anything. Best paired with -v.
 | |
| 
 | |
| --ginkgo.failFast=false: If set, ginkgo will stop running a test suite after a
 | |
| failure occurs.
 | |
| 
 | |
| --ginkgo.failOnPending=false: If set, ginkgo will mark the test suite as failed
 | |
| if any specs are pending.
 | |
| 
 | |
| --ginkgo.focus="": If set, ginkgo will only run specs that match this regular
 | |
| expression.
 | |
| 
 | |
| --ginkgo.skip="": If set, ginkgo will only run specs that do not match this
 | |
| regular expression.
 | |
| 
 | |
| --ginkgo.trace=false: If set, default reporter prints out the full stack trace
 | |
| when a failure occurs
 | |
| 
 | |
| --ginkgo.v=false: If set, default reporter print out all specs as they begin.
 | |
| 
 | |
| --host="": The host, or api-server, to connect to
 | |
| 
 | |
| --kubeconfig="": Path to kubeconfig containing embedded authinfo.
 | |
| 
 | |
| --prom-push-gateway="": The URL to prometheus gateway, so that metrics can be
 | |
| pushed during e2es and scraped by prometheus. Typically something like
 | |
| 127.0.0.1:9091.
 | |
| 
 | |
| --provider="": The name of the Kubernetes provider (gce, gke, local, vagrant,
 | |
| etc.)
 | |
| 
 | |
| --repo-root="../../": Root directory of kubernetes repository, for finding test
 | |
| files.
 | |
| ```
 | |
| 
 | |
| Prior to running the tests, you may want to first create a simple auth file in
 | |
| your home directory, e.g. `$HOME/.kube/config`, with the following:
 | |
| 
 | |
| ```
 | |
| {
 | |
|   "User": "root",
 | |
|   "Password": ""
 | |
| }
 | |
| ```
 | |
| 
 | |
| As mentioned earlier there are a host of other options that are available, but
 | |
| they are left to the developer.
 | |
| 
 | |
| **NOTE:** If you are running tests on a local cluster repeatedly, you may need
 | |
| to periodically perform some manual cleanup:
 | |
| 
 | |
|   - `rm -rf /var/run/kubernetes`, clear kube generated credentials, sometimes
 | |
| stale permissions can cause problems.
 | |
| 
 | |
|   - `sudo iptables -F`, clear ip tables rules left by the kube-proxy.
 | |
| 
 | |
| ### Federation e2e tests
 | |
| 
 | |
| By default, `e2e.go` provisions a single Kubernetes cluster, and any `Feature:Federation` ginkgo tests will be skipped.
 | |
| 
 | |
| Federation e2e testing involve bringing up multiple "underlying" Kubernetes clusters,
 | |
| and deploying the federation control plane as a Kubernetes application on the underlying clusters.
 | |
| 
 | |
| The federation e2e tests are still managed via `e2e.go`, but require some extra configuration items.
 | |
| 
 | |
| #### Configuring federation e2e tests
 | |
| 
 | |
| The following environment variables will enable federation e2e building, provisioning and testing.
 | |
| 
 | |
| ```sh
 | |
| $ export FEDERATION=true
 | |
| $ export E2E_ZONES="us-central1-a us-central1-b us-central1-f"
 | |
| ```
 | |
| 
 | |
| A Kubernetes cluster will be provisioned in each zone listed in `E2E_ZONES`. A zone can only appear once in the `E2E_ZONES` list.
 | |
| 
 | |
| #### Image Push Repository
 | |
| 
 | |
| Next, specify the docker repository where your ci images will be pushed.
 | |
| 
 | |
| * **If `KUBERNETES_PROVIDER=gce` or `KUBERNETES_PROVIDER=gke`**:
 | |
| 
 | |
|   If you use the same GCP project where you to run the e2e tests as the container image repository,
 | |
|   FEDERATION_PUSH_REPO_BASE environment variable will be defaulted to "gcr.io/${DEFAULT_GCP_PROJECT_NAME}".
 | |
|   You can skip ahead to the **Build** section.
 | |
| 
 | |
| 	You can simply set your push repo base based on your project name, and the necessary repositories will be
 | |
|   auto-created when you first push your container images.
 | |
| 
 | |
| 	```sh
 | |
| 	$ export FEDERATION_PUSH_REPO_BASE="gcr.io/${GCE_PROJECT_NAME}"
 | |
| 	```
 | |
| 
 | |
| 	Skip ahead to the **Build** section.
 | |
| 
 | |
| * **For all other providers**:
 | |
| 
 | |
| 	You'll be responsible for creating and managing access to the repositories manually.
 | |
| 
 | |
| 	```sh
 | |
| 	$ export FEDERATION_PUSH_REPO_BASE="quay.io/colin_hom"
 | |
| 	```
 | |
| 
 | |
| 	Given this example, the `federation-apiserver` container image will be pushed to the repository
 | |
| 	`quay.io/colin_hom/federation-apiserver`.
 | |
| 
 | |
| 	The docker client on the machine running `e2e.go` must have push access for the following pre-existing repositories:
 | |
| 
 | |
| 	* `${FEDERATION_PUSH_REPO_BASE}/federation-apiserver`
 | |
| 	* `${FEDERATION_PUSH_REPO_BASE}/federation-controller-manager`
 | |
| 
 | |
| 	These repositories must allow public read access, as the e2e node docker daemons will not have any credentials. If you're using
 | |
| 	GCE/GKE as your provider, the repositories will have read-access by default.
 | |
| 
 | |
| #### Build
 | |
| 
 | |
| * Compile the binaries and build container images:
 | |
| 
 | |
|   ```sh
 | |
|   $ KUBE_RELEASE_RUN_TESTS=n KUBE_FASTBUILD=true go run hack/e2e.go -v -build
 | |
|   ```
 | |
| 
 | |
| * Push the federation container images
 | |
| 
 | |
|   ```sh
 | |
|   $ build-tools/push-federation-images.sh
 | |
|   ```
 | |
| 
 | |
| #### Deploy federation control plane
 | |
| 
 | |
| The following command will create the underlying Kubernetes clusters in each of `E2E_ZONES`, and then provision the
 | |
| federation control plane in the cluster occupying the last zone in the `E2E_ZONES` list.
 | |
| 
 | |
| ```sh
 | |
| $ go run hack/e2e.go -v --up
 | |
| ```
 | |
| 
 | |
| #### Run the Tests
 | |
| 
 | |
| This will run only the `Feature:Federation` e2e tests. You can omit the `ginkgo.focus` argument to run the entire e2e suite.
 | |
| 
 | |
| ```sh
 | |
| $ go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Feature:Federation\]"
 | |
| ```
 | |
| 
 | |
| #### Teardown
 | |
| 
 | |
| ```sh
 | |
| $ go run hack/e2e.go -v --down
 | |
| ```
 | |
| 
 | |
| #### Shortcuts for test developers
 | |
| 
 | |
| * To speed up `e2e.go -up`, provision a single-node kubernetes cluster in a single e2e zone:
 | |
| 
 | |
|   `NUM_NODES=1 E2E_ZONES="us-central1-f"`
 | |
| 
 | |
|   Keep in mind that some tests may require multiple underlying clusters and/or minimum compute resource availability.
 | |
| 
 | |
| * You can quickly recompile the e2e testing framework via `go install ./test/e2e`. This will not do anything besides
 | |
|   allow you to verify that the go code compiles.
 | |
| 
 | |
| * If you want to run your e2e testing framework without re-provisioning the e2e setup, you can do so via
 | |
|   `make WHAT=test/e2e/e2e.test` and then re-running the ginkgo tests.
 | |
| 
 | |
| * If you're hacking around with the federation control plane deployment itself,
 | |
|   you can quickly re-deploy the federation control plane Kubernetes manifests without tearing any resources down.
 | |
|   To re-deploy the federation control plane after running `-up` for the first time:
 | |
| 
 | |
|   ```sh
 | |
|   $ federation/cluster/federation-up.sh
 | |
|   ```
 | |
| 
 | |
| ### Debugging clusters
 | |
| 
 | |
| If a cluster fails to initialize, or you'd like to better understand cluster
 | |
| state to debug a failed e2e test, you can use the `cluster/log-dump.sh` script
 | |
| to gather logs.
 | |
| 
 | |
| This script requires that the cluster provider supports ssh. Assuming it does,
 | |
| running:
 | |
| 
 | |
| ```
 | |
| cluster/log-dump.sh <directory>
 | |
| ````
 | |
| 
 | |
| will ssh to the master and all nodes and download a variety of useful logs to
 | |
| the provided directory (which should already exist).
 | |
| 
 | |
| The Google-run Jenkins builds automatically collected these logs for every
 | |
| build, saving them in the `artifacts` directory uploaded to GCS.
 | |
| 
 | |
| ### Local clusters
 | |
| 
 | |
| It can be much faster to iterate on a local cluster instead of a cloud-based
 | |
| one. To start a local cluster, you can run:
 | |
| 
 | |
| ```sh
 | |
| # The PATH construction is needed because PATH is one of the special-cased
 | |
| # environment variables not passed by sudo -E
 | |
| sudo PATH=$PATH hack/local-up-cluster.sh
 | |
| ```
 | |
| 
 | |
| This will start a single-node Kubernetes cluster than runs pods using the local
 | |
| docker daemon. Press Control-C to stop the cluster.
 | |
| 
 | |
| You can generate a valid kubeconfig file by following instructions printed at the
 | |
| end of aforementioned script.
 | |
| 
 | |
| #### Testing against local clusters
 | |
| 
 | |
| In order to run an E2E test against a locally running cluster, point the tests
 | |
| at a custom host directly:
 | |
| 
 | |
| ```sh
 | |
| export KUBECONFIG=/path/to/kubeconfig
 | |
| export KUBE_MASTER_IP="http://127.0.0.1:<PORT>"
 | |
| export KUBE_MASTER=local
 | |
| go run hack/e2e.go -v --test
 | |
| ```
 | |
| 
 | |
| To control the tests that are run:
 | |
| 
 | |
| ```sh
 | |
| go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\"Secrets\""
 | |
| ```
 | |
| 
 | |
| ### Version-skewed and upgrade testing
 | |
| 
 | |
| We run version-skewed tests to check that newer versions of Kubernetes work
 | |
| similarly enough to older versions.  The general strategy is to cover the following cases:
 | |
| 
 | |
| 1. One version of `kubectl` with another version of the cluster and tests (e.g.
 | |
|    that v1.2 and v1.4 `kubectl` doesn't break v1.3 tests running against a v1.3
 | |
|    cluster).
 | |
| 1. A newer version of the Kubernetes master with older nodes and tests (e.g.
 | |
|    that upgrading a master to v1.3 with nodes at v1.2 still passes v1.2 tests).
 | |
| 1. A newer version of the whole cluster with older tests (e.g. that a cluster
 | |
|    upgraded---master and nodes---to v1.3 still passes v1.2 tests).
 | |
| 1. That an upgraded cluster functions the same as a brand-new cluster of the
 | |
|    same version (e.g. a cluster upgraded to v1.3 passes the same v1.3 tests as
 | |
|    a newly-created v1.3 cluster).
 | |
| 
 | |
| [hack/e2e-runner.sh](http://releases.k8s.io/HEAD/hack/jenkins/e2e-runner.sh) is
 | |
| the authoritative source on how to run version-skewed tests, but below is a
 | |
| quick-and-dirty tutorial.
 | |
| 
 | |
| ```sh
 | |
| # Assume you have two copies of the Kubernetes repository checked out, at
 | |
| # ./kubernetes and ./kubernetes_old
 | |
| 
 | |
| # If using GKE:
 | |
| export KUBERNETES_PROVIDER=gke
 | |
| export CLUSTER_API_VERSION=${OLD_VERSION}
 | |
| 
 | |
| # Deploy a cluster at the old version; see above for more details
 | |
| cd ./kubernetes_old
 | |
| go run ./hack/e2e.go -v --up
 | |
| 
 | |
| # Upgrade the cluster to the new version
 | |
| #
 | |
| # If using GKE, add --upgrade-target=${NEW_VERSION}
 | |
| #
 | |
| # You can target Feature:MasterUpgrade or Feature:ClusterUpgrade
 | |
| cd ../kubernetes
 | |
| go run ./hack/e2e.go -v --test --check_version_skew=false --test_args="--ginkgo.focus=\[Feature:MasterUpgrade\]"
 | |
| 
 | |
| # Run old tests with new kubectl
 | |
| cd ../kubernetes_old
 | |
| go run ./hack/e2e.go -v --test --test_args="--kubectl-path=$(pwd)/../kubernetes/cluster/kubectl.sh"
 | |
| ```
 | |
| 
 | |
| If you are just testing version-skew, you may want to just deploy at one
 | |
| version and then test at another version, instead of going through the whole
 | |
| upgrade process:
 | |
| 
 | |
| ```sh
 | |
| # With the same setup as above
 | |
| 
 | |
| # Deploy a cluster at the new version
 | |
| cd ./kubernetes
 | |
| go run ./hack/e2e.go -v --up
 | |
| 
 | |
| # Run new tests with old kubectl
 | |
| go run ./hack/e2e.go -v --test --test_args="--kubectl-path=$(pwd)/../kubernetes_old/cluster/kubectl.sh"
 | |
| 
 | |
| # Run old tests with new kubectl
 | |
| cd ../kubernetes_old
 | |
| go run ./hack/e2e.go -v --test --test_args="--kubectl-path=$(pwd)/../kubernetes/cluster/kubectl.sh"
 | |
| ```
 | |
| 
 | |
| ## Kinds of tests
 | |
| 
 | |
| We are working on implementing clearer partitioning of our e2e tests to make
 | |
| running a known set of tests easier (#10548). Tests can be labeled with any of
 | |
| the following labels, in order of increasing precedence (that is, each label
 | |
| listed below supersedes the previous ones):
 | |
| 
 | |
|   - If a test has no labels, it is expected to run fast (under five minutes), be
 | |
| able to be run in parallel, and be consistent.
 | |
| 
 | |
|   - `[Slow]`: If a test takes more than five minutes to run (by itself or in
 | |
| parallel with many other tests), it is labeled `[Slow]`. This partition allows
 | |
| us to run almost all of our tests quickly in parallel, without waiting for the
 | |
| stragglers to finish.
 | |
| 
 | |
|   - `[Serial]`: If a test cannot be run in parallel with other tests (e.g. it
 | |
| takes too many resources or restarts nodes), it is labeled `[Serial]`, and
 | |
| should be run in serial as part of a separate suite.
 | |
| 
 | |
|   - `[Disruptive]`: If a test restarts components that might cause other tests
 | |
| to fail or break the cluster completely, it is labeled `[Disruptive]`. Any
 | |
| `[Disruptive]` test is also assumed to qualify for the `[Serial]` label, but
 | |
| need not be labeled as both. These tests are not run against soak clusters to
 | |
| avoid restarting components.
 | |
| 
 | |
|   - `[Flaky]`: If a test is found to be flaky and we have decided that it's too
 | |
| hard to fix in the short term (e.g. it's going to take a full engineer-week), it
 | |
| receives the `[Flaky]` label until it is fixed. The `[Flaky]` label should be
 | |
| used very sparingly, and should be accompanied with a reference to the issue for
 | |
| de-flaking the test, because while a test remains labeled `[Flaky]`, it is not
 | |
| monitored closely in CI. `[Flaky]` tests are by default not run, unless a
 | |
| `focus` or `skip` argument is explicitly given.
 | |
| 
 | |
|   - `[Feature:.+]`: If a test has non-default requirements to run or targets
 | |
| some non-core functionality, and thus should not be run as part of the standard
 | |
| suite, it receives a `[Feature:.+]` label, e.g. `[Feature:Performance]` or
 | |
| `[Feature:Ingress]`. `[Feature:.+]` tests are not run in our core suites,
 | |
| instead running in custom suites. If a feature is experimental or alpha and is
 | |
| not enabled by default due to being incomplete or potentially subject to
 | |
| breaking changes, it does *not* block the merge-queue, and thus should run in
 | |
| some separate test suites owned by the feature owner(s)
 | |
| (see [Continuous Integration](#continuous-integration) below).
 | |
| 
 | |
| ### Viper configuration and hierarchichal test parameters.
 | |
| 
 | |
| The future of e2e test configuration idioms will be increasingly defined using viper, and decreasingly via flags.
 | |
| 
 | |
| Flags in general fall apart once tests become sufficiently complicated.  So, even if we could use another flag library, it wouldn't be ideal.
 | |
| 
 | |
| To use viper, rather than flags, to configure your tests:
 | |
| 
 | |
| - Just add "e2e.json" to the current directory you are in, and define parameters in it... i.e. `"kubeconfig":"/tmp/x"`.
 | |
| 
 | |
| Note that advanced testing parameters, and hierarchichally defined parameters, are only defined in viper, to see what they are, you can dive into [TestContextType](../../test/e2e/framework/test_context.go).
 | |
| 
 | |
| In time, it is our intent to add or autogenerate a sample viper configuration that includes all e2e parameters, to ship with kubernetes.
 | |
| 
 | |
| ### Conformance tests
 | |
| 
 | |
| Finally, `[Conformance]` tests represent a subset of the e2e-tests we expect to
 | |
| pass on **any** Kubernetes cluster. The `[Conformance]` label does not supersede
 | |
| any other labels.
 | |
| 
 | |
| As each new release of Kubernetes providers new functionality, the subset of
 | |
| tests necessary to demonstrate conformance grows with each release. Conformance
 | |
| is thus considered versioned, with the same backwards compatibility guarantees
 | |
| as laid out in [our versioning policy](../design/versioning.md#supported-releases).
 | |
| Conformance tests for a given version should be run off of the release branch
 | |
| that corresponds to that version. Thus `v1.2` conformance tests would be run
 | |
| from the head of the `release-1.2` branch. eg:
 | |
| 
 | |
|  - A v1.3 development cluster should pass v1.1, v1.2 conformance tests
 | |
| 
 | |
|  - A v1.2 cluster should pass v1.1, v1.2 conformance tests
 | |
| 
 | |
|  - A v1.1 cluster should pass v1.0, v1.1 conformance tests, and fail v1.2
 | |
| conformance tests
 | |
| 
 | |
| Conformance tests are designed to be run with no cloud provider configured.
 | |
| Conformance tests can be run against clusters that have not been created with
 | |
| `hack/e2e.go`, just provide a kubeconfig with the appropriate endpoint and
 | |
| credentials.
 | |
| 
 | |
| ```sh
 | |
| # setup for conformance tests
 | |
| export KUBECONFIG=/path/to/kubeconfig
 | |
| export KUBERNETES_CONFORMANCE_TEST=y
 | |
| export KUBERNETES_PROVIDER=skeleton
 | |
| 
 | |
| # run all conformance tests
 | |
| go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Conformance\]"
 | |
| 
 | |
| # run all parallel-safe conformance tests in parallel
 | |
| GINKGO_PARALLEL=y go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Conformance\] --ginkgo.skip=\[Serial\]"
 | |
| 
 | |
| # ... and finish up with remaining tests in serial
 | |
| go run hack/e2e.go -v --test --test_args="--ginkgo.focus=\[Serial\].*\[Conformance\]"
 | |
| ```
 | |
| 
 | |
| ### Defining Conformance Subset
 | |
| 
 | |
| It is impossible to define the entire space of Conformance tests without knowing
 | |
| the future, so instead, we define the compliment of conformance tests, below
 | |
| (`Please update this with companion PRs as necessary`):
 | |
| 
 | |
|   - A conformance test cannot test cloud provider specific features (i.e. GCE
 | |
| monitoring, S3 Bucketing, ...)
 | |
| 
 | |
|   - A conformance test cannot rely on any particular non-standard file system
 | |
| permissions granted to containers or users (i.e. sharing writable host /tmp with
 | |
| a container)
 | |
| 
 | |
|   - A conformance test cannot rely on any binaries that are not required for the
 | |
| linux kernel or for a kubelet to run (i.e. git)
 | |
| 
 | |
|   - A conformance test cannot test a feature which obviously cannot be supported
 | |
| on a broad range of platforms (i.e. testing of multiple disk mounts, GPUs, high
 | |
| density)
 | |
| 
 | |
| ## Continuous Integration
 | |
| 
 | |
| A quick overview of how we run e2e CI on Kubernetes.
 | |
| 
 | |
| ### What is CI?
 | |
| 
 | |
| We run a battery of `e2e` tests against `HEAD` of the master branch on a
 | |
| continuous basis, and block merges via the [submit
 | |
| queue](http://submit-queue.k8s.io/) on a subset of those tests if they fail (the
 | |
| subset is defined in the [munger config]
 | |
| (https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go)
 | |
| via the `jenkins-jobs` flag; note we also block on	`kubernetes-build` and
 | |
| `kubernetes-test-go` jobs for build and unit and integration tests).
 | |
| 
 | |
| CI results can be found at [ci-test.k8s.io](http://ci-test.k8s.io), e.g.
 | |
| [ci-test.k8s.io/kubernetes-e2e-gce/10594](http://ci-test.k8s.io/kubernetes-e2e-gce/10594).
 | |
| 
 | |
| ### What runs in CI?
 | |
| 
 | |
| We run all default tests (those that aren't marked `[Flaky]` or `[Feature:.+]`)
 | |
| against GCE and GKE. To minimize the time from regression-to-green-run, we
 | |
| partition tests across different jobs:
 | |
| 
 | |
|   - `kubernetes-e2e-<provider>` runs all non-`[Slow]`, non-`[Serial]`,
 | |
| non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel.
 | |
| 
 | |
|   - `kubernetes-e2e-<provider>-slow` runs all `[Slow]`, non-`[Serial]`,
 | |
| non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel.
 | |
| 
 | |
|   - `kubernetes-e2e-<provider>-serial` runs all `[Serial]` and `[Disruptive]`,
 | |
| non-`[Flaky]`, non-`[Feature:.+]` tests in serial.
 | |
| 
 | |
| We also run non-default tests if the tests exercise general-availability ("GA")
 | |
| features that require a special environment to run in, e.g.
 | |
| `kubernetes-e2e-gce-scalability` and `kubernetes-kubemark-gce`, which test for
 | |
| Kubernetes performance.
 | |
| 
 | |
| #### Non-default tests
 | |
| 
 | |
| Many `[Feature:.+]` tests we don't run in CI. These tests are for features that
 | |
| are experimental (often in the `experimental` API), and aren't enabled by
 | |
| default.
 | |
| 
 | |
| ### The PR-builder
 | |
| 
 | |
| We also run a battery of tests against every PR before we merge it. These tests
 | |
| are equivalent to `kubernetes-gce`: it runs all non-`[Slow]`, non-`[Serial]`,
 | |
| non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. These
 | |
| tests are considered "smoke tests" to give a decent signal that the PR doesn't
 | |
| break most functionality. Results for your PR can be found at
 | |
| [pr-test.k8s.io](http://pr-test.k8s.io), e.g.
 | |
| [pr-test.k8s.io/20354](http://pr-test.k8s.io/20354) for #20354.
 | |
| 
 | |
| ### Adding a test to CI
 | |
| 
 | |
| As mentioned above, prior to adding a new test, it is a good idea to perform a
 | |
| `-ginkgo.dryRun=true` on the system, in order to see if a behavior is already
 | |
| being tested, or to determine if it may be possible to augment an existing set
 | |
| of tests for a specific use case.
 | |
| 
 | |
| If a behavior does not currently have coverage and a developer wishes to add a
 | |
| new e2e test, navigate to the ./test/e2e directory and create a new test using
 | |
| the existing suite as a guide.
 | |
| 
 | |
| TODO(#20357): Create a self-documented example which has been disabled, but can
 | |
| be copied to create new tests and outlines the capabilities and libraries used.
 | |
| 
 | |
| When writing a test, consult #kinds_of_tests above to determine how your test
 | |
| should be marked, (e.g. `[Slow]`, `[Serial]`; remember, by default we assume a
 | |
| test can run in parallel with other tests!).
 | |
| 
 | |
| When first adding a test it should *not* go straight into CI, because failures
 | |
| block ordinary development. A test should only be added to CI after is has been
 | |
| running in some non-CI suite long enough to establish a track record showing
 | |
| that the test does not fail when run against *working* software. Note also that
 | |
| tests running in CI are generally running on a well-loaded cluster, so must
 | |
| contend for resources; see above about [kinds of tests](#kinds_of_tests).
 | |
| 
 | |
| Generally, a feature starts as `experimental`, and will be run in some suite
 | |
| owned by the team developing the feature. If a feature is in beta or GA, it
 | |
| *should* block the merge-queue. In moving from experimental to beta or GA, tests
 | |
| that are expected to pass by default should simply remove the `[Feature:.+]`
 | |
| label, and will be incorporated into our core suites. If tests are not expected
 | |
| to pass by default, (e.g. they require a special environment such as added
 | |
| quota,) they should remain with the `[Feature:.+]` label, and the suites that
 | |
| run them should be incorporated into the
 | |
| [munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go)
 | |
| via the `jenkins-jobs` flag.
 | |
| 
 | |
| Occasionally, we'll want to add tests to better exercise features that are
 | |
| already GA. These tests also shouldn't go straight to CI. They should begin by
 | |
| being marked as `[Flaky]` to be run outside of CI, and once a track-record for
 | |
| them is established, they may be promoted out of `[Flaky]`.
 | |
| 
 | |
| ### Moving a test out of CI
 | |
| 
 | |
| If we have determined that a test is known-flaky and cannot be fixed in the
 | |
| short-term, we may move it out of CI indefinitely. This move should be used
 | |
| sparingly, as it effectively means that we have no coverage of that test. When a
 | |
| test is demoted, it should be marked `[Flaky]` with a comment accompanying the
 | |
| label with a reference to an issue opened to fix the test.
 | |
| 
 | |
| ## Performance Evaluation
 | |
| 
 | |
| Another benefit of the e2e tests is the ability to create reproducible loads on
 | |
| the system, which can then be used to determine the responsiveness, or analyze
 | |
| other characteristics of the system. For example, the density tests load the
 | |
| system to 30,50,100 pods per/node and measures the different characteristics of
 | |
| the system, such as throughput, api-latency, etc.
 | |
| 
 | |
| For a good overview of how we analyze performance data, please read the
 | |
| following [post](http://blog.kubernetes.io/2015/09/kubernetes-performance-measurements-and.html)
 | |
| 
 | |
| For developers who are interested in doing their own performance analysis, we
 | |
| recommend setting up [prometheus](http://prometheus.io/) for data collection,
 | |
| and using [promdash](http://prometheus.io/docs/visualization/promdash/) to
 | |
| visualize the data.  There also exists the option of pushing your own metrics in
 | |
| from the tests using a
 | |
| [prom-push-gateway](http://prometheus.io/docs/instrumenting/pushing/).
 | |
| Containers for all of these components can be found
 | |
| [here](https://hub.docker.com/u/prom/).
 | |
| 
 | |
| For more accurate measurements, you may wish to set up prometheus external to
 | |
| kubernetes in an environment where it can access the major system components
 | |
| (api-server, controller-manager, scheduler). This is especially useful when
 | |
| attempting to gather metrics in a load-balanced api-server environment, because
 | |
| all api-servers can be analyzed independently as well as collectively. On
 | |
| startup, configuration file is passed to prometheus that specifies the endpoints
 | |
| that prometheus will scrape, as well as the sampling interval.
 | |
| 
 | |
| ```
 | |
| #prometheus.conf
 | |
| job: {
 | |
|   name: "kubernetes"
 | |
|   scrape_interval: "1s"
 | |
|   target_group: {
 | |
|     # apiserver(s)
 | |
|     target: "http://localhost:8080/metrics"
 | |
|     # scheduler
 | |
|     target: "http://localhost:10251/metrics"
 | |
|     # controller-manager
 | |
|     target: "http://localhost:10252/metrics"
 | |
|   }
 | |
| }
 | |
| ```
 | |
| 
 | |
| Once prometheus is scraping the kubernetes endpoints, that data can then be
 | |
| plotted using promdash, and alerts can be created against the assortment of
 | |
| metrics that kubernetes provides.
 | |
| 
 | |
| ## One More Thing
 | |
| 
 | |
| You should also know the [testing conventions](coding-conventions.md#testing-conventions).
 | |
| 
 | |
| **HAPPY TESTING!**
 | |
| 
 | |
| 
 | |
| 
 | |
| <!-- BEGIN MUNGE: GENERATED_ANALYTICS -->
 | |
| []()
 | |
| <!-- END MUNGE: GENERATED_ANALYTICS -->
 |