From 63b7ba0414ab85745b5f458c0011edeadbd2696f Mon Sep 17 00:00:00 2001 From: Prashanth Balasubramanian Date: Wed, 25 Nov 2015 10:50:44 -0800 Subject: [PATCH 1/4] [wip] CI testing guidelines --- docs/devel/testing.md | 66 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 docs/devel/testing.md diff --git a/docs/devel/testing.md b/docs/devel/testing.md new file mode 100644 index 00000000000..ed51dcc68fa --- /dev/null +++ b/docs/devel/testing.md @@ -0,0 +1,66 @@ + + + + +WARNING +WARNING +WARNING +WARNING +WARNING + +

PLEASE NOTE: This document applies to the HEAD of the source tree

+ +If you are using a released version of Kubernetes, you should +refer to the docs that go with that version. + + +The latest release of this document can be found +[here](http://releases.k8s.io/release-1.1/docs/devel/testing.md). + +Documentation for other releases can be found at +[releases.k8s.io](http://releases.k8s.io). + +-- + + + + +Kubernetes Commit Queue Testing +=============================== + +A quick overview of how we add, remove and recycle tests from CI. + +## What is CI? + +Throughout this document we will refer to CI as any suite of e2e tests that can potentially hold up the submit queue, this means Kubernetes PRs must pass these tests prior to getting merged. + +## Adding a test to CI + +When first adding a test it should *not* go straight into CI, because failures block ordinary development. A test should only be added to CI after is has been running in some non-CI suite long enought to establish a track record showing that the test does not fail when run against *working* software. A suite named `flaky` exists, and can be overloaded to mean `experimental` and used for this reason (can it really?). In addition to this track record, consider the following as requirements: +* The test must be short (20m?) +* Failures must indicate that the product is unfit (TODO: establish a firmer bar, what I'm trying to say is testing random controller X in CI doesn't help anyone, but maybe it does if controller X is a cluster addon, or our largest customer wants X, or what?) +* Failures must reliably indicate a bug in the product, not a bug in the test + +(TODO: is there a parallelism requirement here?) + +## Moving a test out of CI + +Do *not* move a test to flaky as soon as it starts failing just to clear up the submit queue, this risks introducing more bugs and compounding the problem even further (TODO: or do this? but why). Build cop can use their better judgement to call a test `flaky`. This means it fails for presumably random reasons, once in X runs (TODO: is X == 0?). Move flaky tests out of CI, create a P0/1 bug and try to triage it along to the right person. Adding the `kind/flake` label on github will grab the attention of the grumpy CI shamer bot, which will include the bug in its daily report. + +If your test got moved to flaky, it must demonstrate the run of greens required for getting added to CI once again (or nah?). + +## Non CI channels for testing + +If you want to test against Kubernetes but your test doesn't meet the following requirements, peruse [this list](../../hack/jenkins/e2e.sh) and add your test in whatever makes sense (TODO: What about release-lists, shouldn't they mirror other lists?). + + + + + +[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/devel/testing.md?pixel)]() + From b6cb8be2bda76f00196b25241ea5830162b08768 Mon Sep 17 00:00:00 2001 From: Isaac Hollander McCreery Date: Fri, 29 Jan 2016 16:20:53 -0800 Subject: [PATCH 2/4] CI testing guidelines redux --- docs/devel/e2e-tests.md | 45 ++++++++++++++++++++++++---- docs/devel/testing.md | 66 ----------------------------------------- 2 files changed, 40 insertions(+), 71 deletions(-) delete mode 100644 docs/devel/testing.md diff --git a/docs/devel/e2e-tests.md b/docs/devel/e2e-tests.md index 388e25f08e6..63ebd16e25c 100644 --- a/docs/devel/e2e-tests.md +++ b/docs/devel/e2e-tests.md @@ -112,19 +112,54 @@ We are working on implementing clearer partitioning of our e2e tests to make run - `[Disruptive]`: If a test restarts components that might cause other tests to fail or break the cluster completely, it is labeled `[Disruptive]`. Any `[Disruptive]` test is also assumed to qualify for the `[Serial]` label, but need not be labeled as both. These tests are not run against soak clusters to avoid restarting components. - `[Flaky]`: If a test is found to be flaky, it receives the `[Flaky]` label until it is fixed. A `[Flaky]` label should be accompanied with a reference to the issue for de-flaking the test, because while a test remains labeled `[Flaky]`, it is not monitored closely in CI. `[Flaky]` tests are by default not run, unless a `focus` or `skip` argument is explicitly given. - `[Skipped]`: `[Skipped]` is a legacy label that we're phasing out. If a test is marked `[Skipped]`, there should be an issue open to label it properly. `[Skipped]` tests are by default not run, unless a `focus` or `skip` argument is explicitly given. -- `[Feature:...]`: If a test has non-default requirements to run or targets some non-core functionality, and thus should not be run as part of the standard suite, it receives a `[Feature:...]` label, e.g. `[Feature:Performance]` or `[Feature:Ingress]`. `[Feature:...]` tests are not run in our core suites, instead running in custom suites. There are a few use-cases for `[Feature:...]` tests: - - If a feature is experimental or alpha and is not enabled by default due to being incomplete or potentially subject to breaking changes, it should *not* block the merge-queue, and thus should run in some separate test suites owned by the feature owner(s). - - If a feature is in beta or GA, it *should* block the merge-queue. In moving from experimental to beta or GA, tests that are expected to pass by default should simply remove the `[Feature:...]` label, and will be incorporated into our core suites. If tests are not expected to pass by default, (e.g. they require a special environment such as added quota,) they should remain with the `[Feature:...]` label, and the suites that run them should be incorporated into our merge-queue, owned by the Build Cop. +- `[Feature:.+]`: If a test has non-default requirements to run or targets some non-core functionality, and thus should not be run as part of the standard suite, it receives a `[Feature:.+]` label, e.g. `[Feature:Performance]` or `[Feature:Ingress]`. `[Feature:.+]` tests are not run in our core suites, instead running in custom suites. There are a few use-cases for `[Feature:.+]` tests: + - If a feature is experimental or alpha and is not enabled by default due to being incomplete or potentially subject to breaking changes, it does *not* block the merge-queue, and thus should run in some separate test suites owned by the feature owner(s) (see #continuous_integration below). Finally, `[Conformance]` tests are tests we expect to pass on **any** Kubernetes cluster. The `[Conformance]` label does not supersede any other labels. `[Conformance]` test policies are a work-in-progress; see #18162. -## Adding a New Test +## Continuous Integration + +A quick overview of how we run e2e CI on Kubernetes. + +### What is CI? + +We run a battery of `e2e` tests against `HEAD` of the master branch on a continuous basis, and block merges via the [submit queue](http://submit-queue.k8s.io/) on a subset of those tests if they fail (the subset is defined in the [munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go) via the `jenkins-jobs` flag; note we also block on `kubernetes-build` and `kubernetes-test-go` jobs for build and unit and integration tests). + +CI results can be found at [ci-test.k8s.io](ci-test.k8s.io), e.g. [ci-test.k8s.io/kubernetes-e2e-gce/10594](ci-test.k8s.io/kubernetes-e2e-gce/10594). + +### What runs in CI? + +We run all default tests (those that aren't marked `[Flaky]` or `[Feature:.+]`) against GCE and GKE. To minimize the time from regression-to-green-run, we partition tests across different jobs: + +- `kubernetes-` runs all non-`[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. +- `kubernetes--slow` runs all `[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. +- `kubernetes--serial` runs all `[Serial]` and `[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in serial. + +We also run non-default tests if the tests exercise general-availability ("GA") features that require a special environment to run in, e.g. `kubernetes-e2e-gce-scalability` and `kubernetes-kubemark-gce`, which test for Kubernetes performance. + +#### Non-default tests + +Many `[Feature:.+]` tests we don't run in CI. These tests are for features that are experimental (often in the `experimental` API), and aren't enabled by default. + +### Adding a test to CI As mentioned above, prior to adding a new test, it is a good idea to perform a `-ginkgo.dryRun=true` on the system, in order to see if a behavior is already being tested, or to determine if it may be possible to augment an existing set of tests for a specific use case. If a behavior does not currently have coverage and a developer wishes to add a new e2e test, navigate to the ./test/e2e directory and create a new test using the existing suite as a guide. -**TODO:** Create a self-documented example which has been disabled, but can be copied to create new tests and outlines the capabilities and libraries used. +TODO(#20357): Create a self-documented example which has been disabled, but can be copied to create new tests and outlines the capabilities and libraries used. + +When writing a test, consult #kinds_of_tests above to determine how your test should be marked, (e.g. `[Slow]`, `[Serial]`; remember, by default we assume a test can run in parallel with other tests!). + +When first adding a test it should *not* go straight into CI, because failures block ordinary development. A test should only be added to CI after is has been running in some non-CI suite long enough to establish a track record showing that the test does not fail when run against *working* software. + +Generally, a feature starts as `experimental`, and will be run in some suite owned by the team developing the feature. If a feature is in beta or GA, it *should* block the merge-queue. In moving from experimental to beta or GA, tests that are expected to pass by default should simply remove the `[Feature:.+]` label, and will be incorporated into our core suites. If tests are not expected to pass by default, (e.g. they require a special environment such as added quota,) they should remain with the `[Feature:.+]` label, and the suites that run them should be incorporated into the [munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go) via the `jenkins-jobs` flag. + +Occasionally, we'll want to add tests to better exercise features that are already GA. These tests also shouldn't go straight to CI. They should begin by being marked as `[Flaky]` to be run outside of CI, and once a track-record for them is established, they may be promoted out of `[Flaky]`. + +### Moving a test out of CI + +TODO(ihmccreery) do we want to keep the `[Flaky]` label at all? ## Performance Evaluation diff --git a/docs/devel/testing.md b/docs/devel/testing.md deleted file mode 100644 index ed51dcc68fa..00000000000 --- a/docs/devel/testing.md +++ /dev/null @@ -1,66 +0,0 @@ - - - - -WARNING -WARNING -WARNING -WARNING -WARNING - -

PLEASE NOTE: This document applies to the HEAD of the source tree

- -If you are using a released version of Kubernetes, you should -refer to the docs that go with that version. - - -The latest release of this document can be found -[here](http://releases.k8s.io/release-1.1/docs/devel/testing.md). - -Documentation for other releases can be found at -[releases.k8s.io](http://releases.k8s.io). - --- - - - - -Kubernetes Commit Queue Testing -=============================== - -A quick overview of how we add, remove and recycle tests from CI. - -## What is CI? - -Throughout this document we will refer to CI as any suite of e2e tests that can potentially hold up the submit queue, this means Kubernetes PRs must pass these tests prior to getting merged. - -## Adding a test to CI - -When first adding a test it should *not* go straight into CI, because failures block ordinary development. A test should only be added to CI after is has been running in some non-CI suite long enought to establish a track record showing that the test does not fail when run against *working* software. A suite named `flaky` exists, and can be overloaded to mean `experimental` and used for this reason (can it really?). In addition to this track record, consider the following as requirements: -* The test must be short (20m?) -* Failures must indicate that the product is unfit (TODO: establish a firmer bar, what I'm trying to say is testing random controller X in CI doesn't help anyone, but maybe it does if controller X is a cluster addon, or our largest customer wants X, or what?) -* Failures must reliably indicate a bug in the product, not a bug in the test - -(TODO: is there a parallelism requirement here?) - -## Moving a test out of CI - -Do *not* move a test to flaky as soon as it starts failing just to clear up the submit queue, this risks introducing more bugs and compounding the problem even further (TODO: or do this? but why). Build cop can use their better judgement to call a test `flaky`. This means it fails for presumably random reasons, once in X runs (TODO: is X == 0?). Move flaky tests out of CI, create a P0/1 bug and try to triage it along to the right person. Adding the `kind/flake` label on github will grab the attention of the grumpy CI shamer bot, which will include the bug in its daily report. - -If your test got moved to flaky, it must demonstrate the run of greens required for getting added to CI once again (or nah?). - -## Non CI channels for testing - -If you want to test against Kubernetes but your test doesn't meet the following requirements, peruse [this list](../../hack/jenkins/e2e.sh) and add your test in whatever makes sense (TODO: What about release-lists, shouldn't they mirror other lists?). - - - - - -[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/docs/devel/testing.md?pixel)]() - From 89dfad4087d00c5c4576d0a055a9e0551c591fd0 Mon Sep 17 00:00:00 2001 From: Isaac Hollander McCreery Date: Fri, 29 Jan 2016 16:23:59 -0800 Subject: [PATCH 3/4] Add docs about the PR builder --- docs/devel/e2e-tests.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/devel/e2e-tests.md b/docs/devel/e2e-tests.md index 63ebd16e25c..12915543ef5 100644 --- a/docs/devel/e2e-tests.md +++ b/docs/devel/e2e-tests.md @@ -141,6 +141,10 @@ We also run non-default tests if the tests exercise general-availability ("GA") Many `[Feature:.+]` tests we don't run in CI. These tests are for features that are experimental (often in the `experimental` API), and aren't enabled by default. +### The PR-builder + +We also run a battery of tests against every PR before we merge it. These tests are equivalent to `kubernetes-gce`: it runs all non-`[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. These tests are considered "smoke tests" to give a decent signal that the PR doesn't break most functionality. Results for you PR can be found at [pr-test.k8s.io](pr-test.k8s.io), e.g. [pr-test.k8s.io/20354](pr-test.k8s.io/20354) for #20354. + ### Adding a test to CI As mentioned above, prior to adding a new test, it is a good idea to perform a `-ginkgo.dryRun=true` on the system, in order to see if a behavior is already being tested, or to determine if it may be possible to augment an existing set of tests for a specific use case. From bb4a41d84c3f9f9afb2362774a171bb1e4490b8e Mon Sep 17 00:00:00 2001 From: Isaac Hollander McCreery Date: Mon, 1 Feb 2016 16:24:41 -0800 Subject: [PATCH 4/4] Updates based on comments --- docs/devel/e2e-tests.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/devel/e2e-tests.md b/docs/devel/e2e-tests.md index 12915543ef5..54c7caefe01 100644 --- a/docs/devel/e2e-tests.md +++ b/docs/devel/e2e-tests.md @@ -112,8 +112,7 @@ We are working on implementing clearer partitioning of our e2e tests to make run - `[Disruptive]`: If a test restarts components that might cause other tests to fail or break the cluster completely, it is labeled `[Disruptive]`. Any `[Disruptive]` test is also assumed to qualify for the `[Serial]` label, but need not be labeled as both. These tests are not run against soak clusters to avoid restarting components. - `[Flaky]`: If a test is found to be flaky, it receives the `[Flaky]` label until it is fixed. A `[Flaky]` label should be accompanied with a reference to the issue for de-flaking the test, because while a test remains labeled `[Flaky]`, it is not monitored closely in CI. `[Flaky]` tests are by default not run, unless a `focus` or `skip` argument is explicitly given. - `[Skipped]`: `[Skipped]` is a legacy label that we're phasing out. If a test is marked `[Skipped]`, there should be an issue open to label it properly. `[Skipped]` tests are by default not run, unless a `focus` or `skip` argument is explicitly given. -- `[Feature:.+]`: If a test has non-default requirements to run or targets some non-core functionality, and thus should not be run as part of the standard suite, it receives a `[Feature:.+]` label, e.g. `[Feature:Performance]` or `[Feature:Ingress]`. `[Feature:.+]` tests are not run in our core suites, instead running in custom suites. There are a few use-cases for `[Feature:.+]` tests: - - If a feature is experimental or alpha and is not enabled by default due to being incomplete or potentially subject to breaking changes, it does *not* block the merge-queue, and thus should run in some separate test suites owned by the feature owner(s) (see #continuous_integration below). +- `[Feature:.+]`: If a test has non-default requirements to run or targets some non-core functionality, and thus should not be run as part of the standard suite, it receives a `[Feature:.+]` label, e.g. `[Feature:Performance]` or `[Feature:Ingress]`. `[Feature:.+]` tests are not run in our core suites, instead running in custom suites. If a feature is experimental or alpha and is not enabled by default due to being incomplete or potentially subject to breaking changes, it does *not* block the merge-queue, and thus should run in some separate test suites owned by the feature owner(s) (see #continuous_integration below). Finally, `[Conformance]` tests are tests we expect to pass on **any** Kubernetes cluster. The `[Conformance]` label does not supersede any other labels. `[Conformance]` test policies are a work-in-progress; see #18162. @@ -125,15 +124,15 @@ A quick overview of how we run e2e CI on Kubernetes. We run a battery of `e2e` tests against `HEAD` of the master branch on a continuous basis, and block merges via the [submit queue](http://submit-queue.k8s.io/) on a subset of those tests if they fail (the subset is defined in the [munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go) via the `jenkins-jobs` flag; note we also block on `kubernetes-build` and `kubernetes-test-go` jobs for build and unit and integration tests). -CI results can be found at [ci-test.k8s.io](ci-test.k8s.io), e.g. [ci-test.k8s.io/kubernetes-e2e-gce/10594](ci-test.k8s.io/kubernetes-e2e-gce/10594). +CI results can be found at [ci-test.k8s.io](http://ci-test.k8s.io), e.g. [ci-test.k8s.io/kubernetes-e2e-gce/10594](http://ci-test.k8s.io/kubernetes-e2e-gce/10594). ### What runs in CI? We run all default tests (those that aren't marked `[Flaky]` or `[Feature:.+]`) against GCE and GKE. To minimize the time from regression-to-green-run, we partition tests across different jobs: -- `kubernetes-` runs all non-`[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. -- `kubernetes--slow` runs all `[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. -- `kubernetes--serial` runs all `[Serial]` and `[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in serial. +- `kubernetes-e2e-` runs all non-`[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. +- `kubernetes-e2e--slow` runs all `[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. +- `kubernetes-e2e--serial` runs all `[Serial]` and `[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in serial. We also run non-default tests if the tests exercise general-availability ("GA") features that require a special environment to run in, e.g. `kubernetes-e2e-gce-scalability` and `kubernetes-kubemark-gce`, which test for Kubernetes performance. @@ -143,7 +142,7 @@ Many `[Feature:.+]` tests we don't run in CI. These tests are for features that ### The PR-builder -We also run a battery of tests against every PR before we merge it. These tests are equivalent to `kubernetes-gce`: it runs all non-`[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. These tests are considered "smoke tests" to give a decent signal that the PR doesn't break most functionality. Results for you PR can be found at [pr-test.k8s.io](pr-test.k8s.io), e.g. [pr-test.k8s.io/20354](pr-test.k8s.io/20354) for #20354. +We also run a battery of tests against every PR before we merge it. These tests are equivalent to `kubernetes-gce`: it runs all non-`[Slow]`, non-`[Serial]`, non-`[Disruptive]`, non-`[Flaky]`, non-`[Feature:.+]` tests in parallel. These tests are considered "smoke tests" to give a decent signal that the PR doesn't break most functionality. Results for you PR can be found at [pr-test.k8s.io](http://pr-test.k8s.io), e.g. [pr-test.k8s.io/20354](http://pr-test.k8s.io/20354) for #20354. ### Adding a test to CI @@ -155,7 +154,7 @@ TODO(#20357): Create a self-documented example which has been disabled, but can When writing a test, consult #kinds_of_tests above to determine how your test should be marked, (e.g. `[Slow]`, `[Serial]`; remember, by default we assume a test can run in parallel with other tests!). -When first adding a test it should *not* go straight into CI, because failures block ordinary development. A test should only be added to CI after is has been running in some non-CI suite long enough to establish a track record showing that the test does not fail when run against *working* software. +When first adding a test it should *not* go straight into CI, because failures block ordinary development. A test should only be added to CI after is has been running in some non-CI suite long enough to establish a track record showing that the test does not fail when run against *working* software. Note also that tests running in CI are generally running on a well-loaded cluster, so must contend for resources; see above about [kinds of tests](#kinds_of_tests). Generally, a feature starts as `experimental`, and will be run in some suite owned by the team developing the feature. If a feature is in beta or GA, it *should* block the merge-queue. In moving from experimental to beta or GA, tests that are expected to pass by default should simply remove the `[Feature:.+]` label, and will be incorporated into our core suites. If tests are not expected to pass by default, (e.g. they require a special environment such as added quota,) they should remain with the `[Feature:.+]` label, and the suites that run them should be incorporated into the [munger config](https://github.com/kubernetes/contrib/blob/master/mungegithub/mungers/submit-queue.go) via the `jenkins-jobs` flag. @@ -163,7 +162,7 @@ Occasionally, we'll want to add tests to better exercise features that are alrea ### Moving a test out of CI -TODO(ihmccreery) do we want to keep the `[Flaky]` label at all? +If we have determined that a test is known-flaky and cannot be fixed in the short-term, we may move it out of CI indefinitely. This move should be used sparingly, as it effectively means that we have no coverage of that test. When a test if demoted, it should be marked `[Flaky]` with a comment accompanying the label with a reference to an issue opened to fix the test. ## Performance Evaluation