Production-Grade Container Scheduling and Management
Go to file
Antonio Murdaca 48f12259b1 test/e2e/apps: fix race in cronjob test
With CRI-O we've been hitting a lot of flakes with the following test:

[sig-apps] CronJob should remove from active list jobs that have been deleted

The events shown in the test failures in both kube and openshift were the following:

STEP: Found 13 events.
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:02 +0000 UTC - event for forbid: {cronjob-controller } SuccessfulCreate: Created job forbid-1540412040
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:02 +0000 UTC - event for forbid-1540412040: {job-controller } SuccessfulCreate: Created pod: forbid-1540412040-z7n7t
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:02 +0000 UTC - event for forbid-1540412040-z7n7t: {default-scheduler } Scheduled: Successfully assigned e2e-tests-cronjob-rjr2m/forbid-1540412040-z7n7t to 127.0.0.1
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:03 +0000 UTC - event for forbid-1540412040-z7n7t: {kubelet 127.0.0.1} Pulled: Container image "docker.io/library/busybox:1.29" already present on machine
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:03 +0000 UTC - event for forbid-1540412040-z7n7t: {kubelet 127.0.0.1} Created: Created container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:03 +0000 UTC - event for forbid-1540412040-z7n7t: {kubelet 127.0.0.1} Started: Started container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:14:12 +0000 UTC - event for forbid: {cronjob-controller } MissingJob: Active job went missing: forbid-1540412040
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:02 +0000 UTC - event for forbid: {cronjob-controller } SuccessfulCreate: Created job forbid-1540412100
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:02 +0000 UTC - event for forbid-1540412100: {job-controller } SuccessfulCreate: Created pod: forbid-1540412100-rq89l
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:02 +0000 UTC - event for forbid-1540412100-rq89l: {default-scheduler } Scheduled: Successfully assigned e2e-tests-cronjob-rjr2m/forbid-1540412100-rq89l to 127.0.0.1
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:06 +0000 UTC - event for forbid-1540412100-rq89l: {kubelet 127.0.0.1} Started: Started container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:06 +0000 UTC - event for forbid-1540412100-rq89l: {kubelet 127.0.0.1} Created: Created container
Oct 24 20:20:05.541: INFO: At 2018-10-24 20:15:06 +0000 UTC - event for forbid-1540412100-rq89l: {kubelet 127.0.0.1} Pulled: Container image "docker.io/library/busybox:1.29" already present on machine

The code in test is racy because the Forbid policy can still let the controller to create
a new pod for the cronjob. CRI-O is fast at re-creating the pod and by the time
the test code reaches the check, it fails. The events are as follow:

[It] should remove from active list jobs that have been deleted
  /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/apps/cronjob.go:192
STEP: Creating a ForbidConcurrent cronjob
STEP: Ensuring a job is scheduled
STEP: Ensuring exactly one is scheduled
STEP: Deleting the job
STEP: deleting Job.batch forbid-1540412040 in namespace e2e-tests-cronjob-rjr2m, will wait for the garbage collector to delete the pods
Oct 24 20:14:02.533: INFO: Deleting Job.batch forbid-1540412040 took: 2.699182ms
Oct 24 20:14:02.634: INFO: Terminating Job.batch forbid-1540412040 pods took: 100.223228ms
STEP: Ensuring job was deleted
STEP: Ensuring there are no active jobs in the cronjob
[AfterEach] [sig-apps] CronJob
  /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:148

It looks clear that by the time we're ensuring that there are no more active jobs, there
could be _already_ a new job spinned, making the test flakes.

This PR fixes all the above by making sure that the _deleted_ job is not in the Active
list anymore, besides other pod already running but with different UUID which is
going to be fine anyway for the purpose of the test.

Signed-off-by: Antonio Murdaca <runcom@linux.com>
2018-10-25 00:14:19 +02:00
.github Add guidance for testing in the PR template 2018-10-18 14:29:18 +08:00
api Generated files 2018-10-23 13:50:03 -04:00
build Merge pull request #69152 from mkumatag/fix_manifest_push 2018-10-24 01:06:10 -07:00
cluster Merge pull request #70094 from mwwolters/prepare-log-file-args 2018-10-24 12:32:53 -07:00
cmd Merge pull request #70049 from fabriziopandini/kubeadm-graduate-kubelet-start 2018-10-24 09:44:17 -07:00
docs autogenerated 2018-10-24 12:57:42 +02:00
Godeps Merge pull request #69718 from andyzhangx/azurefile-premium 2018-10-18 05:18:50 -07:00
hack Merge pull request #68544 from tanshanshan/lint912-2 2018-10-23 16:44:31 -07:00
logo
pkg Merge pull request #70002 from andyzhangx/getdisklun 2018-10-24 10:53:51 -07:00
plugin switch informer in token authn 2018-10-24 15:46:55 +08:00
staging Merge pull request #69795 from yue9944882/chore/resync-psp-api 2018-10-24 09:44:09 -07:00
test test/e2e/apps: fix race in cronjob test 2018-10-25 00:14:19 +02:00
third_party
translations
vendor Merge pull request #69718 from andyzhangx/azurefile-premium 2018-10-18 05:18:50 -07:00
.bazelrc
.generated_files
.gitattributes
.gitignore
.kazelcfg.json
BUILD.bazel
CHANGELOG-1.2.md
CHANGELOG-1.3.md
CHANGELOG-1.4.md
CHANGELOG-1.5.md
CHANGELOG-1.6.md
CHANGELOG-1.7.md
CHANGELOG-1.8.md
CHANGELOG-1.9.md Update CHANGELOG-1.9.md for v1.9.11. 2018-10-01 18:56:09 +00:00
CHANGELOG-1.10.md Update CHANGELOG-1.10.md for v1.10.9. 2018-10-16 13:47:04 +00:00
CHANGELOG-1.11.md
CHANGELOG-1.12.md Merge pull request #69199 from dstrebel/patch-1 2018-10-16 17:52:53 -07:00
CHANGELOG-1.13.md Update CHANGELOG-1.13.md for v1.13.0-alpha.2. 2018-10-24 14:58:18 +00:00
CHANGELOG.md Upgrade the release version info in CHANGELOG.md 2018-09-29 16:09:58 +08:00
code-of-conduct.md
CONTRIBUTING.md
LICENSE
Makefile
Makefile.generated_files
OWNERS
OWNERS_ALIASES Remove ericchiang from OWNERS files 2018-10-11 18:11:15 -07:00
README.md
SECURITY_CONTACTS
SUPPORT.md
vendordiff.patch Update etcd client to 3.3.9 2018-10-08 13:34:34 -07:00
WORKSPACE

Kubernetes

GoDoc Widget CII Best Practices


Kubernetes is an open source system for managing containerized applications across multiple hosts; providing basic mechanisms for deployment, maintenance, and scaling of applications.

Kubernetes builds upon a decade and a half of experience at Google running production workloads at scale using a system called Borg, combined with best-of-breed ideas and practices from the community.

Kubernetes is hosted by the Cloud Native Computing Foundation (CNCF). If you are a company that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Kubernetes plays a role, read the CNCF announcement.


To start using Kubernetes

See our documentation on kubernetes.io.

Try our interactive tutorial.

Take a free course on Scalable Microservices with Kubernetes.

To start developing Kubernetes

The community repository hosts all information about building Kubernetes from source, how to contribute code and documentation, who to contact about what, etc.

If you want to build Kubernetes right away there are two options:

You have a working Go environment.
$ go get -d k8s.io/kubernetes
$ cd $GOPATH/src/k8s.io/kubernetes
$ make
You have a working Docker environment.
$ git clone https://github.com/kubernetes/kubernetes
$ cd kubernetes
$ make quick-release

For the full story, head over to the developer's documentation.

Support

If you need support, start with the troubleshooting guide, and work your way through the process that we've outlined.

That said, if you have questions, reach out to us one way or another.

Analytics