Automatic merge from submit-queue
Bump upgrade test timout to 10 hours
@spxtr is it reasonable to expect that running the v1.2 tests in serial would take longer than ~ 5 hours (assuming the upgrade beforehand takes < 1 hour)?
Automatic merge from submit-queue
Improve script for running scheduler benchmarks
Without this change, this script didn't work in my environment - it's making more consistent with other scripts.
@hongchaodeng @xiang90
Automatic merge from submit-queue
Clarify api-group docs by a tiny bit.
I realize this area is in flux and the doc is out of date, but it's strictly better with this update than without?
Automatic merge from submit-queue
Run test-go less often on release branches.
I made 1.2 run every 3 hours and 1.1 run every 6 hours. They'll still run right away once a build completes.
I'm going to have to lower the number of executors on the Jenkins slaves that run test-go jobs, since running 3 at a time makes them use up all the CPU and flake.
Automatic merge from submit-queue
Replace tab with eight spaces
This file only uses spaces for indentation, and my text editor highlighted the one tab.
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
etcd3/store: watcher implementation
ref: https://github.com/kubernetes/kubernetes/issues/22448
This PR does:
- Provide a watcher that uses etcd v3 API to watch changes via etcd and process them based on existing logic of storage.Interface.Watch(), WatchList().
- By using the watcher, very trivial to implement Watch() and WatchList() in etcd3 storage.Interface implementation.
Automatic merge from submit-queue
shared controller informers
Related to https://github.com/kubernetes/kubernetes/issues/14978
This demonstrates how controllers which use an `Informer`, would be able to share the same watch and store. A similar "setup and run" approach could be done for an `IndexInformer` to share that cache. I found adding listeners here to be easier than intercepting at the watch interface (problems with resourceVersion) or the reflector (same plumbing, but you have to fan out to multiple stores).
We could also use the cache we build here to back several of the admission plugins that currently run their own lookup caches today.
If there's interest, I can finish out the `SharedInformer` and switch the low hanging fruit over.
@kubernetes/rh-cluster-infra @smarterclayton @liggitt @wojtek-t
Automatic merge from submit-queue
Fix PullImage and add corresponding node e2e test
Fixes#24101. This is a bug introduced by #23506, since ref #23563.
The root cause of #24101 is described [here](https://github.com/kubernetes/kubernetes/issues/24101#issuecomment-208547623).
This PR
1) Fixes#24101 by decoding the messages returned during pulling image, and return error if any of the messages contains error.
2) Add the node e2e test to detect this kind of failure.
3) Get present check out of `ConformanceImage.Remove()` and `ConformanceImage.Pull()`. Because sometimes we may expect error to occur in `PullImage()` and `RemoveImage()`, but even that doesn't happen, the `Present()` check will still return error and let the test pass.
@yujuhong @freehan @liangchenye
Also /cc @resouer, because he is doing the image related functions refactoring.
- rebase: ForEach only on Running pods
- add waitFor step in guestbook describe and wrapper
- simplify logs in polling, make panic immediate, give rolluped stats in
the logs.
Improve logging for failure on ForEach
We've encountered flakes in our e2e infrastructure when kubelet took more than
one minute to detach a volume used by a deleted pod.
Let's increase the wait period from 1 to 3 minutes. This slows down the test
by 2 minutes, but it makes the test more stable.
In addition, when kubelet cannot detach a volume for 3 minutes, let the test
wait for additional recycle controller retry interval (10 minutes) and hope the
volume is deleted by then. This should not increase usual test time, it makes
the test stable when kubelet is _extremely_ slow when releasing the volume.