Automatic merge from submit-queue
First pass at a GKE large cluster Jenkins job
Runs a 1000 node GKE parallel e2e test. On demand only. We'll add more
tests as I see what actually works - this is going to have some
flakiness on its own.
Automatic merge from submit-queue
update hack/build-go to build federation/cmd/federated-apiserver as well
federation/cmd/federated-apiserver was added in https://github.com/kubernetes/kubernetes/pull/23509
cc @jianhuiz
Runs a 1000 node GKE parallel e2e test. On demand only. We'll add more
tests as I see what actually works - this is going to have some
flakiness on its own.
Automatic merge from submit-queue
Disable heapster job which has been broken for a month
https://github.com/kubernetes/kubernetes/issues/23538
This job is no longer producing a useful signal. http://kubekins.dls.corp.google.com/ shows that the last pass was nearly two months ago. I would like to disable the job until someone has the chance to fix it so we are not wasting jenkins resources, contributing to system instability.
Automatic merge from submit-queue
Disable flannel job until it works
https://github.com/kubernetes/kubernetes/issues/24520
See bug, this job is fails every time and has done so for two months. Until someone has time to investigate and fix disable the job on jenkins so we're not wasting resources and reducing system stability.
Automatic merge from submit-queue
Enable protobuf compilation by default
Enables protobuf compilation, build verification checks, and generates all initial code.
kubectl is now 47M on OSX, build time from clean on a 2014 MBP (4 core) on Go 1.6 is ~150s.
@wojtek-t
Automatic merge from submit-queue
hack: change update-swagger-spec.sh apiserver defaults
Removing the explicit list of KUBE_API_VERSIONS will cause the apiserver
to enable all APIs by default. This change reduces the amount of script
hacking needed to add new API groups in the future.
Automatic merge from submit-queue
Incremental improvements to kubelet e2e tests
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
Add some more info to the Jenkins README.
This is a bit of a work-in-progress, and I'd appreciate feedback on what to add or remove. I'm not sure that I need to say so much about the GCS format, and I should probably say some more about JJB.
@kubernetes/sig-testing
Automatic merge from submit-queue
Removing call to update-swagger-spec.sh from update-generated-swagger-docs.sh
Fixes https://github.com/kubernetes/kubernetes/issues/24233
Right now `update-generated-swagger-docs.sh` calls `update-swagger-spec.sh`, but `verify-generated-swagger-docs.sh` does not verify swagger spec (that is done by `verify-swagger-spec.sh`).
Hence, `verify-swagger-spec` breaks if it is called after `verify-generated-swagger-docs`.
Fixing it by removing the call to `update-swagger-spec.sh` from `update-generated-swagger-docs.sh`.
This will require users to run both `update-swagger-spec` and `update-generated-swagger-docs` when they update api types, but they already need to run many more scripts (`update-api-reference-docs`, `update-codegen`).
People should mostly be running hack/update-all.sh directly :)
Automatic merge from submit-queue
Shorten cluster names in GKE Jenkins on Trusty
We identified an issue that the PD tests in GKE Jenkins on Trusty fail because the PD name is longer than the limit of 63 characters. The PD name embeds the "E2E_NAME" env variable exported in the Jenkins job configuration. This PR shortens that string for all GKE Jenkins on Trusty. As a result, the PD name will meet the limit requirement.
Automatic merge from submit-queue
Bump kubernetes-test-go timeout.
It looks like the run times got more inconsistent because of load on the VM. Adding another Jenkins slave improved things so we're not constantly timing out, but it still gets a little close to timing out at times.
Average runtime is ~45 mins so I went with a 100 min timeout.
Fixes#24285
Automatic merge from submit-queue
Remove soak and disruptive 1.1 Jenkins jobs.
They're both in the kubernetes-jenkins project, not their own. The disruptive one isn't a critical build, and I don't think the soak should be critical at all, since it's never green for a week anyway and I don't think we ever plan for it to be.
Automatic merge from submit-queue
Bump upgrade test timout to 10 hours
@spxtr is it reasonable to expect that running the v1.2 tests in serial would take longer than ~ 5 hours (assuming the upgrade beforehand takes < 1 hour)?
Automatic merge from submit-queue
Run test-go less often on release branches.
I made 1.2 run every 3 hours and 1.1 run every 6 hours. They'll still run right away once a build completes.
I'm going to have to lower the number of executors on the Jenkins slaves that run test-go jobs, since running 3 at a time makes them use up all the CPU and flake.
Automatic merge from submit-queue
Replace tab with eight spaces
This file only uses spaces for indentation, and my text editor highlighted the one tab.
- Add keep-alive to ssh connection
- Don't try to stop services on image-based runs
- Increase jenkins ci timeout to 90 minutes to accomadate unpredictable go build times
- Remove spammy log statement
Automatic merge from submit-queue
Make etcd cache size configurable
Instead of the prior 50K limit, allow users to specify a more sensible size for their cluster.
I'm not sure what a sensible default is here. I'm still experimenting on my own clusters. 50 gives me a 270MB max footprint. 50K caused my apiserver to run out of memory as it exceeded >2GB. I believe that number is far too large for most people's use cases.
There are some other fundamental issues that I'm not addressing here:
- Old etcd items are cached and potentially never removed (it stores using modifiedIndex, and doesn't remove the old object when it gets updated)
- Cache isn't LRU, so there's no guarantee the cache remains hot. This makes its performance difficult to predict. More of an issue with a smaller cache size.
- 1.2 etcd entries seem to have a larger memory footprint (I never had an issue in 1.1, even though this cache existed there). I suspect that's due to image lists on the node status.
This is provided as a fix for #23323