Automatic merge from submit-queue (batch tested with PRs 39898, 39904)
[scheduler] interface for config
**What this PR fixes**
This PR converts the Scheduler configuration factory into an interface, so that
- the scheduler_perf and scheduler integration tests dont rely on the struct for their implementation
- the exported functionality of the factory (i.e. what it needs to provide to create a scheduler configuration) is completely explicit, rather then completely coupled to a struct.
- makes some parts of the factory immutable, again to minimize possible coupling.
This makes it easier to make a custom factory in instances where we might specifically want to import scheduler logic without actually reusing the entire scheduler codebase.
Automatic merge from submit-queue
Build release tars using bazel
**What this PR does / why we need it**: builds equivalents of the various kubernetes release tarballs, solely using bazel.
For example, you can now do
```console
$ make bazel-release
$ hack/e2e.go -v -up -test -down
```
**Special notes for your reviewer**: this is currently dependent on 3b29803eb5, which I have yet to turn into a pull request, since I'm still trying to figure out if this is the best approach.
Basically, the issue comes up with the way we generate the various server docker image tarfiles and load them on nodes:
* we `md5sum` the binary being encapsulated (e.g. kube-proxy) and save that to `$binary.docker_tag` in the server tarball
* we then build the docker image and tag using that md5sum (e.g. `gcr.io/google_containers/kube-proxy:$MD5SUM`)
* we `docker save` this image, which embeds the full tag in the `$binary.tar` file.
* on cluster startup, we `docker load` these tarballs, which are loaded with the tag that we'd created at build time. the nodes then use the `$binary.docker_tag` file to find the right image.
With the current bazel `docker_build` rule, the tag isn't saved in the docker image tar, so the node is unable to find the image after `docker load`ing it.
My changes to the rule save the tag in the docker image tar, though I don't know if there are subtle issues with it. (Maybe we want to only tag when `--stamp` is given?)
Also, the docker images produced by bazel have the timestamp set to the unix epoch, which is not great for debugging. Might be another thing to change with a `--stamp`.
Long story short, we probably need to follow up with bazel folks on the best way to solve this problem.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Curating Owners: test/e2e_node
cc @random-liu @timstclair @vishh
In an effort to expand the existing pool of reviewers and establish a
two-tiered review process (first someone lgtms and then someone
experienced in the project approves), we are adding new reviewers to
existing owners files.
If You Care About the Process:
------------------------------
We did this by algorithmically figuring out who’s contributed code to
the project and in what directories. Unfortunately, that doesn’t work
well: people that have made mechanical code changes (e.g change the
copyright header across all directories) end up as reviewers in lots of
places.
Instead of using pure commit data, we generated an excessively large
list of reviewers and pruned based on all time commit data, recent
commit data and review data (number of PRs commented on).
At this point we have a decent list of reviewers, but it needs one last
pass for fine tuning.
Also, see https://github.com/kubernetes/contrib/issues/1389.
TLDR:
-----
As an owner of a sig/directory and a leader of the project, here’s what
we need from you:
1. Use PR https://github.com/kubernetes/kubernetes/pull/35715 as an example.
2. The pull-request is made editable, please edit the `OWNERS` file to
remove the names of people that shouldn't be reviewing code in the
future in the **reviewers** section. You probably do NOT need to modify
the **approvers** section. Names asre sorted by relevance, using some
secret statistics.
3. Notify me if you want some OWNERS file to be removed. Being an
approver or reviewer of a parent directory makes you a reviewer/approver
of the subdirectories too, so not all OWNERS files may be necessary.
4. Please use ALIAS if you want to use the same list of people over and
over again (don't hesitate to ask me for help, or use the pull-request
above as an example)
Automatic merge from submit-queue (batch tested with PRs 39625, 39842)
Add RBAC v1beta1
Add `rbac.authorization.k8s.io/v1beta1`. This scrubs `v1alpha1` to remove cruft, then add `v1beta1`. We'll update other bits of infrastructure to code to `v1beta1` as a separate step.
```release-note
The `attributeRestrictions` field has been removed from the PolicyRule type in the rbac.authorization.k8s.io/v1alpha1 API. The field was not used by the RBAC authorizer.
```
@kubernetes/sig-auth-misc @liggitt @erictune
Automatic merge from submit-queue
Enable lazy initialization of ext3/ext4 filesystems
**What this PR does / why we need it**: It enables lazy inode table and journal initialization in ext3 and ext4.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#30752, fixes#30240
**Release note**:
```release-note
Enable lazy inode table and journal initialization for ext3 and ext4
```
**Special notes for your reviewer**:
This PR removes the extended options to mkfs.ext3/mkfs.ext4, so that the defaults (enabled) for lazy initialization are used.
These extended options come from a script that was historically located at */usr/share/google/safe_format_and_mount* and later ported to GO so this dependency to the script could be removed. After some search, I found the original script here: https://github.com/GoogleCloudPlatform/compute-image-packages/blob/legacy/google-startup-scripts/usr/share/google/safe_format_and_mount
Checking the history of this script, I found the commit [Disable lazy init of inode table and journal.](4d7346f7f5). This one introduces the extended flags with this description:
```
Now that discard with guaranteed zeroing is supported by PD,
initializing them is really fast and prevents perf from being affected
when the filesystem is first mounted.
```
The problem is, that this is not true for all cloud providers and all disk types, e.g. Azure and AWS. I only tested with magnetic disks on Azure and AWS, so maybe it's different for SSDs on these cloud providers. The result is that this performance optimization dramatically increases the time needed to format a disk in such cases.
When mkfs.ext4 is told to not lazily initialize the inode tables and the check for guaranteed zeroing on discard fails, it falls back to a very naive implementation that simply loops and writes zeroed buffers to the disk. Performance on this highly depends on free memory and also uses up all this free memory for write caching, reducing performance of everything else in the system.
As of https://github.com/kubernetes/kubernetes/issues/30752, there is also something inside kubelet that somehow degrades performance of all this. It's however not exactly known what it is but I'd assume it has something to do with cgroups throttling IO or memory.
I checked the kernel code for lazy inode table initialization. The nice thing is, that the kernel also does the guaranteed zeroing on discard check. If it is guaranteed, the kernel uses discard for the lazy initialization, which should finish in a just few seconds. If it is not guaranteed, it falls back to using *bio*s, which does not require the use of the write cache. The result is, that free memory is not required and not touched, thus performance is maxed and the system does not suffer.
As the original reason for disabling lazy init was a performance optimization and the kernel already does this optimization by default (and in a much better way), I'd suggest to completely remove these flags and rely on the kernel to do it in the best way.
Automatic merge from submit-queue
Start making `k8s.io/client-go` authoritative for generic client packages
Right now, client-go has copies of various generic client packages which produces golang type incompatibilities when you want to switch between different kinds of clients. In many cases, there's no reason to have two sets of packages. This pull eliminates the copy for `pkg/client/transport` and makes `client-go` the authoritative copy.
I recommend going by commits, the first just synchronizes the client-go code again so that I could test the copy script to make sure it correctly preserves the original package.
@kubernetes/sig-api-machinery-misc @lavalamp @sttts
Automatic merge from submit-queue (batch tested with PRs 34763, 38706, 39939, 40020)
Use Statefulset instead in e2e and controller
Quick fix ref: #35534
We should finish the issue to meet v1.6 milestone.
Automatic merge from submit-queue
Move PatchType to apimachinery/pkg/types
Fixes https://github.com/kubernetes/kubernetes/issues/39970
`PatchType` is shared by the client and server, they have to agree, and its critical for our API to function.
@smarterclayton @kubernetes/sig-api-machinery-misc
Automatic merge from submit-queue (batch tested with PRs 39911, 40002, 39969, 40012, 40009)
Fix RBAC role for kube-proxy in Kubemark
Ref #39959
This should ensure that kube-proxy (in Kubemark) has the required role and RBAC binding.
@deads2k PTAL
cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
move api/errors to apimachinery
`pkg/api/errors` is a set of helpers around `meta/v1.Status` that help to create and interpret various apiserver errors. Things like `.NewNotFound` and `IsNotFound` pairings. This pull moves it into apimachinery for use by the clients and servers.
@smarterclayton @lavalamp First commit is the move plus minor fitting. Second commit is straight replace and generation.
Automatic merge from submit-queue (batch tested with PRs 38592, 39949, 39946, 39882)
Add optional per-request context to restclient
**What this PR does / why we need it**: It adds per-request contexts to restclient's API, and uses them to add timeouts to all proxy calls in the e2e tests. An entire e2e shouldn't hang for hours on a single API call.
**Which issue this PR fixes**: #38305
**Special notes for your reviewer**:
This adds a feature to the low-level rest client request feature that is entirely optional. It doesn't affect any requests that don't use it. The api of the generated clients does not change, and they currently don't take advantage of this.
I intend to patch this in to 1.5 as a mostly test only change since it's not going to affect any controller, generated client, or user of the generated client.
cc @kubernetes/sig-api-machinery
cc @saad-ali
Automatic merge from submit-queue
Fix examples e2e permission check
Ref #39382
Follow-up from #39896
Permission check should be done within the e2e test namespace, not cluster-wide
Also improved RBAC audit logging to make the scope of the permission check clearer
Automatic merge from submit-queue
Remove sleep from DynamicProvisioner test.
The comment says that the sleep is there because of 10 minute PV controller
sync. The controller sync is now 15 seconds and it should be quick enough
to hide this in subsequent `WaitForPersistentVolumeDeleted(.. , 20*time.Minute)`
Automatic merge from submit-queue (batch tested with PRs 38427, 39896, 39889, 39871, 39895)
Grant permissions to e2e examples test service account
ref #39382
Automatic merge from submit-queue
Updated unit tests
@janetkuo updated the flaky unit test to have the same structure with regard to uncasting as the rest of the tests. ptal
Automatic merge from submit-queue (batch tested with PRs 39807, 37505, 39844, 39525, 39109)
Made cache.Controller to be interface.
**What this PR does / why we need it**:
#37504
Automatic merge from submit-queue
run staging client-go update
Chasing to see what real problems we have in staging-client-go.
@sttts you get similar results?
Automatic merge from submit-queue
replace global registry in apimachinery with global registry in k8s.io/kubernetes
We'd like to remove all globals, but our immediate problem is that a shared registry between k8s.io/kubernetes and k8s.io/client-go doesn't work. Since client-go makes a copy, we can actually keep a global registry with other globals in pkg/api for now.
@kubernetes/sig-api-machinery-misc @lavalamp @smarterclayton @sttts