This moves licenses of vendored code from one monolith file into a tree
of individual files for easier reviews. This fixes both the bash and
bazel paths.
An etcd unix user is currently created in configure-helper.sh if it does not exist
on the master.
cloud-init is the only supported mechanism to add users on COS VMs. If an attempt
is made to add a key using OS Login or the instance metadata mechanism the
google_accounts_daemon will race with useradd and potentially attempt to use
the same UID. This will lock out any attempt to SSH into the VM. We therefore
migrate to using cloud-init to create this user and prevent this issue from occurring.
This change removes support for basic authn in v1.19 via the
--basic-auth-file flag. This functionality was deprecated in v1.16
in response to ATR-K8S-002: Non-constant time password comparison.
Similar functionality is available via the --token-auth-file flag
for development purposes.
Signed-off-by: Monis Khan <mok@vmware.com>
If we don't install docker and install just containerd apt packages,
there is no docker group. In this scenario, we should not add the gid to
config.toml
For bug #87482.
Newer OSs are now defaulting to Python3.
This breaks the kube-up scripts for GCE.
Adding code to detect this and explicitly use Python2.
Got the proxy-server coming up in the master.
Added certs and have it comiung up with those certs.
Added a daemonset to run the network-agent.
Adding support for agent running as a sameon set on every node.
Added quick hack to test that proxy server/agent were correctly
tunneling traffic to the kubelet.
Added more WIP for reading network proxy configuration.
Get flags set correctly and fix connection services.
Adding missing ApplyTo
Added ConnectivityService.
Fixed build directives. Added connectivity service configuration.
Fixed log levels.
Fixed minor issues for feature turned off.
Fixed boilerplate and format.
Moved log dialer initialization earlier as per Liggits suggestion.
Fixed a few minor issues in the configuration for GCE.
Fixed scheme allocation
Adding unit test.
Added test for direct connectivity service.
Switching to injecting the Lookup method rather than using a Singleton.
First round of mikedaneses feedback.
Fixed deployment to use yaml and other changes suggested by MikeDanese.
Switched network proxy server/agent which are kebab-case not camelCase.
Picked up DIAL_RSP fix.
Factored in deads2k feedback.
Feedback from mikedanese
Factored in second round of feedback from David.
Fix path in verify.
Factored in anfernee's feedback.
First part of lavalamps feedback.
Factored in more changes from lavalamp and mikedanese.
Renamed network-proxy to konnectivity-server and konnectivity-agent.
Fixed tolerations and config file checking.
Added missing strptr
Finished lavalamps requested rename.
Disambiguating konnectivity service by renaming it egress selector.
Switched feature flag to KUBE_ENABLE_EGRESS_VIA_KONNECTIVITY_SERVICE
This commit adds support for using `gke-exec-auth-plugin` (vTPM-based
certificates for mTLS) for webhooks when calling endpoints matching
`*.googleapis.com`, and integrates this support with
ValidatingAdmissionWebhook.
To enable it, request ValidatingAdmissionWebhook with
`ADMISSION_CONTROL=...,ValidatingAdmissionWebhook,...` (default) and
opt in to `gke-exec-auth-plugin` using `WEBHOOK_GKE_EXEC_AUTH=true`
during the configuration process.
If you don't opt-in, ValidatingAdmissionWebhook will be deployed as
before.
Requesting `WEBHOOK_GKE_EXEC_AUTH=true` will fail if you have not
provided other configuration variables:
* `EXEC_AUTH_PLUGIN_URL`: controls whether `gke-exec-auth-plugin` is
downloaded during the installation step. A prerequisite for
actually using the plugin.
* `TOKEN_URL`, `TOKEN_BODY`, and `TOKEN_BODY_UNQUOTED`:
configuration values used when calling the plugin. `TOKEN_URL`
and `TOKEN_BODY` have existing usage. `TOKEN_BODY_UNQUOTED` is a
new variable that is meant to sidestep the problem of inverting
`strconv.Quote` in Bash.
The existing configuration process for ImagePolicyWebhook has been
reworked to make it play nicely with ValidatingAdmissionWebhook under
`WEBHOOK_GKE_EXEC_AUTH=true`.
* It originally placed the ImagePolicyWebhook configuration object
at the top-level of the file specified by
`--admission-control-config-file`. I can't see why this worked;
it must have been hitting some sort of lucky path through the
various config file loading mechanisms. Now, it places its
configuration in a sub-field of that file, which is shared among
all admission control plugins.
* It mounted its various config files read-write. I reviewed the
code and couldn't see why it was necessary, so I moved the config
files into the existing read-only mount at `/etc/srv/kubernetes`.
* It now checks that all the configuration values it requires have
been provided.
Co-authored-by: Mike Danese <mikedanese@google.com>
Co-authored-by: Taahir Ahmed <taahm@google.com>
using `local -r` will blow up, example output:
```
/home/kubernetes/bin/configure.sh: line 388: local: manifest_name: readonly variable
```
Change-Id: Id379180803d44dd9c7ac0da41c1cd56de0fe54a4
Split arguments to be passed to cluster autoscaler binary,
so each argument is passed separately.
This is preparatory work for migrating CA to disroless base image
and passing multiple arguments together does not work if CA is
not wrapped around with shell script
Change-Id: I26b5a764d2a12079c7f4ed6633ccabf8d623e232
* Touched containers: kube-apiserver, kube-scheduler,
kube-controller-manager.
* Remove the shell dependencies when upstart the containers.
* Reformat the command parameters to ["Exec", "Param1", "Param2"]
* Touched containers: kube-apiserver, kube-scheduler,
kube-controller-manager.
* Remove the shell dependencies when upstart the containers.
* Reformat the command parameters to ["Exec", "Param1", "Param2"]
This change renames the '--experimental-encryption-provider-config'
flag to '--encryption-provider-config'. The old flag is accepted but
generates a warning.
In 1.14, we will drop support for '--experimental-encryption-provider-config'
entirely.
Co-authored-by: Stanislav Laznicka <slaznick@redhat.com>
This change includes the yaml files and gce startup script changes
to run this addon. It is disabled by default, can be enabled by setting
KUBE_ENABLE_NODELOCAL_DNS=true
An ip address is required for the cache instance to listen for
requests on, default is a link local ip address of value 169.254.25.10
addressed review comments, updated image location
Picked a different prometheus port so stats port is not same as the
coredns deployment
Removed the nodelocaldns-ready label.
Set memory limit to 30Mi
Automatic merge from submit-queue (batch tested with PRs 67950, 68195). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Remove e2e-image-puller
**What this PR does / why we need it**:
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#63355
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 68087, 68256, 64621, 68299, 68296). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
gce: use getrandom instead of urandom for on node rng
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 68161, 68023, 67909, 67955, 67731). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Register RuntimeClass CRD as an addon
**What this PR does / why we need it**:
Register the RuntimeClass CRD when the RuntimeClass feature gate is enabled. This is done in through the addon manager.
This is an alternative approach to https://github.com/kubernetes/kubernetes/pull/67924
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
For https://github.com/kubernetes/features/issues/585
**Release note**:
Covered by #67737
```release-note
NONE
```
/sig node
/kind feature
/priority important-soon
/milestone v1.12
In the context, our urandoms where generally safe, however getrandom has
built in invariants around entropy pool initialization, making getrandom
safe in all contexts. This should protect us from cryptopasta errors or
weird entropy issues.
Automatic merge from submit-queue (batch tested with PRs 67736, 68123, 68138). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Added support to get clusters in gce cloud provider.
**What this PR does / why we need it**:
Implemented the call to get all cluster objects in a zone for a project.
Also added code to allow the container api to be set in the gce.conf
file.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
A long time ago, We added the image prepulling as a workaround due to
the overwhelming amount of flake caused by pulling during the tests.
This functionality has been broken for a while now when we switched to a
COS image where mounting `docker` binary into `busybox` stopped working.
So we just have dead code we should clean up.
Change-Id: I538171a5c1d9361eee7f9e0a99655b88b1721e3e
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Support extra prune resources in kube-addon-manager.
The default prune whitelist resources in https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/apply.go#L531 are sometimes not enough.
One example is that when we remove an admission webhook running as an addon pod, after we remove the addon yaml file, the admission webhook pod will be pruned, but the `MutatingWebhookConfiguration`/`ValidationWebhookConfiguration` won't... If the webhook failure policy is `Fail`, this will break the cluster, and users can't create new pods anymore.
It would be good to at least make this configurable, so that users and vendors can configure it based on their requirement.
This PR keeps the default prune resource list exactly the same with before, just makes it possible to add extra ones.
@dchen1107 @MrHohn @kubernetes/sig-cluster-lifecycle-pr-reviews @kubernetes/sig-gcp-pr-reviews
Signed-off-by: Lantao Liu <lantaol@google.com>
**Release note**:
```release-note
Support extra `--prune-whitelist` resources in kube-addon-manager.
```
Automatic merge from submit-queue (batch tested with PRs 64283, 67910, 67803, 68100). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Adding GCE node termination handler as an optional addon.
This step is a pre-requisite for auto-deploying that addon in GKE
cc @mikedanese
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Promote AdvancedAuditing to GA
**What this PR does / why we need it**:
Removes deprecated legacy code used for basic audit logging in favor of advanced audit logging.
```release-note
Promote AdvancedAuditing to GA, replacing the previous (legacy) audit logging mechanisms.
```
Automatic merge from submit-queue (batch tested with PRs 67745, 67432, 67569, 67825, 67943). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md.
Add flag for disabling prometheus-to-sd only for daemon sets
```release-note
NONE
```
The requested Service Protocol is checked against the supported protocols of GCE Internal LB. The supported protocols are TCP and UDP.
SCTP is not supported by OpenStack LBaaS. If SCTP is requested in a Service with type=LoadBalancer, the request is rejected. Comment style is also corrected.
SCTP is not allowed for LoadBalancer Service and for HostPort. Kube-proxy can be configured not to start listening on the host port for SCTP: see the new SCTPUserSpaceNode parameter
changed the vendor github.com/nokia/sctp to github.com/ishidawataru/sctp. I.e. from now on we use the upstream version.
netexec.go compilation fixed. Various test cases fixed
SCTP related conformance tests removed. Netexec's pod definition and Dockerfile are updated to expose the new SCTP port(8082)
SCTP related e2e test cases are removed as the e2e test systems do not support SCTP
sctp related firewall config is removed from cluster/gce/util.sh. Variable name sctp_addr is corrected to sctpAddr in pkg/proxy/ipvs/proxier.go
cluster/gce/util.sh is copied from master
Implemented the call to get all cluster objects in a zone for a project.
Also added code to allow the container api to be set in the gce.conf
file.
Requested fix for @lavalamp. Fixed GetClusters to be GetManagedClusters.
Leaving ListClusters as ListClusters as it is part of the Cloud Clusters
interface, despite also being a "managed" call.
Remove copy pasta :D
Fixed method variable name.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
remove rescheduler since scheduling DS pods by default scheduler is moving to beta
**What this PR does / why we need it**:
remove rescheduler since scheduling DS pods by default scheduler is moving to beta
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#64725
**Special notes for your reviewer**:
**Release note**:
```release-note
Remove rescheduler since scheduling DS pods by default scheduler is moving to beta.
```
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update configure-helper.sh to support heapster resource optimizations
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 66177, 66185, 67136, 67157, 65065). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Templatize the scaling policy for metrics-server
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
GCE: Add OWNERS for image (gci) configuration
**What this PR does / why we need it**:
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 59030, 64666, 66251, 66485, 66813). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
A large set of improvements to the Stackdriver components.
**What this PR does / why we need it**:
This PR delivers a large set of improvements for both the Stackdriver Logging agent and the Stackdriver Metadata agent.
**Release note**:
```release-note
Metadata Agent Improvements
Bump metadata agent version to 0.2-0.0.21-1.
Expand the metadata agent's access to all API groups.
Remove metadata agent config maps in favor of command line flags.
Update the metadata agent's liveness probe to a new /healthz handler.
Logging Agent Improvements
Bump logging agent version to 0.2-1.5.33-1-k8s-1.
Appropriately set log severity for k8s_container.
Fix detect exceptions plugin to analyze message field instead of log field.
Fix detect exceptions plugin to analyze streams based on local resource id.
Disable the metadata agent for monitored resource construction in logging.
Disable timestamp adjustment in logs to optimize performance.
Reduce logging agent buffer chunk limit to 512k to optimize performance.
```
Metadata Agent Improvements
Bump metadata agent version to 0.2-0.0.21-1.
Expand the metadata agent's access to all API groups.
Remove metadata agent config maps in favor of command line flags.
Update the metadata agent's liveness probe to a new /healthz handler.
Logging Agent Improvements
Bump logging agent version to 0.2-1.5.33-1-k8s-1.
Appropriately set log severity for k8s_container.
Fix detect exceptions plugin to analyze message field instead of log field.
Fix detect exceptions plugin to analyze streams based on local resource id.
Disable the metadata agent for monitored resource construction in logging.
Disable timestamp adjustment in logs to optimize performance.
Reduce logging agent buffer chunk limit to 512k to optimize performance.
In addition to the shell script changes the heapster yaml has been
updated to use addon resizer 1.8.3 for the heapster-nanny. Addon resizer 1.8.3
is being used to take advantage of the new minClusterSize flag. Note this is a
no-op change. The values specified for heapster-nanny reflect the current
configuration used with version 1.8.2.
Automatic merge from submit-queue (batch tested with PRs 66152, 66406, 66218, 66278, 65660). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Update crictl to v1.11.1.
Update `crictl` to v1.11.1 to fix several bugs. Release note: https://github.com/kubernetes-incubator/cri-tools/releases/tag/v1.11.1
@kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews
@kubernetes/sig-gcp-pr-reviews
Signed-off-by: Lantao Liu <lantaol@google.com>
```release-note
Update crictl to v1.11.1.
```
The container used by build/run.sh doesn't necessarily have an entry in
/etc/passwd for the host user's uid, and this missing data causes
`whoami` to fail.
Switch `whoami` to `id -un` to fall back to the uid if the /etc/passwd
entry is missing.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
stop using deprecated --etcd-quorum-read
etcd-quorum-read was deprecated, but it is still used.
This pr stops using it.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65319, 64513, 65474, 65601, 65634). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Allow custom manifests in GCP master setup
Add a hook in GCE setup script to allow using custom manifests on master, so we can decouple some GKE changes from k8s. Note that this PR just adds a hook there is no change in default behavior.
```release-note
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Set pod priority on kube-proxy manifest by default
**What this PR does / why we need it**:
Follow up of https://github.com/kubernetes/kubernetes/pull/59237, set pod priority on kube-proxy by default and remove the unneeded logic in startup script.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #NONE
**Special notes for your reviewer**:
/assign @bsalamat @bowei
cc @tanshanshan
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add /home/kubernetes/bin into sudoers path, so that `sudo crictl` works.
Add `/home/kubernetes/bin` to sudoers path, so that user can call `sudo crictl` directly.
Without this fix, user has to either use the full path `sudo /home/kubernetes/bin/crictl` or switch to root, which is not a good user experience.
/cc @yujuhong @feiskyer @filbranden @kubernetes/sig-node-pr-reviews @kubernetes/sig-gcp-pr-reviews
**Release note**:
```release-note
User can now use `sudo crictl` on GCE cluster.
```
Automatic merge from submit-queue (batch tested with PRs 65024, 65287, 65345, 64693, 64941). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add a helper function to customize K8s addon yamls and use it to customize Calico addons on GKE
**What this PR does / why we need it**:
Allow customizing Calico addon in GCP. With #65022, this allows us to do a couple of things:, e.g., run Calico 3.0+ on GCP, use a non-default MTU etc.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#65045, #65067
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 65123, 65176, 65139, 65084, 65056). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Pass cluster_location argument to Heapster
**What this PR does / why we need it**:
Fixes Stackdriver monitoring on GCE clusters where cluster location is not a single zone, for example regional clusters.
**Release note**:
```release-note
Pass cluster_location argument to Heapster
```
Automatic merge from submit-queue (batch tested with PRs 64503, 64903, 64643, 64987). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create system:cluster-autoscaler account & role and introduce it to C…
**What this PR does / why we need it**:
This PR adds cluster-autoscaler ClusterRole & binding, to be used by the Cluster Autoscaler (kubernetes/autoscaler repository).
It also updates GCE scripts to make CA use the cluster-autoscaler user account.
User account instead of Service account is chosen to be more in line with kube-scheduler.
**Which issue(s) this PR fixes**:
Fixes [issue 383](https://github.com/kubernetes/autoscaler/issues/383) from kubernetes/autoscaler.
**Special notes for your reviewer**:
This PR might be treated as a security fix since prior to it CA on GCE was using system:cluster-admin account, assumed due to default handling of unsecured & unauthenticated traffic over plain HTTP.
**Release note**:
```release-note
A cluster-autoscaler ClusterRole is added to cover only the functionality required by Cluster Autoscaler and avoid abusing system:cluster-admin role.
action required: Cloud providers other than GCE might want to update their deployments or sample yaml files to reuse the role created via add-on.
```
Automatic merge from submit-queue (batch tested with PRs 63453, 64592, 64482, 64618, 64661). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Revert "Remove rescheduler and corresponding tests from master"
Reverts kubernetes/kubernetes#64364
After discussing with @bsalamat on how DS controllers(ref: https://github.com/kubernetes/kubernetes/pull/63223#discussion_r192277527) cannot create pods if the cluster is at capacity and they have to rely on rescheduler for making some space, we thought it is better to
- Bring rescheduler back.
- Make rescheduler priority aware.
- If cluster is full and if **only** DS controller is not able to create pods, let rescheduler be run and let it evict some pods which have less priority.
- The DS controller pods will be scheduled now.
So, I am reverting this PR now. Step 2, 3 above are going to be in rescheduler.
/cc @bsalamat @aveshagarwal @k82cn
Please let me know your thoughts on this.
```release-note
Revert #64364 to resurrect rescheduler. More info https://github.com/kubernetes/kubernetes/issues/64725 :)
```
Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add netd as an addon for GCP
**What this PR does / why we need it**:
Add netd as an addon for GKE.
The PR will add setup functions and var to help deploy netd daemon on GKE.
Please checkout more detail for netd at https://github.com/GoogleCloudPlatform/netd
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 61610, 64591, 58143, 63929). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Create CoreDNS and kube-dns folders
**What this PR does / why we need it**:
Separate the CoreDNS and kube-dns manifests by creating their own folders (dns/coredns and dns/kube-dns)
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes#61435
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
cc @MrHohn
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Add ipvs module loading logic to gce scripts
**What this PR does / why we need it**:
Add ipvs module loading logic to gce scripts.
Fixes a part of #59402.
/cc @Lion-Wei
/assign @roberthbailey @m1093782566
**Release note**:
```release-note
None
```
Automatic merge from submit-queue (batch tested with PRs 63434, 64172, 63975, 64180, 63755). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Dump Stack when docker fails on healthcheck
Save stack dump of docker daemon in order to be able to
investigate why docker daemon was unresposive to `docker ps`
See https://github.com/moby/moby/blob/master/daemon/daemon.go on
how docker sets up a trap for SIGUSR1 with `setupDumpStackTrap()`
**What this PR does / why we need it**:
This allows us to investigate why docker daemon was unresponsive to "docker ps" command.
**Special notes for your reviewer**:
Manually tested on Ubuntu and COS.
**Release note**:
```release-note
NONE
```
Send SIGUSR1 to dockerd to save stack dump of docker daemon
in order to be able to investigate why docker daemon was
unresposive to health check done by `docker ps`.
See https://github.com/moby/moby/blob/master/daemon/daemon.go on
how docker sets up a trap for SIGUSR1 with `setupDumpStackTrap()`
Automatic merge from submit-queue (batch tested with PRs 63969, 63902, 63689, 63973, 63978). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Reuse existing CA cert path for kubelet certs
**What this PR does / why we need it**: configure-helper.sh already knows the path to CA cert, re-use that to avoid typos.
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63569, 63918, 63980, 63295, 63989). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
New event exporter config with support for new stackdriver resources
New event exporter, with support for use new and old stackdriver resource model.
This should also be cherry-picked to release-1.10 branch, as all fluentd-gcp components support new and stackdriver resource model.
```release-note
Update event-exporter to version v0.2.0 that supports old (gke_container/gce_instance) and new (k8s_container/k8s_node/k8s_pod) stackdriver resources.
```
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
gce: Prefer MASTER_ADVERTISE_ADDRESS in apiserver setup
MASTER_ADVERTISE_ADDRESS is used to set the --advertise-address flag
for the apiserver. It's useful for running the apiserver behind a load
balancer.
However, if PROJECT_ID, TOKEN_URL, TOKEN_BODY, and NODE_NETWORK are
all set, the GCE VM's external IP address will be fetched and used
instead and MASTER_ADVERTISE_ADDRESS will be ignored.
Change this behavior so that MASTER_ADVERTISE_ADDRESS takes precedence
because it's more specific. We still fall back to using the VM's
external IP address if the other variables are set.
Also: Move the setting of --ssh-user and --ssh-keyfile based on
PROXY_SSH_USER) to a top-level block because this is common to all
codepaths.
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 63167, 63357). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.
Install and use crictl in gce kube-up.sh
Download and use crictl in gce kube-up.sh.
This PR:
1. Downloads crictl `v1.0.0-beta.0` onto the node, which supports CRI v1alpha2. We'll upgrade it to `v1.0.0-beta.1` soon after the release is cut.
2. Change `kube-docker-monitor` to `kube-container-runtime-monitor`, and let it use `crictl` to do health monitoring.
3. Change `e2e-image-puller` to use `crictl`. Because of https://github.com/kubernetes/kubernetes/issues/63355, it doesn't work now. But in `crictl v1.0.0-beta.1`, we are going to statically link it, and the `e2e-image-puller` should work again.
4. Use `systemctl kill --kill-who=main` instead of `pkill`, the reason is that:
a. `pkill docker` will send `SIGTERM` to all processes including `dockerd`, `docker-containerd`, `docker-containerd-shim`. This is not a problem for Docker 17.03 CE, because `containerd-shim` in containerd 0.2.x doesn't exit with SIGERM (see [code](https://github.com/containerd/containerd/blob/v0.2.x/containerd-shim/main.go#L123)). However, `containerd-shim` in containerd 1.0+ does exit with SIGTERM (see [code](https://github.com/containerd/containerd/blob/master/cmd/containerd-shim/main_unix.go#L200)). This means that `pkill docker` and `pkill containerd` will kill all shim processes for Docker 17.11+ and containerd 1.0+.
b. We can use `pkill -x` instead. However, docker systemd service name is `docker`, but daemon process name is `dockerd`. We have to introduce another environment variable to specify "daemon process name". Given so, it seems easier to just use `systemctl kill` which only requires systemd service name. `systemctl kill --kill-who=main` will make sure only main process receives SIGTERM.
Signed-off-by: Lantao Liu <lantaol@google.com>
/cc @filbranden @yujuhong @feiskyer @mrunalp @kubernetes/sig-node-pr-reviews @kubernetes/sig-cluster-lifecycle-pr-reviews
**Release note**:
```release-note
Kubernetes cluster on GCE have crictl installed now. Users can use it to help debug their node. The documentation of crictl can be found https://github.com/kubernetes-incubator/cri-tools/blob/master/docs/crictl.md.
```
MASTER_ADVERTISE_ADDRESS is used to set the --advertise-address flag
for the apiserver. It's useful for running the apiserver behind a load
balancer.
However, if PROJECT_ID, TOKEN_URL, TOKEN_BODY, and NODE_NETWORK are
all set, the GCE VM's external IP address will be fetched and used
instead and MASTER_ADVERTISE_ADDRESS will be ignored.
Change this behavior so that MASTER_ADVERTISE_ADDRESS takes precedence
because it's more specific. We still fall back to using the VM's
external IP address if the other variables are set.
Also: Pass --ssh-user and --ssh-keyfile flags if both PROXY_SSH_USER
and MASTER_ADVERTISE_ADDRESS is set.