Automatic merge from submit-queue (batch tested with PRs 49081, 49318, 49219, 48989, 48486)
Better message if we dont find appropriate BlockStorage API
**What this PR does / why we need it**:
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Warn if aws has no cluster id provided
**What this PR does / why we need it**:
we info log a message when no cluster id is provided that should be a warning given its impact.
fixes https://github.com/kubernetes/kubernetes/issues/49568
**Release note**:
```release-note
NONE
```
With latest devstack, v1 and v2 are DEPRECATED and v3 is marked
as CURRENT. So we fail to attach the disk, the error message is
shown when one does "kubectl describe pod" but the operator has
to dig into find the problem.
So log a better message if we can't find the appropriate version
of the API that we support with an explicit error message that
the operator can see how to fix the situation.
Note support for v3 block storage API is being added to gophercloud
and will take a bit of time before we can support it.
Automatic merge from submit-queue (batch tested with PRs 49420, 49296, 49299, 49371, 46514)
Avoid looking up instance id until we need it
**What this PR does / why we need it**:
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
So let's try to find the instance-id only when we need it.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Bump up gce minNodesHealthCheckVersion due to known issues
**What this PR does / why we need it**: There are some known issues in previous 1.7 versions causing kube-proxy not correctly responding healthz traffic.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: From #49263.
**Special notes for your reviewer**:
/assign @nicksardo @freehan
cc @bowei @thockin
**Release note**:
```release-note
GCE Cloud Provider: New created LoadBalancer type Service will have health checks for nodes by default if all nodes have version >= v1.7.2.
```
Automatic merge from submit-queue (batch tested with PRs 49107, 47177, 49234, 49224, 49227)
Added logging to AWS api calls. #46969
Additionally logging of when AWS API calls start and end to help diagnose problems with kubelet on cloud provider nodes not reporting node status periodically. There's some inconsistency in logging around this PR we should discuss.
IMO, the API logging should be at a higher level than most other types of logging as you would probably only want it in limited instances. For most cases that is easy enough to do, but there are some calls which have some logging around them already, namely in the instance groups. My preference would be to keep the existing logging as it and just add the new API logs around the API call.
currently kube-controller-manager cannot run outside of a vm started
by openstack (with --cloud-provider=openstack params). We try to read
the instance id from the metadata provider or the config drive or the
file location only when we really need it. In the normal scenario, the
controller-manager uses the node name to get the instance id.
41541910e1/pkg/volume/cinder/attacher.go (L149)
The localInstanceID is currently used only in the test case, so let
us not read it until it is really needed.
Automatic merge from submit-queue (batch tested with PRs 49276, 49235)
Don't fail fast if LoadBalancer section is missing
**What this PR does / why we need it**:
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
We should allow scenarios where cinder can be used even if the
operator does not want to use the openstack load balancer. So
let's warn in the beginning if subnet-id is missing but fail only
if they try to use the load balancer
Automatic merge from submit-queue
Tolerate Flavor information for computing instance type
**What this PR does / why we need it**:
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api
So let's just inspect what is present and use that to figure out
the instance type.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Fix on-premises term in error string and comments for aws provider
**What this PR does / why we need it**: fix for correct terminology of "on-premises" over "on-premise"
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a
**Special notes for your reviewer**: Updated error string while doing a scrub for the incorrect term in the docs (kubernetes/kubernetes.github.io#4413).
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49083, 45540, 46862)
Add extra logging to azure API get calls
**What this PR does / why we need it**:
This PR adds extra logging for external calls to the Azure API, specifically get calls.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
This will help troubleshoot problems arising from the usage of this cloudprovider. For example, it looks like #43516 is caused by a call to the cloudprovider taking too much time.
Automatic merge from submit-queue (batch tested with PRs 49218, 48253, 48967, 48460, 49230)
additional backoff in azure cloudprovider
Fixes#48971
**What this PR does / why we need it**:
We want to be able to opt in to backoff retry logic for kubelet-originating request behavior: node IP address resolution and node load balancer pool membership enforcement.
**Special notes for your reviewer**:
The use-case for this is azure cloudprovider clusters with large node counts, especially during cluster installation, or other scenarios when lots of nodes come online at once and attempt to register all resources with the backend API. To allow clusters at scale more control over the API request rate in-cluster, backoff config has the ability to meaningful slow down this rate, when appropriate.
**Release note**:
```additional backoff in azure cloudprovider
```
Current devstack seems to return "id", and an upcoming change using
nova's microversion will be returning "original_name":
https://blueprints.launchpad.net/nova/+spec/instance-flavor-api
So let's just inspect what is present and use that to figure out
the instance type.
Automatic merge from submit-queue (batch tested with PRs 48702, 48965, 48740, 48974, 48232)
Rackspace for cloud-controller-manager
This implements the NodeAddressesByProviderID and InstanceTypeByProviderID
methods used by the cloud-controller-manager to the RackSpace provider.
The instance type returned is the flavor name, for consistency
InstanceType has been implemented too returning the same value.
This is part of #47257 cc @wlan0
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 48890, 46893, 48872, 48896)
Fix the order of deletion
1. EnsureLoadBalancer can't delete pool without deleting members,
just let EnsureLoadBalancerDeleted do it.
2. Add some friendly error message
**Release note**:
```release-note
NONE
```
EnsureHostInPool() submits a GET to azure API for VM info. We’re seeing this on agent node kubelets and would like to enable configurable backoff engagement for 4xx responses to be able to slow down the rate of reconciliation, when appropriate.
Automatic merge from submit-queue (batch tested with PRs 47066, 48892, 48933, 48854, 48894)
azure: msi: add managed identity field, logic
**What this PR does / why we need it**: Enables managed service identity support for the Azure cloudprovider. "Managed Service Identity" allows us to ask the Azure Compute infra to provision an identity for the VM. Users can then retrieve the identity and assign it RBAC permissions to talk to Azure ARM APIs for the purpose of the cloudprovider needs.
Per the commit text:
```
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: n/a
**Special notes for your reviewer**: none
**Release note**:
```release-note
azure: support retrieving access tokens via managed identity extension
```
cc: @brendandburns @jdumars @anhowe
Automatic merge from submit-queue
remove some people from OWNERS so they don't get reviews anymore
These are googlers who don't work on the project anymore but are still
getting reviews assigned to them:
- @bprashanth
- @rjnagal
- @vmarmol
Automatic merge from submit-queue
Azure PD (Managed/Blob)
This is exactly the same code as this [PR](https://github.com/kubernetes/kubernetes/pull/41950). It has a clean set of generated items. We created a separate PR to accelerate the accept/merge the PR
CC @colemickens
CC @brendandburns
**What this PR does / why we need it**:
1. Adds K8S support for Azure Managed Disks.
2. Adds support for dedicated blob disks (1:1 to storage account) in addition to shared blob disks (n:1 to storage account).
3. Automatically manages the underlying storage accounts. New storage accounts are created at 50% utilization. Max is 100 disks, 60 disks per storage account.
2. Addresses the current issues with Blob Disks:
..* Significantly faster attach process. Disks are now usually available for pods on nodes under 30 sec if formatted, under a min if not formatted.
..* Adds support to move disks between nodes.
..* Adds consistent attach/detach behavior, checks if the disk is leased/attached on a different node before attempting to attach to target nodes.
..* Fixes a random hang behavior on Azure VMs during mount/format (for both blob + managed disks).
..* Fixes a potential conflict by avoiding the use of disk names for mount paths. The new plugin uses hashed disk uri for mount path.
The existing AzureDisk is used as is. Additional "kind" property was added allowing the user to decide if the pd will be shared, dedicated or managed (Azure Managed Disks are used).
Due to the change in mounting paths, existing PDs need to be recreated as PV or PVCs on the new plugin.
The azure cloudprovider will now use the Managed Service Identity
to retrieve access tokens for the Azure ARM APIs, rather than
requiring hard-coded, user-specified credentials.
Automatic merge from submit-queue (batch tested with PRs 48555, 48849)
GCE: Fix panic when service loadbalancer has static IP address
Fixes#48848
```release-note
Fix service controller crash loop when Service with GCP LoadBalancer uses static IP (#48848, @nicksardo)
```
Automatic merge from submit-queue (batch tested with PRs 48781, 48817, 48830, 48829, 48053)
vSphere for cloud-controller-manager
**What this PR does / why we need it**:
This is to implement the `NodeAddressesByProviderID` and `InstanceTypeByProviderID` methods for cloud-controller-manager for vSphere cloud provider.
Currently vSphere cloud provider only supports VMs in the same folder.
Thus `NodeAddressesByProviderID` is similar to `NodeAddresses` with a simple ProviderID to NodeName translation.
`InstanceTypeByProviderID` returns nil as same as `InstanceType`.
**Which issue this PR fixes**
Part of Issue https://github.com/kubernetes/kubernetes/issues/47257
**Release note**:
```NONE
```
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Add initial support for the Azure instance metadata service.
Part of fixing #46632
@colemickens @rootfs @jdumars @kris-nova
Automatic merge from submit-queue (batch tested with PRs 48594, 47042, 48801, 48641, 48243)
Fix panic of DeleteRoute()
Fix#48800
It should be 'addr_pairs', not 'routes'.
**Release note**:
```release-note
NONE
```