Windows Kernel now exposes "Internal Load Balancing"
using VFP (Virtual Filtering Platform) part of Virtual Switch. An inbuild
windows service HNS (Host Networking Service) acts as interface to program
the VFP. VFP is synonymous to iptables in functionality. HNS uses json based
data as input.
With the help of the interface available in github.com/Microsoft/hcsshim,
these APIs are exposed to the world in github to program HNS and use
the feature.
*** More info about the changes in this PR ***
(1) For every endpoint available in the system, an HNS Endpoint is added
(1.a) for local endpoints, a local HNS Endpoint would already exist, as part of
container creation.
(1.b) For all remote endpoints, a remote HNS Endpoint is created via HNS
(2) For every Service, a HNS ILB LoadBalancer is added referring the endpoints
created in (1)
Sample Input to HNS:
{
"Policies": [
{
"ExternalPort": 80,
"InternalPort": 80,
"Protocol": 6,
"Type": "ELB",
"VIPs": [
"11.0.98.129"
]
}
],
"References": [
"/endpoints/ca8b877b-ab90-499a-bc0e-7d736c425632",
"/endpoints/ee0ef08b-8434-4f8b-b748-393884e77465"
]
}
(2-a) This is done for Cluster IP, LoadBalancer Ingress IP, NodePort, External IP
Following the regular service and endpoint updates,
the HNS is notified of the updates and the system is kept in sync.
Automatic merge from submit-queue
Improve description for --masquerade-all and --cluster-cidr flags
**What this PR does / why we need it**:
Improves the help text for the kube-proxy's `--masquerade-all` and `--cluster-cidr` flags, which previously were vague and confusing.
Fixes https://github.com/kubernetes/kubernetes/issues/47213
```release-note
NONE
```
Automatic merge from submit-queue
Add test for kube-proxy running with "--cleanup-iptables=true"
**What this PR does / why we need it**:
Add test to prevent such kube-proxy panic to happen again.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#48177
**Special notes for your reviewer**:
Forgot to add this in last PR #48183. Should we also add this to v1.7 milestone?
/cc @ncdc @dchen1107
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49992, 48861, 49267, 49356, 49886)
Emit event and retry when fail to start healthz server on kube-proxy
**What this PR does / why we need it**: Enhance kube-proxy's logic when fail to start healthz server.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: From #49263.
**Special notes for your reviewer**:
/assign @thockin @nicksardo @bowei
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 49058, 49072, 49137, 49182, 49045)
check for nil value in interface for proxier health
golang allows for a non-nil interface to have a nil value (not type). This results in an NPE at runtime.
@sttts remember that bit about go? Trivia becomes real :(
Automatic merge from submit-queue (batch tested with PRs 48043, 48200, 49139, 36238, 49130)
expose method to allow externally setting defaults on an external type
The options are an exposed type. This allows you to set the defaults on them.
@derekwaynecarr who normally owns this bit?
Automatic merge from submit-queue
fix#46039: iptables proxier need use '--bind-address' if set
**What this PR does / why we need it**:
iptables proxier need use '--bind-address' if set
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46039
**Special notes for your reviewer**:
**Release note**:
```release-note
```
change import of client-go/api/helper to kubernetes/api/helper
remove unnecessary use of client-go/api.registry
change use of client-go/pkg/util to kubernetes/pkg/util
remove dependency on client-go/pkg/apis/extensions
remove unnecessary invocation of k8s.io/client-go/extension/intsall
change use of k8s.io/client-go/pkg/apis/authentication to v1
Automatic merge from submit-queue (batch tested with PRs 44727, 45409, 44968, 45122, 45493)
Separate healthz server from metrics server in kube-proxy
From #14661, proposal is on kubernetes/community#552.
Couple bullet points as in commit:
- /healthz will be served on 0.0.0.0:10256 by default.
- /metrics and /proxyMode will be served on port 10249 as before.
- Healthz handler will verify timestamp in iptables mode.
/assign @nicksardo @bowei @thockin
**Release note**:
```release-note
NONE
```
- /healthz will be served on 0.0.0.0:10256 by default.
- /metrics and /proxyMode will be served on port 10249
as before.
- Healthz handler will verify timestamp in iptables mode.
Automatic merge from submit-queue
Optimize getProxyMode() to get proxy mode
1. getProxyMode() does not need nodeGetter args after remove
proxy-mode annotation.
2. we should get error when the version of iptables less than
MinVersion.
Use shared informers instead of creating local controllers/reflectors
for the proxy's endpoints and service configs. This allows downstream
integrators to pass in preexisting shared informers to save on memory &
cpu usage.
This also enables the cache mutation detector for kube-proxy for those
presubmit jobs that already turn it on.
Automatic merge from submit-queue
Extensible Userspace Proxy
This PR refactors the userspace proxy to allow for custom proxy socket implementations.
It changes the the ProxySocket interface to ensure that other packages can properly implement it (making sure all arguments are publicly exposed types, etc), and adds in a mechanism for an implementation to create an instance of the userspace proxy with a non-standard ProxySocket.
Custom ProxySockets are useful to inject additional logic into the actual proxying. For example, our idling proxier uses a custom proxy socket to hold connections and notify the cluster that idled scalable resources need to be woken up.
Also-Authored-By: Ben Bennett bbennett@redhat.com
Automatic merge from submit-queue (batch tested with PRs 42200, 39535, 41708, 41487, 41335)
Update kube-proxy support for Windows
**What this PR does / why we need it**:
The kube-proxy is built upon the sophisticated iptables NAT rules. Windows does not have an equivalent capability. This introduces a change to the architecture of the user space mode of the Windows version of kube-proxy to match the capabilities of Windows.
The proxy is organized around service ports and portals. For each service a service port is created and then a portal, or iptables NAT rule, is opened for each service ip, external ip, node port, and ingress ip. This PR merges the service port and portal into a single concept of a "ServicePortPortal" where there is one connection opened for each of service IP, external ip, node port, and ingress IP.
This PR only affects the Windows kube-proxy. It is important for the Windows kube-proxy because it removes the limited portproxy rule and RRAS service and enables full tcp/udp capability to services.
**Special notes for your reviewer**:
**Release note**:
```
Add tcp/udp userspace proxy support for Windows.
```
This changes the userspace proxy so that it cleans up its conntrack
settings when a service is removed (as the iptables proxy already
does). This could theoretically cause problems when a UDP service
as deleted and recreated quickly (with the same IP address). As
long as packets from the same UDP source IP and port were going to
the same destination IP and port, the the conntrack would apply and
the packets would be sent to the old destination.
This is astronomically unlikely if you did not specify the IP address
to use in the service, and even then, only happens with an "established"
UDP connection. However, in cases where a service could be "switched"
between using the iptables proxy and the userspace proxy, this case
becomes much more frequent.
Add flags to control max connections (set to 256k vs 64k default) and TCP
established timeout (set to 1 day vs 5 day default). Flags can be set to 0 to
mean "don't change it".
This is only set at startup, and not wrapped in a rectifier loop.
Tested manually.
Default to hardcodes for components that had them, and 5.0 qps, 10 burst
for those that relied on client defaults
Unclear if maybe it'd be better to just assume these are set as part of
the incoming kubeconfig. For now just exposing them as flags since it's
easier for me to manually tweak.
This changes the --legacy-userspace-proxy flag to be a string flag
--proxy-mode. If specified, the flag will be respected ('userspace' and
'iptables' being valid values). If left blank (default) we will choose the
"best". best means userspace for now UNLESS the user adds an annotation
(net.experimental.kubernetes.io/proxy-mode) to their node, in which case we
will try to use that.
This allows people to try it on a single machine without fear of global failure
and without it getting rolled back on reboots. It is a poor-man's config blob.
Check to make sure there is not an alphanumeric character immeditely
before or after the 'flag'. It there is an alphanumeric character then
this is obviously not actually the flag we care about. For example if
the project declares a flag "valid-name" but the regex finds something
like "invalid_name" we should not match. Clearly this "invalid_name" is
not actually a wrong usage of the "valid-name" flag.
1. Add HostnameOverride parameter for kube-proxy as kubelet did.
2. Add Birthcry event for kube-proxy.
3. Because record event need apiserver client, adjust order of code partly.
pflag can handle IP addresses so use the pflag code instead of doing it
ourselves. This means our code just uses net.IP and we don't have all of
the useless casting back and forth!
Moves the userspace code in proxy to a sub-package and adds the
ProxyProvider interface.
This is in preparation for landing an implementation of
https://github.com/GoogleCloudPlatform/kubernetes/issues/3760, which
will mostly be in another sub package for iptables.
--master option still supported.
--kubeconfig option added to kube-proxy,
kube-scheduler, and kube-controller-manager
binaries.
Kube-proxy now always makes some kind of API
source, since that is its only kind of config.
Warn if it is using a default client, which probably won't work.
Uses the clientcmd builder.
--master flag is still supported for distros that need it.
But now, --kubeconfig flag can be used instead, or in addition,
to specify the auth info, and/or the location of the master.
A subsequent PR will change salt to generate a kubeconfig,
and to make kube-proxy use it, for salt-based clouds.
It currently is impossible to use two healthz handlers on different
ports in the same process. This removes the global variables in favor
of requiring the consumer to specify all health checks up front.
All the distros that use this have been updated,
or have PRs out to update them, or owners
have been asked to fix RPMs.
Removing this prevents further use of this model.
Remove now dead code: EtcdClientOrDie
Remove now dead pkg/proxy/config/etcd.go.
Remove unused imports.