Merge pull request #63319 from soundcloud/always-masquerade-service-vips

Automatic merge from submit-queue (batch tested with PRs 63319, 64248, 64250, 63890, 64233). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Always masquerade node-originating traffic with a service VIP source ip

**What this PR does / why we need it**:
This is a follow up to make IPVS work on systems without cluster-cidr or masquerade-all.
On these systems the best matching network / source IP to reach the service VIP is the service VIP itself - at least for the host network.
The workaround is simple: Everything originating on the host (OUTPUT nat chain) with a source IP that is the VIP should be masqueraded.

The relevant rule change is the first rule in `KUBE-SERVICES`:
```
Chain KUBE-SERVICES (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 KUBE-MARK-MASQ  all  --  *      *       0.0.0.0/0            0.0.0.0/0            match-set KUBE-CLUSTER-IP src,dst
  104  6240 KUBE-MARK-MASQ  tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp match-set KUBE-NODE-PORT-TCP dst
```

The matching rule could be stricter by matching src(ip),dst(ip),dst(port) but the src ip will only be selected if the VIP should be reached.

**Which issue(s) this PR fixes**
Fixes #63241

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
This commit is contained in:
Kubernetes Submit Queue 2018-05-24 19:46:10 -07:00 committed by GitHub
commit 9587272e1e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1218,17 +1218,25 @@ func (proxier *Proxier) writeIptablesRules() {
"-A", string(kubeServicesChain),
"-m", "comment", "--comment", proxier.ipsetList[kubeClusterIPSet].getComment(),
"-m", "set", "--match-set", kubeClusterIPSet,
"dst,dst",
)
if proxier.masqueradeAll {
writeLine(proxier.natRules, append(args, "-j", string(KubeMarkMasqChain))...)
writeLine(proxier.natRules, append(args, "dst,dst", "-j", string(KubeMarkMasqChain))...)
} else if len(proxier.clusterCIDR) > 0 {
// This masquerades off-cluster traffic to a service VIP. The idea
// is that you can establish a static route for your Service range,
// routing to any node, and that node will bridge into the Service
// for you. Since that might bounce off-node, we masquerade here.
// If/when we support "Local" policy for VIPs, we should update this.
writeLine(proxier.natRules, append(args, "! -s", proxier.clusterCIDR, "-j", string(KubeMarkMasqChain))...)
writeLine(proxier.natRules, append(args, "dst,dst", "! -s", proxier.clusterCIDR, "-j", string(KubeMarkMasqChain))...)
} else {
// Masquerade all OUTPUT traffic coming from a service ip.
// The kube dummy interface has all service VIPs assigned which
// results in the service VIP being picked as the source IP to reach
// a VIP. This leads to a connection from VIP:<random port> to
// VIP:<service port>.
// Always masquerading OUTPUT (node-originating) traffic with a VIP
// source ip and service port destination fixes the outgoing connections.
writeLine(proxier.natRules, append(args, "src,dst", "-j", string(KubeMarkMasqChain))...)
}
}