On ADD save the host device's name into its IFLA_ALIAS property and
rename the device to the requested CNI_IFNAME inside the container
to conform to the CNI specification. On DEL rename the device to
the original name and move it back into the host namespace.
For IP allocation schemes that cannot be interface agnostic, the
ipvlan plugin can be chained with an earlier plugin that handles this
logic. If "master" is omitted from the ipvlan configuration, then the
previous Result must contain a single interface name for the ipvlan
plugin to enslave. If "ipam" is omitted, then the previous Result is
used to configure the ipvlan interface.
- start list of linux_only plugins; ignore them when testing on Windows
- Isolate linux-only code by filename suffix
- Remove stub (NotImplemented) functions
- other misc. fixes for Windows compatibility
This change improves the performance of the portmap plugin and fixes
hairpin, when a container is mapped back to itself.
Performance is improved by using a multiport test to reduce rule
traversal, and by using a masquerade mark.
Hairpin is fixed by enabling masquerading for hairpin traffic.
For IP allocation schemes that cannot be interface agnostic, master can be set
to "ipam". In this configuration, the IPAM plugin is required to return a single
interface name for the ipvlan plugin to enslave.
There are at least two reasons why a lease is not present:
* The dhcp ipam daemon was restarted
* On add the IPAM plugin failed
Don't fail the IPAM invocation when the lease is not present, to allow
proper device cleanup on CNI delete invocations.
In real-world address allocations, disjoint address ranges are common.
Therefore, the host-local allocator should support them.
This change still allows for multiple IPs in a single configuration, but
also allows for a "set of subnets."
Fixes: #45
The source address selection was random, and sometimes we picked a
source address that the container didn't have a route to. Adding a
default route fixes that!
* Wait for addresses to leave tentative state before setting routes
* Enable forwarding correctly
* Set up masquerading according to the active protocol
This change adds support for IPv6 container/pod addresses to the CNI
bridge plugin, both for dual-stack (IPv4 + IPv6) and for IPv6-only
network configurations.
The proposed changes support multiple IPv6 addresses on a container
interface. If isGW is configured, the bridge will also be configured with
gateway addresses for each IPv6 subnet.
Please note that both the dual-stack functionality and support for multiple
IPv6 container/gateway addresses depends upon containernetworking/cni
PR 451 "ipam/host-local: support multiple IP ranges".
This change could potentially be committed independently from this host-local
plugin change, however the dual-stack and multiple IPv6 address
functionality that is enabled by this change can't be exercised/tested
until the host-local plugin change is committed.
There are some IPv6 unit test cases that are currently commented out
in the proposed changes because these test cases will fail without the
prior commits of the multiple IP range host-local change.
This pull request includes a temporary workaround for Kubernetes
Issue #32291 (Container IPv6 address is marked as duplicate, or dadfailed).
The problem is that kubelet enables hairpin mode on bridge veth
interfaces. Hairpin mode causes the container/pod to see echos of its
IPv6 neighbor solicitation packets, so that it declares duplicate address
detection (DAD) failure. The long-term fix is to use enhanced-DAD
when that feature is readily available in kernels. The short-term fix is
to disable IPv6 DAD in the container. Unfortunately, this has to be done
unconditionally (i.e. without a check for whether hairpin mode is enabled)
because hairpin mode is turned on by kubelet after the CNI bridge plugin
has completed cmdAdd processing. Disabling DAD should be okay if
IPv6 addresses are guaranteed to be unique (which is the case for
host-local IPAM plugin).
This change allows the host-local allocator to allocate multiple IPs.
This is intended to enable dual-stack, but is not limited to only two
subnets or separate address families.
Updates the spec and plugins to return an array of interfaces and IP details
to the runtime including:
- interface names and MAC addresses configured by the plugin
- whether the interfaces are sandboxed (container/VM) or host (bridge, veth, etc)
- multiple IP addresses configured by IPAM and which interface they
have been assigned to
Returning interface details is useful for runtimes, as well as allowing
more flexible chaining of CNI plugins themselves. For example, some
meta plugins may need to know the host-side interface to be able to
apply firewall or traffic shaping rules to the container.
Using a new ".configlist" file format that allows specifying
a list of CNI network configurations to run, add new libcni
helper functions to call each plugin in the list, injecting
the overall name, CNI version, and previous plugin's Result
structure into the configuration of the next plugin.
Chaining sends different config JSON to each plugin, but the same
environment, and if we want to test multiple noop plugin runs in
the same chain we need a way of telling each run to use a different
debug file.
This adds the option `resolvConf` to the host-local IPAM configuration.
If specified, the plugin will try to parse the file as a resolv.conf(5)
type file and return it in the DNS response.
It doesn't seem like container IDs should really have whitespace or
newlines in them. As a complete edge-case, manipulating the host-local
store's IP reservations with 'echo' puts a newline at the end, which
caused matching to fail in ReleaseByID(). Don't ask...
Add an e2e host-local plugin testcase, which requires being able
to pass the datadir into the plugin so we can erase it later.
We're not always guaranteed to have access to the default data
dir location, plus it should probably be configurable anyway.
- Add optional 'stateDir' to flannel NetConf, if not present default to
/var/lib/cni/flannel
Signed-off-by: Jay Dunkelberger <ldunkelberger@pivotal.io>
highlights:
- NetConf struct finally includes cniVersion field
- improve test coverage of current version report behavior
- godoc a few key functions
- allow tests to control version list reported by no-op plugin
When RangeEnd is given, a.end = RangeEnd+1.
If when getSearchRange() is called and lastReservedIP equals
RangeEnd, a.nextIP() only compares lastReservedIP (which in this
example is RangeEnd) against a.end (which in this example is
RangeEnd+1) and they clearly don't match, so a.nextIP() returns
start=RangeEnd+1 and end=RangeEnd.
Get() happily allocates RangeEnd+1 because it only compares 'cur'
to the end returned by getSearchRange(), not to a.end, and thus
allocates past RangeEnd.
Since a.end is inclusive (eg, host-local will allocate a.end) the
fix is to simply set a.end equal to RangeEnd.
Add possibility to reconfigure bridge IP address when there is a new value.
New boolean flag added to net configuration to force IP change if it is need.
Otherwise code behaves as previously and throws error
* bridge: Test the following interface's hardware address for the CNI specific
prefix:
- bridge with IP address
- container veth
* plugins/macvlan test: ensure hardware addr
This will give deterministic MAC addresses for all interfaces CNI
creates and manages the IP for:
* bridge: container veth and host bridge
* macvlan: container veth
* ptp: container veth and host veth
This changes the ip allocation logic to round robin. Before this, host-local IPAM searched for available IPs from start of subnet. Hence it tends to allocate IPs that had been used recently. This is not ideal since it may cause collisions.
When isDefaultGateway is true it automatically sets isGateway to true.
The default route will be added via the (bridge's) gateway IP.
If a default gateway has been configured via IPAM in the same
configuration file, the plugin will error out.
Add a namespace object interface for somewhat cleaner code when
creating and switching between network namespaces. All created
namespaces are now mounted in /var/run/netns to ensure they
have persistent inodes and paths that can be passed around
between plugin components without relying on the current namespace
being correct.
Also remove the thread-locking arguments from the ns package
per https://github.com/appc/cni/issues/183 by doing all the namespace
changes in a separate goroutine that locks/unlocks itself, instead of
the caller having to track OS thread locking.
Resolves CNI part of https://github.com/coreos/rkt/issues/1765
Second part would be adding similar lines into kvm flavored macvlan
support (in time of creating macvtap device).
Not using NewLinkAttrs() or not initializing TxQLen leaves
the value as 0, which tells the kernel to set a zero-length
tx_queue_len. That messes up FIFO traffic shapers (like pfifo)
that use the device TX queue length as the default packet
limit. This leads to a default packet limit of 0, which drops
all packets.
Resolves CNI part of https://github.com/coreos/rkt/issues/1765
Second part would be adding similar lines into kvm flavored macvlan
support (in time of creating macvtap device).