Removes the it `fails to execute confListDel given no 'plugins' key"` test.
This test no longer fails after libcni version 1.2.3.
It probably shouldn't failduring a DEL action as it is, we want the least error prone path.
The GC test now uses both cni.dev attachment formats.
Uses both attachment formats as per https://github.com/containernetworking/cni/issues/1101 for GC's cni.dev/valid-attachments & cni.dev/attachments
sometimes pods get deleted super fast (like jobs or CI) and they come back as not found.
instead of erroring, just return an empty CNI result so things don't blow up.
adds a sentinel errPodNotFound and skips the rest of CmdAdd when we hit it.
shouts to race conditions.
This change adds toleration for such errors like:
```
failed to [query/update] the pod pod-name-here in out of cluster comm: pod "pod-name-here" not found
```
During CNI ADD. While this change is a trade off in terms of debugability for RBAC, it's potentially noisy in scaled clusters when it is working properly.
This code changes CNI's GC command argument. Previously it just
passes from parent CNI runtime, however, it may causes unexpected
resource deletion if one CNI plugin is used in both cluster
network and net-attach-def. This change generates valid attachments
from multus CNI cache and passed to delegate CNI plugin.
It was explained to me that informers are almost always are more efficient, and in most cases will work, but a live lookup is appropriate after a number of failures.
This happens only on the retry portion, so we're still getting the benefits of informers, but, on a retry situation, we don't get a cache miss.
Additionally, changes out use of cache get on this, since it already bails out before it on CNI DEL.
If Multus plugin gets a DEL request, but the API Server is down (e.g.
via 'crictl rmp'), the call takes so long, it actually never finishes.
This prevents CRI-O from deleting the Pods.
Because deletes should favor a successful path, the readiness check should be skipped for pod removals.
This can cause an issue where there's pods pending deletes and that might impact scheduling of a pod that may be necessary in order to set the readiness indicator.
Adds a new method to check for readiness indicator alone in order to immediately log a warning.
This change stops to update status in CNI's DEL command.
There are two reasons:
1. cmd DEL is invoked at only pod deletion, hence k8s does not
guarantee the pod and it may be already deleted. Hence this
API may failed.
2. In stateful set's pod recreation case, it may have race
condition to update the status at cmd DEL case.
In stateful set case, same pod name, i.e. stateful-0, is deleted
and then created again. In this case, if old Pod's CNI DEL command
is not finished before new Pod's creation, then SetStatus function
is failed due to pod UID mismatch.
When running in server mode we can use a shared informer to listen for
Pod events from the apiserver, and grab pod info from that cache rather
than doing direct apiserver requests each time.
This reduces apiserver load and retry latency, since multus can poll
the local cache more frequently than it should do direct apiserver
requests.
Oddly static pods don't show up in the informer by the timeout and
require a direct apiserver request. Since static pods are not common
and are typically long-running, it should not be a big issue to
fall back to direct apiserver access for them.
Signed-off-by: Dan Williams <dcbw@redhat.com>
This change fixes multus to support CNI plugin which does not
create any interface and return empty result. Some CNI plugin
may do network configuration change only and does not create
any interface and return empty CNI result. Current multus assumes
that CNI config always creates some interfaces hence above CNI
plugin is out of assumption and multus may not work with such
plugins.
This change fix license boilerplate and its copyright.
The updated year in copyright is based on the file creation date.
If older than 2021, added copyright is transfered to multus
authors from Intel corporation as the multus code was officially
transfered to Kubernetes Networking Plumbing Working Group on
March 11, 2021.
This changes introduce delegate API function in multus-daemon.
This API will be consumed from other programs for hot-plug
interface into running pod. This change also cleanups server
code to split into client code and server code to easy to import
from other golang code.
This change skips to update pod's network-status annotation
when getPod is failed at the beginning of CmdDel. If getPod is
failed, K8s api gets stucked in many cases, hence pod update
might be failed in most cases.
Fix thick plugin daemonset to add volume mapping required for
sr-iov and fix code to update network status.
In addition, fix checkpoint structures to support K8s without
kubelet pod resources API.
fix#665 and #778
setenv refers environment variables, which is unique in process,
not unique to go routine. Hence it may causes some issue in multi
threaded case, hence it is replaced with libcni's runtimeConfig
value set to set these variables at libcni side, after process
fork.
Multus is refactored as a thick plugin, featuring 2 main components:
- a server listening to a unix domain socket, running in a pod
- a shim, a binary on the host that will send JSON requests built from
its environment / stdin values to the aforementioned server.
The pod where the multus daemon is running must share the host's PID
namespace.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
react to maintainers review
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
thick, deployment: update the daemonset spec
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
thick, config: validate the cni config passed by the runtime
Without this patch, we're blindly trusting anything sent by the server.
This way, we assure the requests arriving at the multus controller are
valid before hand.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
thick: model client / server config
Also add a new command line parameter on the multus controller, pointing
it to the server configuration.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
SQUASH candidate, thick, config: cleanup the configuration
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
multus: use args.args instead of an env variable
CNI is already filling the args structure; we should consume that
rather than rely on the environment variables.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
unit tests: remove weird tests that check an impossible scenario
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
docs, thick: document the thick plugin variant
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
thick, server, multus: re-use common types
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
setenv refers environment variables, which is unique in process,
not unique to go routine. Hence it may causes some issue in multi
threaded case, hence it is replaced with libcni's runtimeConfig
value set to set these variables at libcni side, after process
fork.
To keep consistency between actual network and CNI result in cache,
update libcni cache when multus add/del default routes by
`default-route` network selection.