Commit Graph

64 Commits

Author SHA1 Message Date
dougbtv
528d4f150c Functionality for Aux CNI Chain using subdirectory based CNI configuration loading.
Removes the it `fails to execute confListDel given no 'plugins' key"` test.

This test no longer fails after libcni version 1.2.3.
It probably shouldn't failduring a DEL action as it is, we want the least error prone path.

The GC test now uses both cni.dev attachment formats.

Uses both attachment formats as per https://github.com/containernetworking/cni/issues/1101 for GC's cni.dev/valid-attachments & cni.dev/attachments
2025-04-15 15:53:00 -04:00
dougbtv
ccfd8f5fea When returning an empty CNI result, it must be properly structured
For a previous fix of returning an empty CNI result when pods are not found, the CNI result wasn't properly structured. This fixes the structuring.
2025-03-25 14:45:04 -04:00
dougbtv
641f6a3b63 handle pod not found in CNI ADD gracefully
sometimes pods get deleted super fast (like jobs or CI) and they come back as not found.

instead of erroring, just return an empty CNI result so things don't blow up.

adds a sentinel errPodNotFound and skips the rest of CmdAdd when we hit it.

shouts to race conditions.
2025-03-24 09:58:38 -04:00
dougbtv
5892d705da Tolerate issues writing network status annotation on CNI ADD.
This change adds toleration for such errors like:

```
failed to [query/update] the pod pod-name-here in out of cluster comm: pod "pod-name-here" not found
```

During CNI ADD. While this change is a trade off in terms of debugability for RBAC, it's potentially noisy in scaled clusters when it is working properly.
2025-03-20 14:20:00 -04:00
Tomofumi Hayashi
7eb9673a1a Call GC command with valid attachments from multus cache
This code changes CNI's GC command argument. Previously it just
passes from parent CNI runtime, however, it may causes unexpected
resource deletion if one CNI plugin is used in both cluster
network and net-attach-def. This change generates valid attachments
from multus CNI cache and passed to delegate CNI plugin.
2024-12-20 11:28:41 +09:00
Tomofumi Hayashi
a439f91721 Support GC and STATUS command for cluster network
This change supports up to date CNI 1.1 command, GC and STATUS for
cluster network.
2024-12-20 11:28:41 +09:00
dougbtv
f186370654 adds context to GetPodAPILiveQuery 2024-12-19 14:41:32 -05:00
dougbtv
fb03b0f754 This makes sure that stale caches never result in NotFound errors.
It was explained to me that informers are almost always are more efficient, and in most cases will work, but a live lookup is appropriate after a number of failures.

This happens only on the retry portion, so we're still getting the benefits of informers, but, on a retry situation, we don't get a cache miss.

Additionally, changes out use of cache get on this, since it already bails out before it on CNI DEL.
2024-12-19 13:57:57 -05:00
Patryk Matuszak
4ff141c18d Don't wait too long for an answer from API Server
If Multus plugin gets a DEL request, but the API Server is down (e.g.
via 'crictl rmp'), the call takes so long, it actually never finishes.
This prevents CRI-O from deleting the Pods.
2024-12-19 16:13:38 +01:00
dougbtv
bc6c8d5c76 Updates to use CreateNetworkStauses from net-attach-def client for multiple interfaces in CNI results 2024-07-25 16:15:50 -04:00
dougbtv
a1915e1a8e Skips checking for readiness on CNI DEL (and instead warns)
Because deletes should favor a successful path, the readiness check should be skipped for pod removals.

This can cause an issue where there's pods pending deletes and that might impact scheduling of a pod that may be necessary in order to set the readiness indicator.

Adds a new method  to check for readiness indicator alone in order to immediately log a warning.
2024-02-22 09:15:11 -05:00
Tomofumi Hayashi
6ac6fe675f Add net-attach-def informer for thick plugin
This change introduces net-attach-def informer in multus-daemon,
thick pluign case. It could reduced API calls to get
net-attach-def.
2024-01-20 02:04:21 +09:00
Tomofumi Hayashi
46fe38e2c5 Suppress status unset in cmdDel
This change stops to update status in CNI's DEL command.
There are two reasons:

1. cmd DEL is invoked at only pod deletion, hence k8s does not
guarantee the pod and it may be already deleted. Hence this
API may failed.

2. In stateful set's pod recreation case, it may have race
condition to update the status at cmd DEL case.
In stateful set case, same pod name, i.e. stateful-0, is deleted
and then created again. In this case, if old Pod's CNI DEL command
is not finished before new Pod's creation, then SetStatus function
is failed due to pod UID mismatch.
2023-10-04 23:28:26 +09:00
Dan Williams
50c0357467 server: use a shared informer pod cache rather than direct apiserver access
When running in server mode we can use a shared informer to listen for
Pod events from the apiserver, and grab pod info from that cache rather
than doing direct apiserver requests each time.

This reduces apiserver load and retry latency, since multus can poll
the local cache more frequently than it should do direct apiserver
requests.

Oddly static pods don't show up in the informer by the timeout and
require a direct apiserver request. Since static pods are not common
and are typically long-running, it should not be a big issue to
fall back to direct apiserver access for them.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:57:12 -05:00
Tomofumi Hayashi
41d5d08686 Support readinessIndicator file in thick multus-daemon
This change supports readinessIndicatorfile in multus-daemon and
refines goroutine termination in case of signal with context.
2023-08-01 23:01:17 +09:00
Tomofumi Hayashi
fa60329105 Refine and fix parameters
This changes refines parameters in multus thick/thin.
- delete unused parameter, confDir
- add multus-cni-conf-dir
- fix multusConfigPath in non-default params case
2023-07-20 21:22:54 +09:00
Tomofumi Hayashi
22304806c8 Fix multus to support CNI plugin which does not create interface
This change fixes multus to support CNI plugin which does not
create any interface and return empty result. Some CNI plugin
may do network configuration change only and does not create
any interface and return empty CNI result. Current multus assumes
that CNI config always creates some interfaces hence above CNI
plugin is out of assumption and multus may not work with such
plugins.
2023-05-18 02:40:27 +09:00
Tomofumi Hayashi
1b01e3e486 Change gopkg.in to v4 for v4 release 2023-04-13 23:36:40 +09:00
Tomofumi Hayashi
5bce250398
Fix linter warning message (#1057) 2023-04-07 00:20:04 +09:00
杨刚 (成都)
43e2008107
code clean for if condtion (#1037)
Signed-off-by: yanggang <gang.yang@daocloud.io>
2023-02-14 01:41:51 +09:00
Tomofumi Hayashi
82c09a98b7 Change conditional flow to use cache file in CmdDel
This fix changes conditional flow to use net-attach-def if
cache file is collapsed in CmdDel.

Fix #948
2022-11-11 00:22:15 +09:00
yanggang
4f91106f29
remove io/ioutil for advanced golang (#951)
Signed-off-by: yanggang <gang.yang@daocloud.io>

Signed-off-by: yanggang <gang.yang@daocloud.io>
2022-11-10 00:18:54 +09:00
Tomofumi Hayashi
77e0150afe
Fix license boilerplate/copyright in go files (#947)
This change fix license boilerplate and its copyright.
The updated year in copyright is based on the file creation date.
If older than 2021, added copyright is transfered to multus
authors from Intel corporation as the multus code was officially
transfered to Kubernetes Networking Plumbing Working Group on
March 11, 2021.
2022-11-02 21:49:57 +09:00
Gao PeiLiang
a3f9694d09
delegate plugin delete success, delete cache file (#926) 2022-10-31 22:32:51 +09:00
Tomofumi Hayashi
3d9cec4ec9 Merge remote-tracking branch 'origin/master' into feature/multus-4.0 2022-08-19 00:07:30 +09:00
Tomofumi Hayashi
505ab4567c
Add delegate API in multus-daemon (#890)
This changes introduce delegate API function in multus-daemon.
This API will be consumed from other programs for hot-plug
interface into running pod. This change also cleanups server
code to split into client code and server code to easy to import
from other golang code.
2022-08-10 00:45:23 +09:00
Tomofumi Hayashi
99dd6678d5 Refine build-go.sh and update 'version' output 2022-07-07 01:44:13 +09:00
Tomofumi Hayashi
107624ccff Use *[]net.IP for 'default-route' network selection element. 2022-06-22 02:12:08 +09:00
Tomofumi Hayashi
a735987501 Merge remote-tracking branch 'origin/master' into feature/multus-4.0 2022-06-14 18:04:33 +09:00
Doug Smith
8bbb594dad
Merge pull request #862 from s1061123/fix/cmddel-nostatus-update
Skip status update in CmdDel if getPod is failed
2022-06-13 13:22:17 -04:00
Tomofumi Hayashi
fcc8e44f14 Skip status update in CmdDel if getPod is failed
This change skips to update pod's network-status annotation
when getPod is failed at the beginning of CmdDel. If getPod is
failed, K8s api gets stucked in many cases, hence pod update
might be failed in most cases.
2022-06-14 02:14:43 +09:00
疯狂的小企鹅
be56f8dc30
Fixed that in.Delegates may remain in the CmdDel (#859)
Co-authored-by: jinda.ljd <jinda.ljd@alibaba-inc.com>
2022-06-08 21:03:14 +09:00
Tomofumi Hayashi
485642c18f Merge remote-tracking branch 'origin/master' into feature/multus-4.0 2022-05-07 00:36:30 +09:00
Doug Smith
dcbc215b93
The cachefile name should be the delegate configuration name (#841)
It was previously using the net-attach-def name, which doesn't align with the cache file. Causing default-route selection to not succeed.
2022-05-07 00:06:37 +09:00
Tomofumi Hayashi
4670f1f240 Fix sr-iov support
Fix thick plugin daemonset to add volume mapping required for
sr-iov and fix code to update network status.
In addition, fix checkpoint structures to support K8s without
kubelet pod resources API.

fix #665 and #778
2022-04-18 21:28:13 +09:00
Tomofumi Hayashi
bf4d6c716c Merge remote-tracking branch 'origin/master' into feature/multus-4.0 2022-04-12 21:42:19 +09:00
Doug Smith
588ee8f192
Fixes log message on CNI Check (#825) 2022-04-11 20:35:47 +09:00
Tomofumi Hayashi
51c39205a8 Remove error handling for getPod to force to proceed cmdDel.
In cmdDel, CNI Spec mentioned that plugin should proceed cmdDel
without any error, hence the change removes error returning
at cmdDel.

fix #822
2022-04-06 00:34:53 +09:00
Tomofumi Hayashi
4180f88442 Refine multus-daemon config 2022-04-06 00:34:53 +09:00
Doug Smith
e7aaf8f5d5 only warn when netns can't be opened (#803) 2022-04-06 00:34:53 +09:00
Tomofumi Hayashi
93ec0c121e Support CNI 1.0.0
Fix #792
2022-04-06 00:34:53 +09:00
Tomofumi Hayashi
d4a3ea4fd0 Replace setenv with runtimeConfig set (#785)
setenv refers environment variables, which is unique in process,
not unique to go routine. Hence it may causes some issue in multi
threaded case, hence it is replaced with libcni's runtimeConfig
value set to set these variables at libcni side, after process
fork.
2022-04-06 00:34:53 +09:00
Miguel Duarte Barroso
fb31217e2c thick-plugin: refactor multus
Multus is refactored as a thick plugin, featuring 2 main components:
  - a server listening to a unix domain socket, running in a pod
  - a shim, a binary on the host that will send JSON requests built from
    its environment / stdin values to the aforementioned server.

The pod where the multus daemon is running must share the host's PID
namespace.

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

react to maintainers review

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

thick, deployment: update the daemonset spec

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

thick, config: validate the cni config passed by the runtime

Without this patch, we're blindly trusting anything sent by the server.
This way, we assure the requests arriving at the multus controller are
valid before hand.

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

thick: model client / server config

Also add a new command line parameter on the multus controller, pointing
it to the server configuration.

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

SQUASH candidate, thick, config: cleanup the configuration

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

multus: use args.args instead of an env variable

CNI is already filling the args structure; we should consume that
rather than rely on the environment variables.

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

unit tests: remove weird tests that check an impossible scenario

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

docs, thick: document the thick plugin variant

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>

thick, server, multus: re-use common types

Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
2022-04-06 00:34:52 +09:00
Tomofumi Hayashi
2d53334211 Remove error handling for getPod to force to proceed cmdDel.
In cmdDel, CNI Spec mentioned that plugin should proceed cmdDel
without any error, hence the change removes error returning
at cmdDel.

fix #822
2022-04-05 02:29:41 +09:00
Doug Smith
7559625a38
only warn when netns can't be opened (#803) 2022-03-04 02:02:24 +09:00
Tomofumi Hayashi
6dd45f38f9
Replace setenv with runtimeConfig set (#785)
setenv refers environment variables, which is unique in process,
not unique to go routine. Hence it may causes some issue in multi
threaded case, hence it is replaced with libcni's runtimeConfig
value set to set these variables at libcni side, after process
fork.
2022-02-21 23:55:33 +09:00
Tomofumi Hayashi
2e474f4c95 Suppress uid mismatch error/warning in case of static pod
In static pod case, kube api returns mirror pod UID hence
uid must be mismatched. This fix suppress error/warning message
in such case.

Fix #773
2022-01-15 23:17:53 +09:00
Miguel Duarte Barroso
12df5bda72
run gofmt on the code (#772)
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
2022-01-05 01:33:58 +09:00
Tomofumi Hayashi
d52f2b6a45 Update libcni cache when default-route net selection is used
To keep consistency between actual network and CNI result in cache,
update libcni cache when multus add/del default routes by
`default-route` network selection.
2021-12-15 01:57:51 +09:00
Doug Smith
b9d0d93d6e
Pod UID mismatches should only warn on CNI DEL (#763) 2021-11-23 17:52:45 +09:00