On the CNI request failure, multus-cni prints out cmdArgs. In all
cases, except for debug printing, this is done with %s and a special
printing function. However, the handleCNIRequest is an exception for
some reason. That leads to unintelligible error messages in case
of CNI request failures (severely abridged):
CmdAdd (shim): CNI request failed with status 400:
'&{ContainerID:<id> Netns:/var/run/netns/<uuid> IfName:eth0
Args:<args> Path: StdinData:[125 121 111 117 114 32 97 100 118
101 114 116 105 115 101 109 101 110 116 32 99 111 117 108 100
32 98 101 32 104 101 114 101 125 ... another 650 numbers ]}
ContainerID:"<id>" Netns:"/var/run/netns/<uuid>" IfName:"eth0"
Args:"<args>" Path:"" ERRORED: error configuring pod ...
printCmdArgs() should be used for this case as well to avoid huge
hardly readable logs.
At the same time, the content of cniCmdArgs is always appended to
the error twice as seen in the example above. The first time by the
HandleCNIRequest and another time by the handleCNIRequest. Same for
the HandleDelegateRequest path.
Just removing the prefixing from the lower level handlers while
keeping higher level ones. The 'ERRORED' part migrated to the higher
level handler functions to preserve the overall look of the error.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
This change introduces certDuration as parameter to customize
cert duration. In addition, environment variable for node name
is matched to other usages.
We used to run chroot in multus main process when calling other CNI
plugin binary. We also use a mutex to lock the access to pod files.
But this causes performance issues when facing heavy
CNI_ADD/CNI_DEL requests.
With this patch, we do chroot in the child processes instead. So
file operations in the main process will not be affected by chroot.
This change requires the multus thick plugin pod to mount CNI bin
directory to the same path in the container host.
Signed-off-by: Peng Liu <pliu@redhat.com>
This change introduces per-node certification for multus pods.
Once multus pod is launched, then specified bootstrap kubeconfig
is used for initial access, then multus sends CSR request to
kube API to get original certs for kube API access. Once it is
accepted then the multus pod uses generated certs for kube access.
For whatever reason calling os.Stat() on the readiness indicator file
from CmdAdd()/CmdDel() when multus is running in server mode and is
containerized often returns "file not found", which triggers the
polling behavior of GetReadinessIndicatorFile(). This greatly delays
CNI operations that should be pretty quick. Even if an exponential
backoff is used, os.Stat() can still return "file not found"
multiple times, even though the file clearly exists.
But it turns out we don't need to check the readiness file in server
mode when running with MultusConfigFile == "auto". In this mode the
server starts the ConfigManager which (a) waits until the file exists
and (b) fsnotify watches the readiness and (c) exits the daemon
immediately if the file is deleted or moved.
This means we can assume that while the daemon is running and the
server is handling CNI requests that the readiness file exists;
otherwise the daemon would have exited. Thus CmdAdd/CmdDel don't
need to run a lot of possibly failing os.Stat() calls in the CNI
hot paths.
Signed-off-by: Dan Williams <dcbw@redhat.com>
When running in server mode we can use a shared informer to listen for
Pod events from the apiserver, and grab pod info from that cache rather
than doing direct apiserver requests each time.
This reduces apiserver load and retry latency, since multus can poll
the local cache more frequently than it should do direct apiserver
requests.
Oddly static pods don't show up in the informer by the timeout and
require a direct apiserver request. Since static pods are not common
and are typically long-running, it should not be a big issue to
fall back to direct apiserver access for them.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Move server start code to a common function that both regular
and test code can use. Also shut down the server from the
testcases.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Then we can just use the Server struct kube client and exec rather
than passing them through the function parameters.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Thick server's chroot mutex is missing in GetDefaultNetworks,
that touch the pod filesystem. This change adds mutex lock there
and prevent race condition.
Fix#1072
* config, daemon: shim socket path is not needed
The shim socket dir attribute is only required for the shim (cni
configuration). Thus, it can be removed from the daemon configuration.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
* config, daemon: rename socket dir attribute
Now the socketDir parameter no longer stutters.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
* docs, thick plugin: align docs with new configuration reference
PR #1053 - [0] - changed the thick plugin configuration to happen
exclusively via the user provided config map. This PR aligns the multus
documentation with the existing code.
[0] - https://github.com/k8snetworkplumbingwg/multus-cni/pull/1053
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
---------
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
* Draft for refine options
* config: remove command line args; use configMap/JSON config
The `socketDir` configuration was split in two, since the multus daemon,
and multus shim have the socket in different paths. This allows the user
to customize these paths.
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
* deployment, ci: update daemonset spec
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
---------
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Co-authored-by: Tomofumi Hayashi <tohayash@redhat.com>
Co-authored-by: dougbtv <dosmith@redhat.com>
From the node (or any privileged pod having mounted the multus socket)
you can now query the multus-cni server liveliness - for instance:
```
root@kind-worker:/# curl -v --unix-socket /run/multus/multus.sock localhost/healthz
* Trying /run/multus/multus.sock:0...
* Connected to localhost (/host/run/multus/multus.sock) port 80 (#0)
> GET /healthz HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 14 Nov 2022 17:21:07 GMT
< Content-Length: 0
< Connection: close
<
* Closing connection 0
```
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
Signed-off-by: Miguel Duarte Barroso <mdbarroso@redhat.com>
This change fix license boilerplate and its copyright.
The updated year in copyright is based on the file creation date.
If older than 2021, added copyright is transfered to multus
authors from Intel corporation as the multus code was officially
transfered to Kubernetes Networking Plumbing Working Group on
March 11, 2021.
This changes introduce delegate API function in multus-daemon.
This API will be consumed from other programs for hot-plug
interface into running pod. This change also cleanups server
code to split into client code and server code to easy to import
from other golang code.
This change make binary file and directory name consistent.
In addition, change the package name cni to server because cni
is a bit umbiguous for cni plugin's repository.