Commit Graph

964 Commits

Author SHA1 Message Date
Tomofumi Hayashi
03fcb34abe Update kind e2e 2024-02-16 02:30:17 +09:00
Doug Smith
ba18cf5ab3
Merge pull request #1214 from s1061123/add-netdef-informer
Add net-attach-def informer for thick plugin
2024-02-15 09:40:57 -05:00
Doug Smith
b271fbf84d
Merge pull request #1229 from s1061123/fix/filepath
Add filepath sanity check
2024-02-14 10:48:12 -05:00
Tomofumi Hayashi
748930239d Add filepath sanity check 2024-02-15 00:29:07 +09:00
Doug Smith
c550826675
Merge pull request #1228 from s1061123/fix/reload-kubeconfig-if-failed
Reload bootstrap kubeconfig if cert mgr failed to load valid certs
2024-02-13 10:49:49 -05:00
Tomofumi Hayashi
a337317533 Reload bootstrap kubeconfig if cert mgr failed to load valid certs
When user recreate whole cluster certs, multus thick plugin's
previous cert is no longer valid. In such case, we need to prevent
to use cert manager's old certs and restart it from bootstrap
kubeconfig. This fix reloads client config from bootstrap
kubeconfig if cert mgr's cert is failed to load pod.
2024-02-14 00:46:12 +09:00
Tomofumi Hayashi
8e5060b9a7
Opt out to mount service account token (#1219) 2024-02-01 17:33:59 +09:00
Dennis Periquet
6c982f3fee
supplement log with stringified version of StdinData to enhance debug (#1215) 2024-01-26 01:30:58 +09:00
Doug Smith
1071115e90
Merge pull request #1217 from s1061123/add-sleep-thin
Add additional sleep in thick entrypoint
2024-01-25 11:10:57 -05:00
Tomofumi Hayashi
493d421cf7
Update github actions (#1216) 2024-01-26 00:46:09 +09:00
Tomofumi Hayashi
24b2d55c84 Add additional sleep in thick entrypoint 2024-01-26 00:45:47 +09:00
Tomofumi Hayashi
6812ce0ed6
Update e2e related tools (#1212) 2024-01-24 22:39:54 +09:00
Tomofumi Hayashi
6ac6fe675f Add net-attach-def informer for thick plugin
This change introduces net-attach-def informer in multus-daemon,
thick pluign case. It could reduced API calls to get
net-attach-def.
2024-01-20 02:04:21 +09:00
Fish-pro
3477c9c827
fix(quick start)-You do not need to clone the repository and directly deliver the installation file (#1210)
Signed-off-by: Zechun Chen <zechun.chen@daocloud.io>
2024-01-18 23:45:57 +09:00
Lionel Jouin
36ba3039ae
Add watch permission to thick e2e template (#1208)
As described in #1171, the watch function is required in the clusterrole
for the thick Multus version, otherwise "Failed to watch *v1.Pod" would
be returned.
2024-01-18 23:45:40 +09:00
Tomofumi Hayashi
40687759fb
Reduce informer memory usage by informer transform (#1203)
This fix reduces multus-daemon memory usage with k8s 0.29 informer
transform to trim unnecessary Pod object information to multus.
2024-01-18 23:32:21 +09:00
Tomofumi Hayashi
a70da3556a
Fix a wait to account for the possiblity of a not ready unix socket (#1207) 2024-01-11 13:34:37 +09:00
Doug Smith
003fbd5785
Merge pull request #1202 from s1061123/add-timeout
Add timeout
2024-01-05 08:04:02 -05:00
Tomofumi Hayashi
6e4f62f2f2 disable revive's dot-imports in unit test files 2024-01-05 14:32:09 +09:00
Tomofumi Hayashi
197877d113 Adds a wait to account for the possiblity of a not ready unix socket 2024-01-05 14:27:31 +09:00
Doug Smith
ab7d64e96f
Refactors the configuration options document reference (#1180) 2024-01-04 23:54:56 +09:00
dependabot[bot]
acfbd42719
Bump google.golang.org/grpc from 1.53.0 to 1.56.3 (#1182)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.53.0 to 1.56.3.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.53.0...v1.56.3)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-11 14:48:54 +09:00
Doug Smith
c76db9c7a0
Merge pull request #1194 from s1061123/fix-logging
Fix to use lumberjack only for logging files
2023-12-07 09:14:30 -05:00
Tomofumi Hayashi
540a887651 Fix to use lumberjack only for logging files 2023-12-07 21:08:17 +09:00
Tomofumi Hayashi
d97514f841 Ignore dot-imports error message only for go test files 2023-12-07 20:56:36 +09:00
Moshe Levi
e4404b2645
fix e2e test ModuleNotFoundError: No module named 'pkg_resources' (#1189)
Signed-off-by: Moshe Levi <moshele@nvidia.com>
Signed-off-by: Tomofumi Hayashi <tohayash@redhat.com>
2023-12-07 20:51:02 +09:00
Jonatan
a373a2286d
Deployments: Add watch permission to thick example (#1171)
The ClusterRole was missing the watch permission on pods, which resulted in Multus throwing this error message every few seconds:

Failed to watch *v1.Pod: unknown (get pods)
2023-12-04 20:28:18 +09:00
dependabot[bot]
e2e8cfb677
Bump golang.org/x/net from 0.8.0 to 0.17.0 (#1176)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.8.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.8.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-10-17 14:21:43 +09:00
Doug Smith
b710020f7b
Merge pull request #1173 from s1061123/remove-status-set-del
Suppress status unset in cmdDel
2023-10-04 11:50:16 -04:00
Tomofumi Hayashi
46fe38e2c5 Suppress status unset in cmdDel
This change stops to update status in CNI's DEL command.
There are two reasons:

1. cmd DEL is invoked at only pod deletion, hence k8s does not
guarantee the pod and it may be already deleted. Hence this
API may failed.

2. In stateful set's pod recreation case, it may have race
condition to update the status at cmd DEL case.
In stateful set case, same pod name, i.e. stateful-0, is deleted
and then created again. In this case, if old Pod's CNI DEL command
is not finished before new Pod's creation, then SetStatus function
is failed due to pod UID mismatch.
2023-10-04 23:28:26 +09:00
Doug Smith
d7e391e006
Merge pull request #1167 from s1061123/fix-params
Per node certificates: Add duration parameter
2023-09-26 14:14:03 -04:00
Tomofumi Hayashi
6a0c905347 Fix per node cert feature
This change introduces certDuration as parameter to customize
cert duration. In addition, environment variable for node name
is matched to other usages.
2023-09-27 00:54:32 +09:00
Peng Liu
4d69fed8ad
Fix incorrect mount volume name in the thick plugin manifest (#1166)
Signed-off-by: Peng Liu <pliu@redhat.com>
2023-09-25 22:31:02 +09:00
Peng Liu
1dd4edded2
Move chroot from multus main process to its child processes (#1161)
We used to run chroot in multus main process when calling other CNI
plugin binary. We also use a mutex to lock the access to pod files.
But this causes performance issues when facing heavy
CNI_ADD/CNI_DEL requests.

With this patch, we do chroot in the child processes instead. So
file operations in the main process will not be affected by chroot.

This change requires the multus thick plugin pod to mount CNI bin
directory to the same path in the container host.

Signed-off-by: Peng Liu <pliu@redhat.com>
2023-09-22 17:08:57 +09:00
Doug Smith
857d070679
Merge pull request #1159 from s1061123/per-node-cert
Add per-node-certification support
2023-09-18 12:16:03 -04:00
Tomofumi Hayashi
e5d19fff6b Add per-node-certification support
This change introduces per-node certification for multus pods.
Once multus pod is launched, then specified bootstrap kubeconfig
is used for initial access, then multus sends CSR request to
kube API to get original certs for kube API access. Once it is
accepted then the multus pod uses generated certs for kube access.
2023-09-19 00:38:29 +09:00
Doug Smith
acfdc64991
Merge pull request #1158 from s1061123/bump-ver
Bump golang and k8s API version
2023-09-17 13:18:16 -04:00
Tomofumi Hayashi
f8afd78120 Bump golang and k8s API version 2023-09-18 01:40:44 +09:00
Doug Smith
ddb977f4b9
Merge pull request #1154 from dcbw/shared-informer
Performance and efficiency improvements in daemon/server mode
2023-09-15 09:56:52 -04:00
Dan Williams
d9c06e99d1 server: don't set CNI config readinessindicatorfile when using ConfigManager
For whatever reason calling os.Stat() on the readiness indicator file
from CmdAdd()/CmdDel() when multus is running in server mode and is
containerized often returns "file not found", which triggers the
polling behavior of GetReadinessIndicatorFile(). This greatly delays
CNI operations that should be pretty quick. Even if an exponential
backoff is used, os.Stat() can still return "file not found"
multiple times, even though the file clearly exists.

But it turns out we don't need to check the readiness file in server
mode when running with MultusConfigFile == "auto". In this mode the
server starts the ConfigManager which (a) waits until the file exists
and (b) fsnotify watches the readiness and (c) exits the daemon
immediately if the file is deleted or moved.

This means we can assume that while the daemon is running and the
server is handling CNI requests that the readiness file exists;
otherwise the daemon would have exited. Thus CmdAdd/CmdDel don't
need to run a lot of possibly failing os.Stat() calls in the CNI
hot paths.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:58:19 -05:00
Dan Williams
b0df7dd5e3 server/config: use filepath.Join()
Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:58:19 -05:00
Dan Williams
fb4f4aa4c1 server/config: un-export some functions no longer used outside the module
Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:58:19 -05:00
Dan Williams
c2add82b93 server/config: fix MonitorPluginConfiguration test
The test was comparing the same configuration to itself, since
nothing in the changed CNI configuration is used in the written
multus configuration.

Instead make sure the updated CNI config contains something
that will be reflected in the written multus configuration,
and while we're there use a more robust way to wait for the
config to be written via gomega.Eventually().

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:58:19 -05:00
Dan Williams
8539a476fd server/config: consolidate ConfigManager start and fsnotify watching
Simplify setup by moving the post-creation operations like
GenerateConfig() and PersistMultusConfig() into a new Start() function
that also begins watching the configuration directory. This better
encapsulates the manager functionality in the object.

We can also get rid of the done channel passed to the config
manager and just use the existing WaitGroup to determine when to
exit the daemon main().

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:58:19 -05:00
Dan Williams
4ade85669b server/config: simplify ConfigManager creation
A couple of the setup variables for NewManager*() are already in the
multus config that it gets passed, so use those instead of passing
explicitly.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:58:19 -05:00
Dan Williams
50c0357467 server: use a shared informer pod cache rather than direct apiserver access
When running in server mode we can use a shared informer to listen for
Pod events from the apiserver, and grab pod info from that cache rather
than doing direct apiserver requests each time.

This reduces apiserver load and retry latency, since multus can poll
the local cache more frequently than it should do direct apiserver
requests.

Oddly static pods don't show up in the informer by the timeout and
require a direct apiserver request. Since static pods are not common
and are typically long-running, it should not be a big issue to
fall back to direct apiserver access for them.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-14 08:57:12 -05:00
Dan Williams
cec1a53cd8 server: simplify server start
Move server start code to a common function that both regular
and test code can use. Also shut down the server from the
testcases.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-13 07:54:41 -05:00
Dan Williams
1605ffcad5 daemon: remove unused done channel
Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-13 07:54:41 -05:00
Dan Williams
7c68481e43 vendor: add client-go and more apimachinery modules
We'll need these for the next commit.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-13 07:54:41 -05:00
Dan Williams
6b8d24c1ef server: make CmdAdd/Del/Check struct member functions
Then we can just use the Server struct kube client and exec rather
than passing them through the function parameters.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2023-09-13 07:54:41 -05:00