For whatever reason calling os.Stat() on the readiness indicator file
from CmdAdd()/CmdDel() when multus is running in server mode and is
containerized often returns "file not found", which triggers the
polling behavior of GetReadinessIndicatorFile(). This greatly delays
CNI operations that should be pretty quick. Even if an exponential
backoff is used, os.Stat() can still return "file not found"
multiple times, even though the file clearly exists.
But it turns out we don't need to check the readiness file in server
mode when running with MultusConfigFile == "auto". In this mode the
server starts the ConfigManager which (a) waits until the file exists
and (b) fsnotify watches the readiness and (c) exits the daemon
immediately if the file is deleted or moved.
This means we can assume that while the daemon is running and the
server is handling CNI requests that the readiness file exists;
otherwise the daemon would have exited. Thus CmdAdd/CmdDel don't
need to run a lot of possibly failing os.Stat() calls in the CNI
hot paths.
Signed-off-by: Dan Williams <dcbw@redhat.com>
The test was comparing the same configuration to itself, since
nothing in the changed CNI configuration is used in the written
multus configuration.
Instead make sure the updated CNI config contains something
that will be reflected in the written multus configuration,
and while we're there use a more robust way to wait for the
config to be written via gomega.Eventually().
Signed-off-by: Dan Williams <dcbw@redhat.com>
Simplify setup by moving the post-creation operations like
GenerateConfig() and PersistMultusConfig() into a new Start() function
that also begins watching the configuration directory. This better
encapsulates the manager functionality in the object.
We can also get rid of the done channel passed to the config
manager and just use the existing WaitGroup to determine when to
exit the daemon main().
Signed-off-by: Dan Williams <dcbw@redhat.com>
A couple of the setup variables for NewManager*() are already in the
multus config that it gets passed, so use those instead of passing
explicitly.
Signed-off-by: Dan Williams <dcbw@redhat.com>
When running in server mode we can use a shared informer to listen for
Pod events from the apiserver, and grab pod info from that cache rather
than doing direct apiserver requests each time.
This reduces apiserver load and retry latency, since multus can poll
the local cache more frequently than it should do direct apiserver
requests.
Oddly static pods don't show up in the informer by the timeout and
require a direct apiserver request. Since static pods are not common
and are typically long-running, it should not be a big issue to
fall back to direct apiserver access for them.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Move server start code to a common function that both regular
and test code can use. Also shut down the server from the
testcases.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Then we can just use the Server struct kube client and exec rather
than passing them through the function parameters.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Multus is a pretty critical piece of infrastructure, so it shouldn't
be subject to the same lower QPS limits as most components are.
Signed-off-by: Dan Williams <dcbw@redhat.com>
We want the in-cluster client that the multus server uses to use
the same client config (QPS, protobuf, grpc, etc) as the regular
client.
Signed-off-by: Dan Williams <dcbw@redhat.com>
The logging package contains two functions, SetLogOptions and SetLogFile, that could experience race conditions when multiple goroutines access and modify the logger struct concurrently.
To address these issues, a copy of the logger struct is now created in each function to eliminate data races.
In addition, the test-go.sh script is updated to include the '-race' flag, enabling race detection during testing. This change helps prevent future race conditions by activating the Go race detector.
Signed-off-by: Alina Sudakov <asudakov@redhat.com>
This change fixes multus to support CNI plugin which does not
create any interface and return empty result. Some CNI plugin
may do network configuration change only and does not create
any interface and return empty CNI result. Current multus assumes
that CNI config always creates some interfaces hence above CNI
plugin is out of assumption and multus may not work with such
plugins.
Thick server's chroot mutex is missing in GetDefaultNetworks,
that touch the pod filesystem. This change adds mutex lock there
and prevent race condition.
Fix#1072