TestStart was previously flaky. In approx 100_000 local runs, it failed
about 70% of the time, and has been mentioned as a flaky unit test in
the past.
This flake was due to a race condition with the logic as written and the
go scheduler. UpdateThreshold calls `notifier.Start(events)` in a new Go
Routine, which is not guarunteed to be called immediately.
This meant that if `m.Start()` was called before `notifier.Start()`, the
test would fail, as the notifier would not have been started before the
4 events were processed and lock released.
Here, we update the test to more closely match the intended application
behaviour, and have events passed to the channel when `Start` is called
on the notifier.
This ensures that -Start gets called and additionally validates
that the correct channel is provided to the notifier.
Stop was never called previously, as it only gets called on a subsequent
call to UpdateThreshold. `AnyTimes()` hid that this did not occur.
cAdvisor has code to expose OOM metrics since 0.40.0, but this was not
included in Kubelet so far. This commit enables it.
Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>
- Modify GetAllEventsSinceThreadUnsafe to return a watchCacheInterval
- Modify Watch() to compute a watchCacheInterval rather than a slice
of all "initEvents" and pass this interval to process()
- Use interval::Next() to obtain events to process rather than obtain
them all at once
- Modify tests accordingly to use interval
- On invalidation, stop processing and stop the watch.
- Make indexValidator injectable for testing
- Add unit test for verifying the behaviour of stopping the watch.
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
"-bench=PerfScheduling/Preemption/500Nodes" ran both the
PerfScheduling/Preemption/500Nodes and the
PerfScheduling/PreemptionPVs/500Nodes benchmark.
This can be avoided by choosing names where none is the prefix of another.
When running an integration test that measures performance, like for example
test/integration/scheduler_perf, running etcd with debug level output is
undesirable because it creates additional load on the system and isn't
realistic.
The default is still "debug", but ETCD_LOGLEVEL=warn can be used to override
that.
GetNamespaceLabelsSnapshot has a fallback when it gets errors when looking up a
namespace, therefore reporting the error is more informational than a real
error. In particular, not finding the namespace is normal when running
test/integration/scheduler_perf and happens so frequently that there is a lot
of output on stderr:
E0120 12:19:09.204768 95305 plugin.go:138] "getting namespace, assuming empty set of namespace labels" err="namespace \"namespace-1\" not found" namespace="namespace-1"
to match their kubectl config subcommand names and reflect that they
are used for setting values, not just creating.
Deprecated NewCmdConfigSetAuthInfo in favor of NewCmdConfigSetCredentials.
Did some minor refactoring of one of the complete functions to eliminate
an unused argument and not wrap returned errors that do not add detail.