- test/e2e/framework/*.go should have very minimal dependencies.
We can enforce that via import-boss.
- What each test/e2e/framework/* sub-package uses is less relevant,
although ideally it also should be as minimal as possible in each case.
Enforcing this via import-boss ensures that new dependencies get flagged as
problem and thus will get additional scrutiny. It might be okay to add them,
but it needs to be considered.
All code must use the context from Ginkgo when doing API calls or polling for a
change, otherwise the code would not return immediately when the test gets
aborted.
ginkgo.DeferCleanup has multiple advantages:
- The cleanup operation can get registered if and only if needed.
- No need to return a cleanup function that the caller must invoke.
- Automatically determines whether a context is needed, which will
simplify the introduction of context parameters.
- Ginkgo's timeline shows when it executes the cleanup operation.
The "todo" packages were necessary while moving code around to avoid hitting
cyclic dependencies. Now that any sub package can depend on the framework, they
are no longer needed and the code can be moved into the normal sub packages.
The MetricsGrabber checked whether a component supported metrics
grabbing, but then tests didn't have an API to use the result of that
check. Because metrics grabbing is an optional debug feature, tests
must skip checks that depend on metrics data or, when the entire
test is about metrics data, skip the test.
This is now supported with a special error that gets wrapped and
returned by the individual Grab functions.
This can be checked by trying to retrieve log output. As in the case
of no pod found, a warning gets emitted when log retrieval fails and
metrics grabbing gets disabled.
Logging is checked instead of actual metrics retrieval because the
latter is more complex and thus more likely to fail for other reasons.
The previous approach with grabbing via a nginx proxy had some
drawbacks:
- it did not work when the pods only listened on localhost (as
configured by kubeadm) and the proxy got deployed on a different
node
- starting the proxy raced with starting the pods, causing
sporadic test failures because the proxy was not set up
properly unless it saw all pods when starting the e2e.test
- the proxy was always started, whether it is needed or not
- the proxy was left running after a test and then the next
test run triggered potentially confusing messages when
it failed to create objects for the proxy
The new approach is similar to "kubectl port-forward" + "kubectl get
--raw". It uses the port forwarding feature to establish a TCP
connection via a custom dialer, then lets client-go handle TLS and
credentials.
Somehow verifying the server certificate did not work. As this
shouldn't be a big concern for E2E testing, certificate checking gets
disabled on the client side instead of investigating this further.
As its name, DeprecatedMightBeMasterNode is deprecated.
In e2e metrics, the function was used for knowing master node name to
get metrics from kube-scheduler and kube-controller-manager pods.
This make e2e metrics get these metrics directly by getting those pod
names without calling DeprecatedMightBeMasterNode().
Some e2e tests depend on the controller-manager to expose metrics
on the path /metrics.
It may happen that when the test runs, the pod is not available or the
URL not ready, causing it to fail.
Previously, the test were waiting until the pod was running, but we
need to wait until the /metrics URL is ready.
The MetricsGrabber may use the controller-manager pod
to gather metrics, however, it doesn't wait until
it is ready to serve, failing the test if this is the
case.
We wait until the controller-manager pod is running
before trying to get metrics from it.
This PR does minimal changes to interface to allow removing all
references to prometheus from `test` directory. In future I would expect
wrapping prometheus samples to provide better abstraction. Changes:
Move generic_metrics.go to testutil/metrics.go
Remove etcd.go as it was not called
Move prometheus label consts to testutil.
PrettyPrintJSON is most used e2emetrics function and that doesn't seem
specific for metrics. The implementation itself is generic, so it is
nice to move it to core framework for avoiding circular dependency.