Currently, some of the E2E test images have Windows support, and one of the goals is for most of
them to have Windows support. For that, the Image Builder is currently building those Windows
container images using a few Windows Server nodes (for 1809, 1903, 1909) with Remote Docker
enabled which are hosted on an azure subscription dedicated for CNCF.
With this, the Windows nodes dependency is removed entirely, as the images can be also built with
docker buildx. One additional benefit to this is that adding new supported Windows OS versions
to the E2E test images manifest lists becomes a lot easier (we wouldn't have to create a new Windows
Server node that matches that new OS version, assign DNS name, update certificates, etc.), and it
also becomes easier for other people to build their own E2E windows test images.
However, some dependencies are still required to run on a Windows machine. To solve this, we can
just pull helper images: e2eteam/powershell-helper:6.2.7 and e2eteam/busybox-helper:1.29.0. Their
Dockerfiles and a Makefile for them has been included in this commit. If any change is required to
them, then a new image will be built and tagged under a different version, but they are pretty
straight-forward and shouldn't require changes.
However, there is a small concern when it comes to the build time: Windows servercore images are
very large (for example, mcr.microsoft.com/windows/servercore:ltsc2019 is 4.99GB uncompressed, and
about ~2 GB compressed - those images are already cached on the Windows Server builder nodes, so
this isn't an issue there), and we currently support 1809, 1903, and 1909 (soon to add 2004).
This can lead to build times that are too big.
We have changed the base image to nanoserver (uncompressed size: 250MB), but some images still
require some DLLs or some other dependencies that can be fetched from a servercore image.
A separate job has been defined that would build a scratch windows-servercore-cache image monthly,
and then we can just get those dependencies from this cache, which will be very small.
This would be preferred, as the Windows images update periodically, and those dependencies
could be updated as well.
vendor/k8s.io/metrics/pkg/client/custom_metrics/multi_client.go:49:4: ineffective break statement. Did you mean to break out of the outer loop? (SA4011)
vendor/k8s.io/metrics/pkg/client/custom_metrics/versioned_client.go:38:2: var codecs is unused (U1000)
1. cd to root dir before removing temp installer path. It was failing because we were trying to remove while being in the same dir.
2. Expand variables in a regular string and use it in the command. Expansion was failing in single quotes.
The available_controller creates short-lived clients to sync remote APIService
objects. These clients are constructed with HTTP transports that cannot
be cached by client-go (because client-go won't know whether the TLS configs
have dynamic functions or not), which may spam idle connections. A local
cache works because we know all the configs share the same dialer
function, and can only vary on the dynamic cert/key.
when DefaultPodTopologySpread feature is enabled
If SelectorSpreadPriority is in use, PodTopologySpread gets inevitably enabled.
When only EvenPodsSpreadPriority is in use, PodTopologySpread is configured without system defaults.
Change-Id: I2389a585cd8ad0bd35b0d2acae1665cd46908b3e
When a pod is deleted, it is given a deletion timestamp. However the
pod might still run for some time during graceful shutdown. During
this time it might still produce CPU utilization metrics and be in a
Running phase.
Currently the HPA replica calculator attempts to ignore deleted pods
by skipping over them. However by not adding them to the ignoredPods
set, their metrics are not removed from the average utilization
calculation. This allows pods in the process of shutting down to drag
down the recommmended number of replicas by producing near 0%
utilization metrics.
In fact the ignoredPods set is misnomer. Those pods are not fully
ignored. When the replica calculator recommends to scale up, 0%
utilization metrics are filled in for those pods to limit the scale
up. This prevents overscaling when pods take some time to startup. In
fact, there should be 4 sets considered (readyPods, unreadyPods,
missingPods, ignoredPods) not just 3.
This change renames ignoredPods as unreadyPods and leaves the scaleup
limiting semantics. Another set (actually) ignoredPods is added to
which delete pods are added instead of being skipped during
grouping. Both ignoredPods and unreadyPods have their metrics removed
from consideration. But only unreadyPods have 0% utilization metrics
filled in upon scaleup.