By default check the KCM and scheduler on 127.0.0.1:<port> as that is the
defaall --bind-address kubeamd uses for these components.
For kube-apiserver take the value from APIEndpoint.AdvertiseAddress which is
dynamically detected from the host. Unless the user has passed explicitly --advertise-address
as an extra arg.
Read the <port> values for all components from the --secure-port flag
value if needed. Otherwise use defaults.
Use /livez for apiserver and scheduler. Add TODO for KCM to
switch to /livez as well.
- Split the code that tries to get node name from SSR into
a new function getNodeNameFromSSR(). Unit test the function.
- Fix error that the "system:nodes:" prefix was not trimmed.
- Fix mislearding errors around FetchInitConfigurationFromCluster.
This function performs multiple actions, and the "get node"
action can also be of type apierrors.NotFound(). This creates
confusion in the returned error in enforceRequirement during
upgrade. Fix this problem.
Make the following changes:
- When dryrunning if the given kubeconfig does not exist
create a DryRun object without a real client. This means only
a fake client will be used for all actions.
- Skip the preflight check if manifests exist during dryrun.
Print "would ..." instead.
- Add new reactors that handle objects during upgrade.
- Add unit tests for new reactors.
- Print message on "upgrade node" that this is not a CP node
if the apiserver manifest is missing.
- Add a new function GetNodeName() that uses 3 different methods
for fetching the node name. Solves a long standing issue where
we only used the cert in kubelet.conf for determining node name.
- Various other minor fixes.
The new DRAAdminAccess feature gate has the following effects:
- If disabled in the apiserver, the spec.devices.requests[*].adminAccess
field gets cleared. Same in the status. In both cases the scenario
that it was already set and a claim or claim template get updated
is special: in those cases, the field is not cleared.
Also, allocating a claim with admin access is allowed regardless of the
feature gate and the field is not cleared. In practice, the scheduler
will not do that.
- If disabled in the resource claim controller, creating ResourceClaims
with the field set gets rejected. This prevents running workloads
which depend on admin access.
- If disabled in the scheduler, claims with admin access don't get
allocated. The effect is the same.
The alternative would have been to ignore the fields in claim controller and
scheduler. This is bad because a monitoring workload then runs, blocking
resources that probably were meant for production workloads.
Since 1.31 the core component cloud provider logic should not exist,
this disables the existing code in the kube-controller-manager that still
expects to work with the cloud-provider logic to avoid having time bombs
in the code base that can break the component.
The code can not be completely removed because this will impact existing
users that may be using some of the flags breaking their deployments, so
this just removes the code that is no longer to be used becuase it
depends on options that no longer are exposed to users.
It also adds validation on the configuration/flag level to ensure that
the --cloud-provider flag can only be set to external or empty.
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.
The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
Refactor Healthz with Metrics Address for internal configuration of
kube-proxy adhering to the v1alpha2 version specifications as detailed
in https://kep.k8s.io/784.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
In Go unit tests, the entire test output becomes the failure message because
`go test` doesn't track why a test fails. This can make the failure message
pretty large, in particular in integration tests.
We cannot identify the real failure either because Kubernetes has no convention
for how to format test failures. What we can do is recognize log output added
by klog.
prune-junit-xml now moves the full text to to the test output and only keep
those lines in the failure which are not from klog.
The klog output parsing might eventually get moved to
k8s.io/logtools/logparse. For now it is developed as a sub-package of
prune-junit-xml.