cpu.cfs_period_us is 100μs by default despite having an "ms" unit
for some unfortunate reason. Documentation:
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html#management
The desired effect of that change is to match
k8s default `CPUCFSQuotaPeriod` value (100ms before that change)
with one used in k8s without the `CustomCPUCFSQuotaPeriod` flag enabled
and Linux CFS (100us, 1000x smaller than 100ms).
To preserve loose coupling, it is needed to pass `RESTClientGetter`
instead `cmdutil.Factory` for all kubectl commands.
This PR removes `cmdutil.Factory` usage in `cluster-info` command and
instead passes `RESTClientGetter`.
The functionality provided by the finalURLTemplate is still used by
certain external projects to track the request latency for requests
performed to kube-apiserver.
Using a template of the URL, instead of the URL itself, prevents the
explosion of label cardinality in exposed metrics since it aggregates
the URLs in a way that common URLs requests are reported as being the
same.
This reverts commit bebf5a608f.
Signed-off-by: André Martins <aanm90@gmail.com>
Currently `kubectl apply` determines correct patch type for given
GVKs by trying to register schema and if it succeeds, it uses
strategic-merge-patch.
But OpenAPI endpoint already stores which patch types are supported
by GVKs. This PR checks OpenAPI endpoint to retrieve patch type,
if OpenAPI is enabled. If it is not enabled, patch type determination
will be done as conventional registration method.
Services which fail to be successfully synced as a cause by a triggered node event
are actually never retried. The commit before this one gave an example of when such
services used to be retried before, but that was not really efficient nor fully
correct. Moving to a workqueue for node events is a more modern approach to syncing
nodes, and placing all service keys that have failed on the service workqueue, in
case they do, fixes the re-sync problem
Also, now that we are using a node workqueue and use one go-routine to service items
from that queue, we don't need the `nodeSyncLock` anymore. So further clean that up
from the controller.
It dawned on me that `needsFullSync` can never be false. `needsFullSync` was used
to compare the set of nodes that were existing last time the node event handler was
triggered, with the current set of node for this run. However, if `triggerNodeSync`
gets called it's always because the set of nodes have changed due to a condition
changing on one node, or a new node being added/removed. If `needsFullSync` can
never be false then a lot of things in the service sync path was just spurious, for
ex: `servicesToRetry`, `knownHosts`. Essentially: if we ever need to `triggerNodeSync`
then the set of nodes have somehow changed and we always need to re-sync all services.
Before this patch series there was a possibility for `needsFullSync` to be set to false.
`shouldSyncNode` and the predicates used to list nodes were not aligned, specifically
for Unschedulable nodes. This means that we could have been triggered by a change to
the schedulable state but not actually computed any diffs between the old vs. new nodes.
Meaning, whenever there was a change in schedulable state we would just try to re-sync
all service updates that might have failed when we synced last time. But I believe this
to be an overlooked coincidence, rather than something actually intended.
To preserve loose coupling, it is needed to pass `RESTClientGetter`
instead `cmdutil.Factory` for all kubectl commands.
This PR removes `cmdutil.Factory` usage and instead
passes `RESTClientGetter` as well as required changes in unit tests.
The Priority is determined as follows:
P0: ClusterCIDR with higher number of matching labels has highest
priority.
P1: ClusterCIDR having cidrSet with fewer allocatable Pod CIDRs has
higher priority.
P2: ClusterCIDR with a PerNodeMaskSize having fewer IPs has higher
priority.
P3: ClusterCIDR having label with lower alphanumeric value has higher
priority.
P4: ClusterCIDR with a cidrSet having a smaller IP address value has
higher priority.
Add a new cidrset named `multicidrset` which extends the current
cidrset mechanism to track allocatable Pod and Service CIDRs.
multicidrset stores the info about allocated CIDRs in a Map as opposed
to the current cidrset implementation where it is stored in a bitmap.
Introduce networking/v1alpha1 api group.
Add `ClusterCIDR` type to networking/v1alpha1 api group, this type
will enable the NodeIPAM controller to support multiple ClusterCIDRs.