Commit Graph

113052 Commits

Author SHA1 Message Date
Nikhita Raghunath
fd8d92a29d pkg/controller/job: re-honor exponential backoff
This commit makes the job controller re-honor exponential backoff for
failed pods. Before this commit, the controller created pods without any
backoff. This is a regression because the controller used to
create pods with an exponential backoff delay before (10s, 20s, 40s ...).

The issue occurs only when the JobTrackingWithFinalizers feature is
enabled (which is enabled by default right now). With this feature, we
get an extra pod update event when the finalizer of a failed pod is
removed.

Note that the pod failure detection and new pod creation happen in the
same reconcile loop so the 2nd pod is created immediately after the 1st
pod fails. The backoff is only applied on 2nd pod failure, which means
that the 3rd pod created 10s after the 2nd pod, 4th pod is created 20s
after the 3rd pod and so on.

This commit fixes a few bugs:

1. Right now, each time `uncounted != nil` and the job does not see a
_new_ failure, `forget` is set to true and the job is removed from the
queue. Which means that this condition is also triggered each time the
finalizer for a failed pod is removed and `NumRequeues` is reset, which
results in a backoff of 0s.

2. Updates `updatePod` to only apply backoff when we see a particular
pod failed for the first time. This is necessary to ensure that the
controller does not apply backoff when it sees a pod update event
for finalizer removal of a failed pod.

3. If `JobsReadyPods` feature is enabled and backoff is 0s, the job is
now enqueued after `podUpdateBatchPeriod` seconds, instead of 0s. The
unit test for this check also had a few bugs:
    - `DefaultJobBackOff` is overwritten to 0 in certain unit tests,
    which meant that `DefaultJobBackOff` was considered to be 0,
    effectively not running any meaningful checks.
    - `JobsReadyPods` was not enabled for test cases that ran tests
    which required the feature gate to be enabled.
    - The check for expected and actual backoff had incorrect
    calculations.
2023-01-12 20:34:10 +05:30
Kubernetes Prow Robot
0ff0d0b94e
Merge pull request #115010 from cpanato/go119-main
releng: Update images, dependencies and version to Go 1.19.5
2023-01-12 06:00:54 -08:00
cpanato
eb38f1508a
releng: Update images, dependencies and version to Go 1.19.5
Signed-off-by: cpanato <ctadeu@gmail.com>
2023-01-12 13:43:57 +01:00
Kubernetes Prow Robot
457341c3d4
Merge pull request #114647 from kannon92/remove-legacy-job-tracking-job-controller
Removing Legacy Job Tracking Code
2023-01-12 04:38:53 -08:00
Kubernetes Prow Robot
0d6dc14051
Merge pull request #114783 from pohly/e2e-framework-timeouts
e2e framework: consolidate timeouts and intervals
2023-01-12 03:29:08 -08:00
Kubernetes Prow Robot
5b241820b7
Merge pull request #114417 from chendave/ginkgo_fix_spec
e2e: bring back total test spec for Ginkgo v2
2023-01-12 03:28:56 -08:00
Kubernetes Prow Robot
45df8f0bb3
Merge pull request #115002 from SataQiu/clean-20230112
kubeadm: remove the unused variable DefaultAuditPolicyLogMaxAge
2023-01-11 23:26:54 -08:00
SataQiu
3df577ea28 kubeadm: remove unused variable DefaultAuditPolicyLogMaxAge 2023-01-12 12:30:30 +08:00
Kubernetes Prow Robot
c9ed04762f
Merge pull request #114370 from enj/enj/r/reload_nits
encryption-at-rest: clean up context usage and duplicated code
2023-01-11 15:32:06 -08:00
Kubernetes Prow Robot
8fdaac238e
Merge pull request #114879 from olivierlemasle/bump-kube-openapi
Bump kube-openapi
2023-01-11 14:28:20 -08:00
Kubernetes Prow Robot
08d9a0ef5b
Merge pull request #113467 from pacoxu/psp-cleanup
Remove PodSecurityPolicy related code except client-go & API type
2023-01-11 14:28:07 -08:00
Kubernetes Prow Robot
97bbf07d3f
Merge pull request #114977 from apelisse/simplify-fieldmanager-test
fieldmanagertest: Reduce API surface of the test package
2023-01-11 10:31:57 -08:00
Kubernetes Prow Robot
7372e7e807
Merge pull request #114724 from tnqn/fix-lb-svc-delete-error
Do not log errors when ServiceHealthServer is closed normally
2023-01-11 10:31:45 -08:00
Kubernetes Prow Robot
280473ebc4
Merge pull request #114773 from yangjunmyfm192085/fixsmallerrorlog
fix a small log error about proxy
2023-01-11 07:51:43 -08:00
Kubernetes Prow Robot
14c2d7b39b
Merge pull request #114980 from mimowo/do-not-include-scheduler-name-in-event
Do not include scheduler name in the preemption event message
2023-01-11 06:43:56 -08:00
Kubernetes Prow Robot
6f6c468168
Merge pull request #114802 from moshe010/pod-resource-metrics
kubelet podresource: fix GetAllocatableResources metrics
2023-01-11 06:43:44 -08:00
Kubernetes Prow Robot
6699db9f59
Merge pull request #114957 from claudiubelu/kubeadm-preflight-checks-admin
unit tests: Fixes kubeadm enforce requirements test
2023-01-11 03:33:43 -08:00
Olivier Lemasle
8b8e20fcdb Bump kube-openapi 2023-01-11 11:48:07 +01:00
Rafa de Castro
a887a3b4fd
Changed code to improve output messages on error for files under test/e2e/apps (#109944)
* Improving the output of tests in case of error

* Better error message

Also, the condition in the second case was reversed

* Fixing 2 tests whose condition was inverted

* Again I got the conditions wrong

* Sorry for the confusion

* Improved error messages on failures
2023-01-11 02:11:44 -08:00
Kubernetes Prow Robot
cfa6ad50e6
Merge pull request #114972 from seans3/remove-openapi-printing
Removes deprecated kubectl openapi column printing
2023-01-11 00:53:45 -08:00
Michal Wozniak
437179afc3 Do not include scheduler name in the preemption event message 2023-01-11 09:32:21 +01:00
Kubernetes Prow Robot
6882e76c60
Merge pull request #114063 from ruquanzhao/fixNetworkTypesDoc
fix doc of types.go of network v1, v1alpha1, v1beta1
2023-01-10 23:47:56 -08:00
Kubernetes Prow Robot
f56c79398e
Merge pull request #112365 from dgrisonnet/consolidate-isomorphic-events
Update isomorphic event definition in the events/v1 client to match aggregation logic from core/v1
2023-01-10 23:47:44 -08:00
Kubernetes Prow Robot
990b2f86fa
Merge pull request #114938 from seans3/patcher-remove-kube-openapi
Removes kube-openapi dependency from Patcher
2023-01-10 21:35:22 -08:00
Antoine Pelisse
7899157345 fieldmanagertest: Reduce API surface of the test package 2023-01-10 20:05:41 -08:00
Kubernetes Prow Robot
5d794f881a
Merge pull request #114910 from SataQiu/update-staging-readme
Update staging README.md
2023-01-10 19:51:19 -08:00
Sean Sullivan
2f184814b8 Removes deprecated kubectl openapi column printing 2023-01-10 17:37:18 -08:00
Kubernetes Prow Robot
7e97b4b322
Merge pull request #114868 from apelisse/private-internal-managers
fieldmanager: Make internal managers private
2023-01-10 16:33:19 -08:00
Kubernetes Prow Robot
cf81822d38
Merge pull request #114970 from tkashem/waitgroup-refactor
apiserver: refactor WithWaitGroup handler
2023-01-10 15:25:45 -08:00
Kubernetes Prow Robot
a11ad04564
Merge pull request #114859 from pohly/e2e-ginkgo-spec-ordering
dependencies: update ginkgo to v2.7.0
2023-01-10 15:25:37 -08:00
Kubernetes Prow Robot
2a2f994c24
Merge pull request #114187 from claudiubelu/refactor-platform-deps-3
Refactors kubelet's plugin watcher
2023-01-10 15:25:26 -08:00
Monis Khan
70b414b0e5
encryption-at-rest: clean up context usage and duplicated code
This change in a no-op refactor of the encryption at rest code that
primarily changes the wiring to consistently use context for
lifecycle management (instead of a mixture of context and stop
channels).

Signed-off-by: Monis Khan <mok@microsoft.com>
2023-01-10 17:13:27 -05:00
Kubernetes Prow Robot
564f438892
Merge pull request #114691 from thockin/fix-pod-warning-string
Make the warning about pod name clearer
2023-01-10 13:47:38 -08:00
Kubernetes Prow Robot
5a896bf379
Merge pull request #114677 from kl52752/epd-warning-address-type
Generate warning for EndpointSlice AddressType FQDN
2023-01-10 13:47:27 -08:00
Abu Kashem
9093f126b8
apiserver: refactor WithWaitGroup handler 2023-01-10 16:18:55 -05:00
Kubernetes Prow Robot
f1e74f77ff
Merge pull request #114959 from ncdc/make-cr-conversions-safer
CR conversion: protect from converter input edits
2023-01-10 12:05:37 -08:00
Kubernetes Prow Robot
d4faca5386
Merge pull request #114954 from liggitt/head-prune
Include head and tail of clipped test messages
2023-01-10 12:05:25 -08:00
Kubernetes Prow Robot
aab3fb3a1e
Merge pull request #114940 from Rajalakshmi-Girish/fix-apiserver-ut-timeout-fail
Fixes the issue #114145
2023-01-10 10:39:59 -08:00
Kubernetes Prow Robot
eca8503574
Merge pull request #114742 from Transmitt0r/autoscaling-enhance-assertions
Changed remaining code to improve output for files under test/e2e/autoscaling
2023-01-10 10:39:47 -08:00
Kubernetes Prow Robot
1c30eee9a8
Merge pull request #114693 from wzshiming/fix/test
Fix this e2e failure causes subsequent e2e failures altogether
2023-01-10 10:39:36 -08:00
Kubernetes Prow Robot
2d08117e9e
Merge pull request #114065 from ruquanzhao/fixNodeTypesDoc
fix doc of types.go of node
2023-01-10 10:39:25 -08:00
Damien Grisonnet
21f2f746ab event_broadcaster: update isomorphic event def
Update the definition of an isomorphic event in the events/v1 client to
match the aggregation logic that was already present in the core/v1
implementation.

The note field was omitted even though the message was used in the core
API aggregation because we didn't reach consensus.

Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
2023-01-10 19:29:27 +01:00
Andy Goldstein
f14cc7fdfc
CR conversion: protect from converter input edits
Deep copy the input list before invoking the converter to protect from a
converter that mutates the input list.

Signed-off-by: Andy Goldstein <andy.goldstein@redhat.com>
2023-01-10 12:53:16 -05:00
Claudiu Belu
3af2c257e8 unit tests: Fixes kubeadm enforce requirements test
enforceRequirements will run preflight checks, including whether the user
is privileged is not. Because of this, the test will make different assertions
based on the user's UID. However, we don't have UIDs on Windows, so we're asserting
the wrong thing.

This fix addresses the issue.
2023-01-10 16:56:14 +00:00
Kubernetes Prow Robot
50a0bc8de1
Merge pull request #114953 from enj/enj/i/csi_migration_file_gate
Prevent CSIMigrationAzureFile gate from being disabled
2023-01-10 08:55:38 -08:00
Kubernetes Prow Robot
ef2ef15476
Merge pull request #114952 from liggitt/verify-vendor-tidy
Improve vendor verification works for each staging repo
2023-01-10 08:55:26 -08:00
Jordan Liggitt
3b64cb5f11
Include head and tail of clipped test messages 2023-01-10 11:26:34 -05:00
Kubernetes Prow Robot
5cbd6960c8
Merge pull request #114937 from seans3/export-delete-option
Exports WarningPrinter field in DeleteOptions
2023-01-10 06:59:28 -08:00
kannon92
6dfaeff33c Remove Legacy Job Tracking 2023-01-10 14:52:54 +00:00
Monis Khan
0b22cb0b72
Prevent CSIMigrationAzureFile gate from being disabled
Signed-off-by: Monis Khan <mok@microsoft.com>
2023-01-10 09:43:35 -05:00