Commit Graph

118712 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
b7e3b9f7d9
Merge pull request #120508 from aojea/aojea_owner
add aojea as client-go approver
2023-09-11 13:12:11 -07:00
Ryan Phillips
43b434f66a sig-node: add rphillips to reviewers 2023-09-11 14:34:30 -05:00
Benjamin Elder
d1b5a9953a handle GOTOOLCHAIN in kube::golang::verify_go_version
for now:
- shim FORCE_HOST_GO to GOTOOLCHAIN=local
- treat GOTOOLCHAIN set and !=auto like FORCE_HOST_GO
- otherwise set GOTOOLCHAIN=go${GO_VERSION} and fallback to gimme if necessary

TODO: set toolchain statements in go.mod files and keep them in sync
2023-09-11 12:04:45 -07:00
Kubernetes Prow Robot
be968597e3
Merge pull request #120310 from gjkim42/use-container-log-instead-of-termination-log
Use container log instead of termination log
2023-09-11 11:52:23 -07:00
Kubernetes Prow Robot
74f6c263d8
Merge pull request #118544 from sohankunkerkar/remove-sandbox-image-ref
pkg/kubelet: allow sandbox image pinning from CRI
2023-09-11 11:52:12 -07:00
Ben Luddy
4d55c0687d
Reuse SupportedMediaTypes for CR content-type negotiation.
In the course of calling NegotiateOutputMediaType, each CR endpoint handler invocation instantiates
six JSON serializers. Each instantiation marshals the serializer options to JSON to construct the
serializer identifier. Under heavy CR GET load, CPU profiling shows approximately 16% of the time
spent in WriteObjectNegotiated was in SupportedMediaTypes, most of that within json.identifier().
2023-09-11 13:59:21 -04:00
Han Kang
09d64ed7d4 promote sli metrics to stable 2023-09-11 10:17:42 -07:00
Kubernetes Prow Robot
aa4ec3c5b0
Merge pull request #119944 from Sharpz7/jm/backup-finalizers
Adding backup code for removing finalizers to more Job End States.
2023-09-11 09:30:30 -07:00
Lars Ekman
0df4a69f5c
Kube-proxy: Get nodeIPs for both families with dual-stack (#119525)
* Kube-proxy: handle dual-stack in detectNodeIPs()

* Updates
2023-09-11 09:30:23 -07:00
Kensei Nakada
0d3eafdfa3
fix(scheduling_queue): always put Pods with no unschedulable plugins into activeQ/backoffQ (#119105)
* always put Pods with no unschedulable plugins into activeQ/backoffQ

* address review comments
2023-09-11 09:30:11 -07:00
Han Kang
e6435e98ed promote component SLIs to GA; remove feature gates for component slis 2023-09-11 09:15:32 -07:00
Kubernetes Prow Robot
b1161a8ac4
Merge pull request #120559 from pohly/e2e-framework-WaitForPodsResponding-retry
e2e pods: fix WaitForPodsResponding retry
2023-09-11 07:52:10 -07:00
Stephen Kitt
e2c1c0d34a
kubeadm: drop deprecated pointer package
This replaces deprecated k8s.io/utils/pointer functions with their ptr
equivalent.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-11 16:41:12 +02:00
Stephen Kitt
357d7804b8
kube-proxy: drop deprecated pointer package
This replaces deprecated k8s.io/utils/pointer functions with their ptr
equivalent.

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-09-11 16:38:37 +02:00
Paco Xu
678b958567 use universal decoder and add a check on default dns Policy of static pod for test 2023-09-11 22:31:35 +08:00
Gunju Kim
1fb4eee94e
Use container log instead of termination log
Since the termination log cannot be accessed until the container is
terminated, use the container log.
2023-09-11 22:55:09 +09:00
Kubernetes Prow Robot
6c578bc982
Merge pull request #120428 from pohly/dra-scheduler-reallocation-flake
DRA: scheduler reallocation flake
2023-09-11 02:56:12 -07:00
Patrick Ohly
fc3ee07b51 e2e pods: fix WaitForPodsResponding retry
The status error was embedded inside the new error constructed by
WaitForPodsResponding's get function, but not wrapped. Therefore
`apierrors.IsServiceUnavailable(err)` didn't find it and returned false -> no
retries.

Wrapping fixes this and Gomega formatting of the error remains useful:

	err := &errors.StatusError{}
	err.ErrStatus.Code = 503
	err.ErrStatus.Message = "temporary failure"

	err2 := fmt.Errorf("Controller %s: failed to Get from replica pod %s:\n%w\nPod status:\n%s",
		"foo", "bar",
		err, "some status")
	fmt.Println(format.Object(err2, 1))
        fmt.Println(errors.IsServiceUnavailable(err2))

=>

    <*fmt.wrapError | 0xc000139340>:
    Controller foo: failed to Get from replica pod bar:
    temporary failure
    Pod status:
    some status
    {
        msg: "Controller foo: failed to Get from replica pod bar:\ntemporary failure\nPod status:\nsome status",
        err: <*errors.StatusError | 0xc0001a01e0>{
            ErrStatus: {
                TypeMeta: {Kind: "", APIVersion: ""},
                ListMeta: {
                    SelfLink: "",
                    ResourceVersion: "",
                    Continue: "",
                    RemainingItemCount: nil,
                },
                Status: "",
                Message: "temporary failure",
                Reason: "",
                Details: nil,
                Code: 503,
            },
        },
    }

    true
2023-09-11 11:54:15 +02:00
Patrick Ohly
6f9140e421 DRA scheduler: stop allocating before deallocation
This fixes a test flake:

    [sig-node] DRA [Feature:DynamicResourceAllocation] multiple nodes reallocation [It] works
    /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:552

      [FAILED] number of deallocations
      Expected
          <int64>: 2
      to equal
          <int64>: 1
      In [It] at: /nvme/gopath/src/k8s.io/kubernetes/test/e2e/dra/dra.go:651 @ 09/05/23 14:01:54.652

This can be reproduced locally with

    stress -p 10 go test ./test/e2e -args -ginkgo.focus=DynamicResourceAllocation.*reallocation.works  -ginkgo.no-color -v=4 -ginkgo.v

Log output showed that the sequence of events leading to this was:
- claim gets allocated because of selected node
- a different node has to be used, so PostFilter sets
  claim.status.deallocationRequested
- the driver deallocates
- before the scheduler can react and select a different node,
  the driver allocates *again* for the original node
- the scheduler asks for deallocation again
- the driver deallocates again (causing the test failure)
- eventually the pod runs

The fix is to disable allocations first by removing the selected node and then
starting to deallocate.
2023-09-11 10:56:17 +02:00
Junhao Zou
2dd7db306a
Update modules.txt 2023-09-11 16:33:16 +08:00
Rohit Singh
61ecc2ad88 Retry operations if CSI Driver Isn't Found by Treating this Error as Transient 2023-09-11 06:07:40 +00:00
Qiutong Song
d3eb082568 Create a node startup latency tracker
Signed-off-by: Qiutong Song <songqt01@gmail.com>
2023-09-11 05:54:25 +00:00
Kubernetes Prow Robot
cc0a24d2e8
Merge pull request #120406 from wlq1212/cheanup/framework/timeout
e2e_framework:stop using deprecated wait.ErrwaitTimeout
2023-09-10 21:10:10 -07:00
Paco Xu
2d86c333f5 add test case for generating etcd manifests 2023-09-11 10:35:50 +08:00
Paco Xu
912041ce41 kubeadm: fix diff order and add test for new default value manifest 2023-09-11 10:35:50 +08:00
Stephen Heywood
41b62c4dd7 Promote PV/PVC e2e test to Conformance 2023-09-11 10:25:07 +12:00
Kubernetes Prow Robot
0ee315b94c
Merge pull request #120375 from pegasas/proxy
Improve logging on kube-proxy exit
2023-09-10 12:08:10 -07:00
Riaan Kleinhans
0936c8de59
remove persistentvolume endpoints from pending_eligible_endpoints.yaml 2023-09-11 06:53:28 +12:00
Kubernetes Prow Robot
098d4c7b9e
Merge pull request #120546 from SaumyaBhushan/issue
added documentation about the format of certificateKey
2023-09-10 10:26:10 -07:00
pegasas
f446745777 Improve logging on kube-proxy exit 2023-09-11 00:50:29 +08:00
Kubernetes Prow Robot
25c7a1439a
Merge pull request #120069 from aojea/service_conformance
promote to conformance Service multiprotocol tests
2023-09-10 07:26:09 -07:00
SaumyaBhushan
df5c1bb1ea added documentation about the format of certificateKey
Signed-off-by: SaumyaBhushan <saumya.bhushan666@gmail.com>
2023-09-10 19:50:42 +05:30
Kubernetes Prow Robot
49768134e5
Merge pull request #119754 from pbxqdown/kubelet-fix-typo
Fix some typos in kubelet component source code
2023-09-09 19:36:11 -07:00
Kubernetes Prow Robot
b343878daa
Merge pull request #120438 from ritazh/kmsv2-metrics-apiserverid
kmsv2: add apiserver identity to metrics
2023-09-09 16:46:09 -07:00
Rita Zhang
43ccf6c4e8
kmsv2: add apiserver identity to metrics
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
2023-09-09 15:31:32 -07:00
Kubernetes Prow Robot
33c5bd631d
Merge pull request #120008 from skitt/drop-intstr-ptr-wrappers
Use ptr.To to retrieve intstr addresses
2023-09-09 07:24:09 -07:00
Kubernetes Prow Robot
fd8f2c7fc6
Merge pull request #120541 from pacoxu/kubeadm-fix-hash
kubeadm: add log for static pod manifest diff
2023-09-09 06:08:08 -07:00
HirazawaUi
c1a0aa08e3 Add cni plugin auto Arch and OS selection 2023-09-09 20:33:12 +08:00
Paco Xu
b443a841e3 kubeadm: add log for static pod manifest diff 2023-09-09 20:00:31 +08:00
Kubernetes Prow Robot
21f7bf66fa
Merge pull request #120272 from tzneal/add-tzneal-to-sig-node-reviewers
OWNERS_ALIASES: add tzneal to sig-node-reviewer
2023-09-08 18:28:11 -07:00
Kubernetes Prow Robot
37cf2638c9
Merge pull request #119619 from skitt/intstr-parse-parseint
Limit intstr.Parse() to 32-bit integer parsing
2023-09-08 13:04:29 -07:00
Kubernetes Prow Robot
41689233b4
Merge pull request #120334 from pohly/scheduler-clear-unschedulable-plugins
scheduler: avoid false "unschedulable" pod state
2023-09-08 12:01:23 -07:00
Kubernetes Prow Robot
817488e4fa
Merge pull request #120082 from aojea/hostnetwork_services_fallback
e2e network test for udp services with hostNetwork clients
2023-09-08 12:01:12 -07:00
Alexander Zielenski
f135eed37b update codegen 2023-09-08 09:49:35 -07:00
Kubernetes Prow Robot
bec95ed575
Merge pull request #120527 from cpanato/bump-distroless
Bump distroless-iptables to v0.3.2
2023-09-08 09:36:29 -07:00
Kubernetes Prow Robot
15a019d841
Merge pull request #120526 from cpanato/update-prom
[releng] Update publishing-bot rules for active release branches that uses go1.20 to Go 1.20.8
2023-09-08 09:36:18 -07:00
Aleksandra Malinowska
d7264d0af0 Make StatefulSet restart pods with phase Succeeded 2023-09-08 17:47:17 +02:00
Kubernetes Prow Robot
d7aeb7f853
Merge pull request #120524 from jprzychodzen/kcm-args
[cluster/gce] Add possibility to specify KCM specific args for scalability tests
2023-09-08 08:24:26 -07:00
Kubernetes Prow Robot
f6a87aebe6
Merge pull request #120499 from tukwila/gorilla/websocket_v1.5.0
bump: upgrade gorilla/websocket from v1.4.2 to v1.5.0
2023-09-08 08:24:15 -07:00
Patrick Ohly
4e73634b53 scheduler: start scheduling attempt with clean UnschedulablePlugins
When some plugin was registered as "unschedulable" in some previous scheduling
attempt, it kept that attribute for a pod forever. When that plugin then later
failed with an error that requires backoff, the pod was incorrectly moved to the
"unschedulable" queue where it got stuck until the periodic flushing because
there was no event that the plugin was waiting for.

Here's an example where that happened:

     framework.go:1280: E0831 20:03:47.184243] Reserve/DynamicResources: Plugin failed err="Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" node="scheduler-perf-dra-7l2v2" plugin="DynamicResources" pod="test/test-dragxd5c"
    schedule_one.go:1001: E0831 20:03:47.184345] Error scheduling pod; retrying err="running Reserve plugin \"DynamicResources\": Operation cannot be fulfilled on podschedulingcontexts.resource.k8s.io \"test-dragxd5c\": the object has been modified; please apply your changes to the latest version and try again" pod="test/test-dragxd5c"
    ...
    scheduling_queue.go:745: I0831 20:03:47.198968] Pod moved to an internal scheduling queue pod="test/test-dragxd5c" event="ScheduleAttemptFailure" queue="Unschedulable" schedulingCycle=9576 hint="QueueSkip"

Pop still needs the information about unschedulable plugins to update the
UnschedulableReason metric. It can reset that information before returning the
PodInfo for the next scheduling attempt.
2023-09-08 16:52:36 +02:00