Commit Graph

93232 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
1fdd8fb213
Merge pull request #93263 from liggitt/windows
Fix windows kubelet startup
2020-07-20 19:51:57 -07:00
Kubernetes Prow Robot
275eabdf72
Merge pull request #93259 from jpbetz/revert-88936
Revert nested trace PR#88936
2020-07-20 19:51:47 -07:00
Kubernetes Prow Robot
b467072a55
Merge pull request #93256 from ahg-g/ahg-metric
Rename pod_preemption_metrics to preemption_metrics.
2020-07-20 19:51:37 -07:00
Kubernetes Prow Robot
bb079afdef
Merge pull request #93253 from liggitt/utils-trace
Update k8s.io/utils
2020-07-20 19:51:28 -07:00
Kubernetes Prow Robot
c09ecf13a5
Merge pull request #93248 from giuseppe/cgroup-set-max-shares
kubelet: clamp cpu.shares to max allowed
2020-07-20 19:51:14 -07:00
Stephen Heywood
86ba88d52f Promote: Discovery PreferredVersion test 2020-07-21 00:30:25 +00:00
Kubernetes Prow Robot
5a529aa3a0
Merge pull request #91399 from danwinship/endpoint-ipfamily
multiple IPv6/dual-stack endpoint fixes
2020-07-20 13:31:14 -07:00
wawa0210
aea228f5dd fix no-new-privileges on windows 2020-07-20 16:14:52 -04:00
Jordan Liggitt
886727a4c0 Revert "Add deviceManager in windows container manager"
This reverts commit 056d73b1a1.
2020-07-20 16:13:53 -04:00
Joe Betz
02cf58102a Revert nested trace PR#88936 2020-07-20 09:55:05 -07:00
Benjamin Pineau
fcb3f1f64c Tests fixes for Azure per-VMSS VMs caches
Signed-off-by: Benjamin Pineau <benjamin.pineau@datadoghq.com>
2020-07-20 18:35:23 +02:00
Benjamin Pineau
85ecd0e17c Azure: per VMSS, incremental VMSS VMs cache
Azure's cloud provider VMSS VMs API accesses are mediated through
a cache holding and refreshing all VMSS together.

Due to that we hit VMSSVM.List API more often than we could: an
instance's cache miss or expiration should only require a single
VMSS re-list, while it's currently O(n) relative to the number of
attached Scale Sets.

Under hard pressure (clusters with many attached VMSS that can't all
be listed in one sequence of successive API calls) the controller
manager might be stuck trying to re-list everything from scratch,
then aborting the whole operation; then re-trying and re-triggering
API rate-limits, affecting the whole Subscription.

This patch replaces the global VMSS VMs cache by per-VMSS VMs caches.
Refreshes (VMSS VMs lists) are scoped to the single relevant VMSS; under
severe throttling the various caches can be incrementally refreshed.

Signed-off-by: Benjamin Pineau <benjamin.pineau@datadoghq.com>
2020-07-20 18:35:23 +02:00
Kubernetes Prow Robot
5feab0aa1e
Merge pull request #93207 from hasheddan/nvidia-gpu-installer
Use local daemonset manifest for installing Nvidia drivers
2020-07-20 09:02:51 -07:00
Abdullah Gharaibeh
6f9794d5e9 Rename pod_preemption_metrics to preemption_metrics. Since this metric's type was changed from Gauge to Histogram, renaming it should make it easier to providers to migrate 2020-07-20 11:44:10 -04:00
Giuseppe Scrivano
ef935bd991
kubelet: clamp cpu shares to max allowed
clamp the max cpu.shares to the maximum value allowed by the kernel.

It is not an issue when using cgroupfs, as the kernel will
anyway make sure the value is not out of range and automatically clamp
it, systemd has an additional check that prevents the cgroup creation.

Closes: https://github.com/kubernetes/kubernetes/issues/92855

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-07-20 17:18:03 +02:00
Jordan Liggitt
7aacbeac14 Update k8s.io/utils 2020-07-20 11:12:29 -04:00
Kubernetes Prow Robot
c237804533
Merge pull request #92755 from chelseychen/event-e2e-conformance
Promote Event CRUD tests to conformance
2020-07-20 05:50:51 -07:00
Kevin Klues
00df26a985 Fix a bug whereby reusable CPUs and devices were not being honored
Previously, it was possible for reusable CPUs and reusable devices (i.e.
those previously consumed by init containers) to not be reused by
subsequent init containers or app containers if the TopologyManager was
enabled. This would happen because hint generation for the
TopologyManager was not considering the reusable devices when it made
its hint calculation.

As such, it would sometimes:
1) Generate a hint for a differnent NUMA node, causing the CPUs and
devices to be allocated from that node instead of the one where the
reusable devices live; or
2) End up thinking there were not enough CPUs or devices to allocate and
throw a TopologyAffinity admission error

This patch fixes this by ensuring that reusable CPUs and devices are
considered as part of TopologyHint generation. This frunctionality is
difficult to unit test since it spans multiple components, but an e2e
test will be added in a subsequent patch to test this functionality.
2020-07-20 11:41:13 +00:00
Kevin Klues
74fe9364c3 Simplify logic in devicemanager TopologyHint generation 2020-07-20 11:41:13 +00:00
Kevin Klues
9f5f401d60 Add AnySet() to topologymanager bitmask API 2020-07-20 11:41:13 +00:00
Nikhita Raghunath
c3b75416a8 publishing: use go 1.14.6 for master and release-1.19
The `default-go-version` field specifies the go version used for the
master branch, and if the go version is not explicitly specified for a
release branch.

This commit also uses go 1.14.6 for the `release-1.19` branch.
2020-07-20 14:02:30 +05:30
Kubernetes Prow Robot
43fbe17dc6
Merge pull request #93128 from gaurav1086/convertMaptoMapPointer_fix_range_iterator_issue
[staging/azure] azure_utils: fix range iterator issue in convertMaptoMapPointer
2020-07-19 21:02:50 -07:00
Caleb Woodbine
125e839d77 Fix formatting 2020-07-20 13:16:35 +12:00
Caleb Woodbine
05163497bc Fix bazel build 2020-07-20 11:15:57 +12:00
Caleb Woodbine
b38d7f25fe Remove watch tooling 2020-07-20 11:00:37 +12:00
Kubernetes Prow Robot
6ceb6c6845
Merge pull request #93134 from logicalhan/metric-handler
Add reset handler to the instrumentation metric library and expose Reset on the metric registries
2020-07-19 15:48:50 -07:00
Caleb Woodbine
dc30156fb8 Update error handling formatting, handling of type conversion in watch event loop 2020-07-20 10:03:49 +12:00
Caleb Woodbine
6e04fbdde1 Update error statements 2020-07-20 10:03:49 +12:00
Caleb Woodbine
a2c19d7ae0 Add watch checks 2020-07-20 10:03:49 +12:00
Caleb Woodbine
a4e29f2481 Fix formatting 2020-07-20 10:03:49 +12:00
Caleb Woodbine
cb7835bcb0 Add check for unmarshalling onto a Pod object type 2020-07-20 10:03:49 +12:00
Caleb Woodbine
c6a86b5fed Fix test to use values from v1, wording; Update variables to be more templatable 2020-07-20 10:03:49 +12:00
Caleb Woodbine
47cd8dde56 Update to check response data of UpdateStatus instead of listing after updating the status 2020-07-20 10:03:49 +12:00
Caleb Woodbine
19e9368eb8 Create Pod+PodStatus resource lifecycle test 2020-07-20 10:03:49 +12:00
Amim Knabben
1044840f6e Documenting TEST_ARGS on Node E2E helper 2020-07-19 14:37:28 -04:00
Kubernetes Prow Robot
363c3b89f5
Merge pull request #93198 from justaugustus/go1146
Update Golang to v1.14.6
2020-07-19 09:10:50 -07:00
hasheddan
4e4d629af7
Return error instead of panic if container index outside bounds
Adds check for index out of bounds error instead of panic when passing
container to kubectl exec.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
2020-07-19 10:04:53 -05:00
Kubernetes Prow Robot
66020d2292
Merge pull request #93110 from dims/adding-dims-as-reviewer-for-test
Adding dims as reviewer for test/
2020-07-19 04:32:50 -07:00
Kubernetes Prow Robot
4804fbe4c1
Merge pull request #93121 from liggitt/resource-quota
kube-up: limit critical pods to kube-system by default
2020-07-19 00:00:50 -07:00
Kubernetes Prow Robot
92e471a0bd
Merge pull request #93216 from liggitt/deflake-preferred-version
Deflake PreferredVersion e2e test
2020-07-18 21:44:50 -07:00
Jordan Liggitt
9718e7906f Deflake PreferredVersion e2e test 2020-07-18 22:51:56 -04:00
Kubernetes Prow Robot
eda07adf6e
Merge pull request #91177 from MikeSpreitzer/more-concurrency-details
Introduce more metrics on concurrency
2020-07-18 19:20:50 -07:00
hasheddan
e990698d5f
Use local daemonset manifest for installing Nvidia drivers
Updates sig-scheduling e2e Nvidia GPU tests to install drivers using
local manifest by default. Currently the DaemonSet is fetched from the
GoogleCloudPlatform/container-enginer-accelerators repo by default.
Using a local manifest allows for manually specifying the image
cos-gpu-installer image rather than always using latest. A remote
manifest can still be fetched by setting
NVIDIA_DRIVER_INSTALLER_DAEMONSET env var.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
2020-07-18 21:01:00 -05:00
Kubernetes Prow Robot
a789d56b65
Merge pull request #93119 from dcbw/e2e-ingress-misisng-return
test/e2e/ingress: add missing return to fix panics on !GCE
2020-07-18 13:58:49 -07:00
Jordan Liggitt
9d83ca4b02 Deflake GCEPD namespace deletion test 2020-07-18 15:32:02 -04:00
Jordan Liggitt
5678d40f76 Make CRDList lifecycle consistent with CRD 2020-07-18 13:53:49 -04:00
Kubernetes Prow Robot
1f14cbac54
Merge pull request #93118 from bart0sh/PR0091-update-etcd
go.mod: update etcd to fix e2e tests
2020-07-18 10:24:50 -07:00
Kubernetes Prow Robot
3a0b683c01
Merge pull request #93084 from ii/heyste-get-code-version-test
Promote Check Server Version e2e test to conformance - 1 Endpoint Coverage
2020-07-18 06:14:50 -07:00
Kubernetes Prow Robot
05f6812c2d
Merge pull request #90822 from deads2k/csr-separate-signer-flags-02
allow setting different certificates for kube-controller-managed CSR signers
2020-07-18 03:10:50 -07:00
Kubernetes Prow Robot
242f3d9dce
Merge pull request #80917 from aarnaud/windows-devicemanager
Port deviceManager to windows container manager to enable GPU access
2020-07-17 21:04:50 -07:00