Commit Graph

96344 Commits

Author SHA1 Message Date
chenyw1990
a8add50ab6 don't add pod to podQueue when the NodeName of pod is not empty 2020-11-19 08:01:59 +08:00
Kubernetes Prow Robot
379ed6644d
Merge pull request #96484 from aojea/e2etest
add e2e test for dual-stack secondary service IPs
2020-11-18 15:28:51 -08:00
Haowei Cai
40a65577c7 generated 2020-11-18 12:48:26 -08:00
Haowei Cai
8bcf34a203 unit and integration tests
apiserver dedups and adds warning in CREATE/UPDATE/PATCH requests;
also handles duplication caused by mutating admission.
2020-11-18 12:46:20 -08:00
Kubernetes Prow Robot
1c49b4425b
Merge pull request #96646 from adtac/apfe2e-2
APF e2e tests: add request drown-out fairness test
2020-11-18 12:45:37 -08:00
jay vyas
0663e190ec Update to include windows / description of how NodeOSDistro is used in the e2es 2020-11-18 15:45:03 -05:00
Haowei Cai
ffc54ed1d2 apiserver dedups owner references and adds warning
for CREATE and UPDATE requests, we check duplication before managedFields
update, and after mutating admission; for PATCH requests, we check
duplication after mutating admission
2020-11-18 12:35:45 -08:00
Jing Xu
2a568c95dd Add linuxonly on one multivolume test
This test is not working for windows yet due to commands issued in pod
are not available for windows

Change-Id: Ia0b03afd6dfe0bbb1ab00dc821775450a7e8ce54
2020-11-18 11:58:12 -08:00
Kubernetes Prow Robot
b381baab66
Merge pull request #96681 from tkashem/request-timout-e2e
Use default value when the specified timeout for a request is 0s
2020-11-18 11:44:05 -08:00
Mike Danese
7fc57a207e gce: move iptables rule to mangle
This avoids a conflict with rules that calico installs. Also, acquire
the lock everywhere.
2020-11-18 11:28:03 -08:00
Adhityaa Chandrasekar
5d2fdde120 APF defaults.go: use already defined catch-all name constant
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
2020-11-18 19:00:08 +00:00
Adhityaa Chandrasekar
16fc690d3a APF e2e tests: add request drown-out fairness test
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
2020-11-18 18:50:17 +00:00
Abu Kashem
2e6cb784d4
add e2e tests for request timeout 2020-11-18 13:34:48 -05:00
Abu Kashem
0090e27bd3
use default value when the specified timeout is 0s 2020-11-18 12:01:27 -05:00
Kubernetes Prow Robot
1df1f882c4
Merge pull request #96619 from SergeyKanzhelev/runtimeAPIConf
convert the runtimeclass API tests to conformance
2020-11-18 08:40:05 -08:00
Jan Safranek
92c3895115 Fix Cinder volume detection on OpenStack Train
Newer OpenStack does not truncate volumeID to 20 characters.
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_033fa19a-a5e3-445a-8631-3e9349e540e5
was seen on an OpenStack Train node.
2020-11-18 14:46:05 +01:00
Sean McGinnis
be131457ef
Remove stale analytics links from docs
Many README files and other docs contained a link to a an appspot
tracking app that is no longer active. Following the links leads to an
error about Go 1.9 no longer being supported. Go 1.9 support was dropped
in appspot in 2019 and disabled June 2020.

This also resulted in a broken image link displaying when viewing these
files on GitHub. Since the app is no longer functioning, and since it
causes a potentially (but granted, minor) confusing error to display,
this just removes those links as I don't believe they are needed
anymore.

Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
2020-11-18 07:04:48 -06:00
Kubernetes Prow Robot
b3fc888863
Merge pull request #96586 from Doude/for/upstream/master/96585
Fixes fake client test generation
2020-11-18 01:44:05 -08:00
Harshal Patil
b76abcd243 Remove the typo in the logs while configuring firewall for node e2e
Signed-off-by: Harshal Patil <harpatil@redhat.com>
2020-11-18 14:20:59 +05:30
10177505
b464e51063 CHANGELOG: Update error link in 1.20 2020-11-18 16:39:51 +08:00
Kobayashi Daisuke
fa68cda13f fix staticchekc failer in apiserver/pkg/endpoints/request 2020-11-18 15:28:35 +09:00
Kubernetes Prow Robot
36d12390a7
Merge pull request #95906 from harche/iptables_fix
Verify iptable rules are applied for tcp, udp and icmp
2020-11-17 22:08:04 -08:00
Kubernetes Prow Robot
bd2f96d10b
Merge pull request #96640 from aojea/kubenetsctp
e2e SCTP test must not depend on kubenet
2020-11-17 21:10:05 -08:00
Anago GCB
89b52c729c CHANGELOG: Update directory for v1.20.0-beta.2 release 2020-11-18 03:08:44 +00:00
Kubernetes Prow Robot
160c33a6a1
Merge pull request #96533 from gnufied/reduce-vsphere-volume-name
Reduce volume name length for vsphere
2020-11-17 17:34:05 -08:00
Kubernetes Prow Robot
6715318ee7
Merge pull request #96644 from jingxu97/nov/tests
Mark some storage tests as LinuxOnly
2020-11-17 15:44:04 -08:00
Kubernetes Prow Robot
afeac926fa
Merge pull request #95981 from caesarxuchao/http2-healthcheck
Enables HTTP/2 health check
2020-11-17 14:48:05 -08:00
Hemanth Malla
2b697e68f3
Using UpperCamelCase event reason - DeletingNode, instead of verbose msg 2020-11-17 17:12:20 -05:00
Kubernetes Prow Robot
3af376d3ad
Merge pull request #96626 from jingxu97/nov/topology
Update topology tests for windows
2020-11-17 13:22:17 -08:00
Kubernetes Prow Robot
114f9988ff
Merge pull request #96322 from zshihang/conformance
Promote TokenRequest e2e test to Conformance
2020-11-17 13:22:04 -08:00
Kubernetes Prow Robot
19b15b2fc2
Merge pull request #96526 from celestehorgan/use-k8s-more
Use K8s in the README
2020-11-17 12:14:20 -08:00
Kubernetes Prow Robot
e1ab99e0d6
Merge pull request #92743 from liggitt/gc
Fix GC uid races and handling of conflicting ownerReferences
2020-11-17 12:14:06 -08:00
Adhityaa Chandrasekar
e827708635 APF e2e tests: rename request drown-out priority client names
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
2020-11-17 18:41:08 +00:00
Kubernetes Prow Robot
6dddea5abf
Merge pull request #96613 from tosi3k/deprecate-log-dump
Add a deprecation note to k/k/cluster/log-dump directory
2020-11-17 10:28:04 -08:00
hasheddan
97c358fe5b
Fix link to cadvisor CRI-O sock path
Fixes link to point to CRI-O sock constant defined in cadvisor. We
cannot pin directly because of linux build tags in transitive dependency
opencontaines/runc.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
2020-11-17 12:02:27 -06:00
Jing Xu
079a4ea30c Mark some storage tests as LinuxOnly
Some storage tests has commands not available in Windows. Mark them as
LinuxOnly now. Will check later to see whether equivalent windows
commands are available.

Change-Id: I41b5668c855b2754a2e332cff4e90ebf2981aca0
2020-11-17 09:53:50 -08:00
Bryan Boreham
beceee6815 Remove unused argument from generateEvent 2020-11-17 16:51:10 +00:00
Jordan Liggitt
e491c3bc70 Add GC unit tests
Adds unit tests covering the problematic scenarios identified
around conflicting data in child owner references

                      Before   After
package level         51%      68%
garbagecollector.go   60%      75%
graph_builder.go      50%      81%
graph.go              50%      68%

Added/improved coverage of key functions that had lacking unit test coverage:

* attemptToDeleteWorker
* attemptToDeleteItem
* processGraphChanges (added coverage of all added code)
2020-11-17 10:49:32 -05:00
Jordan Liggitt
603a0b016e Log cluster-scoped owners referencing namespaced owners, avoid retrying lookups forever
If a cluster-scoped dependent references a namespace-scoped owner,
this is an invalid relationship, and the lookup will never succeed in attemptToDelete.

Short-circuit requeueing in attemptToDelete and log.
2020-11-17 10:49:30 -05:00
Jordan Liggitt
221e4aa2c2 Queue non-matching children for deletion when a virtual node is marked as observed
When we observe valid coordinates for a previously virtual node,
if there are dependents that do not agree with those coordinates,
add them to the attemptToDelete queue.

This queue will check the dependent's ownerReferences using the coordinates specified by the dependent.
If all of the owners can be verified absent, the dependent will be deleted.
If some are still present, or if there are errors looking them up, the dependent will not be deleted.

If the verified owner is namespaced, and the dependent is not in the same namespace,
an event will be recorded for user visibility, since cross-namespace ownerReferences are not supported.
2020-11-17 10:49:27 -05:00
Jordan Liggitt
b655f22509 Handle virtual delete events when children don't agree on owner coordinates
If a virtual delete event is received for a node whose dependents disagree on the parent's coordinates:
1. propagate the delete to children that matched the verified absent coordinates
2. if the existing node is virtual, select a new set of coordinates from the remaining dependents
3. do not delete the parent node from the graph if the parent node is non-virtual,
   or if there are dependents that do not agree with the virtual delete event coordinates
2020-11-17 10:49:07 -05:00
Jordan Liggitt
b8d7ecf73b Make node removal conditional in processGraphChanges 2020-11-17 10:49:04 -05:00
Jordan Liggitt
ac8d419b4c Enqueue dependents for deletion when their ownerReference does not match observed parent coordinates
When adding a dependent to the graph, we ensure there is a node representing each owner reference,
and add the dependent to each parent node.

If the parent node already exists, and the dependent's ownerReference
coordinates disagree with the verified coordinates, add the dependent to the attemptToDelete queue.

This queue will check the dependent's ownerReferences using the coordinates specified by the dependent.
If all of the owners can be verified absent, the dependent will be deleted.
If some are still present, or if there are errors looking them up, the dependent will not be deleted.

If the parent node has been observed via informer event (so we know the coordinates are accurate),
and the verified owner is namespaced, and the dependent is not in the same namespace,
an event will be recorded for user visibility, since cross-namespace ownerReferences are not supported.
2020-11-17 10:47:39 -05:00
Jordan Liggitt
78317edb8b Short-circuit attemptToDelete loop for virtual nodes that are removed or observed
Virtual nodes are added to the attemptToDelete queue, and continue getting requeued
until they are successfully verified absent or are observed via informer.

In the meantime, if the real object associated with that UID is observed via informer,
or is observed to be deleted via informer, the graph node for that UID can be removed
or marked as observed. In that case, we should stop retrying to get the virtual node coordinates.
2020-11-17 10:46:00 -05:00
Jordan Liggitt
cae56bea0a Replace virtual node with observed node if identity differs
If the graph contains a virtual node (because some child object referenced it in an OwnerRef),
and a real informer event is observed for that uid at different coordinates,
we want to fix the coordinates of the node in the graph to match the actual coordinates.

The safe way to do this is to clone the node, replace the identity in the clone,
then replace the node with the clone.

Modifying the identity directly is not safe because it is accessed lock-free from many code paths.

Replacing the node in the graph from processGraphChanges is safe because it is the only graph writer.
2020-11-17 10:42:48 -05:00
Jordan Liggitt
cb7b9ed532 Refactor identityFromEvent 2020-11-17 10:42:48 -05:00
Jordan Liggitt
30eb6683e6 Avoid marking virtual nodes as observed when they haven't been
Virtual nodes can be added to the GC graph in order to represent objects
which have not been observed via an informer, but are referenced via ownerReferences.

These virtual nodes are requeued into attemptToDelete until they are observed via an informer,
or successfully verified absent via a live lookup. Previously, both of those code paths
called markObserved() to stop requeuing into attemptToDelete.

Because it is useful to know whether a particular node has been observed via
a real informer event, this commit does the following:

* adds a `virtual bool` attribute to graph events so we know which ones came from a real informer
* limits the markObserved() call to the code path where a real informer event is observed
* uses an alternative mechanism to stop requeueing into attemptToDelete when a virtual node is verified absent via a live lookup
2020-11-17 10:42:48 -05:00
Jordan Liggitt
445f20dbdb Switch GC absentOwnerCache to full reference
Before deleting an object based on absent owners, GC verifies absence of those owners with a live lookup.

The coordinates used to perform that live lookup are the ones specified in the ownerReference of the child.

In order to performantly delete multiple children from the same parent (e.g. 1000 pods from a replicaset),
a 404 response to a lookup is cached in absentOwnerCache.

Previously, the cache was a simple uid set. However, since children can disagree on the coordinates
that should be used to look up a given uid, the cache should record the exact coordinates verified absent.
This is a [apiVersion, kind, namespace, name, uid] tuple.
2020-11-17 10:42:48 -05:00
Jordan Liggitt
14f7f3201f Add GC integration race test 2020-11-17 10:42:48 -05:00
Jordan Liggitt
09bdf76b8a Plumb event recorder to garbage collector controller 2020-11-17 10:42:45 -05:00