Automatic merge from submit-queue
Add a custom main instead of the standard test main, to reduce stack …
Adds a custom test main handler (see: `TestMain` in https://golang.org/pkg/testing/ for details)
Partial fix for https://github.com/kubernetes/kubernetes/issues/25965
This does the standard timeout, but strips non-kubernetes stacks out of the stack trace (e.g. it filters things like:
```
goroutine 466 [IO wait, 7 minutes]:
net.runtime_pollWait(0x7fd74c4672c0, 0x72, 0xc821614000)
/usr/local/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8215c21b0, 0x72, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8215c21b0, 0x0, 0x0)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).Read(0xc8215c2150, 0xc821614000, 0x1000, 0x1000, 0x0, 0x7fd74c491050, 0xc820014058)
/usr/local/go/src/net/fd_unix.go:250 +0x23a
net.(*conn).Read(0xc820a5a090, 0xc821614000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:172 +0xe4
net/http.noteEOFReader.Read(0x7fd74c465258, 0xc820a5a090, 0xc8215f0068, 0xc821614000, 0x1000, 0x1000, 0x405773, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:1687 +0x67
net/http.(*noteEOFReader).Read(0xc8215ae1a0, 0xc821614000, 0x1000, 0x1000, 0xc82159ad1d, 0x0, 0x0)
<autogenerated>:284 +0xd0
bufio.(*Reader).fill(0xc8202a2b40)
/usr/local/go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Peek(0xc8202a2b40, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:132 +0xcc
net/http.(*persistConn).readLoop(0xc8215f0000)
/usr/local/go/src/net/http/transport.go:1073 +0x177
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:857 +0x10a6
```
We may want to get even more aggressive in the future.
@kubernetes/sig-testing
Automatic merge from submit-queue
Move quota usage testing for loadbalancers into unit tests
Fixes https://github.com/kubernetes/kubernetes/issues/26319
* moved testing for node port and load balancer usage in quota to unit tests
* remove node port and node port -> loadbalancer service testing out of e2e
* covered already in replenishment_controller_test scenario
Given the time it takes to even allocate a load balancer, it seems better to test that outside of this test case to avoid unnecessary flakes.
/cc @bprashanth
Automatic merge from submit-queue
Flake 26210: decouple explicit access from port 80
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Reduce huge amount of logs in large cluster tests
When running tests in 2000-node clusters, I got more than 100.000 lines like this:
```
Jun 8 01:03:11.850: INFO: Condition NetworkUnavailable of node gke-gke-large-cluster-default-pool-1-03ee5a12-knrw is true instead of false. Reason: NoRouteCreated, message: Node created w ithout a route
```
that doesn't give much value.
This is PR is reducing the number of logs.
Also includes other improvements:
- Makefile rule to run tests against remote instance using existing host or image
- Makefile will reuse an instance created from an image if it was not torn down
- Runner starts gce instances in parallel with building source
- Runner uses instance ip instead of hostname so that it doesn't need to resolve
- Runner supports cleaning up files and processes on an instance without stopping / deleting it
- Runner runs tests using `ginkgo` binary to support running tests in parallel
Now that GCE routes take an extremely long time to come up and there's
a variance in "Ready" and "Schedulable", start cherry-picking tests
where we really want to have all nodes routable/schedulable for
testing. Adding logging. This will increase test times on large
clusters but should have 0 impact on normal testing.
Flake #26210 only happens for port 80. To decouple the possible causes, all
tests with explicit port 80 are moved to port 1080 (these were 80% of the flakes).
The urls without a specified port (which map to port 80 though) are left untouched.
If port 1080 does not show up as flake now, there is really a connection to the
actual port number.
Automatic merge from submit-queue
Mark runtime conformance tests as flaky and not run them in jenkins CI.
For #26809
@pwittrock As discussed offline, marking runtime tests as flaky for now. I'm not sure if those tests are required. Testing docker in every Kubernetes PR is un-necessary.
These tests can be run periodically in a separate CI. AFAIK, these tests don't seem to exercise any kube features.
When we create a PV, we should created it withoud Spec.ClaimRef.UID.
In rare cases, when 'PV added' event with UID is processed before 'PVC
added' (created by for loop few lines above), the controller does not know
a PVC with this UID and considers the PV as released. Reclaim policy is
then executed and the PV is deleted and it's never bound.
With UID="", the controller waits for the PVC to get created and binds
it.
Different tests should use different objects and watchers - I noticed
sometimes an event from old tests leaked into subsequent test in the
same function.
And add some logs.
Automatic merge from submit-queue
Update Node e2e Core OS image to run systemd with CPU & Memory accounting enabled by default
cc @derekwaynecarr
For #26289