This adds a call to createBalancedPods during the ubernetes_lite scheduling e2es,
which are prone to improper score balancing due to unbalanced utilization.
The test "validates that there is no conflict between pods with same
hostPort but different hostIP and protocol" was testing the scheduler
capability to schedule pods on the same node with hostPorts, however,
it wasn´t validating that the HostPorts was working, causing false
positives, because the pods were scheduled, but the HostPort exposed
wasn´t working.
In order to test the HostPort functionality, we have to use HostNetwork
pods, that are incompatible with Windows platforms. Also, since this
is touching both network and scheduling, there is no clear the ownership,
but sig-network is happy to adopt it.
We also add a new test for scheduling only under "scheduling", so Windows
folks can use it to test the scheduled in that platform.
The test is not cleaning all pods it created.
Memory balancing pods are deleted once the test namespace is.
Thus, leaving the pods running or in terminating state when a new test is run.
In case the next test is "[sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run",
the test can fail.
The e2e test, included as part of Conformance,
"validates that there is no conflict between
pods with same hostPort but different hostIP and protocol"
was only testing that the pods were scheduled without conflict
but was never testing the functionality.
The test should check that pods with containers forwarding the same
hostPort can be scheduled without conflict, and that those exposed
HostPort are forwarding the ports to the corresponding pods.
the predicate tests were using loopback addresses for the the
hostPort test, however, those have different semantics depending
on the IP family, i.e. you can not bind to ::1 and ::2 simultanously,
in addition, IP forwarding from localhost to localhost in IPv6 is
not working since it doesn't have the kernel route_localnet hack.
* Rename const for topology.../zone
* Rename const for topology.../region
* Rename const for failure-domain.../zone
* Rename const for failure-domain.../region
* Restore old names for compat
A spreading test is more meaningful with a greater number of Pods. However, we cannot always expect perfect spreading. We accept a skew of 2 for 5*z Pods, where z is the number of zones.
Change-Id: Iab0de06a95974fbfec604f003b550f15db618ebd
Updates sig-scheduling e2e Nvidia GPU tests to install drivers using
local manifest by default. Currently the DaemonSet is fetched from the
GoogleCloudPlatform/container-enginer-accelerators repo by default.
Using a local manifest allows for manually specifying the image
cos-gpu-installer image rather than always using latest. A remote
manifest can still be fetched by setting
NVIDIA_DRIVER_INSTALLER_DAEMONSET env var.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Currently when checking for unscheduled pods an exception will be raised
if a pod is not scheduled and the status is unknown. This update modifies
the logic to include any pod without a NodeName in the not scheduled
pods returned.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
When node scheduling tests were updated to use worker instead of master
nodes the GetPodsScheduled function, which is tasked with getting all
scheduled and not scheduled pods inadvertently was changed to ignore all
pods that have an empty NodeName before checking whether pods had been
scheduled or not. This updates the function to include pods without a
NodeName in the check for unscheduled pods.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
Ready schedulable nodes are being inserted into an unitialized string
set, causing an assignment to entry in nil map in the underlying data
structure. This initializes the string set before attempting to insert
nodes.
Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
The namespace parameter "ns" of getScheduledAndUnscheduledPods() is
always metav1.NamespaceAll.
This removes the parameter from the function for cleanup.
WaitForStableCluster() checks all pods run on worker nodes, and the
function used to refer master nodes to skip checking controller plane
pods.
GetMasterAndWorkerNodes() was used for getting master nodes, but the
implementation is not good because it usesDeprecatedMightBeMasterNode().
This makes WaitForStableCluster() refer worker nodes directly to avoid
using GetMasterAndWorkerNodes().
Conformance tests must not rely on the kubelet API in order to
pass. SchedulerPredicates tests attempt to use the kubelet API
in their BeforeEach, some of which are tagged as Conformance.
Is there a compelling reason to use the kubelet's view of pods
for a given node instead of the apiserver's view of the pods?
WaitForPod*() are just wrapper functions for e2epod package, and they
made an invalid dependency to sub e2e framework from the core framework.
So this replaces WaitForPodRunning() with the e2epod function.
This is gross but because NewDeleteOptions is used by various parts of
storage that still pass around pointers, the return type can't be
changed without significant refactoring within the apiserver. I think
this would be good to cleanup, but I want to minimize apiserver side
changes as much as possible in the client signature refactor.
1. move the integration test of TaintBasedEvictions to test/integration/node
2. move the e2e test of TaintBasedEvictions e2e test/e2e/node
3. modify the conformance file to adapt the TaintBasedEviction test
Similar functionality is required across e2e tests for RuntimeClass.
Let's create runtimeclass as part of the framework/node package.
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
NodeResourceFit plugin's Filter method responsible for checking if a pod fits
a given node ignores resource limits and acknowledge resource requests only.
Given both tests validating resource limits of pods were setting only pod resource limits,
ability of NodeResourceFit plugin to properly filter nodes was not tested at all.