Before the addition of GetAllocatableResources, the
podresources API had just one endpoint `List()`, thus we could just
account for the total of the calls to have a good pulse of the API usage.
Now that we extend the API with more endpoints
(`GetAlloctableResources`), in order to improve the observability we add
per-endpoint counters, in addition to the existing counter of the total
API calls.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Add feature gate to disable the GetAllocatableResources API.
The feature gate isd alpha stage, disabled by default.
Add e2e test to demonstrate the behaviour with feature gate disabled.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Add test to reflect the correct behaviour according to
review comments.
Most notably, we should consider that -as the device plugin API
allows to express- a device ID can have multiple "NUMA" node IDs.
(example: AMD Rome).
More details:
https://github.com/kubernetes/kubernetes/pull/95734#discussion_r539545041
Signed-off-by: Francesco Romani <fromani@redhat.com>
From https://github.com/kubernetes/kubernetes/pull/96553
we are reminded we need to handle the case on which
a device plugin reports nil Topology, which is legal.
Add unit test to ensure this case is handled.
Signed-off-by: Francesco Romani <fromani@redhat.com>
It's legal for device plugins to not expose topology informations.
Previously, the code was just skipping these devices.
Review highlighted is better to report them anyway and let the
client application decide if they still want somehow to track them
or skip them entirely.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Document why teardownSRIOVPod has to wait for all the containers
to be gone before to end, and why is important.
Additionally, change the code to wait for all the containers to be gone,
not just the first. This is both a little cleaner and a little safer,
even though it seems the current code caused no issues so far.
Signed-off-by: Francesco Romani <fromani@redhat.com>
speedup the cleanup after testcases deleting pods in separate
goroutines.
The post-test cleanup stage must be done carefully since pod require
exclusive allocation - so pods must take all the steps to properly
cleanup the tests to avoid to pollute the environment, but
this has a negative effect on test duration (take longer).
Hence, we add safe speedups like doing pod deletions in parallel.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Add e2e tests for the new GetAllocatableResources API.
The tests are added in the `podresources_test` suite
created previously in this series.
Signed-off-by: Francesco Romani <fromani@redhat.com>
during the review, we convened that the manager types
(CPUSet, ResourceDeviceInstances) should not cross the
containermanager API boundary; thus, the ContainerManager layer
is the correct place to do the type conversion
We push back the type conversions from the podresources server
layer, fixing tests accordingly.
Signed-off-by: Francesco Romani <fromani@redhat.com>
We want to make the return type of the GetDevices() method of the
podresources DevicesProvider interface consistent with
the newly added GetAllocatableDevices type.
This makes the code easier to read and reduces the coupling between
the podresourcesapi server and the devicemanager code.
No intended changes in behaviour, but the different return types
now requires some data massaging. Tests are updated accordingly.
Signed-off-by: Francesco Romani <fromani@redhat.com>
a upcoming patch wants to add GetAllocatableCPUs() returning a cpuset.
To make the code consistent and a bit more flexible, we change the
existing interface to also return a cpuset.
Signed-off-by: Francesco Romani <fromani@redhat.com>