This is the second and final step towards making kubelet independent of the
resource.k8s.io API versioning because it now doesn't need to copy structs
defined by that API from the driver to the API server.
This is a first step towards making kubelet independent of the resource.k8s.io
API versioning because it now doesn't need to copy structs defined by that API
from the driver to the API server. The next step is removing the other
direction (reading ResourceClaim status and passing the resource handle to
drivers).
The drivers must get deployed so that they have their own connection to the API
server. Securing at least the writes via a validating admission policy should
be possible.
As before, the kubelet removes all ResourceSlices for its node at startup, then
DRA drivers recreate them if (and only if) they start up again. This ensures
that there are no orphaned ResourceSlices when a driver gets removed while the
kubelet was down.
While at it, logging gets cleaned up and updated to use structured, contextual
logging as much as possible. gRPC requests and streams now use a shared,
per-process request ID and streams also get logged.
2e34e187c9 enabled kubelet to do List and Watch
requests with the caveat that kubelet should better use a field selector (which
it does). The same is now also needed for DeleteCollection because kubelet will
use that to clean up in one operation instead of using multiple.
the exact number of lines/ro lines is not important, just that there are more than 0 ro lines
and more than 1 line total.
this helps accomodate different architectures that implement different kernel APIs
Signed-off-by: Peter Hunt <pehunt@redhat.com>
The manual deep comparison code is hard to maintain (would need to be updated
in https://github.com/kubernetes/kubernetes/pull/125488) and error prone.
In fact, one test case failed when doing a full automatic comparison with
cmp.Diff because it wasn't setting allMemory.
Recording the expected and actual checksum in the error makes it possible to
provide that information, for example in a failed test like the ones for DRA.
Otherwise developers have to manually step through the test with a debugger to
figure out what the new checksum is.
This reverts commit 0c0e19b343.
During stress test for SVM controller, the controller is unable to
make a list call due to following error:
resourceversion.go:155: I0716 21:49:26.973127] storage-version-migrator-controller: Error syncing SVM resource, retrying svm="crdsvm" err="error getting latest resourceVersion for stable.example.com/v1, Resource=testcrds: Timeout: Too large resource version: 28976, current: 20349"
With the feature disabled, the stress test passes.
Signed-off-by: Monis Khan <mok@microsoft.com>
including one Alpha only test, as the feature is in alpha
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Co-authored-by: Sohan Kunkerkar <sohank2602@gmail.com>