The watch.Interface design is hard to change, because it would break
most client-go users that perform watches. So instead of changing the
interface to be more user friendly, this change updates the method
comments to explain the different responsibilities of the consumer
(client user) and the producer (interface implementer).
Kubernetes-commit: 1f35231a1d4f7b8586a7ec589c799729eeb4f7c4
- Extract watchWithResync to simplify ListAndWatch
- Wrap watchHandler with two variants, one for WatchList and one for
just Watch.
- Replace a bool pointer arg with a bool arg and bool return, to
improve readability.
- Use errors.Is to satisfy the linter
- Use %w to wrap the store.Replace error, to allow unwrapping.
Kubernetes-commit: 65fc1bb463c85a4c85e619bf7acac9503e23a253
- Add tests to confirm that Stop is always called.
- Add TODOs to show were Stop is not currently being called
(to fix in a future PR)
Kubernetes-commit: ab5aa4762fd5206d0dbd8412d7c6f3b76533a122
the signature of the method was tightly connected to the reflector,
making it difficult to use for anything other than a reflector.
this simple refactor makes the method more generic.
Kubernetes-commit: 83c7542abc8c542c01ecb67376f134b2071c5304
In particular, document that ListAllByNamespace delegates to ListAll
if no namespace is specified.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
Kubernetes-commit: 54e899317ef46e3b70827cacee244717022db0ad
until #115478(use streaming against the etcd storage)
is resolved the cacher need a way to disable the streaming.
Kubernetes-commit: 41e706600aea7468f486150d951d3b8948ce89d5
checkWatchListConsistencyIfRequested performs a data consistency check only when
the KUBE_WATCHLIST_INCONSISTENCY_DETECTOR environment variable was set during a binary startup.
The consistency check is meant to be enforced only in the CI, not in production.
The check ensures that data retrieved by the watch-list api call
is exactly the same as data received by the standard list api call.
Note that this function will panic when data inconsistency is detected.
This is intentional because we want to catch it in the CI.
Kubernetes-commit: b31e7793d0d873a71c90caf8455556aa905cf88d
originally we honored only apierrors.IsInvalid
but decided to fallback on every error
because it is better to make progress than deadlocking
Kubernetes-commit: 4b3915017950a114124a88c5d308bd8bfb9ec48e
`SetWatchErrorHandler` claims it will fail if Run() has already started.
But if they are called concurrently, it will actually trigger a data
race.
With this PR:
```
62702 runs so far, 0 failures (100.00% pass rate). 59.152682ms avg, 189.068387ms max, 26.623785ms min
```
Without this PR:
```
5012 runs so far, 38 failures (99.25% pass rate). 58.675502ms avg, 186.018084ms max, 29.468104ms min
```
Kubernetes-commit: 35d2431b3a89c5bd693846952e9d27ce4e3a0754
Before, we've used two separate backoff managers for List and Watch
calls, now they share single backoff manager.
Kubernetes-commit: 337728b02559dec8a613fdef174f732da9cae310
Without this change, sometimes leaked goroutines were reported for
test/integration/scheduler_perf. The one that caused the cleanup to get delayed
was this one:
goleak.go:50: found unexpected goroutines:
[Goroutine 2704 in state chan receive, 2 minutes, with k8s.io/client-go/tools/cache.(*Reflector).watch on top of the stack:
goroutine 2704 [chan receive, 2 minutes]:
k8s.io/client-go/tools/cache.(*Reflector).watch(0xc00453f590, {0x0, 0x0}, 0x1f?, 0xc00a128080?)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:388 +0x5b3
k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc00453f590, 0xc006e94900)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:324 +0x3bd
k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:279 +0x45
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc007aafee0)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:157 +0x49
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc003e18150?, {0x75e37c0, 0xc00389c280}, 0x1, 0xc006e94900)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:158 +0xcf
k8s.io/client-go/tools/cache.(*Reflector).Run(0xc00453f590, 0xc006e94900)
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/client-go/tools/cache/reflector.go:278 +0x257
k8s.io/apimachinery/pkg/util/wait.(*Group).StartWithChannel.func1()
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:58 +0x3f
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:75 +0x74
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
/nvme/gopath/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73 +0xe5
watch() was stuck in an exponential backoff timeout. Logging confirmed that:
I0309 21:14:21.756149 1572727 reflector.go:387] k8s.io/client-go/informers/factory.go:150: watch of *v1.PersistentVolumeClaim returned Get "https://127.0.0.1:38269/api/v1/persistentvolumeclaims?allowWatchBookmarks=true&resourceVersion=1&timeout=7m47s&timeoutSeconds=467&watch=true": dial tcp 127.0.0.1:38269: connect: connection refused - backing off
Kubernetes-commit: b4751a52d53d4acc6a5ce3e796938c9a12f81fcb
Since the behavior is now changed, and the old behavior leaked objects,
this adds a new comment about how Replace works.
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@uged.al>
Kubernetes-commit: 27f4bcae5c52a3bb88141f940ec23d907a15cde5
This is useful to both reduce the code complexity, and to ensure clients
get the "newest" version of an object known when its deleted. This is
all best-effort, but for clients it makes more sense giving them the
newest object they observed rather than an old one.
This is especially useful when an object is recreated. eg.
Object A with key K is in the KnownObjects store;
- DELETE delta for A is queued with key K
- CREATE delta for B is queued with key K
- Replace without any object with key K in it.
In this situation its better to create a DELETE delta with
DeletedFinalStateUnknown with B (with this patch), than it is to give
the client an DeletedFinalStateUnknown with A (without this patch).
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@uged.al>
Kubernetes-commit: 7bcc3e00fc28b2548886d04639a2e352ab37fb55
This fixes an issue where a relist could result in a DELETED delta
with an object wrapped in a DeletedFinalStateUnknown object; and then on
the next relist, it would wrap that object inside another
DeletedFinalStateUnknown, leaving the user with a "double" layer
of DeletedFinalStateUnknown's.
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@uged.al>
Kubernetes-commit: 0bf0546d9f75d92c801e81c9f7adf040bba64102
This fixes a race condition when a "short lived" object
is created and the create event is still present on the queue
when a relist replaces the state. Previously that would lead in the
object being leaked.
The way this could happen is roughly;
1. new Object is added O, agent gets CREATED event for it
2. watch is terminated, and the agent runs a new list, L
3. CREATE event for O is still on the queue to be processed.
4. informer replaces the old data in store with L, and O is not in L
- Since O is not in the store, and not in the list L, no DELETED event
is queued
5. CREATE event for O is still on the queue to be processed.
6. CREATE event for O is processed
7. O is <leaked>; its present in the cache but not in k8s.
With this patch, on step 4. above it would create a DELETED event
ensuring that the object will be removed.
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@uged.al>
Kubernetes-commit: 25d77218acdac2f793071add9ea878b08c7d328b