If the informers handlers are slow processing the objects, the deltaFIFO
blocks the queue and the streamWatchers can not add new elements to the
queue, creating contention and causing different problems, like high
memory usage.
The problem is not easy to identify from a user perspective, typically
you can use pprof to identify a high memory usage on the StreamWatchers
or some handler consuming most of the cpu time, but users should not
have to profile the golang binary in order to know that.
Metrics were disabled on the reflector because of memory leaks, also
monitoring the queue depth can't give a good signal, since it never goes high
However, we can trace slow handlers and inform users about the problem.
Kubernetes-commit: d38c2df2c4b945bcf1f81714fc6bfd01bbd0f538
https://github.com/kubernetes/kubernetes/pull/87795 most likely
unintentionally increased the log level of "Starting reflector" and
"Stopping reflector", with the result that since Kubernetes 1.21
clients have printed that message by default. This is undesirable, we
should use the original level 3 again.
Kubernetes-commit: fd972934e4916879b04508686302659ce82cfa75
Currently there is no way to specify WatchListPageSize used by Controller. This PR adds a field that can be used to specify this.
Change-Id: I241454a45dd94d3ea65a91b297f530e217f843aa
Kubernetes-commit: 43f5afe1a1dd058a2564cd3b2f330fc2a401f607
Currently when ListAndWatch() receives a connection refused error, it is
assumed to be due to the apiserver being transiently unresponsive. In
situations where a controller is running outside the k8s cluster it's
controlling, it is more common for the controller to lose connection
permanently to the apiserver and needs to exponentially backoff its
retry rather than continously spamming logs with Watch attempts that
will never succeed.
Kubernetes-commit: 1ff789f2bb9bf7fbb3df35977bc249c0dd019d31
When dedupDeltas does the impossible and the key is already queued,
return an error rather than maintain the data structure invariants.
Kubernetes-commit: a39481a4f6cf33f9bf4555adcffa28077863e7a9
Some comments and code incorrectly contemplated violating the
invariant that a keys is in `f.items` if and only if it is in
`f.queue`.
Also fixed up some comment wording.
Kubernetes-commit: 5efd727d112206ef9a8ede93c5878b0d40707ae9
Fixes: kubernetes#90581 (the first part)
When `Close()` is invoked on an empty queue, the control loop inside `Pop()` has a small chance of missing the signal and blocks indefinitely due to a race condition. This PR eliminates the race and allows the control loop inside any blocking `Pop()` to successfully exit after Close() is called.
Kubernetes-commit: d8b90955519d10b99415515f8314dd6d35caae8d
When creating an informer, this adds a way to add custom error handling, so that
Kubernetes tooling can properly surface the errors to the end user.
Fixes https://github.com/kubernetes/client-go/issues/155
Kubernetes-commit: 435b40aa1e5c0ae44e0aeb9aa6dbde79838b3390
Removed the incorrect promise of coherency in the answer to a query to
an informer's local cache. Removed the definition of "collection
state", because it was only used in the now-removed promise. Added a
remark about ordering of states that appear in an informer's local
cache.
Brushed up the commentary on resync period. Changed the relevant
parameter of NewSharedInformer to have the same name as the
corresponding parameter to NewSharedIndexInformer.
Kubernetes-commit: b8e2ad5926c3a6872422ad25cf9867e10e052a7d