The kube-apiserver validation expects the Count of an EventSeries to be
at least 2, otherwise it rejects the Event. There was is discrepancy
between the client and the server since the client was iniatizing an
EventSeries to a count of 1.
According to the original KEP, the first event emitted should have an
EventSeries set to nil and the second isomorphic event should have an
EventSeries with a count of 2. Thus, we should matcht the behavior
define by the KEP and update the client.
Also, as an effort to make the old clients compatible with the servers,
we should allow Events with an EventSeries count of 1 to prevent any
unexpected rejections.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Kubernetes-commit: 62c9fa8fe6c2e04b1a40970e93055c2e92259b12
There was a data race in the recordToSink function that caused changes
to the events cache to be overriden if events were emitted
simultaneously via Eventf calls.
The race lies in the fact that when recording an Event, there might be
multiple calls updating the cache simultaneously. The lock period is
optimized so that after updating the cache with the new Event, the lock
is unlocked until the Event is recorded on the apiserver side and then
the cache is locked again to be updated with the new value returned by
the apiserver.
The are a few problem with the approach:
1. If two identical Events are emitted successively the changes of the
second Event will override the first one. In code the following
happen:
1. Eventf(ev1)
2. Eventf(ev2)
3. Lock cache
4. Set cache[getKey(ev1)] = &ev1
5. Unlock cache
6. Lock cache
7. Update cache[getKey(ev2)] = &ev1 + Series{Count: 1}
8. Unlock cache
9. Start attempting to record the first event &ev1 on the apiserver side.
This can be mitigated by recording a copy of the Event stored in
cache instead of reusing the pointer from the cache.
2. When the Event has been recorded on the apiserver the cache is
updated again with the value of the Event returned by the server.
This update will override any changes made to the cache entry when
attempting to record the new Event since the cache was unlocked at
that time. This might lead to some inconsistencies when dealing with
EventSeries since the count may be overriden or the client might even
try to record the first isomorphic Event multiple time.
This could be mitigated with a lock that has a larger scope, but we
shouldn't want to reflect Event returned by the apiserver in the
cache in the first place since mutation could mess with the
aggregation by either allowing users to manipulate values to update
a different cache entry or even having two cache entries for the same
Events.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Kubernetes-commit: cfdd40b569d7630b9b31ddbe0557159b1f8b0f9e
When attempting to record a new Event and a new Serie on the apiserver
at the same time, the patch of the Serie might happen before the Event
is actually created. In that case, we handle the error and try to create
the Event. But the Event might be created during that period of time and
it is treated as an error today. So in order to handle that scenario, we
need to retry when a Create call for a Serie results in an AlreadyExist
error.
Signed-off-by: Damien Grisonnet <dgrisonn@redhat.com>
Kubernetes-commit: 92c7042e640e148f54aa112591a4550d5450a132
Currenlty an event recorder can send an event to a
broadcaster that is already stopped, resulting
in a panic. This ensures the broadcaster holds
a lock while it is shutting down and then forces
any senders to drop queued events following
broadcaster shutdown.
It also updates the Action, ActionOrDrop, Watch,
and WatchWithPrefix functions to return an error
in the case where data is sent on the closed bradcaster
channel rather than panicing.
Lastly it updates unit tests to ensure the fix works correctly
fixes: https://github.com/kubernetes/kubernetes/issues/108518
Signed-off-by: Andrew Stoycos <astoycos@redhat.com>
Kubernetes-commit: 6aa779f4ed3d3acdad2f2bf17fb27e11e23aabe4