This was causing tests to pass which ought not be passing. This is not
an API change because we default the value of it when needed. So we
would never see this in the wild, but it makes the tests sloppy.
This scaffolding allows us to assert more on each test case, and more
consistently.
Set input fields from output fields IFF they are expected AND not set on
input. This allows us to verify the "after" state (expected) whether
the test case specified the value or not, and still pass the generic
cmp.Equal.
Use this in a few tests to prove its worth, more to do.
Some of the existing tests that are focused on create and delete can
probably be replaced by these.
This could be used in other test cases that are open-coding a lot of the
same stuff. Later commits.
This is the last layered method. All allocator logic is moved to the
beginUpdate() path. Removing the now-useless layer will happen in a
subsequent commit.
This commit ports the ExternalTrafficPolicy and HealthCheckNodePort
tests from rest_test to storage_test. It's not a direct port, though.
I have added more cases (much more exhaustive) and more assertions.
This commit ports the NodePort test from rest_test to storage_test.
It's not a direct port, though. I have added many more cases (much more
exhaustive) and more assertions.
This includes cases for gate MixedProtocolLBService.
This includes a few cases.
1) TestCreateIgnoresIPFamilyForExternalName: Prove that ExternalName is
ignored for dual-stack. A small set of test cases were chosen to
demonstrate.
2) TestCreateIgnoresIPFamilyWithoutDualStack: Prove that when the
dual-stack gate is off, all services are ignored for dual-stack. A
small set of test cases were chosen to demonstrate
3) TestCreateInitIPFields: Run over a huge array of test cases for
dual-stack. This was generated by this program:
https://gist.github.com/thockin/cccc9c9a580b4830ee0946ddd43eeafe and
then updated by hand.
Gut the "outer" Create() and move it to the inner BeginCreate(). This
uses a "transaction" type to make cleanup functions easy to read.
Background:
Service has an "outer" and "inner" REST handler. This is because of how we do IP and port allocations synchronously, but since we don't have API transactions, we need to roll those back in case of a failure. Both layers use the same `Strategy`, but the outer calls into the inner, which causes a lot of complexity in the code (including an open-coded partial reimplementation of a date-unknown snapshot of the generic REST code) and results in `Prepare` and `Validate` hooks being called twice.
The "normal" REST flow seems to be:
```
mutating webhooks
generic REST store Create {
cleanup = BeginCreate
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Decorator
}
```
Service (before this commit) does:
```
mutating webhooks
svc custom Create {
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
Allocations
inner (generic) Create {
cleanup = BeginCreate
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Decorator
}
}
```
After this commit:
```
mutating webhooks
generic REST store Create {
cleanup = BeginCreate
Allocations
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Rollback allocations on error
Decorator
}
```
This same fix pattern will be applied to Delete and Update in subsequent
commits.
All the logic remains unchanged, just reorganized. The functions are
imperfect but emphasize the change being made and can be cleaned up
subsequently.
This makes the following steps easier to comprehend.
Move all allocator-related methods onto the alloc object so it can be
used in either REST layer. There's an INORDINATE amount of test code
here and I am skeptical that it is all useful. That's for later
commits.
Prior to 1.22 a user could change NodePort values within a service
during an update, and the apiserver would allocate values for any that
were not specified.
Consider a YAML like:
```
apiVersion: v1
kind: Service
metadata:
name: foo
spec:
type: NodePort
ports:
- name: p
port: 80
- name: q
port: 81
selector:
app: foo
```
When this is created, nodeport values will be allocated for each port.
Something like:
```
apiVersion: v1
kind: Service
metadata:
name: foo
spec:
clusterIP: 10.0.149.11
type: NodePort
ports:
- name: p
nodePort: 30872
port: 80
protocol: TCP
targetPort: 9376
- name: q
nodePort: 31310
port: 81
protocol: TCP
targetPort: 81
selector:
app: foo
```
If the user PUTs (kubectl replace) the original YAML, we would see that
`.nodePort = 0`, and allocate new ports. This was ugly at best.
In 1.22 we fixed this to not allocate new values if we still had the old
values, but instead re-assign them. Net new ports would still be seen
as `.nodePort = 0` and so new allocations would be made.
This broke a corner case as follows:
Prior to 1.22, the user could PUT this YAML:
```
apiVersion: v1
kind: Service
metadata:
name: foo
spec:
type: NodePort
ports:
- name: p
nodePort: 31310 # note this is the `q` value
port: 80
- name: q
# note this nodePort is not specified
port: 81
selector:
app: foo
```
The `p` port would take the `q` port's value. The `q` port would be
seen as `.nodePort = 0` and a new value allocated. In 1.22 this results
in an error (duplicate value in `p` and `q`).
This is VERY minor but it is an API regression, which we try to avoid,
and the fix is not too horrible.
This commit adds more robust testing of this logic.
Rename `NewCIDRRange()` to `NewInMemory()`
Rename `NewAllocatorCIDRRange()` to `New()`
Rename `NewPortAllocator()` to `NewInMemory()`
Rename `NewPortAllocatorCustom()` to `New()`
Add 4 new metrics to the ClusterIP allocators:
- current number of available IPs per Service CIDR
- current number of used IPs per Service CIDR
- total number of allocation per Service CIDR
- total number of allocation errors per ServiceCIDR
It is not uncommon for users to Create a Service and not specify things
like ClusterIP and NodePort, which we then allocate for them. They same
that YAML somewhere and later use it again in an Update, but then it
fails.
That's because we detected them trying to set a ClusterIP from a value
to "", which is not allowed. If it was just NodePort, they would
actually succeed and reallocate a new port.
After this change, we try to "patch" updates where the user did not
specify those values from the old object.
* pkg/features: promote the ServiceInternalTrafficPolicy field to Beta and on by default
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/api/service/testing: update Service test fixture functions to set internalTrafficPolicy=Cluster by default
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/apis/core/validation: add more Service validation tests for internalTrafficPolicy
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/registry/core/service/storage: fix failing Service REST storage tests to use internalTrafficPolicy: Cluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/registry/core/service/storage: add two test cases for Service REST TestServiceRegistryInternalTrafficPolicyClusterThenLocal and TestServiceRegistryInternalTrafficPolicyLocalThenCluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/registry/core/service: update strategy unit tests to expect default
internalTrafficPolicy=Cluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/proxy/ipvs: fix unit test Test_EndpointSliceReadyAndTerminatingLocal to use internalTrafficPolicy=Cluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/apis/core: update fuzzers to set Service internalTrafficPolicy field
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/api/service/testing: refactor Service test fixtures to use Tweak funcs
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
1. add AllocateLoadBalancerNodePorts fields in specs for validation test cases
2. update fuzzer
3. in resource quota e2e, allocate node port for loadbalancer type service and
exceed the node port quota
Signed-off-by: Hanlin Shi <shihanlin9@gmail.com>
This test was sometimes using the "inner" REST and sometimes using the
"outer" REST. This commit changes all but one test to use the outer.
The remaining test needs rework.
The rest api for services was validating that, on updates, both
the old and new service have the same type. That guarantees that
the type is going to be the same after that, thus we don't need
to validate the service type on the old and the new service.
Adds and implements ResetFieldsProvder interface in order to ensure that
the fieldmanager no longer owns fields that get reset before the object
is persisted.
Co-authored-by: Kevin Wiesmueller <kwiesmul@redhat.com>
Co-authored-by: Kevin Delgado <kevindelgado@google.com>
1. Add API definitions;
2. Add feature gate and drops the field when feature gate is not on;
3. Set default values for the field;
4. Add API Validation
5. add kube-proxy iptables and ipvs implementations
6. add tests
Imporved testing turned these up:
1) Headless+Selectorless, on a single-stack cluster, policy=PreferDual
Prior to this commit, the result was a single IPFamiliy (because we
checked that the 2nd allocator was present). This changes that case to
populate both families (we don't care if the allocator exists), which is
the same as RequireDual.
2) ClusterIP, user specifies 2 families but no IPs
Prior to this commit, the policy was inferred to be SingleStack. This
changes that case to correctly default to RequireDual when 2 families
are present but no IPs.
* Mixed protocol support for Services with type=LoadBalancer
KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/20200103-mixed-protocol-lb.md
Add new feature gate to control the support of mixed protocols in Services with type=LoadBalancer
Add new fields to the ServiceStatus
Add Ports to the LoadBalancerIngress, so cloud provider implementations can report the status of the requested load balanc
er ports
Add ServiceCondition to the ServiceStatus so Service controllers can indicate the conditions of the Service
* regenerate conflicting stuff
Old stored services will not have the `clusterIPs` field when read back
without this.
This includes some renaming for clarity and expanded comments, and a new
test for default on read.
Service has had a problem since forever:
- User creates a service type=LoadBalancer
- We silently allocate them a NodePort
- User changes type to ClusterIP
- We fail the operation because they did not clear NodePort
They never asked for or used the NodePort!
Dual-stack introduced some dependent fields that get auto-wiped on
updates. This carries it further.
If you squint, you can see Service as a big, messy discriminated union,
with type as the discriminator. Ignoring fields for non-selected
union-modes seems right.
This introduces the potential for an apply loop. Specifically, we will
accept YAML that we did not previously accept. Apply could see the
field in local YAML and not in the server and repeatedly try to patch it
in. But since that YAML is currently an error, it seems like a very low
risk. Almost nobody actually specifies their own NodePort values.
To mitigate this somewhat, we only auto-wipe on updates. The same YAML
would fail to create. This is a little inconsistent. We could
auto-wipe on create, too, at the risk of more potential impact.
To do this properly, we need to know the old and new values, which means
we can not do it in defaulting or conversion. So we do it in strategy.
This change also adds unit tests and updates e2e tests to rely on and
verify this behavior.
* api: structure change
* api: defaulting, conversion, and validation
* [FIX] validation: auto remove second ip/family when service changes to SingleStack
* [FIX] api: defaulting, conversion, and validation
* api-server: clusterIPs alloc, printers, storage and strategy
* [FIX] clusterIPs default on read
* alloc: auto remove second ip/family when service changes to SingleStack
* api-server: repair loop handling for clusterIPs
* api-server: force kubernetes default service into single stack
* api-server: tie dualstack feature flag with endpoint feature flag
* controller-manager: feature flag, endpoint, and endpointSlice controllers handling multi family service
* [FIX] controller-manager: feature flag, endpoint, and endpointSlicecontrollers handling multi family service
* kube-proxy: feature-flag, utils, proxier, and meta proxier
* [FIX] kubeproxy: call both proxier at the same time
* kubenet: remove forced pod IP sorting
* kubectl: modify describe to include ClusterIPs, IPFamilies, and IPFamilyPolicy
* e2e: fix tests that depends on IPFamily field AND add dual stack tests
* e2e: fix expected error message for ClusterIP immutability
* add integration tests for dualstack
the third phase of dual stack is a very complex change in the API,
basically it introduces Dual Stack services. Main changes are:
- It pluralizes the Service IPFamily field to IPFamilies,
and removes the singular field.
- It introduces a new field IPFamilyPolicyType that can take
3 values to express the "dual-stack(mad)ness" of the cluster:
SingleStack, PreferDualStack and RequireDualStack
- It pluralizes ClusterIP to ClusterIPs.
The goal is to add coverage to the services API operations,
taking into account the 6 different modes a cluster can have:
- single stack: IP4 or IPv6 (as of today)
- dual stack: IPv4 only, IPv6 only, IPv4 - IPv6, IPv6 - IPv4
* [FIX] add integration tests for dualstack
* generated data
* generated files
Co-authored-by: Antonio Ojea <aojea@redhat.com>
The test suite was using a /24 cluster network for the allocator.
The ip allocator, if no ip is specified when creating the cluster,
picks one randomly, that means that we had 1/256 chances of
collision.
The TestServiceRegistryUpdateDryRun was creating a service without
a ClusterIP, the ip allocator assigned one random, and it was
never deleting it. The same test was checking later if one
specific IP was not allocated, not taking into consideration
that the same ip may have allocated to the first Service.
To avoid any randomness, we create the first Service with a specific
IP address.
Errors from staticcheck:
pkg/registry/autoscaling/horizontalpodautoscaler/storage/storage_test.go:207:7: this value of err is never used (SA4006)
pkg/registry/core/namespace/storage/storage.go:256:5: options.OrphanDependents is deprecated: please use the PropagationPolicy, this field will be deprecated in 1.7. Should the dependent objects be orphaned. If true/false, the "orphan" finalizer will be added to/removed from the object's finalizers list. Either this field or PropagationPolicy may be set, but not both. +optional (SA1019)
pkg/registry/core/namespace/storage/storage.go:257:11: options.OrphanDependents is deprecated: please use the PropagationPolicy, this field will be deprecated in 1.7. Should the dependent objects be orphaned. If true/false, the "orphan" finalizer will be added to/removed from the object's finalizers list. Either this field or PropagationPolicy may be set, but not both. +optional (SA1019)
pkg/registry/core/namespace/storage/storage.go:266:5: options.OrphanDependents is deprecated: please use the PropagationPolicy, this field will be deprecated in 1.7. Should the dependent objects be orphaned. If true/false, the "orphan" finalizer will be added to/removed from the object's finalizers list. Either this field or PropagationPolicy may be set, but not both. +optional (SA1019)
pkg/registry/core/namespace/storage/storage.go:267:11: options.OrphanDependents is deprecated: please use the PropagationPolicy, this field will be deprecated in 1.7. Should the dependent objects be orphaned. If true/false, the "orphan" finalizer will be added to/removed from the object's finalizers list. Either this field or PropagationPolicy may be set, but not both. +optional (SA1019)
pkg/registry/core/persistentvolumeclaim/storage/storage_test.go:165:2: this value of err is never used (SA4006)
pkg/registry/core/resourcequota/storage/storage_test.go:202:7: this value of err is never used (SA4006)
pkg/registry/core/service/ipallocator/allocator_test.go:338:2: this value of other is never used (SA4006)
pkg/registry/core/service/portallocator/allocator_test.go:199:2: this value of other is never used (SA4006)
pkg/registry/core/service/storage/rest_test.go:1843:2: this value of location is never used (SA4006)
pkg/registry/core/service/storage/rest_test.go:1849:2: this value of location is never used (SA4006)
pkg/registry/core/service/storage/rest_test.go:3174:20: use net.IP.Equal to compare net.IPs, not bytes.Equal (SA1021)
pkg/registry/core/service/storage/rest_test.go:3178:20: use net.IP.Equal to compare net.IPs, not bytes.Equal (SA1021)
pkg/registry/core/service/storage/rest_test.go:3185:20: use net.IP.Equal to compare net.IPs, not bytes.Equal (SA1021)
pkg/registry/core/service/storage/rest_test.go:3189:20: use net.IP.Equal to compare net.IPs, not bytes.Equal (SA1021)
Several of the functions in pkg/registry/core/service/ipallocator were
moved to k8s.io/utils/net, but then the original code was never
updated to used to the vendored versions.
(utilnet's version of RangeSize does not have the IPv6 special case
that the original code did, so we need to move that to
NewAllocatorCIDRRange now.)
If the dual-stack flag is enabled and the cluster is single stack IPv6,
the allocator logic for service clusterIP does not properly handle rejecting
a request for an IPv4 family. Return a 422 Invalid on the ipFamily field
when the dual stack flag is on (as it would when it hits beta) and the
cluster is configured for single-stack IPv6.
The family is now defaulted or cleared in BeforeCreate/BeforeUpdate,
and is either inherited from the previous object (if nil or unchanged),
or set to the default strategy's family as necessary. The existing
family defaulting when cluster ip is provided remains in the api
section. We add additonal family defaulting at the time we allocate
the IP to ensure that IPFamily is a consequence of the ClusterIP
and prevent accidental reversion. This defaulting also ensures that
old clients that submit a nil IPFamily for non ClusterIP services
receive a default.
To properly handle validation, make the strategy and the validation code
path condition on which configuration options are passed to service
storage. Move validation and preparation logic inside the strategy where
it belongs. Service validation is now dependent on the configuration of
the server, and as such ValidateConditionService needs to know what the
allowed families are.
The service allocator is used to allocate ip addresses for the
Service IP allocator and NodePorts for the Service NodePort
allocator. It uses a bitmap backed by etcd to store the allocation
and tries to allocate the resources directly from the local memory
instead from etcd, that can cause issues in environment with
high concurrency.
It may happen, in deployments with multiple apiservers, that the
resource allocation information is out of sync, this is more
sensible with NodePorts, per example:
1. apiserver A create a service with NodePort X
2. apiserver B deletes the service
3. apiserver A creates the service again
If the allocation data of apiserver A wasn't refreshed with the
deletion of apiserver B, apiserver A fails the allocation because
the data is out of sync. The Repair loops solve the problem later,
but there are some use cases that require to improve the concurrency
in the allocation logic.
We can try to not do the Allocation and Release operations locally,
and try instead to check if the local data is up to date with etcd,
and operate over the most recent version of the data.
conflict.
Adding unit test verify that deleteValidation is retried.
adding e2e test verifying the webhook can intercept configmap and custom
resource deletion, and the existing object is sent via the
admissionreview.OldObject.
update the admission integration test to verify that the existing object
is passed to the deletion admission webhook as oldObject, in case of an
immediate deletion and in case of an update-on-delete.
* fix duplicated imports of api/core/v1
* fix duplicated imports of client-go/kubernetes
* fix duplicated imports of rest code
* change import name to more reasonable