Headless+selectorless -> RequireDualStack
Headless+selector -> SingleStack
Add test cases to cover this and ExternalName and dual-stack init (which
I think can never trigger, but best to be safe).
Since the PR to do this deeper in the stack was declined, we'll do it
ourselves. This ensures that we don't accidentally mutate the input and
then compare that mutated form to the result (which caused previous test
failures).
This is the culmination of all the previous commits which made this last
move less dramatic. More tests and cleanup commits will follow.
Background, for future archaeologists:
Service has (had) an "outer" and "inner" REST handler. This is because of how we do IP and port allocations synchronously, but since we don't have API transactions, we need to roll those back in case of a failure. Both layers use the same `Strategy`, but the outer calls into the inner, which causes a lot of complexity in the code (including an open-coded partial reimplementation of a date-unknown snapshot of the generic REST code) and results in `Prepare` and `Validate` hooks being called twice.
The "normal" REST flow seems to be:
```
mutating webhooks
generic REST store Create {
cleanup = BeginCreate
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Decorator
}
```
Service (before this series of commits) does:
```
mutating webhooks
svc custom Create {
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
Allocations
inner (generic) Create {
cleanup = BeginCreate
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Decorator
}
}
```
After this:
```
mutating webhooks
generic REST store Create {
cleanup = BeginCreate
Allocations
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Rollback allocations on error
Decorator
}
```
Previously we would try to infer the `ipFamilyPolicy` from `clusterIPs`
and/or `ipFamilies`. That is too tricky. Now you MUST specify
`ipFamilyPolicy` as one of the dual-stack options in order to get a
dual-stack service.
This removes the old rest_tests and adds significantly more coverage.
Maybe too much. The v4,v6 and v6,v4 tables are identical but for the
order of families.
This exposed that `trimFieldsForDualStackDowngrade` is called too late
to do anything (since we don't run strategy twice any more). I moved
similar logic to `PatchAllocatedValues` but I hit on some unclarity.
Specifically, consider a PATCH operation.
Assume I have a valid dual-stack service (with 2 IPs, 2 families, and
policy either require or prefer). What fields can I patch, on their own,
to trigger a downgrade to single-stack?
I think patching policy to "single" is pretty clear intent.
But what if I leave policy and only patch `ipFamilies` down to a single
value (without violating the "can't change first family" rule)?
Or what if I patch `clusterIPs` down to a single IP value?
After much discussion, we decided to make a small API change (OK since
we are beta). When you want a dual-stack Service you MUST specify the
`ipFamilyPolicy`. Now we can infer less and avoid ambiguity.
This commit started as removing FIXME comments, but in doing so I
realized that the IP allocation process was using unvalidated user
input. Before de-layering, validation was called twice - once before
init and once after, which the init code depended on.
Fortunately (or not?) we had duplicative checks that caught errors but
with less friendly messages.
This commit calls validation before initializing the rest of the
IP-related fields.
This also re-organizes that code a bit, cleans up error messages and
comments, and adds a test SPECIFICALLY for the errors in those cases.
This was causing tests to pass which ought not be passing. This is not
an API change because we default the value of it when needed. So we
would never see this in the wild, but it makes the tests sloppy.
This scaffolding allows us to assert more on each test case, and more
consistently.
Set input fields from output fields IFF they are expected AND not set on
input. This allows us to verify the "after" state (expected) whether
the test case specified the value or not, and still pass the generic
cmp.Equal.
Use this in a few tests to prove its worth, more to do.
Some of the existing tests that are focused on create and delete can
probably be replaced by these.
This could be used in other test cases that are open-coding a lot of the
same stuff. Later commits.
This is the last layered method. All allocator logic is moved to the
beginUpdate() path. Removing the now-useless layer will happen in a
subsequent commit.
This commit ports the ExternalTrafficPolicy and HealthCheckNodePort
tests from rest_test to storage_test. It's not a direct port, though.
I have added more cases (much more exhaustive) and more assertions.
This commit ports the NodePort test from rest_test to storage_test.
It's not a direct port, though. I have added many more cases (much more
exhaustive) and more assertions.
This includes cases for gate MixedProtocolLBService.
This includes a few cases.
1) TestCreateIgnoresIPFamilyForExternalName: Prove that ExternalName is
ignored for dual-stack. A small set of test cases were chosen to
demonstrate.
2) TestCreateIgnoresIPFamilyWithoutDualStack: Prove that when the
dual-stack gate is off, all services are ignored for dual-stack. A
small set of test cases were chosen to demonstrate
3) TestCreateInitIPFields: Run over a huge array of test cases for
dual-stack. This was generated by this program:
https://gist.github.com/thockin/cccc9c9a580b4830ee0946ddd43eeafe and
then updated by hand.
Gut the "outer" Create() and move it to the inner BeginCreate(). This
uses a "transaction" type to make cleanup functions easy to read.
Background:
Service has an "outer" and "inner" REST handler. This is because of how we do IP and port allocations synchronously, but since we don't have API transactions, we need to roll those back in case of a failure. Both layers use the same `Strategy`, but the outer calls into the inner, which causes a lot of complexity in the code (including an open-coded partial reimplementation of a date-unknown snapshot of the generic REST code) and results in `Prepare` and `Validate` hooks being called twice.
The "normal" REST flow seems to be:
```
mutating webhooks
generic REST store Create {
cleanup = BeginCreate
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Decorator
}
```
Service (before this commit) does:
```
mutating webhooks
svc custom Create {
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
Allocations
inner (generic) Create {
cleanup = BeginCreate
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Decorator
}
}
```
After this commit:
```
mutating webhooks
generic REST store Create {
cleanup = BeginCreate
Allocations
BeforeCreate {
strategy.PrepareForCreate {
dropDisabledFields
}
strategy.Validate
strategy.Canonicalize
}
createValidation (validating webhooks)
storage Create
cleanup
AfterCreate
Rollback allocations on error
Decorator
}
```
This same fix pattern will be applied to Delete and Update in subsequent
commits.
All the logic remains unchanged, just reorganized. The functions are
imperfect but emphasize the change being made and can be cleaned up
subsequently.
This makes the following steps easier to comprehend.
Move all allocator-related methods onto the alloc object so it can be
used in either REST layer. There's an INORDINATE amount of test code
here and I am skeptical that it is all useful. That's for later
commits.
Prior to 1.22 a user could change NodePort values within a service
during an update, and the apiserver would allocate values for any that
were not specified.
Consider a YAML like:
```
apiVersion: v1
kind: Service
metadata:
name: foo
spec:
type: NodePort
ports:
- name: p
port: 80
- name: q
port: 81
selector:
app: foo
```
When this is created, nodeport values will be allocated for each port.
Something like:
```
apiVersion: v1
kind: Service
metadata:
name: foo
spec:
clusterIP: 10.0.149.11
type: NodePort
ports:
- name: p
nodePort: 30872
port: 80
protocol: TCP
targetPort: 9376
- name: q
nodePort: 31310
port: 81
protocol: TCP
targetPort: 81
selector:
app: foo
```
If the user PUTs (kubectl replace) the original YAML, we would see that
`.nodePort = 0`, and allocate new ports. This was ugly at best.
In 1.22 we fixed this to not allocate new values if we still had the old
values, but instead re-assign them. Net new ports would still be seen
as `.nodePort = 0` and so new allocations would be made.
This broke a corner case as follows:
Prior to 1.22, the user could PUT this YAML:
```
apiVersion: v1
kind: Service
metadata:
name: foo
spec:
type: NodePort
ports:
- name: p
nodePort: 31310 # note this is the `q` value
port: 80
- name: q
# note this nodePort is not specified
port: 81
selector:
app: foo
```
The `p` port would take the `q` port's value. The `q` port would be
seen as `.nodePort = 0` and a new value allocated. In 1.22 this results
in an error (duplicate value in `p` and `q`).
This is VERY minor but it is an API regression, which we try to avoid,
and the fix is not too horrible.
This commit adds more robust testing of this logic.
Rename `NewCIDRRange()` to `NewInMemory()`
Rename `NewAllocatorCIDRRange()` to `New()`
Rename `NewPortAllocator()` to `NewInMemory()`
Rename `NewPortAllocatorCustom()` to `New()`
Add 4 new metrics to the ClusterIP allocators:
- current number of available IPs per Service CIDR
- current number of used IPs per Service CIDR
- total number of allocation per Service CIDR
- total number of allocation errors per ServiceCIDR
It is not uncommon for users to Create a Service and not specify things
like ClusterIP and NodePort, which we then allocate for them. They same
that YAML somewhere and later use it again in an Update, but then it
fails.
That's because we detected them trying to set a ClusterIP from a value
to "", which is not allowed. If it was just NodePort, they would
actually succeed and reallocate a new port.
After this change, we try to "patch" updates where the user did not
specify those values from the old object.
* pkg/features: promote the ServiceInternalTrafficPolicy field to Beta and on by default
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/api/service/testing: update Service test fixture functions to set internalTrafficPolicy=Cluster by default
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/apis/core/validation: add more Service validation tests for internalTrafficPolicy
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/registry/core/service/storage: fix failing Service REST storage tests to use internalTrafficPolicy: Cluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/registry/core/service/storage: add two test cases for Service REST TestServiceRegistryInternalTrafficPolicyClusterThenLocal and TestServiceRegistryInternalTrafficPolicyLocalThenCluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/registry/core/service: update strategy unit tests to expect default
internalTrafficPolicy=Cluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/proxy/ipvs: fix unit test Test_EndpointSliceReadyAndTerminatingLocal to use internalTrafficPolicy=Cluster
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/apis/core: update fuzzers to set Service internalTrafficPolicy field
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
* pkg/api/service/testing: refactor Service test fixtures to use Tweak funcs
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>