mirror of
https://github.com/k3s-io/kubernetes.git
synced 2025-07-31 15:25:57 +00:00
ScheduledJob: proposal updates
This commit is contained in:
parent
d6854cbb6b
commit
87cba77e34
@ -62,14 +62,27 @@ report generation and the like. Each of these tasks should be allowed to run
|
|||||||
repeatedly (once a day/month, etc.) or once at a given point in time.
|
repeatedly (once a day/month, etc.) or once at a given point in time.
|
||||||
|
|
||||||
|
|
||||||
## Implementation
|
## Design Overview
|
||||||
|
|
||||||
|
Users create a ScheduledJob object. One ScheduledJob object
|
||||||
|
is like one line of a crontab file. It has a schedule of when to run,
|
||||||
|
in [Cron](https://en.wikipedia.org/wiki/Cron) format.
|
||||||
|
|
||||||
|
|
||||||
|
The ScheduledJob controller creates a Job object [Job](job.md)
|
||||||
|
about once per execution time of the scheduled (e.g. once per
|
||||||
|
day for a daily schedule.) We say "about" because there are certain
|
||||||
|
circumstances where two jobs might be created, or no job might be
|
||||||
|
created. We attempt to make these rare, but do not completely prevent
|
||||||
|
them. Therefore, Jobs should be idempotent.
|
||||||
|
|
||||||
|
The Job object is responsible for any retrying of Pods, and any parallelism
|
||||||
|
among pods it creates, and determining the success or failure of the set of
|
||||||
|
pods. The ScheduledJob does not examine pods at all.
|
||||||
|
|
||||||
|
|
||||||
### ScheduledJob resource
|
### ScheduledJob resource
|
||||||
|
|
||||||
The ScheduledJob controller relies heavily on the [Job API](job.md)
|
|
||||||
for running actual jobs, on top of which it adds information regarding the date
|
|
||||||
and time part according to [Cron](https://en.wikipedia.org/wiki/Cron) format.
|
|
||||||
|
|
||||||
The new `ScheduledJob` object will have the following contents:
|
The new `ScheduledJob` object will have the following contents:
|
||||||
|
|
||||||
```go
|
```go
|
||||||
@ -173,31 +186,12 @@ type ScheduledJobStatus struct {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
### Modifications to Job resource
|
Users must use a generated selector for the job.
|
||||||
|
|
||||||
In order to distinguish Job runs, we need to add `UniqueLabelKey` field to `JobSpec`.
|
## Modifications to Job resource
|
||||||
This field will be used for creating unique label selectors.
|
|
||||||
|
|
||||||
```go
|
TODO for beta: forbid manual selector since that could cause confusing between
|
||||||
type JobSpec {
|
subsequent jobs.
|
||||||
|
|
||||||
//...
|
|
||||||
|
|
||||||
// Key of the selector that is added to prevent concurrently running Jobs
|
|
||||||
// selecting their pods.
|
|
||||||
// Users can set this to an empty string to indicate that the system should
|
|
||||||
// not add any selector and label. If unspecified, system uses
|
|
||||||
// "scheduledjob.kubernetes.io/podTemplateHash".
|
|
||||||
// Value of this key is hash of ScheduledJobSpec.PodTemplateSpec.
|
|
||||||
// No label is added if this is set to an empty string.
|
|
||||||
UniqueLabelKey *string
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Although at Job level empty string is perfectly valid, `ScheduledJob` cannot have
|
|
||||||
empty selector, it needs to be defined, either by user or generated automatically.
|
|
||||||
For this to happen, validation will be tightened at ScheduledJob level for this
|
|
||||||
field to be either nil or non-empty string.
|
|
||||||
|
|
||||||
### Running ScheduledJobs using kubectl
|
### Running ScheduledJobs using kubectl
|
||||||
|
|
||||||
@ -216,6 +210,151 @@ In the above example:
|
|||||||
July 7th, 2pm. This value will be validated according to the same rules which
|
July 7th, 2pm. This value will be validated according to the same rules which
|
||||||
apply to `.spec.schedule`.
|
apply to `.spec.schedule`.
|
||||||
|
|
||||||
|
## Fields Added to Job Template
|
||||||
|
|
||||||
|
When the controller creates a Job from the JobTemplateSpec in the ScheduledJob, it
|
||||||
|
adds the following fields to the Job:
|
||||||
|
|
||||||
|
- a name, based on the ScheduledJob's name, but with a suffix to distinguish
|
||||||
|
multiple executions, which may overlap.
|
||||||
|
- the standard created-by annotation on the Job, pointing to the SJ that created it
|
||||||
|
The standard key is `kubernetes.io/created-by`. The value is a serialized JSON object, like
|
||||||
|
`{ "kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ScheduledJob","namespace":"default",`
|
||||||
|
`"name":"nightly-earnings-report","uid":"5ef034e0-1890-11e6-8935-42010af0003e","apiVersion":...`
|
||||||
|
This serialization contains the UID of the parent. This is used to match the Job to the SJ that created
|
||||||
|
it.
|
||||||
|
|
||||||
|
## Updates to ScheduledJobs
|
||||||
|
|
||||||
|
If the schedule is updated on a ScheduledJob, it will:
|
||||||
|
- continue to use the Status.Active list of jobs to detect conflicts.
|
||||||
|
- try to fulfill all recently-passed times for the new schedule, by starting
|
||||||
|
new jobs. But it will not try to fulfill times prior to the
|
||||||
|
Status.LastScheduledTime.
|
||||||
|
- Example: If you have a schedule to run every 30 minutes, and change that to hourly, then the previously started
|
||||||
|
top-of-the-hour run, in Status.Active, will be seen and no new job started.
|
||||||
|
- Example: If you have a schedule to run every hour, change that to 30-minutely, at 31 minutes past the hour,
|
||||||
|
one run will be started immediately for the starting time that has just passed.
|
||||||
|
|
||||||
|
If the job template of a ScheduledJob is updated, then future executions use the new template
|
||||||
|
but old ones still satisfy the schedule and are not re-run just because the template changed.
|
||||||
|
|
||||||
|
If you delete and replace a ScheduledJob with one of the same name, it will:
|
||||||
|
- not use any old Status.Active, and not consider any existing running or terminated jobs from the previous
|
||||||
|
ScheduledJob (with a different UID) at all when determining coflicts, what needs to be started, etc.
|
||||||
|
- If there is an existing Job with the same time-based hash in its name (see below), then
|
||||||
|
new instances of that job will not be able to be created. So, delete it if you want to re-run.
|
||||||
|
with the same name as conflicts.
|
||||||
|
- not "re-run" jobs for "start times" before the creation time of the new ScheduledJobJob object.
|
||||||
|
- not consider executions from the previous UID when making decisions about what executions to
|
||||||
|
start, or status, etc.
|
||||||
|
- lose the history of the old SJ.
|
||||||
|
|
||||||
|
To preserve status, you can suspend the old one, and make one with a new name, or make a note of the old status.
|
||||||
|
|
||||||
|
|
||||||
|
## Fault-Tolerance
|
||||||
|
|
||||||
|
### Starting Jobs in the face of controller failures
|
||||||
|
|
||||||
|
If the process with the scheduledJob controller in it fails,
|
||||||
|
and takes a while to restart, the scheduledJob controller
|
||||||
|
may miss the time window and it is too late to start a job.
|
||||||
|
|
||||||
|
With a single scheduledJob controller process, we cannot give
|
||||||
|
very strong assurances about not missing starting jobs.
|
||||||
|
|
||||||
|
With a suggested HA configuration, there are multiple controller
|
||||||
|
processes, and they use master election to determine which one
|
||||||
|
is active at any time.
|
||||||
|
|
||||||
|
If the Job's StartingDeadlineSeconds is long enough, and the
|
||||||
|
lease for the master lock is short enough, and other controller
|
||||||
|
processes are running, then a Job will be started.
|
||||||
|
|
||||||
|
TODO: consider hard-coding the minimum StartingDeadlineSeconds
|
||||||
|
at say 1 minute. Then we can offer a clearer guarantee,
|
||||||
|
assuming we know what the setting of the lock lease duration is.
|
||||||
|
|
||||||
|
### Ensuring jobs are run at most once
|
||||||
|
|
||||||
|
There are three problems here:
|
||||||
|
|
||||||
|
- ensure at most one Job created per "start time" of a schedule.
|
||||||
|
- ensure that at most one Pod is created per Job
|
||||||
|
- ensure at most one container start occurs per Pod
|
||||||
|
|
||||||
|
#### Ensuring one Job
|
||||||
|
|
||||||
|
Multiple jobs might be created in the following sequence:
|
||||||
|
|
||||||
|
1. scheduled job controller sends request to start Job J1 to fulfill start time T.
|
||||||
|
1. the create request is accepted by the apiserver and enqueued but not yet written to etcd.
|
||||||
|
1. scheduled job controller crashes
|
||||||
|
1. new scheduled job controller starts, and lists the existing jobs, and does not see one created.
|
||||||
|
1. it creates a new one.
|
||||||
|
1. the first one eventually gets written to etcd.
|
||||||
|
1. there are now two jobs for the same start time.
|
||||||
|
|
||||||
|
We can solve this in several ways:
|
||||||
|
|
||||||
|
1. with three-phase protocol, e.g.:
|
||||||
|
1. controller creates a "suspended" job.
|
||||||
|
1. controller writes writes an annotation in the SJ saying that it created a job for this time.
|
||||||
|
1. controller unsuspends that job.
|
||||||
|
1. by picking a deterministic name, so that at most one object create can succeed.
|
||||||
|
|
||||||
|
#### Ensuring one Pod
|
||||||
|
|
||||||
|
Job object does not currently have a way to ask for this.
|
||||||
|
Even if it did, controller is not written to support it.
|
||||||
|
Same problem as above.
|
||||||
|
|
||||||
|
#### Ensuring one container invocation per Pod
|
||||||
|
|
||||||
|
Kubelet is not written to ensure at-most-one-container-start per pod.
|
||||||
|
|
||||||
|
#### Decision
|
||||||
|
|
||||||
|
This is too hard to do for the alpha version. We will await user
|
||||||
|
feedback to see if the "at most once" property is needed in the beta version.
|
||||||
|
|
||||||
|
This is awkward but possible for a containerized application ensure on it own, as it needs
|
||||||
|
to know what ScheduledJob name and Start Time it is from, and then record the attempt
|
||||||
|
in a shared storage system. We should ensure it could extract this data from its annotations
|
||||||
|
using the downward API.
|
||||||
|
|
||||||
|
## Name of Jobs
|
||||||
|
|
||||||
|
A ScheduledJob creates one Job at each time when a Job should run.
|
||||||
|
Since there may be concurrent jobs, and since we might want to keep failed
|
||||||
|
non-overlapping Jobs around as a debugging record, each Job created by the same ScheduledJob
|
||||||
|
needs a distinct name.
|
||||||
|
|
||||||
|
To make the Jobs from the same ScheduledJob distinct, we could use a random string,
|
||||||
|
in the way that pods have a `generateName`. For example, a scheduledJob named `nightly-earnings-report`
|
||||||
|
in namespace `ns1` might create a job `nightly-earnings-report-3m4d3`, and later create
|
||||||
|
a job called `nightly-earnings-report-6k7ts`. This is consistent with pods, but
|
||||||
|
does not give the user much information.
|
||||||
|
|
||||||
|
Alternatively, we can use time as a uniqifier. For example, the same scheduledJob could
|
||||||
|
create a job called `nightly-earnings-report-2016-May-19`.
|
||||||
|
However, for Jobs that run more than once per day, we would need to represent
|
||||||
|
time as well as date. Standard date formats (e.g. RFC 3339) use colons for time.
|
||||||
|
Kubernetes names cannot include time. Using a non-standard date format without colons
|
||||||
|
will annoy some users.
|
||||||
|
|
||||||
|
Also, date strings are much longer than random suffixes, which means that
|
||||||
|
the pods will also have long names, and that we are more likely to exceed the
|
||||||
|
253 character name limit when combining the scheduled-job name,
|
||||||
|
the time suffix, and pod random suffix.
|
||||||
|
|
||||||
|
One option would be to compute a hash of the nominal start time of the job,
|
||||||
|
and use that as a suffix. This would not provide the user with an indication
|
||||||
|
of the start time, but it would prevent creation of the same execution
|
||||||
|
by two instances (replicated or restarting) of the controller process.
|
||||||
|
|
||||||
|
We chose to use the hashed-date suffix approach.
|
||||||
|
|
||||||
## Future evolution
|
## Future evolution
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user