mirror of
https://github.com/rancher/rke.git
synced 2025-08-31 14:36:32 +00:00
183
README.md
183
README.md
@@ -414,78 +414,73 @@ Using RKE's pluggable user addons, it's possible to deploy Rancher 2.x server in
|
||||
# chown <user> /var/run/docker.sock
|
||||
```
|
||||
|
||||
## Etcd Snapshot
|
||||
## Etcd Snapshots
|
||||
|
||||
You can configure a Rancher Kubernetes Engine (RKE) cluster to automatically create backups of etcd. In a disaster scenario, you can restore these backups, which are stored on other cluster nodes.
|
||||
You can configure a Rancher Kubernetes Engine (RKE) cluster to automatically take snapshots of etcd. In a disaster scenario, you can restore these snapshots, which are stored on other cluster nodes.
|
||||
|
||||
### Etcd Rolling-Backup
|
||||
### One-Time Snapshots
|
||||
|
||||
To schedule a recurring automatic etcd snapshot save, enable the `etcd-backup` service. `etcd-backup` runs in a service container alongside the `etcd` container. `etcd-backup` automatically creates a snapshot of etcd and stores them to its local disk.
|
||||
RKE introduce a new command that can take a snapshot of a running etcd node in rke cluster, the snapshot will be automatically saved in `/opt/rke/etcd-snapshots`, the commands works as following:
|
||||
```
|
||||
./rke etcd snapshot-save --config cluster.yml
|
||||
|
||||
To enable `etcd-backup` in RKE CLI, configure the following three variables:
|
||||
WARN[0000] Name of the snapshot is not specified using [rke_etcd_snapshot_2018-05-17T23:32:08+02:00]
|
||||
INFO[0000] Starting saving snapshot on etcd hosts
|
||||
INFO[0000] [dialer] Setup tunnel for host [x.x.x.x]
|
||||
INFO[0001] [dialer] Setup tunnel for host [y.y.y.y]
|
||||
INFO[0002] [dialer] Setup tunnel for host [z.z.z.z]
|
||||
INFO[0003] [etcd] Saving snapshot [rke_etcd_snapshot_2018-05-17T23:32:08+02:00] on host [x.x.x.x]
|
||||
INFO[0004] [etcd] Successfully started [etcd-snapshot-once] container on host [x.x.x.x]
|
||||
INFO[0004] [etcd] Saving snapshot [rke_etcd_snapshot_2018-05-17T23:32:08+02:00] on host [y.y.y.y]
|
||||
INFO[0005] [etcd] Successfully started [etcd-snapshot-once] container on host [y.y.y.y]
|
||||
INFO[0005] [etcd] Saving snapshot [rke_etcd_snapshot_2018-05-17T23:32:08+02:00] on host [z.z.z.z]
|
||||
INFO[0006] [etcd] Successfully started [etcd-snapshot-once] container on host [z.z.z.z]
|
||||
INFO[0006] Finished saving snapshot [rke_etcd_snapshot_2018-05-17T23:32:08+02:00] on all etcd hosts
|
||||
```
|
||||
|
||||
The command will save a snapshot of etcd from each etcd node in the cluster config file and will save it in `/opt/rke/etcd-snapshots`. This command also creates a container for taking the snapshot. When the process completes, the container is automatically removed.
|
||||
|
||||
### Etcd Recurring Snapshots
|
||||
|
||||
To schedule a recurring automatic etcd snapshot save, enable the `etcd-snapshot` service. `etcd-snapshot` runs in a service container alongside the `etcd` container. `etcd-snapshot` automatically takes a snapshot of etcd and stores them to its local disk in `/opt/rke/etcd-snapshots`.
|
||||
|
||||
To enable `etcd-snapshot` in RKE CLI, configure the following three variables:
|
||||
|
||||
```
|
||||
services:
|
||||
etcd:
|
||||
backup: true
|
||||
snapshot: true
|
||||
creation: 5m0s
|
||||
retention: 24h
|
||||
```
|
||||
|
||||
- `backup`: Enables/disables etcd backups in the RKE cluster.
|
||||
- `snapshot`: Enables/disables etcd snapshot recurring service in the RKE cluster.
|
||||
|
||||
Default value: `false`.
|
||||
- `creation`: Time period in which `etcd-backup` creates and stores local backups.
|
||||
- `creation`: Time period in which `etcd-sanpshot` take snapshots.
|
||||
|
||||
Default value: `5m0s`
|
||||
|
||||
- `retention`: Time period before before an etcd backup expires. Expired backups are purged.
|
||||
- `retention`: Time period before before an etcd snapshot expires. Expired snapshots are purged.
|
||||
|
||||
Default value: `24h`
|
||||
|
||||
After RKE runs, view the `etcd-backup` logs to confirm backups are being created automatically:
|
||||
After RKE runs, view the `etcd-snapshot` logs to confirm backups are being created automatically:
|
||||
```
|
||||
# docker logs etcd-backup
|
||||
# docker logs etcd-snapshot
|
||||
|
||||
time="2018-05-04T18:39:16Z" level=info msg="Initializing Rolling Backups" creation=1m0s retention=24h0m0s
|
||||
time="2018-05-04T18:40:16Z" level=info msg="Created backup" name="2018-05-04T18:40:16Z_etcd" runtime=108.332814ms
|
||||
time="2018-05-04T18:41:16Z" level=info msg="Created backup" name="2018-05-04T18:41:16Z_etcd" runtime=92.880112ms
|
||||
time="2018-05-04T18:42:16Z" level=info msg="Created backup" name="2018-05-04T18:42:16Z_etcd" runtime=83.67642ms
|
||||
time="2018-05-04T18:43:16Z" level=info msg="Created backup" name="2018-05-04T18:43:16Z_etcd" runtime=86.298499ms
|
||||
```
|
||||
Backups are saved to the following directory: `/opt/rke/etcdbackup/`. Backups are created on each node that runs etcd.
|
||||
Backups are saved to the following directory: `/opt/rke/etcd-snapshots/`. Backups are created on each node that runs etcd.
|
||||
|
||||
|
||||
### Etcd onetime Snapshots
|
||||
|
||||
RKE also added two commands that for etcd v3 snapshot management:
|
||||
```
|
||||
./rke etcd snapshot-save --name NAME
|
||||
```
|
||||
and
|
||||
```
|
||||
./rke etcd snapshot-restore --name NAME
|
||||
```
|
||||
|
||||
The backup command saves a snapshot of etcd from each etcd nodes in the cluster config file and will save it in `/opt/rke/etcdbackup`. This command also creates a container for the backup. When the backup completes, the container is removed.
|
||||
|
||||
```
|
||||
# ./rke etcd snapshot-save --name snapshot --config cluster.yml
|
||||
|
||||
INFO[0000] Starting Backup on etcd hosts
|
||||
INFO[0000] [dialer] Setup tunnel for host [x.x.x.x]
|
||||
INFO[0002] [dialer] Setup tunnel for host [y.y.y.y]
|
||||
INFO[0004] [dialer] Setup tunnel for host [z.z.z.z]
|
||||
INFO[0006] [etcd] Starting backup on host [x.x.x.x]
|
||||
INFO[0007] [etcd] Successfully started [etcd-backup-once] container on host [x.x.x.x]
|
||||
INFO[0007] [etcd] Starting backup on host [y.y.y.y]
|
||||
INFO[0009] [etcd] Successfully started [etcd-backup-once] container on host [y.y.y.y]
|
||||
INFO[0010] [etcd] Starting backup on host [z.z.z.z]
|
||||
INFO[0011] [etcd] Successfully started [etcd-backup-once] container on host [z.z.z.z]
|
||||
INFO[0011] Finished backup on all etcd hosts
|
||||
```
|
||||
### Etcd Disaster recovery
|
||||
|
||||
`etcd snapshot-restore` is used for etcd Disaster recovery, it reverts to any snapshot stored in `/opt/rke/etcdbackup` that you explicitly define. When you run `etcd snapshot-restore`, RKE removes the old etcd container if it still exists. To restore operations, RKE creates a new etcd cluster using the snapshot you choose.
|
||||
`etcd snapshot-restore` is used for etcd Disaster recovery, it reverts to any snapshot stored in `/opt/rke/etcd-snapshots` that you explicitly define. When you run `etcd snapshot-restore`, RKE removes the old etcd container if it still exists. To restore operations, RKE creates a new etcd cluster using the snapshot you choose.
|
||||
|
||||
>**Warning:** Restoring an etcd snapshot deletes your current etcd cluster and replaces it with a new one. Before you run the `etcd snapshot-restore` command, backup any important data in your current cluster.
|
||||
|
||||
@@ -530,6 +525,112 @@ INFO[0027] [etcd] Successfully started etcd plane..
|
||||
INFO[0027] Finished restoring on all etcd hosts
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
In this example we will assume that you started RKE on two nodes:
|
||||
|
||||
| Name | IP | Role |
|
||||
|:-----:|:--------:|:----------------------:|
|
||||
| node1 | 10.0.0.1 | [controlplane, worker] |
|
||||
| node2 | 10.0.0.2 | [etcd] |
|
||||
|
||||
### 1. Setting up rke cluster
|
||||
A minimal cluster configuration file for running k8s on these nodes should look something like the following:
|
||||
|
||||
```
|
||||
nodes:
|
||||
- address: 10.0.0.1
|
||||
hostname_override: node1
|
||||
user: ubuntu
|
||||
role: [controlplane,worker]
|
||||
- address: 10.0.0.2
|
||||
hostname_override: node2
|
||||
user: ubuntu
|
||||
role: [etcd]
|
||||
```
|
||||
|
||||
After running `rke up` you should be able to have a two node cluster, the next step is to run few pods on node1:
|
||||
|
||||
```
|
||||
kubectl --kubeconfig=kube_config_cluster.yml run nginx --image=nginx --replicas=3
|
||||
```
|
||||
|
||||
### 2. Backup etcd cluster
|
||||
|
||||
Now lets take a snapshot using RKE:
|
||||
|
||||
```
|
||||
rke etcd snapshot-save --name snapshot.db --config cluster.yml
|
||||
```
|
||||
|
||||

|
||||
|
||||
### 3. Store snapshot externally
|
||||
|
||||
After taking the etcd backup on node2 we should be able to save this backup in a persistence place, one of the options to do that is to save the backup taken on a s3 bucket or tape backup, for example:
|
||||
|
||||
```
|
||||
root@node2:~# s3cmd mb s3://rke-etcd-backup
|
||||
root@node2:~# s3cmd /opt/rke/etcdbackup/snapshot.db s3://rke-etcd-backup/
|
||||
```
|
||||
|
||||
### 4. Pull the backup on a new node
|
||||
|
||||
To simulate the failure lets powerdown node2 completely:
|
||||
|
||||
```
|
||||
root@node2:~# poweroff
|
||||
```
|
||||
|
||||
Now its time to pull the backup saved on s3 on a new node:
|
||||
|
||||
| Name | IP | Role |
|
||||
|:-----:|:--------:|:----------------------:|
|
||||
| node1 | 10.0.0.1 | [controlplane, worker] |
|
||||
| ~~node2~~ | ~~10.0.0.2~~ | ~~[etcd]~~ |
|
||||
| node3 | 10.0.0.3 | [etcd] |
|
||||
| | | |
|
||||
```
|
||||
root@node3:~# mkdir -p /opt/rke/etcdbackup
|
||||
root@node3:~# s3cmd get s3://rke-etcd-backup/snapshot.db /opt/rke/etcdbackup/snapshot.db
|
||||
```
|
||||
|
||||
### 5. Restore etcd on the new node
|
||||
|
||||
Now lets do a restore to restore and run etcd on the third node, in order to do that you have first to add the third node to the cluster configuration file:
|
||||
```
|
||||
nodes:
|
||||
- address: 10.0.0.1
|
||||
hostname_override: node1
|
||||
user: ubuntu
|
||||
role: [controlplane,worker]
|
||||
# - address: 10.0.0.2
|
||||
# hostname_override: node2
|
||||
# user: ubuntu
|
||||
# role: [etcd]
|
||||
- address: 10.0.0.3
|
||||
hostname_override: node3
|
||||
user: ubuntu
|
||||
role: [etcd]
|
||||
```
|
||||
and then run `rke etcd restore`:
|
||||
```
|
||||
rke etcd snapshot-restore --name snapshot.db --config cluster.yml
|
||||
```
|
||||
|
||||
The previous command will restore the etcd data dir from the snapshot and run etcd container on this node, the final step is to restore the operations on the cluster by making the k8s api to point to the new etcd, to do that we run `rke up` again on the new cluster.yml file:
|
||||
```
|
||||
rke up --config cluster.yml
|
||||
```
|
||||
You can make sure that operations have been restored by checking the nginx deployment we created earlier:
|
||||
```
|
||||
> kubectl get pods
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
nginx-65899c769f-kcdpr 1/1 Running 0 17s
|
||||
nginx-65899c769f-pc45c 1/1 Running 0 17s
|
||||
nginx-65899c769f-qkhml 1/1 Running 0 17s
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
Copyright (c) 2018 [Rancher Labs, Inc.](http://rancher.com)
|
||||
|
@@ -77,12 +77,12 @@ func (c *Cluster) DeployControlPlane(ctx context.Context) error {
|
||||
if len(c.Services.Etcd.ExternalURLs) > 0 {
|
||||
log.Infof(ctx, "[etcd] External etcd connection string has been specified, skipping etcd plane")
|
||||
} else {
|
||||
etcdBackup := services.EtcdBackup{
|
||||
Backup: c.Services.Etcd.Backup,
|
||||
etcdRollingSnapshot := services.EtcdSnapshot{
|
||||
Snapshot: c.Services.Etcd.Snapshot,
|
||||
Creation: c.Services.Etcd.Creation,
|
||||
Retention: c.Services.Etcd.Retention,
|
||||
}
|
||||
if err := services.RunEtcdPlane(ctx, c.EtcdHosts, etcdNodePlanMap, c.LocalConnDialerFactory, c.PrivateRegistriesMap, c.UpdateWorkersOnly, c.SystemImages.Alpine, etcdBackup); err != nil {
|
||||
if err := services.RunEtcdPlane(ctx, c.EtcdHosts, etcdNodePlanMap, c.LocalConnDialerFactory, c.PrivateRegistriesMap, c.UpdateWorkersOnly, c.SystemImages.Alpine, etcdRollingSnapshot); err != nil {
|
||||
return fmt.Errorf("[etcd] Failed to bring up Etcd Plane: %v", err)
|
||||
}
|
||||
}
|
||||
|
@@ -11,16 +11,16 @@ import (
|
||||
"github.com/rancher/types/apis/management.cattle.io/v3"
|
||||
)
|
||||
|
||||
func (c *Cluster) BackupEtcd(ctx context.Context, backupName string) error {
|
||||
func (c *Cluster) SnapshotEtcd(ctx context.Context, snapshotName string) error {
|
||||
for _, host := range c.EtcdHosts {
|
||||
if err := services.RunEtcdBackup(ctx, host, c.PrivateRegistriesMap, c.SystemImages.Alpine, c.Services.Etcd.Creation, c.Services.Etcd.Retention, backupName, true); err != nil {
|
||||
if err := services.RunEtcdSnapshotSave(ctx, host, c.PrivateRegistriesMap, c.SystemImages.Alpine, c.Services.Etcd.Creation, c.Services.Etcd.Retention, snapshotName, true); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (c *Cluster) RestoreEtcdBackup(ctx context.Context, backupPath string) error {
|
||||
func (c *Cluster) RestoreEtcdSnapshot(ctx context.Context, snapshotPath string) error {
|
||||
// Stopping all etcd containers
|
||||
for _, host := range c.EtcdHosts {
|
||||
if err := tearDownOldEtcd(ctx, host, c.SystemImages.Alpine, c.PrivateRegistriesMap); err != nil {
|
||||
@@ -30,8 +30,8 @@ func (c *Cluster) RestoreEtcdBackup(ctx context.Context, backupPath string) erro
|
||||
// Start restore process on all etcd hosts
|
||||
initCluster := services.GetEtcdInitialCluster(c.EtcdHosts)
|
||||
for _, host := range c.EtcdHosts {
|
||||
if err := services.RestoreEtcdBackup(ctx, host, c.PrivateRegistriesMap, c.SystemImages.Etcd, backupPath, initCluster); err != nil {
|
||||
return fmt.Errorf("[etcd] Failed to restore etcd backup: %v", err)
|
||||
if err := services.RestoreEtcdSnapshot(ctx, host, c.PrivateRegistriesMap, c.SystemImages.Etcd, snapshotPath, initCluster); err != nil {
|
||||
return fmt.Errorf("[etcd] Failed to restore etcd snapshot: %v", err)
|
||||
}
|
||||
}
|
||||
// Deploy Etcd Plane
|
||||
@@ -40,12 +40,12 @@ func (c *Cluster) RestoreEtcdBackup(ctx context.Context, backupPath string) erro
|
||||
for _, etcdHost := range c.EtcdHosts {
|
||||
etcdNodePlanMap[etcdHost.Address] = BuildRKEConfigNodePlan(ctx, c, etcdHost, etcdHost.DockerInfo)
|
||||
}
|
||||
etcdBackup := services.EtcdBackup{
|
||||
Backup: c.Services.Etcd.Backup,
|
||||
etcdRollingSnapshots := services.EtcdSnapshot{
|
||||
Snapshot: c.Services.Etcd.Snapshot,
|
||||
Creation: c.Services.Etcd.Creation,
|
||||
Retention: c.Services.Etcd.Retention,
|
||||
}
|
||||
if err := services.RunEtcdPlane(ctx, c.EtcdHosts, etcdNodePlanMap, c.LocalConnDialerFactory, c.PrivateRegistriesMap, c.UpdateWorkersOnly, c.SystemImages.Alpine, etcdBackup); err != nil {
|
||||
if err := services.RunEtcdPlane(ctx, c.EtcdHosts, etcdNodePlanMap, c.LocalConnDialerFactory, c.PrivateRegistriesMap, c.UpdateWorkersOnly, c.SystemImages.Alpine, etcdRollingSnapshots); err != nil {
|
||||
return fmt.Errorf("[etcd] Failed to bring up Etcd Plane: %v", err)
|
||||
}
|
||||
return nil
|
||||
|
54
cmd/etcd.go
54
cmd/etcd.go
@@ -3,20 +3,22 @@ package cmd
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/rancher/rke/cluster"
|
||||
"github.com/rancher/rke/hosts"
|
||||
"github.com/rancher/rke/log"
|
||||
"github.com/rancher/rke/pki"
|
||||
"github.com/rancher/types/apis/management.cattle.io/v3"
|
||||
"github.com/sirupsen/logrus"
|
||||
"github.com/urfave/cli"
|
||||
)
|
||||
|
||||
func EtcdCommand() cli.Command {
|
||||
backupRestoreFlags := []cli.Flag{
|
||||
snapshotFlags := []cli.Flag{
|
||||
cli.StringFlag{
|
||||
Name: "name",
|
||||
Usage: "Specify Backup name",
|
||||
Usage: "Specify Snapshot name",
|
||||
},
|
||||
cli.StringFlag{
|
||||
Name: "config",
|
||||
@@ -26,33 +28,33 @@ func EtcdCommand() cli.Command {
|
||||
},
|
||||
}
|
||||
|
||||
backupRestoreFlags = append(backupRestoreFlags, commonFlags...)
|
||||
snapshotFlags = append(snapshotFlags, commonFlags...)
|
||||
|
||||
return cli.Command{
|
||||
Name: "etcd",
|
||||
Usage: "etcd backup/restore operations in k8s cluster",
|
||||
Usage: "etcd snapshot save/restore operations in k8s cluster",
|
||||
Subcommands: []cli.Command{
|
||||
{
|
||||
Name: "snapshot-save",
|
||||
Usage: "Take snapshot on all etcd hosts",
|
||||
Flags: backupRestoreFlags,
|
||||
Action: BackupEtcdHostsFromCli,
|
||||
Flags: snapshotFlags,
|
||||
Action: SnapshotSaveEtcdHostsFromCli,
|
||||
},
|
||||
{
|
||||
Name: "snapshot-restore",
|
||||
Usage: "Restore existing snapshot",
|
||||
Flags: backupRestoreFlags,
|
||||
Action: RestoreEtcdBackupFromCli,
|
||||
Flags: snapshotFlags,
|
||||
Action: RestoreEtcdSnapshotFromCli,
|
||||
},
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
func BackupEtcdHosts(
|
||||
func SnapshotSaveEtcdHosts(
|
||||
ctx context.Context,
|
||||
rkeConfig *v3.RancherKubernetesEngineConfig,
|
||||
dockerDialerFactory hosts.DialerFactory,
|
||||
configDir, backupName string) error {
|
||||
configDir, snapshotName string) error {
|
||||
|
||||
log.Infof(ctx, "Starting saving snapshot on etcd hosts")
|
||||
kubeCluster, err := cluster.ParseCluster(ctx, rkeConfig, clusterFilePath, configDir, dockerDialerFactory, nil, nil)
|
||||
@@ -63,19 +65,19 @@ func BackupEtcdHosts(
|
||||
if err := kubeCluster.TunnelHosts(ctx, false); err != nil {
|
||||
return err
|
||||
}
|
||||
if err := kubeCluster.BackupEtcd(ctx, backupName); err != nil {
|
||||
if err := kubeCluster.SnapshotEtcd(ctx, snapshotName); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
log.Infof(ctx, "Finished saving snapshot on all etcd hosts")
|
||||
log.Infof(ctx, "Finished saving snapshot [%s] on all etcd hosts", snapshotName)
|
||||
return nil
|
||||
}
|
||||
|
||||
func RestoreEtcdBackup(
|
||||
func RestoreEtcdSnapshot(
|
||||
ctx context.Context,
|
||||
rkeConfig *v3.RancherKubernetesEngineConfig,
|
||||
dockerDialerFactory hosts.DialerFactory,
|
||||
configDir, backupName string) error {
|
||||
configDir, snapshotName string) error {
|
||||
|
||||
log.Infof(ctx, "Starting restoring snapshot on etcd hosts")
|
||||
kubeCluster, err := cluster.ParseCluster(ctx, rkeConfig, clusterFilePath, configDir, dockerDialerFactory, nil, nil)
|
||||
@@ -86,15 +88,15 @@ func RestoreEtcdBackup(
|
||||
if err := kubeCluster.TunnelHosts(ctx, false); err != nil {
|
||||
return err
|
||||
}
|
||||
if err := kubeCluster.RestoreEtcdBackup(ctx, backupName); err != nil {
|
||||
if err := kubeCluster.RestoreEtcdSnapshot(ctx, snapshotName); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
log.Infof(ctx, "Finished restoring snapshot on all etcd hosts")
|
||||
log.Infof(ctx, "Finished restoring snapshot [%s] on all etcd hosts", snapshotName)
|
||||
return nil
|
||||
}
|
||||
|
||||
func BackupEtcdHostsFromCli(ctx *cli.Context) error {
|
||||
func SnapshotSaveEtcdHostsFromCli(ctx *cli.Context) error {
|
||||
clusterFile, filePath, err := resolveClusterFile(ctx)
|
||||
if err != nil {
|
||||
return fmt.Errorf("Failed to resolve cluster file: %v", err)
|
||||
@@ -110,11 +112,16 @@ func BackupEtcdHostsFromCli(ctx *cli.Context) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
return BackupEtcdHosts(context.Background(), rkeConfig, nil, "", ctx.String("name"))
|
||||
// Check snapshot name
|
||||
etcdSnapshotName := ctx.String("name")
|
||||
if etcdSnapshotName == "" {
|
||||
etcdSnapshotName = fmt.Sprintf("rke_etcd_snapshot_%s", time.Now().Format(time.RFC3339))
|
||||
logrus.Warnf("Name of the snapshot is not specified using [%s]", etcdSnapshotName)
|
||||
}
|
||||
return SnapshotSaveEtcdHosts(context.Background(), rkeConfig, nil, "", etcdSnapshotName)
|
||||
}
|
||||
|
||||
func RestoreEtcdBackupFromCli(ctx *cli.Context) error {
|
||||
func RestoreEtcdSnapshotFromCli(ctx *cli.Context) error {
|
||||
clusterFile, filePath, err := resolveClusterFile(ctx)
|
||||
if err != nil {
|
||||
return fmt.Errorf("Failed to resolve cluster file: %v", err)
|
||||
@@ -130,7 +137,10 @@ func RestoreEtcdBackupFromCli(ctx *cli.Context) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
return RestoreEtcdBackup(context.Background(), rkeConfig, nil, "", ctx.String("name"))
|
||||
etcdSnapshotName := ctx.String("name")
|
||||
if etcdSnapshotName == "" {
|
||||
return fmt.Errorf("You must specify the snapshot name to restore")
|
||||
}
|
||||
return RestoreEtcdSnapshot(context.Background(), rkeConfig, nil, "", etcdSnapshotName)
|
||||
|
||||
}
|
||||
|
@@ -21,17 +21,17 @@ import (
|
||||
)
|
||||
|
||||
const (
|
||||
EtcdBackupPath = "/opt/rke/etcdbackup/"
|
||||
EtcdRestorePath = "/opt/rke/etcdrestore/"
|
||||
EtcdDataDir = "/var/lib/rancher/etcd/"
|
||||
EtcdSnapshotPath = "/opt/rke/etcd-snapshots"
|
||||
EtcdRestorePath = "/opt/rke/etcd-snapshots-restore/"
|
||||
EtcdDataDir = "/var/lib/rancher/etcd/"
|
||||
)
|
||||
|
||||
type EtcdBackup struct {
|
||||
// Enable or disable backup creation
|
||||
Backup bool
|
||||
// Creation period of the etcd backups
|
||||
type EtcdSnapshot struct {
|
||||
// Enable or disable snapshot creation
|
||||
Snapshot bool
|
||||
// Creation period of the etcd snapshots
|
||||
Creation string
|
||||
// Retention period of the etcd backups
|
||||
// Retention period of the etcd snapshots
|
||||
Retention string
|
||||
}
|
||||
|
||||
@@ -43,7 +43,7 @@ func RunEtcdPlane(
|
||||
prsMap map[string]v3.PrivateRegistry,
|
||||
updateWorkersOnly bool,
|
||||
alpineImage string,
|
||||
etcdBackup EtcdBackup) error {
|
||||
etcdSnapshot EtcdSnapshot) error {
|
||||
log.Infof(ctx, "[%s] Building up etcd plane..", ETCDRole)
|
||||
for _, host := range etcdHosts {
|
||||
if updateWorkersOnly {
|
||||
@@ -54,8 +54,8 @@ func RunEtcdPlane(
|
||||
if err := docker.DoRunContainer(ctx, host.DClient, imageCfg, hostCfg, EtcdContainerName, host.Address, ETCDRole, prsMap); err != nil {
|
||||
return err
|
||||
}
|
||||
if etcdBackup.Backup {
|
||||
if err := RunEtcdBackup(ctx, host, prsMap, alpineImage, etcdBackup.Creation, etcdBackup.Retention, EtcdBackupContainerName, false); err != nil {
|
||||
if etcdSnapshot.Snapshot {
|
||||
if err := RunEtcdSnapshotSave(ctx, host, prsMap, alpineImage, etcdSnapshot.Creation, etcdSnapshot.Retention, EtcdSnapshotContainerName, false); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
@@ -219,8 +219,8 @@ func IsEtcdMember(ctx context.Context, etcdHost *hosts.Host, etcdHosts []*hosts.
|
||||
return false, nil
|
||||
}
|
||||
|
||||
func RunEtcdBackup(ctx context.Context, etcdHost *hosts.Host, prsMap map[string]v3.PrivateRegistry, etcdBackupImage string, creation, retention, name string, once bool) error {
|
||||
log.Infof(ctx, "[etcd] Starting backup on host [%s]", etcdHost.Address)
|
||||
func RunEtcdSnapshotSave(ctx context.Context, etcdHost *hosts.Host, prsMap map[string]v3.PrivateRegistry, etcdSnapshotImage string, creation, retention, name string, once bool) error {
|
||||
log.Infof(ctx, "[etcd] Saving snapshot [%s] on host [%s]", name, etcdHost.Address)
|
||||
imageCfg := &container.Config{
|
||||
Cmd: []string{
|
||||
"/opt/rke/rke-etcd-backup",
|
||||
@@ -231,7 +231,7 @@ func RunEtcdBackup(ctx context.Context, etcdHost *hosts.Host, prsMap map[string]
|
||||
"--name", name,
|
||||
"--endpoints=" + etcdHost.InternalAddress + ":2379",
|
||||
},
|
||||
Image: etcdBackupImage,
|
||||
Image: etcdSnapshotImage,
|
||||
}
|
||||
if once {
|
||||
imageCfg.Cmd = append(imageCfg.Cmd, "--once")
|
||||
@@ -242,28 +242,28 @@ func RunEtcdBackup(ctx context.Context, etcdHost *hosts.Host, prsMap map[string]
|
||||
}
|
||||
hostCfg := &container.HostConfig{
|
||||
Binds: []string{
|
||||
fmt.Sprintf("%s:/backup", EtcdBackupPath),
|
||||
fmt.Sprintf("%s:/backup", EtcdSnapshotPath),
|
||||
fmt.Sprintf("%s:/etc/kubernetes:z", path.Join(etcdHost.PrefixPath, "/etc/kubernetes"))},
|
||||
NetworkMode: container.NetworkMode("host"),
|
||||
}
|
||||
|
||||
if once {
|
||||
if err := docker.DoRunContainer(ctx, etcdHost.DClient, imageCfg, hostCfg, EtcdBackupOnceContainerName, etcdHost.Address, ETCDRole, prsMap); err != nil {
|
||||
if err := docker.DoRunContainer(ctx, etcdHost.DClient, imageCfg, hostCfg, EtcdSnapshotOnceContainerName, etcdHost.Address, ETCDRole, prsMap); err != nil {
|
||||
return err
|
||||
}
|
||||
status, err := docker.WaitForContainer(ctx, etcdHost.DClient, etcdHost.Address, EtcdBackupOnceContainerName)
|
||||
status, err := docker.WaitForContainer(ctx, etcdHost.DClient, etcdHost.Address, EtcdSnapshotOnceContainerName)
|
||||
if status != 0 || err != nil {
|
||||
return fmt.Errorf("Failed to take etcd backup exit code [%d]: %v", status, err)
|
||||
return fmt.Errorf("Failed to take etcd snapshot exit code [%d]: %v", status, err)
|
||||
}
|
||||
return docker.RemoveContainer(ctx, etcdHost.DClient, etcdHost.Address, EtcdBackupOnceContainerName)
|
||||
return docker.RemoveContainer(ctx, etcdHost.DClient, etcdHost.Address, EtcdSnapshotOnceContainerName)
|
||||
}
|
||||
return docker.DoRunContainer(ctx, etcdHost.DClient, imageCfg, hostCfg, EtcdBackupContainerName, etcdHost.Address, ETCDRole, prsMap)
|
||||
return docker.DoRunContainer(ctx, etcdHost.DClient, imageCfg, hostCfg, EtcdSnapshotContainerName, etcdHost.Address, ETCDRole, prsMap)
|
||||
}
|
||||
|
||||
func RestoreEtcdBackup(ctx context.Context, etcdHost *hosts.Host, prsMap map[string]v3.PrivateRegistry, etcdRestoreImage, backupName, initCluster string) error {
|
||||
log.Infof(ctx, "[etcd] Restoring [%s] snapshot on etcd host [%s]", backupName, etcdHost.Address)
|
||||
func RestoreEtcdSnapshot(ctx context.Context, etcdHost *hosts.Host, prsMap map[string]v3.PrivateRegistry, etcdRestoreImage, snapshotName, initCluster string) error {
|
||||
log.Infof(ctx, "[etcd] Restoring [%s] snapshot on etcd host [%s]", snapshotName, etcdHost.Address)
|
||||
nodeName := pki.GetEtcdCrtName(etcdHost.InternalAddress)
|
||||
backupPath := filepath.Join(EtcdBackupPath, backupName)
|
||||
snapshotPath := filepath.Join(EtcdSnapshotPath, snapshotName)
|
||||
|
||||
imageCfg := &container.Config{
|
||||
Cmd: []string{
|
||||
@@ -273,7 +273,7 @@ func RestoreEtcdBackup(ctx context.Context, etcdHost *hosts.Host, prsMap map[str
|
||||
"--cacert", pki.GetCertPath(pki.CACertName),
|
||||
"--cert", pki.GetCertPath(nodeName),
|
||||
"--key", pki.GetKeyPath(nodeName),
|
||||
"snapshot", "restore", backupPath,
|
||||
"snapshot", "restore", snapshotPath,
|
||||
"--data-dir=" + EtcdRestorePath,
|
||||
"--name=etcd-" + etcdHost.HostnameOverride,
|
||||
"--initial-cluster=" + initCluster,
|
||||
|
@@ -21,19 +21,19 @@ const (
|
||||
SidekickServiceName = "sidekick"
|
||||
RBACAuthorizationMode = "rbac"
|
||||
|
||||
KubeAPIContainerName = "kube-apiserver"
|
||||
KubeletContainerName = "kubelet"
|
||||
KubeproxyContainerName = "kube-proxy"
|
||||
KubeControllerContainerName = "kube-controller-manager"
|
||||
SchedulerContainerName = "kube-scheduler"
|
||||
EtcdContainerName = "etcd"
|
||||
EtcdBackupContainerName = "etcd-backup"
|
||||
EtcdBackupOnceContainerName = "etcd-backup-once"
|
||||
EtcdRestoreContainerName = "etcd-restore"
|
||||
NginxProxyContainerName = "nginx-proxy"
|
||||
SidekickContainerName = "service-sidekick"
|
||||
LogLinkContainerName = "rke-log-linker"
|
||||
LogCleanerContainerName = "rke-log-cleaner"
|
||||
KubeAPIContainerName = "kube-apiserver"
|
||||
KubeletContainerName = "kubelet"
|
||||
KubeproxyContainerName = "kube-proxy"
|
||||
KubeControllerContainerName = "kube-controller-manager"
|
||||
SchedulerContainerName = "kube-scheduler"
|
||||
EtcdContainerName = "etcd"
|
||||
EtcdSnapshotContainerName = "etcd-rolling-snapshots"
|
||||
EtcdSnapshotOnceContainerName = "etcd-snapshot-once"
|
||||
EtcdRestoreContainerName = "etcd-restore"
|
||||
NginxProxyContainerName = "nginx-proxy"
|
||||
SidekickContainerName = "service-sidekick"
|
||||
LogLinkContainerName = "rke-log-linker"
|
||||
LogCleanerContainerName = "rke-log-cleaner"
|
||||
|
||||
KubeAPIPort = 6443
|
||||
SchedulerPort = 10251
|
||||
|
@@ -25,4 +25,4 @@ github.com/ugorji/go/codec ccfe18359b55b97855cee1d3f74e5efbda4869d
|
||||
github.com/Microsoft/go-winio ab35fc04b6365e8fcb18e6e9e41ea4a02b10b175
|
||||
|
||||
github.com/rancher/norman ff60298f31f081b06d198815b4c178a578664f7d
|
||||
github.com/rancher/types d289637bccd1ac6a8eaa46556733890a5f204fbc
|
||||
github.com/rancher/types f08dc626d420185972a3bcf11504b81d7f2d37e0
|
||||
|
8
vendor/github.com/rancher/types/apis/management.cattle.io/v3/rke_types.go
generated
vendored
8
vendor/github.com/rancher/types/apis/management.cattle.io/v3/rke_types.go
generated
vendored
@@ -171,11 +171,11 @@ type ETCDService struct {
|
||||
Key string `yaml:"key" json:"key,omitempty"`
|
||||
// External etcd prefix
|
||||
Path string `yaml:"path" json:"path,omitempty"`
|
||||
// Etcd Backup Service
|
||||
Backup bool `yaml:"backup" json:"backup,omitempty"`
|
||||
// Etcd Backup Retention period
|
||||
// Etcd Recurring snapshot Service
|
||||
Snapshot bool `yaml:"snapshot" json:"snapshot,omitempty"`
|
||||
// Etcd snapshot Retention period
|
||||
Retention string `yaml:"retention" json:"retention,omitempty"`
|
||||
// Etcd Backup Creation period
|
||||
// Etcd snapshot Creation period
|
||||
Creation string `yaml:"creation" json:"creation,omitempty"`
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user