Previously the network policy yaml had to be hard-coded in the image.
This patch allows the policy to be added via the metadata directories:
- /var/config/cni/etc/net.d/
- /var/config/kube-system.init/
Signed-off-by: David Scott <dave.scott@docker.com>
This has no kube object(s) but just arranges for the CNI configuration to be
written in the right place. The CNI bridge, loopback etc binaries are already
included since they are in the reference set.
Signed-off-by: Ian Campbell <ijc@docker.com>
KUBE_NETWORK now specifies a yml which is passed to the Moby tool, which can
introduce files into /etc/kubeadm/kube-system.init/ or do other things as it
likes.
In the case of weave this just adds the weave yaml to that directory. To avoid
too much confusion between weave.yml (Moby tool input) and `weave.yaml` (the
kubernetes `ServiceAccount`, `DeamonsSet` etc object specs) name the latter
`kube-weave.yaml`.
Signed-off-by: Ian Campbell <ijc@docker.com>
Building both BIOS and EFI variants is a waste of time in most cases, instead
just build whichever one is relevant to the platform (which currently means EFI
on Darwin and BIOS everywhere else).
At the same time make it possible to pass "KUBE_FORMATS" (a space separated
list of targets) to the build e.g. `make KUBE_FORMATS="iso-efi iso-bios"` will
preserve the behaviour prior to this patch.
Signed-off-by: Ian Campbell <ijc@docker.com>
Specifically ignore present-but-empty files entirely and ignore (but log)
failure to apply any one file.
Ignoring an empty file is useful because it means you can clobber a file which
might be referenced from an images binds without needing to override those
binds (since that generally means duplicating the whole lot which is annoying).
Ignoring any failures to apply means the rest gets applied and the rest of the
script (including untaint and the stamp file creation) still happen, resulting
in a system where the admin just has to address the failures rather than the
remaining updates. We touch a file to indicate failure generally plus one to
indicate the specific yaml which failed to apply.
Signed-off-by: Ian Campbell <ijc@docker.com>
Adding a `PRETTY_NAME` to this causes it to appear in the node information:
$ kubectl --namespace=kube-system get -o wide nodes
NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
linuxkit-b6e09efea36e Ready master 29m v1.8.0 <none> LinuxKit Kubernetes Project 4.9.53-linuxkit containerd://1.0.0-beta.1
Previously it would be `Unknown`.
A later yaml passed to `moby build` can override this by simply duplicating
the path.
Signed-off-by: Ian Campbell <ijc@docker.com>
Done as follows:
find -name build.yml | xargs dirname | while read d ; do
t=$(linuxkit pkg show-tag $d)
./scripts/update-component-sha.sh --image ${t%:*} ${t#*:}
done
git commit -s test pkg tools blueprints examples projects/kubernetes projects/swarmd docs linuxkit.yml Makefile src
This explicitly excludes projects/* which I did not know whether to update.
Then:
git reset --hard
for i in init runc containerd ca-certificates sysctl dhcpcd getty rngd ; do
o=$(git grep -h "\(image:\|-\) *linuxkit/$i:[0-9a-f]\{40\}" origin/master:linuxkit.yml | awk '// { print $2 }')
n=$(linuxkit pkg show-tag pkg/$i)
./scripts/update-component-sha.sh "$o" "$n"
done
git commit --amend projects
This updates any projects which were using components with the same hash as the
top-level linuxkit.yml.
Signed-off-by: Ian Campbell <ijc@docker.com>
By running:
./scripts/update-component-sha.sh --image linuxkit/alpine ad35b6ddbc70faa07e59a9d7dee7707c08122e8d
Signed-off-by: Ian Campbell <ijc@docker.com>
This implements the proposal in #2564 and converts a handful of representative
or especially interesting (from a build PoV) packages to use it.
For now those pkg/* affected get a stub-`Makefile`, once all packages are
converted then `pkg/Makefile` can be adjusted and those stubs can be removed.
For now only `pkg/package.mk`'s functionality is implemented. In particular:
- `push-manifest.sh` remains a separate script, to enable calling it on systems
with just the LinuxKit tools installed arrange to install it under a less
generic name.
- `kernel` and `tools/alpine` do not use `pkg/package.mk` and those cases are
not yet fully considered/covered.
I have updated the documentation assuming that the existing uses of
`pkg/package.mk` will be removed quite soon in a follow up PR rather than
trying to document the situation which results after just this commit.
Due to `cmd/linuxkit` now gaining a library the build needs adjusting slightly to
allow both `make bin/linuxkit` and `go build` to work.
`go vet` has forced me to write some rather asinine comments for things that
are rather obvious from the name.
Signed-off-by: Ian Campbell <ijc@docker.com>
In particular also fix the wireguard test whose kernel
tag hasn't been updated for quite some time...
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Unfortunately some options (such as enabling dynamic registration of
initializers) can only be enabled by a `--config foo.yaml` argument.
Furthermore some command-line options (such as the kubernetes version)
cannot be used in combination with the config file.
This patch checks for a supplied /etc/kubeadm/kubeadm.yaml and uses
it if it exists, otherwise it falls back to the original command-line.
Note it is safe to use the `--skip-*` options in combination with the
`--config` option.
Signed-off-by: David Scott <dave.scott@docker.com>
It is possible to get rebooted halfway through the init process, after key
files like `/etc/kubernetes/kubelet.conf` have been created but before full
cluster setup is complete or networking is applied.
Right now the idempotency of kubeadm (or backing out from this half-way state
and resuming the initialisation) is not something I have investigated. By
dropped stamps before and after at least the situation will be somewhat
detectable/diagnosable so the user can e.g. nuke their persistent disk and
start again.
Signed-off-by: Ian Campbell <ijc@docker.com>
If a stamp file is present in the metadata then untaint.
This is useful for dev environments where you only want to start a single vm.
The construction of the metadata becomes a little more complex to produce
correct json syntax now that there are two (independent) possible options.
Likewise the kubelet.sh script now takes the presence of /var/config/kubeadm
(rather than /var/config/kubeadm/init) as the signal to use the more structured
setup, since we may now have /var/config/kubeadm/untaint-master but not
/var/config/kubeadm/init so would otherwise end up passing the contents of
`/var/config/userdata` (something like `{ "kubeadm": { "untaint-master": "" }
}`) to `kubeadm` and confusing it enormously.
Signed-off-by: Ian Campbell <ijc@docker.com>
/usr/libexec/kubernetes/kubelet-plugins is a new path in Kube 1.8 (related to
flexvolumes) which should be persisted. Like /etc/cni and /opt/cni we also need
to arrange for this path to be valid in the host environment (since various
system containers will try and mount bind mount it).
Signed-off-by: Ian Campbell <ijc@docker.com>
With kube 1.8 kubeadm initially configures worker nodes with a
bootstrap-kubelet.conf. Adjust our start of day scripting to DTRT.
Signed-off-by: Ian Campbell <ijc@docker.com>
To help reduce confusion from this file (which configures our `kubelet.sh`
wrapper) vs `/var/lib/kubeadm/kubelet.conf` (which is created by `kubeadm` and
configures `kubelet` itself).
Signed-off-by: Ian Campbell <ijc@docker.com>
This vendors containerd v1.0.0-beta.1
Enable seccomp support at build time.
Requires /dev bind mount so it can use /dev/disk/by-uuid to resolve devices to
uuids.
Signed-off-by: Ian Campbell <ijc@docker.com>
Set KUBE_MASTER_AUTOINIT when using boot.sh to enable. User will need to pick
up the token for other nodes using `kubeadm token list`.
Signed-off-by: Ian Campbell <ijc@docker.com>
Rework the kubelet.sh script by adding an explicit step which waits for the
configuration to be valid, either by finding appropriate metadata or by waiting
explicitly for kubelet.conf to be created (e.g. by kubeadm) before launching
kubelet. The previous construct was implicitly waiting for kubelet.conf to be
created since kubelet fails if that file is not present.
Pull the set of start of day yaml files to be applied (currently just weave)
out of the kubelet image and into the LinuxKit yaml by providing a directory
which is searched for *.yaml after init.
Signed-off-by: Ian Campbell <ijc@docker.com>
With the master tailoring for docker now being in docker-master.yml,
kube-master and kube-node are identical, so just use a single kube.yml.
The reference to kube-master.yml in README.md is obsolete, so just drop it.
Signed-off-by: Ian Campbell <ijc@docker.com>
This allows cri-containerd and docker based systems to pass the correct options
via composition of yml files, while keeping the kubelet service stanza common.
Since bind mounts are not conditional on the presence of the source we need to
create an empty file in the docker case.
Signed-off-by: Ian Campbell <ijc@docker.com>
This doesn't seem to be necessary when using Docker Engine as the CRI backend,
but in general it is.
The sysctl container must be writeable to allow the
/etc/sysctl.d/01-kubernetes.conf mount point to be created. See #2503.
Signed-off-by: Ian Campbell <ijc@docker.com>
Depending on the configuration/components used the system can expect to be able
to share `/var/run/netns` (=`/run/netns` via symlink) bind mounts with other
system level containers, which requires exposing those to the host.
This doesn't appear to be needed when using Docker engine but it is with
cri-containerd.
Signed-off-by: Ian Campbell <ijc@docker.com>
Kubernetes assumes (for now) that various paths are valid at the host level to
be mounted into containers, including /opt/cni and /etc/cni.
We cannot (easily) use symlinks here because the weave.yml mounts /opt and /etc
rather than /opt/cni and /etc/cni (this seems likely to be common pattern). So
if /etc/cni were a symlink to the persistent disk (under /var/lib) then it will
be dangling link within the weave container.
So add bind mounts to the runtime configuration of the kubernetes image. This
also means we must create the target mount points in the yml.
Signed-off-by: Ian Campbell <ijc@docker.com>
This avoids a slightly tricky sequence of nested bind mounts by just unpacking
a tarball on boot (with a stamp so it only happens once).
Signed-off-by: Ian Campbell <ijc@docker.com>
For example to tell kubelet to use cri-containerd:
command: ["/usr/bin/kubelet.sh", "--container-runtime=remote", "--container-runtime-endpoint=unix:///var/run/cri-containerd.sock"]
Signed-off-by: Ian Campbell <ijc@docker.com>
Not having to redo the kubeadm-init.sh step massively speeds up the test/dev
cycle. Having the same MAC (and hence same IP) is useful there too since you
don't need to figure out the mac on each boot.
Signed-off-by: Ian Campbell <ijc@docker.com>
The sock-shop demo[0] requires around 5G of images on a worker node and 3G of
RAM (if there is only one worker node and therefore everything runs on that
node).
Since the master is more than happy with the 4G disk and 1G RAM it is given
today split the settings into master and node specific and bump only the
latter.
KUBE_PORT_BASE is unused and was already removed in 54ddde0d43 but
accidentally reintroduced (by me) in 62aa9248a4, whack it again.
[0] https://microservices-demo.github.io/microservices-demo
Signed-off-by: Ian Campbell <ijc@docker.com>
This is less confusing as there is also an output option to set the file.
See https://github.com/moby/tool/pull/146
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Currently they will be on a read only partition so broken;
previously this would have been a non persistent read write partition
in an initramfs but this no longer works.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Because Kubernetes is 1.5GB, ISO makes sense as the files do not
take up memory, so you can boot a 1GB machine rather than a 4GB one.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
It is not in any wa=y a required container, and now that arm64
and other architecture machines are widely available we should
start to deprecate it, as it has many issues, eg requires patches
to qemu for Go support, will mislabel images etc.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
These were removed as unused in 8acecf1b62 but with the update to 1.7.2 they
are now pulled in (again?) by the default system.
Signed-off-by: Ian Campbell <ijc@docker.com>
PR #2314 turned /var into a tmpfs (possibly overmounted by a persistent disk)
and made /var/run into a symlink to /run. Adjust various containers and bind
mount settings to allow for this change. In particular ensuring that everything
can find the correct shared /var/run/docker.sock, which due to the symlink is
now actually at /run.
Signed-off-by: Ian Campbell <ijc@docker.com>
After #2289 we need to bind /etc/resolv.conf into the docker and kubelet
containers on both node and master.
Also since #2289 the metadata container requires /etc/resolv.conf to be
populated on the host, which means running DHCP earlier in oneshot onboot mode,
do so.
Signed-off-by: Ian Campbell <ijc@docker.com>
These contribute ~140M to the common image cache but do not appear to be used
by either the base system nor the sock-shop demo. They can/will still be pulled
on demands as necessary.
Signed-off-by: Ian Campbell <ijc@docker.com>
Bumps kubernetes and associated tools and images to v1.6.7 (from v1.6.1).
Updates weave from v1.9.4 to v2.0.1
Updates cni from a snapshot to v0.5.2. Note that the download location has
changed and the tarball no longer includes the `bin` subdirectory, so adjust
build to compensate.
Signed-off-by: Ian Campbell <ijc@docker.com>
Much smaller than the CentOS based one.
Note that ijc25/alpine-ssh has entrypoint==ssh.
Drop Compression=yes, this is used for local ssh so no point compressing (just uses CPU).
Signed-off-by: Ian Campbell <ijc@docker.com>
Remove `-publish` (which is currently Linux/QEMU specific) and replace with a
generic $KUBE_RUN_ARGS envvar. Usage:
KUBE_RUN_ARGS="-publish 2222:22" ./boot.sh
KUBE_PORT_BASE is thus obsolete and removed.
Signed-off-by: Ian Campbell <ijc@docker.com>
This commits an initial version of the Memorizer tracing tool. It collects and
outputs detailed data on the objects (traced from kmalloc/kmem_cache_alloc) and
accesses, tracking the context of each event with respect to thread ID, program
counter, and for allocations name of process.
Signed-off-by: Nathan Dautenhahn <ndd@cis.upenn.edu>
It's slightly embarrassing that this old snapshot was kept around here
rotting for so long, but thankfully something is finally being done
about it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Apart from adding the recursive target itself this required:
- Unescaping the @ in the image names, this was confusing `make` into always
rebuilding and wasn't necessary (I had previously thought I had seen oddities
due to these being interpreted by the `patsubst`, but I think that was just the
colons.
- Making the recursive rules silent (prepending an @), those command lines are
not especially enlightening and they obscure the output in the show-tags case.
With this the output is like:
$ make --no-print-directory -C image-cache/ show-tags
linuxkitprojects/kubernetes-image-cache-common:94a0715c6b3604e909bc0da74260dc7f1142d90d-dirty
linuxkitprojects/kubernetes-image-cache-control-plane:94a0715c6b3604e909bc0da74260dc7f1142d90d-dirty
Signed-off-by: Ian Campbell <ijc@docker.com>
The total size of the images in the common and control-plane cache is 251M and
528M respectively.
This changes drops the size of the cache images from 353M to 274M and from 630M
to 530M, reducing the overhead from ~100M to ~20M.
The initrd images shrink from 273M to 246M and from 416M to 363M (the initrd's
are compressed).
Signed-off-by: Ian Campbell <ijc@docker.com>
This updates the build of the two image caches to use the `pkg/package.mk`
infrastructure, albeit in a slightly (ok, very) atypical way.
In order to share the bulk of the build code (including the `Dockerfile` and
the `Makefile` machinery to download the images) we arrange for the necessary
bits to be copied at build time into distinct subdirectories and for the
`pkg/package.mk` to be aware of this possibility.
Since pkg/package.mk is only set up to build a single package we use a single
`image-cache/Makefile` to drive the whole process and recurse into
`Makefile.pkg` to build individual packages.
One particular subtlety is that the package hash is based on the `image-cache`
directory (which is in `git`) rather than the generated subdirectories (which
are not in `git`). Since all the generators (and their inputs) are in the
`image-cache` directory this is what we want. This means that the two images
are given the same tag, but this is deliberate and desirable.
The generated directories are completely temporary to avoid picking up stale
versions of images when versions are updated. Images are hardlinked into place.
The images are moved to the linuxkitprojects org. Using a dev tag for now, will
update once everything is in place.
Also use "tag" rather than "build" where appropriate in the Makefile.
There is no point in the .dockerignore now, but add a .gitignore.
Signed-off-by: Ian Campbell <ijc@docker.com>
It is pretty close to our docker package, if we adjust the command
that is run to avoid the actual dind startup script. We can't use
the normal docker image as it does not have mkfs and so on.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
3 components:
- network: read eht0 and proxy only DHCP traffic
- engine: read DHCP traffic, handle DHCP client state machine, and call the
host actuator to change the host config when a lease is obtained
host system configuration.
- actuator: perform the acutall net syscalls, read and write host configuration
files, etc
These three components can either be linked together in a single binary
(see src/dhcp-client/main.ml) or can be used as 3 binaries communicating
over cap-n-proto.
Signed-off-by: Thomas Gazagnaire <thomas@gazagnaire.org>
Still a flat/unstructured config space, but at least uses the mounting
machinery.
`boot.sh` continues to just work without modification.
Signed-off-by: Ian Campbell <ijc@docker.com>
These were originally generated by some box builder runes and then taken
wholesale here. Format them to be more readable.
Signed-off-by: Ian Campbell <ijc@docker.com>
It doesn't support it. This makes "make cache-images" work. Previously it would
fail with various:
Error: remote trust data does not exist for gcr.io/google_containers/pause-amd64: gcr.io does not have trust data for gcr.io/google_containers/pause-amd64
Signed-off-by: Ian Campbell <ijc@docker.com>
This is a pretty straight port of the previous box stuff, without much attempt
to clean things up.
Image label is a placeholder, will update once a batch of changes are complete.
Signed-off-by: Ian Campbell <ijc@docker.com>
Apart from the /var/lib mount itself the custom package:
- Made host /etc/cni and /opt/cni rshared. This has been handled by init make /
rshared since 3c326bebdf ("Make / rshared").
- Make /var/lib/kubeadm after mount. For now handle this with a dedicated start
of day container instead.
Signed-off-by: Ian Campbell <ijc@docker.com>
Port base is configurable (via $KUBE_PORT_BASE envvar). Master uses this and
nodes use subsequent ports.
Check that the node number is numeric so we can add them to things, but avoid
worker node 0 since the port will clash with master.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
pull in newer containerd v1.0.0-alpha0 via updated alpine base, update runc to
429a5387123625040bacfbb60d96b1cbd02293ab which is vendored by that version of
containerd (and also update alpine base for runc)
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
These are not needed, but we are inconsistent. Been waiting for a
quiet moment to fix this since I noticed while doing a presentation...
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Firstly add option to disable content trust, for the use of e.g. projects which
are pushing to the linuxkitprojects org (which has no trust setup) rather than
the main linuxkit org.
Secondly, when trust _is_ enabled then enable it globally, in particular it is
now active for the `docker build` and hence containers referenced in
Dockerfiles via "FROM" will be checked.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This version is more up to date and works with the current containerd packages.
swarmd needs /tmp to share /tmp/containerd with containerd.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Lots of boilerplate for now on, will work on upstreaming that in the tool
properly if needed later.
Signed-off-by: Thomas Gazagnaire <thomas@gazagnaire.org>
```
$ fdd init &
$ fdd share /tmp/foo # serve a fresh socketpair on that path
$ fdd test /tmp/foo # read the socketpair and test that it works
```
Instead of `fdd test` (which is only useful for testing), users are expected to
connect to the unix domain socket and call `recvmsg(2)`. They will get one side
of the socketpair. Two different processes can do this and they will be able to
talk to each other.
Signed-off-by: Thomas Gazagnaire <thomas@gazagnaire.org>
- Use the new style kernel tags with the full kernel version
- Update packages with new alpine base and new/simplified Makefiles.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
- IO has been upstreamed in mirage-flow-lwt
- Init.Flow.Fd has been upstreamed in mirage-flow-unix
- Init.Flow.Rawlink has been upstreamed in mirage-flow-rawlink
- Remove some dead-code in unikernel.ml
Signed-off-by: Thomas Gazagnaire <thomas@gazagnaire.org>
This includes https://github.com/containerd/containerd/pull/994 and hence
requires updating the various instances of `/etc/containerd/config.toml`.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Currently it supports only `service start <SERVICE>`, but it could grow e.g.
`stop`, `exec` etc in the future (although you can still use `ctr` for those).
In order to be able to use go-compile.sh the containerd build needs to move
from /root/go to /go as the GOPATH.
The vendoring situation is not ideal, but since this tool wants to be an exact
match for the containerd it seems tollerable to reuse its vendoring.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This doesn't exist with newer ctr or in systems where service containers are
not started using the ctr tool. All it contains today are the stdio FIFOs,
which are not in general useful to access after container creation.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
The new init adds the usermode helper which is needed with
the soon to be pushed new 4.11 kernel update.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Instead of figuring out which config files to use inside of makeconfig.sh,
let's figure that out in the Dockerfile and pass them into the script.
Signed-off-by: Tycho Andersen <tycho@docker.com>
Instead of having a special case sed script, we can just put this in the
.debug config file, and have a special case when it's being checked.
Signed-off-by: Tycho Andersen <tycho@docker.com>
Let's use the kernel machine architecture for this value.
Also remove a broken check. The "arch" binary on OSX outputs different
stuff than on linux. Since we don't need this check anyway (the variable
is mostly to demonstrate how cross platform stuff would work, not to
actually do it yet), let's just remove the check.
Signed-off-by: Tycho Andersen <tycho@docker.com>