This puts the build side in charge of the runtime layout, which enables
additional optimisations later, like sharing the rootfs if it is
used multiple times.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This could be used in LinuxKit now, as there are some examples, eg
https://github.com/linuxkit/linuxkit/blob/master/blueprints/docker-for-mac/base.yml#L33
which are creating containers to do a mount.
The main reason though is to in future change the ad hoc code that generates
overlay mounts for writeable containers with a runtime config which does
the same thing; this code needs to create both tmpfs and overlay mounts.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This adds a `runtime` section in the config that can be used
to move network interfaces into a container, create directories,
and bind mount container namespaces into the filesystem.
See also https://github.com/linuxkit/linuxkit/pull/2413
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Rather than using an initrd, unpack full filesystem for ISO BIOS.
Stream docker output direct to file rather than via a buffer, to save
memory.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
When we converted these to cpio we were not noticing that they
were invalid as they had incorrect paths as we converted the
path to a symlink anyway. Only the busybox images have hard links
in, the Alpine ones are symlinks anyway, which is why it was
less visible too.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Also do some code cleanup.
Related to #131 we need to read the OCI config to find if the container
is read only, not rely on the yaml, as it may just be set in the label.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
To work with truly immutable filesystems, rather than ones
we sneakily remount `rw`, we are going to use overlay for
writeable containers. To leave the final mount as `rootfs`,
in the writeable case we make a new `lower` path for the read
only filesystem, and leave `rootfs` as a mount point for an
overlay, with the writable layer and workdir mounted as a tmpfs
on `tmp`.
See https://github.com/linuxkit/linuxkit/issues/2288
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Unfortunately there are a lot of issues with resolv.conf as we
cannot actually write it into the image from any docker image, as docker will
always have something bind mounted in.
In addition, normally we expect the filesystem to br read only for images
that moby generates, so the actual etc/resolv.conf is likely not to be writeable.
Previously we were adding in a default resolv.conf into every image pointing at
Google's name servers but that is really a bad idea.
Instead, normal images now get an empty default, while images in the `init`
section will get a symlink, currently hard coded to `/run/resolvconf/resolv.conf`
but you can override this with the `files` section to be static or a different
link.
In future, if we have an easy way to build and extract images with user control
of this, we can drop this.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Some of these are arbitrary and just syncing for the sake of it, however the
image- and runtime-spec are relevant. Interesting changes:
- runtime spec:
- LinuxRLimit is now POSIXRLimit.
- Specs.Config is now a pointer.
- LinuxResources.DisableOOMKiller moved to
LinuxResources.LinuxMemory.DisableOOMKiller
- image spec:
- Platform.Features is removed (unused here).
Signed-off-by: Ian Campbell <ijc@docker.com>
This is a list of images to run on a clean shutdown. Note that you must not rely on these
being run at all, as machines may be be powered off or shut down without having time to run
these scripts. If you add anything here you should test both in the case where they are
run and when they are not. Most systems are likely to be "crash only" and not have any setup here,
but you can attempt to deregister cleanly from a network service here, rather than relying
on timeouts, for example.
Fix https://github.com/linuxkit/linuxkit/issues/1988
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Currently this supports "yaml" as the only option, which will output
the yaml config (as JSON) into the file specified in the image.
Fix#107
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Previously I was forcing them to be strings, which is horrible. Now you
can either specify a numeric uid or the name of a service to use the
allocated id for that service.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Continue to allow onboot to have duplicates as we do not run simultaneously
so that is ok (and we number them anyway), but services are run together
so we will get a runtime error if duplicated as this is the containerd/runc
id.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
We were pulling in this whole stack of packages just for `trust.ReleasesRole`.
Just define it locally.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Note that various fields have changed moved around in the JSON as a result:
* `Platform` has been removed.
* `Process` is now a pointer.
* `OOMScoreAdj` has moved into `Process`, from `Linux.Resources` (resolving a
TODO here).
Also updates golang.org/x/sys which is less critical.
Signed-off-by: Ian Campbell <ian.campbell@docker.com>
This adds the OCI parts needed into the yaml, but there are still
permissions issues in practise so marked as experimental.
It may just need further documentation to resolve the issues.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
In order to support not running containers as root, allocate
each of them a uid and gid, a bit like traditional Unix system
service IDs. These can be referred to elsewhere by the name of
the container, eg if you wish to create a file owned by a
particular esrvice.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Allow setting ambient capabilities, as a seperate option to the standard
ones. If you are running as a non root user you should use these.
Note that unless you add `CAP_DAC_OVERRIDE` and similar permissions you
need to be careful about file ownership. Added support to set ownership
in the `files` section to help out with this.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Rather than build the image and have something weird happen, let's check
that the capabilities specified are actually valid capabilities.
Signed-off-by: Tycho Andersen <tycho@docker.com>
- this is pretty much the smallest change to split this out and it
exposes a few things that can be improved later
- no change to logging yet
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
In the WIP code in `moby` we now have a standard base tarball format,
that includes the kernel and cmdline as files in `/boot` so that the
entire output of the yaml file can default to a single tarball. Then
this can be split back up by LinuxKit into initrd, kernel and cmdline
as needed. This will probably become the only output of the `moby build`
stage, with a `moby package` stage dealing with output formats.
We may remove the output format specification from the yaml file as well,
and just have it in the command.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Instead, make a hard link a symlink. This isn't much better, but it allows
some cases (e.g. installing GCC on moby via alpine) to work.
Signed-off-by: Tycho Andersen <tycho@docker.com>
This does not get everything where we want it finally, see #1266
nor the optimal way of building, but it gets it out of top level.
Added instructions to build if you have a Go installation.
Not moving `vendor` yet.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This currently only changes the `gcp` target, but is the new
model - the `build` command will only do things locally, then
you need to `push` to an image store such as GCP or other ones
in order to `run` for platforms that cannot boot directly from
a local image.
Fix#1618
Signed-off-by: Justin Cormack <justin.cormack@docker.com>