NOTE: This will be a shared mount, due to root being turned into a
shared with `MC_REC` set: `mount("", "/", "", rec|shared, "")`.
For some reason setting `shared` when mounting `/sys/fs/bpf` doesn't
work at all, perhaps that's just a kernel feature.
Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com>
- use the mkimage hashes that we had in LinuxKit as more up to date than tool.
- update docs
- move the code from moby under src/cmd/linuxkit
Signed-off-by: Justin Cormack <justin@specialbusservice.com>
When logging directly to files (the not-using-memlogd case) the onboot
services must log to /run/log because /var/log might be overmounted
by a persistent disk. Therefore we create a symlink at the end of
the onboot section.
When logging via memlogd, all logs are buffered until a logwrite service
starts, so no symlink is needed.
Signed-off-by: David Scott <dave.scott@docker.com>
If external logging is enabled, this patch sets the stdout and stderr
of the `runc` invocations to one end of a socketpair and the other end is
sent to the logging service. Otherwise we log to files as before.
Signed-off-by: David Scott <dave.scott@docker.com>
An external logging system exists if the socket
/var/run/linuxkit-external-logging.sock
exists.
If an external logging system is enabled then create FIFOs for
containerd and send the other end of the FIFOs to the logging service.
Otherwise use /var/log files as before.
Signed-off-by: David Scott <dave.scott@docker.com>
When busybox's reboot processing occurs in init, it runs all SHUTDOWN
actions that are defined in inittab. Once those are complete, it will
trigger either a halt, poweroff, or reboot, depending upon what signal
is received. The mechanism that's used to shell out through inittab
does not allow us to pass through exactly which invocation was
requested.
Due to the way that rc.shutdown works, it invokes the poweroff action
for any and all SHUTDOWN callbacks, whether they're a reboot, poweroff,
or halt. Instead of handling the reboot(2) syscall in rc.shutdown,
return after killing and unmounting and let busybox's init process
decide which reboot(2) action to use.
Signed-off-by: Krister Johansen <krister.johansen@oracle.com>
Since we are building containerd v1.1.0 with go 1.10 (as it requires) to the
same for init and runc too for consistency. In the case of init it is actually
required since we use the containerd client library there.
The subreaper interfaces have been removed from containerd and replaced with a
similar interface in runc/libcontainer, update init to use that now.
Signed-off-by: Ian Campbell <ijc@docker.com>
$ git diff linuxkit.yml
diff --git a/linuxkit.yml b/linuxkit.yml
index e2ec829db..21b84e4ad 100644
--- a/linuxkit.yml
+++ b/linuxkit.yml
@@ -1,6 +1,6 @@
kernel:
image: linuxkit/kernel:4.14.32
- cmdline: "console=tty0 console=ttyS0 console=ttyAMA0"
+ cmdline: "console=ttyS0 console=foobar"
init:
- linuxkit/init:v0.3
- linuxkit/runc:v0.3
$ linuxkit build linuxkit.yml
[...]
$ linuxkit run linuxkit
[...]
getty: cmdline has console=foobar but /dev/foobar is not a character device; not starting getty for foobar
linuxkit-2ae2c420a11c login: root (automatic login)
Welcome to LinuxKit!
NOTE: This system is namespaced.
The namespace you are currently in may not be the root.
(ns: getty) linuxkit-2ae2c420a11c:~# ls -l /proc/1/root/dev/foobar
-rw-r--r-- 1 root root 311 Apr 9 13:19 /proc/1/root/dev/foobar
(ns: getty) linuxkit-2ae2c420a11c:~# cat /proc/1/root/dev/foobar
Welcome to LinuxKit
## .
## ## ## ==
## ## ## ## ## ===
/"""""""""""""""""\___/ ===
{ / ===-
\______ O __/
\ \ __/
\____\_______/
Also added quotes around $tty for good measure.
Signed-off-by: Ian Campbell <ijc@docker.com>
This is similar to ae64ab6b82 from #2849 which
did the same for runtime.mkdir.
This makes it possible to specify both host (absolute) or container (relative)
paths.
Signed-off-by: Ian Campbell <ijc@docker.com>
This PR correctly plumbs a single context to propagate the containerd
namespace to the necessary commands. Services launched with containerd
after this change will now be in a default namespace of
`services.linuxkit`.
A top-level flag is added to the service command,
`--containerd-namespace` which can be used to change, if needed.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Use unix.Reboot from golang.org/x/sys/unix for poweroff and reboot
instead of relying on external commands.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Implements https://github.com/moby/tool/pull/181
Design for things like Kubernetes setup that requires some cgroups to
exist when the service starts but it is not running in these, other
services are, so there would be a race if they are not created in each.
Essentially it is just a sugared `mkdir` in all the cgroup dirs.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
By running:
./scripts/update-component-sha.sh --image linuxkit/alpine ad35b6ddbc70faa07e59a9d7dee7707c08122e8d
Signed-off-by: Ian Campbell <ijc@docker.com>
This was done with the following "script":
git rm pkg/{auditd,binfmt,init}/Makefile
sed -e 's/IMAGE=/image: /g' -i pkg/*/Makefile
sed -e 's/NETWORK=1/network: true/g' -i pkg/*/Makefile
sed -e 's/ARCHES=x86_64/arches:\n - amd64/g' -i pkg/*/Makefile
sed -e '/DEPS:\?=/d' -i pkg/*/Makefile
sed -e '/ARCHES=SKIP/d' -i pkg/node_exporter/Makefile
sed -e 's/include \.\.\/package.mk//g' -i pkg/*/Makefile
sed -e '/^$/d' -i pkg/*/Makefile
git mv pkg/node_exporter/Makefile pkg/node_exporter/build.yml-skip
for i in pkg/*/Makefile ; do git mv $i ${i%Makefile}build.yml ; done
and manual update of pkg/Makefile.
Signed-off-by: Ian Campbell <ijc@docker.com>
This implements the proposal in #2564 and converts a handful of representative
or especially interesting (from a build PoV) packages to use it.
For now those pkg/* affected get a stub-`Makefile`, once all packages are
converted then `pkg/Makefile` can be adjusted and those stubs can be removed.
For now only `pkg/package.mk`'s functionality is implemented. In particular:
- `push-manifest.sh` remains a separate script, to enable calling it on systems
with just the LinuxKit tools installed arrange to install it under a less
generic name.
- `kernel` and `tools/alpine` do not use `pkg/package.mk` and those cases are
not yet fully considered/covered.
I have updated the documentation assuming that the existing uses of
`pkg/package.mk` will be removed quite soon in a follow up PR rather than
trying to document the situation which results after just this commit.
Due to `cmd/linuxkit` now gaining a library the build needs adjusting slightly to
allow both `make bin/linuxkit` and `go build` to work.
`go vet` has forced me to write some rather asinine comments for things that
are rather obvious from the name.
Signed-off-by: Ian Campbell <ijc@docker.com>
golint on pkg/init now complains:
golint...
./init.go:199:2: redundant if ...; err != nil check, just return error instead.
Resulting in a change which doesn't seem like an improvement to me.
Signed-off-by: Ian Campbell <ijc@docker.com>
This removes more shell scripts to improve maintainability.
This now also works correctly in userspace, so it can be used for
running LinuxKit images in Docker and other such use cases.
It is a literal conversion of the shell scripts with a few small
tweaks.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Previously we would pass the path `/var/log/service.log` for both
stdout and stderr to containerd. containerd would construct a dict
with the paths as keys[1] and, due to the duplicate key, would only
open one of the files and start one `io.Copy` instance. Writes to
the other stream would be buffered by the pipe connected to
containerd-shim and would eventually block.
If we modified containerd to open the file twice and start 2
`io.Copy` instances, we would end up with the two streams interleaved
together. It seems cleaner to keep the streams separate; therefore
this patch logs stdout to `/var/log/service.out.log` and stderr to
`/var/log/service.err.log`.
[1]
49437711c3/linux/shim/io.go (L51)
Signed-off-by: David Scott <dave.scott@docker.com>
This removes all the code that had knowledge of how to do read only
and read write container mounts, and just uses the runtime config.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This could be used in LinuxKit now, as there are some examples, eg
https://github.com/linuxkit/linuxkit/blob/master/blueprints/docker-for-mac/base.yml#L33
which are creating containers to do a mount.
The main reason though is to in future change the ad hoc code that generates
overlay mounts for writeable containers with a runtime config which does
the same thing; this code needs to create both tmpfs and overlay mounts.
See https://github.com/moby/tool/pull/145
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
The metadata package has binds
- /dev
- /var
- /sys
- /etc/resolv.conf
- /etc/ssl/certs
but unfortunately `/etc/ssl/certs` doesn't exist and this causes the
following commands:
cd blueprints/docker-for-mac # easy example
moby build -name docker-for-mac base.yml docker-17.06-ce.yml
linuxkit run hyperkit -networking=vpnkit -vsock-ports=2376 -disk size=500M docker-for-mac
to produce the following error on the VM console:
container_linux.go:265: starting container process caused "process_linux.go:348: container init caused \"rootfs_linux.go:57: mounting \\\"/etc/ssl/certs\\\" to rootfs \\\"/containers/onboot/000-metadata/rootfs\\\" at \\\"/etc/ssl/certs\\\" caused \\\"stat /etc/ssl/certs: no such file or directory\\\"\""
2017/08/21 16:39:40 Error creating 000-metadata: exit status 1
This patch creates /etc/ssl/certs in the `init` package. The metadata package
will now say things like
2017/08/21 16:44:39 No metadata/userdata found. Bye
Signed-off-by: David Scott <dave.scott@docker.com>
As discussed before, as we use this in three places, cloning in
base makes more sense.
Update base image.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This adds support for a runtime configuration file that can do:
- `mkdir` to make a directory at runtime, eg in `/var` or `/tmp`, to avoid workarounds
- `interface` that can create network interfaces in a container or move them
- `bindNS` that can bind mount namespaces of an `onboot` container to a file so a service can be started in that namespace.
It merges the `service` and `onboot` tools (in `init`) to avoid duplication. This also saves some size for
eg LCOW which did not use the `onboot` code in `runc`.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Chown clears suid bits even for root on Linux.
Also move a few functions to x/sys/unix from syscall, to be
more arm64 friendly.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
Previously we were cheating and remounting /var `rw` but this does not
work if the filesystem is really read only. Nount a tmpfs, which may
be overmounted later by a persistent filesystem.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
At present they use a small shared function called "prepare"
that does the read-write remounts, that I will switch to doing overlay
mounts soon.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
The filesystem is supposed to be immutable, so do not try to make
a symlink; new versions of moby tool should add one anyway. But
try to make the directory a symlink points to, assuming that it
will be on a writeable filesystem.
fix#1920
see also #2288
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This uses a more memory efficient copy, and gets us closer to
not having a shell in the base system if not required.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
These can be added by other packages if they need to do something on
clean shutdown.
Crash only software can ignore this.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
We want them to run in sequence. For example we want mounts to be done (done by
`pkg/runc/etc/init.d/010-onboot`) before we start services (done by
`pkg/containerd/etc/init.d/020-containerd`). This was most likely introduced by
28b4245b12 ("Move onboot startup script to runc package").
None of the initscripts in pkg/* block, but some in projects (selinux and
logging, not updated here) do.
Signed-off-by: Ian Campbell <ijc@docker.com>
As this does not use containerd at all, this means you can run very
minimal setups with just `runc` if you use no services, for example
most of our tests do not actually use services, or if you have other
similar very minimal use cases.
Move ulimit setup to `init` which makes more sense.
Signed-off-by: Justin Cormack <justin.cormack@docker.com>
This commit moves the include statement to the bottom of the file to
ensure that all variables are set before conditionals are evaluated.
I also changed the ifndef NETWORK to ifdef NETWORK as the former was
incorrect. We want `NET_OPTS="--network=none"` in cases where NETWORK is
not defined.
Fixes: #2134
Signed-off-by: Dave Tucker <dt@docker.com>
In a subsequent commit, all YAML files will be updated with
new package hashes since all packages needed rebuild due to
build system changes in commit adae27b8d1 ("Simplify
Makefiles for Packages"). So, we might as well bring all
packages up to the latest alpine base package.
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>