The Container Images for Windows Server 2022 have been published, and
we can start building test images using them, so we can start adding
jobs for them.
The image versions for the e2e test images have been bumped in a previous
commit, but haven't been promoted yet. We don't need to bump them here.
We're starting with windows-servercore-cache and busybox images, since
they are needed for the other images the most.
A previous added LD_FLAGS for the go binary compilation, but it's not
defined for all images.
In the test image build jobs, the image-util.sh script is not being run in a git
repository, which causes git log to fail.
In this case, we can use the PULL_BASE_SHA set in cloudbuild.yaml instead.
Currently, whenever agnhost/VERSION is bumped, the version in
agnhost/agnhost.go has to be bumped as well. This is also verified
on presubmit (build/dependencies.yaml).
This means that whenever we need to bump the agnhost image version,
someone has to approve the build/dependencies.yaml, which is not as
easy.
This commit removes the need for this check by automatically setting
the Version inside agnhost.go at build time, simplifying the process.
For manifest lists containing Windows images, it is important to also have the "os.version"
annotation set, as it is needed by the Windows nodes, so they can pull the appropriate image
from the list.
Previously, the docker manifest CLI did not have the capability to set it, so, we had to set
it outselves in the manifest list's image JSON file. This is no longer necessary since
docker 20.10.0, which includes docker manifest annotate --os-version.
The docker installed in the image gcr.io/k8s-testimages/gcb-docker-gcloud:v20210622-762366a
satisfies this version requirement.
Looking deeper into the logs there are a lot of errors like:
`script exited with error 1`
Initial reaction was that there was a problem with download, but it
looks like the script we use to register the qemu emulators may be at
fault, let's try this alternate mechanism.
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
Adds the httpd, nginx, images that are used in tests.
Two different versions of nginx have to build, and thus, the have
different folders. An ALIAS file was added to nginx-new in order to
keep the same image name.
- reset `binfmt_misc` is needn't when the building platform is non-amd64 and the
target arch is the same as building platform
- non-amd64 platform doesn't supported cross-build well, and there is no binary of
`qemu-user-static` able to do that, and thus skip the cross-build on non-amd64
platform.
Signed-off-by: Dave Chen <dave.chen@arm.com>
The default value for the progress is ``auto``, which will eat the output of RUN commands. This makes it a bit hard to debug when issues occur. Changing that option to ``plain`` will ensure that the output is properly kept.
The metadata-concealment image does not have any BASEIMAGE file, which means
that the image will be built from scratch. In this case, there are a few
fixes that need to be made in the image-build.sh script.
We are planing to test and support 20H2 release of Windows, thus,
we need to build test images for it as well. The busybox image already
has a BASEIMAGE entry for it, but we also need to add it to the image-util.sh's
windows_os_versions, so the OS Version can be properly included in the
manifest list.
We are planing to test and support 20H2 release of Windows, thus,
we need to build test images for it as well. The busybox image already
has a BASEIMAGE entry for it, but we also need to add it to the image-util.sh's
windows_os_versions, so the OS Version can be properly included in the
manifest list.
Dockerhub will introduce rate limiting in November, and a lot of E2E tests
are relying on the busybox image. It could potentially become an issue
causing jobs to fail because of this.
Ideally, we'd have the busybox image mirrored on gcr.io, but that could take
some time. Until then, we can just have the Image Builder mirror the image
for us in the staging registry and use that for tests until this issue is
solved. The busybox image should NOT be promoted out of staging.
During the sig-testing meeting, it was decided that we should do the same
for the other images are hosted on dockerhub.
Two different versions of httpd and nginx have to be built, and thus, the have
different folders. An ALIAS file was added to httpd-new and nginx-new in order
to keep the same image name.
Before creating and bootstrapping a docker buildx instance, we need to call
register.sh with the -p yes flag. Without this, the docker buildx will only
support linux/amd64 and linux/386 platforms, meaning that it will fail when
trying to build images for other architecture types.
Additionally, the builder has to have qemu and its qemu-* binaries installed
in order to properly build the images. The recently created image
gcr.io/k8s-testimages/gcb-docker-gcloud:v20201130-750d12f has those requirements met.
Currently, the Image Builder job is failing as it cannot build images
for other architecture types. This happens because the Image Builder image
does not have any of the expected qemu-* binaries in /usr/bin/ needed in order to
run qemu-binfmt-conf.sh with the -p yes flag, so that flag is removed.
Currently, some of the E2E test images have Windows support, and one of the goals is for most of
them to have Windows support. For that, the Image Builder is currently building those Windows
container images using a few Windows Server nodes (for 1809, 1903, 1909) with Remote Docker
enabled which are hosted on an azure subscription dedicated for CNCF.
With this, the Windows nodes dependency is removed entirely, as the images can be also built with
docker buildx. One additional benefit to this is that adding new supported Windows OS versions
to the E2E test images manifest lists becomes a lot easier (we wouldn't have to create a new Windows
Server node that matches that new OS version, assign DNS name, update certificates, etc.), and it
also becomes easier for other people to build their own E2E windows test images.
However, some dependencies are still required to run on a Windows machine. To solve this, we can
just pull helper images: e2eteam/powershell-helper:6.2.7 and e2eteam/busybox-helper:1.29.0. Their
Dockerfiles and a Makefile for them has been included in this commit. If any change is required to
them, then a new image will be built and tagged under a different version, but they are pretty
straight-forward and shouldn't require changes.
However, there is a small concern when it comes to the build time: Windows servercore images are
very large (for example, mcr.microsoft.com/windows/servercore:ltsc2019 is 4.99GB uncompressed, and
about ~2 GB compressed - those images are already cached on the Windows Server builder nodes, so
this isn't an issue there), and we currently support 1809, 1903, and 1909 (soon to add 2004).
This can lead to build times that are too big.
We have changed the base image to nanoserver (uncompressed size: 250MB), but some images still
require some DLLs or some other dependencies that can be fetched from a servercore image.
A separate job has been defined that would build a scratch windows-servercore-cache image monthly,
and then we can just get those dependencies from this cache, which will be very small.
This would be preferred, as the Windows images update periodically, and those dependencies
could be updated as well.
Using Windows nanoserver container images as a base instead of the current
Windows servercore image will reduce the image size by about ~10x.
However, the nanoserver image lacks several things we need:
- netapi32.dll
- powershell
- certain powershell commands
- chocolatey cannot be used
When building the nanoserver images, we are going to use a Windows servercore helper,
in which we are going to install the necessary dependencies, and then copy them over
to our nanoserver image, including necessary DLLs.
Other notable changes include:
- switch from wget to curl (wget was a powershell alias).
- implement in code getting the DNS suffix list and DNS server list.
- reimplement getting file permissions for mounttest.
When trying to build the s390x image, it would fail when running the apk
command with the following error:
ERROR: Unable to open root: Bad address
ERROR: Failed to open apk database: Bad address
This can be fixed by updating the third_party/multiarch/qemu-user-static/register/register.sh
and third_party/multiarch/qemu-user-static/register/qemu-binfmt-conf.sh scripts
and their usage to a newer version [1].
Additionally, the packages nginx-mod-http-lua and nginx-mod-http-lua-upstream
cannot be found in the regular http://dl-cdn.alpinelinux.org/alpine/v3.9/main/s390x/
repository, but we can use an older one [2].
[1] https://github.com/qemu/qemu/blob/master/scripts/qemu-binfmt-conf.sh
[2] http://dl-cdn.alpinelinux.org/alpine/v3.8/main
The build times are a bit high for the image builder (~50 minutes), and it will a bit more
when Windows support will be added to the other test images. This commit changes the
machineType to N1_HIGHCPU_8.
Reenables Windows test image building. Added DOCKER_CERT_BASE_PATH (default value: $HOME),
which will contain the path where the certificates needed for Remote Docker Connection can
be found.
If a REMOTE_DOCKER_URL was not set for a particular OS version, exclude that image from the
manifest list. This fixes an issue where, if REMOTE_DOCKER_URL was not set for Windows Server 1909,
the Windows were completely excluded from the manifest list, including for Windows Server 1809
and 1903 which could have been built and pushed.
Sets "test-webserver" as the default CMD for kitten and nautilus. Since they are now based on
agnhost, they should be set to run test-webserver to maintain previous behaviour.
Bumps the agnhost version to 2.13, as 2.12 has already been promoted. 2.13 will contain
Windows support.
Adds Windows support for the kitten and nautilus images, so they can promoted together
with agnhost (they were not previously promoted).
Adds OWNERS files to: agnhost, busybox, kitten, nautilus.
Adds splitOsArch function to image-util.sh, which makes the script DRY-er.
When building a Windows test image, if REMOTE_DOCKER_URL is not set, skip the rest of the
building process for that image, which will save some time (no need to build binaries).
If a REMOTE_DOCKER_URL was not set for a particular OS version, exclude that image from the
manifest list. This fixes an issue where, if REMOTE_DOCKER_URL was not set for Windows Server 1909,
the Windows were completely excluded from the manifest list, including for Windows Server 1809
and 1903 which could have been built and pushed.
Sets "test-webserver" as the default CMD for kitten and nautilus. Since they are now based on
agnhost, they should be set to run test-webserver to maintain previous behaviour.
In the current version, due to how make works, when building all the conformance
images (make all-push WHAT=all-conformance), ALL the images are being built first
before being pushed.
This PR will allow images to be built and pushed immediately afterwards, so the first
images that have been succesfully built are already pushed and promotable, even if
the the task failed on the last image, or it timed out.
In order to build Windows container images for multiple OS versions,
--isolation=hyperv is required. However, not all clouds / nodes supports
or have it enabled by default, which is why we're going to rely on
having multiple nodes to build the Windows images, until this issue
is addressed.
This commit adds support for building test images for multiple
Windows versions, as we have to support both LTS and SAC channels.
With this, the format for Windows images in the BASEIMAGE files is:
OS/ARCH/OS_VERSION
Also adds --isolation-hyperv to the Windows docker build command, making sure
that container images for multiple OS versions can be built using the same
Windows node.
Adds Windows support to the test/images/image-util.sh script.
A Windows node with Docker installed is required to build Windows images.
The connection URL to it must be set in the REMOTE_DOCKER_URL env variable.
Additionally, the authentication to the remote docker node is done through
certificates, which must be found in ~/.docker.
By default, the REMOTE_DOCKER_URL env variable is set to "" in the Makefile,
and because of it, the image-util.sh script will skip building and pushing
Windows images.
Added GOOS argument to the go build process in order to be able to build
Windows binaries. Additionally, the OS env variable was added to the images
Makefiles (default value is "linux") in order to maintain default behaviour.
Some images require a different Dockerfile for Windows images, since they
have different ways of installing dependencies. Because of this, if a image
needs to be built for Windows, it will first check for a Dockerfile_windows
file instead of the default one. If there isn't one, it means that the
same Dockerfile can be used for both Windows and Linux.
All Windows images will be based on the image
"mcr.microsoft.com/windows/servercore:ltsc2019". There are a couple of features
that are needed from this image, especially powershell.
Added busybox image for Windows. Most Windows images will be based on it, which
will help reduce the command line differences between Linux and Windows, but
not entirely.
Added Windows support for agnhost image.
Changes the image naming template from:
$REGISTRY/$image-$arch:$TAG
to
$REGISTRY/$image:$TAG-$os_name-$arch
The previous naming template would generate a plethora of images (Ai * N images,
where Ai is the number of OS/architectures for the image i and N is the number
of images), while the new naming template will reduce the number of images to N.
The new template also includes the OS name, as we plan to integrate Windows
images into the manifest lists as well.
When building images, their REGISTRY can be set to a custom
one, instead of the default "gcr.io/kubernetes-e2e-test-images" or
"us.gcr.io/k8s-artifacts-prod/e2e-test-images".
Some images are based on other images we're already building
(e.g.: kitten, nautilus), but their base images
are set in the default registry name, which can be undesirable.
This commit addresses this issue.
Windows images will require other base images, and thus, we will need
to explicitly specify the OS type a base image is for in order to
avoid confusion or errors.
The manifest list is stateful, which means that the same list will get amended
with each successive image published. That's unintended, and can lead to the
wrong image being pulled from the manifest list.
Resets the manifest list before amending new images into it.
It seems that the Image Promoter is running containers without the -t flag, which causes the error:
the input device is not a TTY
Removing the -it from the docker command in kubernetes/test/images/image-util.sh solves this.
Prior to the Image Centralization part 4 (https://github.com/kubernetes/kubernetes/pull/81170),
a PR merged that enables the Image Promoter to run on the k/k test images.
The Image Promoter currently only builds the Conformance-related images, but the
Image Centralization part 4 centralized some of those images into agnhost, so they
need to be removed from the conformance_images list.
Additionally, https://github.com/kubernetes/kubernetes/pull/81226 proposes mounttest-user
image to be removed, and RunAsUser to be used in tests instead.
The image used by the Image Promoter (gcr.io/k8s-testimages/gcb-docker-gcloud:v20190906-745fed4)
is based on busybox, and thus, the sed binary is actually busybox. image-util.sh calls
kube::util::ensure-gnu-sed several times, which ensures that a GNU sed binary exists
(it checks by greping GNU in its --help output). Obviously, it won't match the busybox sed
binary. But the sed usage in image-util.sh is fairly simple, and the busybox sed is sufficient.
Bumps image versions for: jessie-dnsutils, nonewprivs, resource-consumer, sample-apiserver. These
images are included in the conformance_images that are being built by the Image Promoter, so
we're bumping them just to make sure we're not breaking anything and cause all the CIs to fall.
We're going to bump the image versions used in tests in a subsequent PR. The image version was not
bumped for: agnhost, kitten, nautilus, as they were already bumped by the Image Centralization part 4
PR.
In order for the E2E test images to be automatically built and published
to the staging registry (from which they will be promoted to the regular
E2E test registry), the cloudbuild.yaml file has been added.
The file was added in conformance with [1].
Adds the ability to build all test images:
make -C test/images WHAT=all-images
[1] https://github.com/kubernetes/test-infra/blob/master/config/jobs/image-pushing/README.md
- `GOARM` should not be hardcoded
- `GOARM` needn't be set when the `ARCH` is not `arm`
- make it also possible to build binary within `agnhost` dir as well
- fix image build failure when the user is root
Signed-off-by: Dave Chen <dave.chen@arm.com>