Commit Graph

37 Commits

Author SHA1 Message Date
Fabiano Fidêncio
3edbff730d image-builder: support building addon images without /sbin/init check
Addon images (e.g. the CoCo guest components addon) are not full root
filesystems -- they contain only the binaries and configuration that
get bind-mounted into the real rootfs at boot.  The existing
check_rootfs() validation requires /sbin/init and systemd, which are
not present in addon images.

Add a SKIP_ROOTFS_CHECK environment variable that, when set to "yes",
bypasses the check_rootfs() call.  Forward the variable into the
container environment when using the Docker-based build path so it
works in both direct and containerised invocations.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
Assisted-by: Cursor <cursoragent@cursor.com>
2026-05-11 19:06:41 +02:00
Fabiano Fidêncio
284db5421f image-builder/nvidia: skip DAX header for virtio-blk-pci images
The DAX header (2 MiB of NVDIMM metadata + a duplicate MBR) is
unconditionally prepended to every image by set_dax_header(). NVIDIA
images use virtio-blk-pci with disable_image_nvdimm=true, so the
kernel reads MBR #1 directly and never touches the DAX metadata --
it is dead weight.

Add a SKIP_DAX_HEADER environment variable (default "no") that, when
set to "yes", skips the DAX header entirely:
- Removes the 2 MiB DAX overhead from image size calculations in
  both the erofs and ext4 paths
- Skips the set_dax_header() call, avoiding compilation and
  execution of the nsdax tool
- Passes the variable through to containerised builds

Enable SKIP_DAX_HEADER=yes for both install_image_nvidia_gpu() and
install_image_nvidia_gpu_confidential() in the build pipeline. All
other image builds are unaffected (default remains "no").

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:16:23 +02:00
Fabiano Fidêncio
f35661fac2 image-builder: add erofs dm-verity support and lz4hc compression
Add full dm-verity and measured rootfs support to
create_erofs_rootfs_image(), bringing it to parity with the ext4 path.

Unlike ext4, which is a read-write filesystem mounted read-only by
convention, erofs is structurally read-only -- no journal, no write
metadata, no superblock write path.

This is a natural fit for dm-verity: erofs never attempts writes, so
verity never has to reject anything. With ext4, the kernel must skip
journal replay on verity-protected devices, which is a fragile
assumption.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:16:23 +02:00
Fabiano Fidêncio
0e24b72e0d image-builder: refactor dm-verity setup into shared functions
Extract build_kernel_verity_params() and setup_verity() from the
inline block inside create_rootfs_image() into top-level functions.

This is a pure refactoring with no behavior change. The verity logic
is moved verbatim, with the only difference being that
build_kernel_verity_params() now takes the image path as an explicit
parameter instead of capturing it from the enclosing scope.

The extracted functions will be reused by create_erofs_rootfs_image()
in a subsequent commit to add dm-verity support for erofs images.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-05-10 17:16:23 +02:00
Fabiano Fidêncio
ea6c77bd5e tools: Fix shellcheck issues in image_builder.sh
Fix shellcheck warnings and notes identified by running
shellcheck --severity=style.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-04-24 08:14:08 +02:00
Manuel Huber
a786582d0b rootfs: deprecate initramfs dm-verity mode
Remove the initramfs folder, its build steps, and use the kernel
based dm-verity enforcement for the handlers which used the
initramfs mode. Also, remove the initramfs verity mode
capability from the shims and their configs.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
a3c4e0b64f rootfs: Introduce kernelinit dm-verity mode
This change introduces the kernelinit dm-verity mode, allowing
initramfs-less dm-verity enforcement against the rootfs image.
For this, the change introduces a new variable with dm-verity
information. This variable will be picked up by shim
configurations in subsequent commits.
This will allow the shims to build the kernel command line
with dm-verity information based on the existing
kernel_parameters configuration knob and a new
kernel_verity_params configuration knob. The latter
specifically provides the relevant dm-verity information.
This new configuration knob avoids merging the verity
parameters into the kernel_params field. Avoiding this, no
cumbersome escape logic is required as we do not need to pass the
dm-mod.create="..." parameter directly in the kernel_parameters,
but only relevant dm-verity parameters in semi-structured manner
(see above). The only place where the final command line is
assembled is in the shims. Further, this is a line easy to comment
out for developers to disable dm-verity enforcement (or for CI
tasks).

This change produces the new kernelinit dm-verity parameters for
the NVIDIA runtime handlers, and modifies the format of how
these parameters are prepared for all handlers. With this, the
parameters are currently no longer provided to the
kernel_params configuration knob for any runtime handler.
This change alone should thus not be used as dm-verity
information will no longer be picked up by the shims.

systemd-analyze on the coco-dev handler shows that using the
kernelinit mode on a local machine, less time is spent in the
kernel phase, slightly speeding up pod start-up. On that machine,
the average of 172.5ms was reduced to 141ms (4 measurements, each
with a basic pod manifest), i.e., the kernel phase duration is
improved by about 18 percent.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Manuel Huber
d37db5f068 rootfs: Restore "gpu: Handle root_hash.txt ..."
This reverts commit 923f97bc66 in
order to re-instantiate the logic from commit
e4a13b9a4a.

The latter commit was previously reverted due to the NVIDIA GPU TEE
handler using an initrd, not an image.

Signed-off-by: Manuel Huber <manuelh@nvidia.com>
2026-02-05 23:04:35 +01:00
Fabiano Fidêncio
33b1f0786e Revert "arm64: Do not use DAX with the rootfs image"
This reverts commit 2acb94ef2d, as we have
a kernel patch approved fixing the issue.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-17 19:15:53 +01:00
Fabiano Fidêncio
2acb94ef2d arm64: Do not use DAX with the rootfs image
Kernel 6.18.x has an issue with DAX, which is not yet fixed upstream:
```
[    0.737679] EXT4-fs (pmem0p1): mounted filesystem 79676804-7c8b-491a-b2a6-9bae3c72af70 ro with ordered data mode. Quota mode: disabled.
[    0.737891] VFS: Mounted root (ext4 filesystem) readonly on device 259:1.
[    0.739119] devtmpfs: mounted
[    0.739476] Freeing unused kernel memory: 1920K
[    0.740156] Run /sbin/init as init process
[    0.740229]   with arguments:
[    0.740286]     /sbin/init
[    0.740321]   with environment:
[    0.740369]     HOME=/
[    0.740400]     TERM=linux
[    0.743162] Unable to handle kernel paging request at virtual address fffffdffbf000008
[    0.743285] Mem abort info:
[    0.743316]   ESR = 0x0000000096000006
[    0.743371]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.743444]   SET = 0, FnV = 0
[    0.743489]   EA = 0, S1PTW = 0
[    0.743545]   FSC = 0x06: level 2 translation fault
[    0.743610] Data abort info:
[    0.743656]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[    0.743720]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[    0.743785]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[    0.743848] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000b9d17000
[    0.743931] [fffffdffbf000008] pgd=10000000bfa3d403, p4d=10000000bfa3d403, pud=1000000040bfe403, pmd=0000000000000000
[    0.744070] Internal error: Oops: 0000000096000006 [#1]  SMP
[    0.748888] CPU: 0 UID: 0 PID: 1 Comm: init Not tainted 6.18.4 #1 NONE
[    0.749421] pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.749969] pc : dax_disassociate_entry.constprop.0+0x20/0x50
[    0.750444] lr : dax_insert_entry+0xcc/0x408
[    0.750802] sp : ffff80008000b9e0
[    0.751083] x29: ffff80008000b9e0 x28: 0000000000000000 x27: 0000000000000000
[    0.751682] x26: 0000000001963d01 x25: ffff0000004f7d90 x24: 0000000000000000
[    0.752264] x23: 0000000000000000 x22: ffff80008000bcc8 x21: 0000000000000011
[    0.752836] x20: ffff80008000ba90 x19: 0000000001963d01 x18: 0000000000000000
[    0.753407] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[    0.753970] x14: ffffbf3154b9ae70 x13: 0000000000000000 x12: ffffbf3154b9ae70
[    0.754548] x11: ffffffffffffffff x10: 0000000000000000 x9 : 0000000000000000
[    0.755122] x8 : 000000000000000d x7 : 000000000000001f x6 : 0000000000000000
[    0.755707] x5 : 0000000000000000 x4 : 0000000000000000 x3 : fffffdffc0000000
[    0.756287] x2 : 0000000000000008 x1 : 0000000040000000 x0 : fffffdffbf000000
[    0.756871] Call trace:
[    0.757107]  dax_disassociate_entry.constprop.0+0x20/0x50 (P)
[    0.757592]  dax_iomap_pte_fault+0x4fc/0x808
[    0.757951]  dax_iomap_fault+0x28/0x30
[    0.758258]  ext4_dax_huge_fault+0x80/0x2dc
[    0.758594]  ext4_dax_fault+0x10/0x3c
[    0.758892]  __do_fault+0x38/0x12c
[    0.759175]  __handle_mm_fault+0x530/0xcf0
[    0.759518]  handle_mm_fault+0xe4/0x230
[    0.759833]  do_page_fault+0x17c/0x4dc
[    0.760144]  do_translation_fault+0x30/0x38
[    0.760483]  do_mem_abort+0x40/0x8c
[    0.760771]  el0_ia+0x4c/0x170
[    0.761032]  el0t_64_sync_handler+0xd8/0xdc
[    0.761371]  el0t_64_sync+0x168/0x16c
[    0.761677] Code: f9453021 f2dfbfe3 cb813080 8b001860 (f9400401)
[    0.762168] ---[ end trace 0000000000000000 ]---
[    0.762550] note: init[1] exited with irqs disabled
[    0.762631] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
```

For now, we limit the rootfs that we ship to ARM64 to not use DAX, in
the future we'll re-enable it as soon as the patch lands on mainstream
kernel.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2026-01-14 11:46:40 +01:00
Fabiano Fidêncio
923f97bc66 rootfs: Temporarily revert "gpu: Handle root_hash.txt correctly"
This reverts commit e4a13b9a4a, as it
caused some issues with the GPU workflows.

Reverting it is better, as it unblocks other PRs.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-12-05 11:47:37 +01:00
Zvonko Kaiser
e4a13b9a4a gpu: Handle root_hash.txt correctly
Updates to the shim-v2 build and the binaries.sh script.
Makeing sure that both variants "confidential" AND
"nvidia-gpu-confidential" are handled.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-02 19:56:19 +01:00
Fabiano Fidêncio
776e08dbba build: Add nvidia image rootfs builds
So far we've only been building the initrd for the nvidia rootfs.
However, we're also interested on having the image beind used for a few
use-cases.

Signed-off-by: Fabiano Fidêncio <ffidencio@nvidia.com>
2025-11-27 22:46:07 +01:00
Dan Mihai
65385a5bf9 image: custom guest rootfs image file size alignment
The Guest rootfs image file size is aligned up to 128M boundary,
since commmit 2b0d5b2. This change allows users to use a custom
alignment value - e.g., to align up to 2M, users will be able to
specify IMAGE_SIZE_ALIGNMENT_MB=2 for image_builder.sh.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-06-02 16:15:17 +00:00
Dan Mihai
a49d0fb343 rootfs: delete systemd units/files from rootfs.sh
Move the deletion of unnecessary systemd units and files from
image_builder.sh into rootfs.sh.

The files being deleted can be applicable to other image file formats
too, not just to the rootfs-image format created by image_builder.sh.

Also, image_builder.sh was deleting these files *after* it calculated
the size of the rootfs files, thus missing out on the opportunity to
possibly create a smaller image file.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-01-13 21:28:23 +00:00
Gabriela Cervantes
4cd737d9fd image-builder: Remove unused variable
This PR removes an unused variable in the image builder script.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2024-10-04 15:56:28 +00:00
Zvonko Kaiser
a48c084e13 ci: remove sudo and make sure image is owed by user
The image build needs special handling since we're doing a lot of
privileged operations.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2024-06-03 15:29:06 +00:00
Biao Lu
b816dca3ed image-builder: fix incorrect part start position
The 'part_start' of image and dax_image should exactly specify the
same location, according to the parted documentation, to exactly
specify the location, the units of start and end should use MiB.

https://www.gnu.org/software/parted/manual/parted.html#IEC-binary-units

Fixes: #8435

Signed-off-by: Biao Lu <biao.lu@intel.com>
2023-12-04 17:20:26 +08:00
Manabu Sugimoto
211de08d9e osbuilder: Remove chcon operation for guest SELinux
Remove the `chcon` operation which adds `container_runtime_exec_t` label to
the `kata-agent` binary because the container-selinux package including
the 39f83cc74d
commit has been released officially.
Ref. https://centos.pkgs.org/9-stream/centos-appstream-x86_64/container-selinux-2.221.0-1.el9.noarch.rpm.html

The container-selinux package is installed in a guest rootfs when we create it with `SELinux = yes`,
and `restorecon` sets `container_runtime_exec_t` to the `kata-agent`.

Fixes: #7807

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2023-08-31 16:44:32 +09:00
Jianyong Wu
35d6d86ab5 static-build: enable cross-build for image build
It's too long a time to cross build agent based on docker buildx, thus
we cross build rootfs based on a container with cross compile toolchain
of gcc and rust with musl libc. Then we get fast build just like native
build.

rootfs initrd cross build is disabled as no cross compile tolchain for
rust with musl lib if found for alpine and based on docker buildx takes
too long a time.

Fixes: #6557
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-08-01 23:28:52 +02:00
Wang, Arron
5cb02a8067 image-build: generate root hash as an separate partition for rootfs
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.

Fixes: #6674

Signed-off-by: Wang, Arron <arron.wang@intel.com>
2023-06-06 12:31:14 +02:00
yaoyinnan
bdf20b5d26 rootfs: support EROFS filesystem
For kata containers, rootfs is used in the read-only way.
EROFS can noticably decrease metadata overhead.

On the basis of supporting the EROFS file system, it supports using the config parameter to switch the file system used by rootfs.

Fixes: #6063

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2023-02-11 00:44:13 +08:00
Manabu Sugimoto
a75f99d20d osbuilder: Create guest image for SELinux
Create a guest image to support SELinux for containers inside the guest
if `SELINUX=yes` is specified. This works only if the guest rootfs is
CentOS and the init service is systemd, not the agent init. To enable
labeling the guest image on the host, selinuxfs must be mounted on the
host. The kata-agent will be labeled as `container_runtime_exec_t` type.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 13:32:26 +09:00
James O. D. Hunt
5d6d39be48 scripts: Change here document delimiters
Fix the outstanding scripts using non standard shell here document delimiters.

This should have been caught by
https://github.com/kata-containers/tests/pull/3937, but there is a bug
in the checker which is fixed on
https://github.com/kata-containers/tests/pull/4569.

Fixes: #3864.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-10 09:23:37 +00:00
bin
3f7cf7ae67 osbuilder: show usage if no options/arguments specified
Now if no options/arguments specified, the shell scripts will return an error:

ERROR: Invalid rootfs directory: ''

This commit will show usage if no options/arguments specified.

Fixes: #3256

Signed-off-by: bin <bin@hyper.sh>
2021-12-13 16:10:55 +08:00
Bin Liu
978b13c9e8 Merge pull request #3235 from Kvasscn/kata_dev_image_builer_help
image_build: add help info for '-f' option and 'BLOCK_SIZE' env.
2021-12-09 22:55:24 +08:00
Snir Sheriber
2ebaaac73d osbuilder: be runtime consistent also with podman build
Use the same runtime used for podman run also for the podman build cmd
Additionally remove "docker" from the docker_run_args variable

Fixes: #3239
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
2021-12-09 11:28:16 +02:00
zhanghj
6b3e4c212c image_build: add help info for '-f' option and 'BLOCK_SIZE' env.
The help information of '-f' option is missing, and same issue
with 'BLOCK_SIZE' env variables, fix it in usage() function.

Fixes: #3231

Signed-off-by: zhanghj <zhanghj.lc@inspur.com>
2021-12-08 17:33:07 +08:00
Binbin Zhang
8ee67aae4f osbuilder: fix missing cpio package when building rootfs-initrd image
1. install cpio package before building rootfs-initrd image
2. add `pipefaili;errexit` check to the scripts

Fixes: #3144

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-11-29 23:42:44 +08:00
Yujia Qiao
bfcee91164 osbuilder: fix inconsistent calculation of fs size
This patch fixes inconsistent calculations of the rootfs size.
For `du` and `df`, `-B 1MB` is different from `-BM`. The
former is the power of 1000, and the latter is the power of
1024. So comparing them doesn't make sense. The bug may result
in a larger image than needed.

Fixes: #2560

Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
2021-09-02 16:00:29 +08:00
Jianyong Wu
2b0d5b252e image_build: align image size to 128M for arm64
There is an inconformity between qemu and kernel of memory alignment
check in memory hotplug. Both of qemu and kernel will do the start
address alignment check in memory hotplug. But it's 2M in qemu
while 128M in kernel. It leads to an issue when memory hotplug.

Currently, the kata image is a nvdimm device, which will plug into the VM as
a dimm. If another dimm is pluged, it will reside on top of that nvdimm.
So, the start address of the second dimm may not pass the alginment
check in kernel if the nvdimm size doesn't align with 128M.

There are 3 ways to address this issue I think:
1. fix the alignment size in kernel according to qemu. I think people
in linux kernel community will not accept it.
2. do alignment check in qemu and force the start address of hotplug
in alignment with 128M, which means there maybe holes between memory blocks.
3. obey the rule in user end, which means fix it in kata.

I think the second one is the best, but I can't do that for some reason.
Thus, the last one is the choice here.

Fixes: #1769
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-05-03 10:44:30 +08:00
Eric Ernst
49bdbac606 osbuilder: Allow image registry to be customizable
Give the user chance to specify their own registry in event the default
provided are not accessible, desirable.

Fixes: #1393

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2021-02-17 16:49:58 -08:00
Eric Ernst
cb6d2f3c40 osbuilder: alphabetize fields
Let's go ahead and list the usage info / fields in alphabetical order!

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2021-02-10 12:39:10 -08:00
Wainer dos Santos Moschetta
1273e485d8 osbuilder: Fix urls to repositories
Changed the user-visible urls to point to the right Kata Containers
files/repositories.

Fixes #234

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2021-01-26 07:51:20 -05:00
Pavel Mores
2dfb8bc549 rootfs-builder: fix unbootable dracut-based initramfs on Fedora
This is a forward port of Kata 1.x PR's
https://github.com/kata-containers/osbuilder/pull/480 and
https://github.com/kata-containers/osbuilder/pull/494 .

Fixes #646

Signed-off-by: Pavel Mores <pmores@redhat.com>
2020-09-08 20:10:38 +02:00
Julio Montes
f7ff6d3297 image-builder: disable reflink
Disable reflink when using DAX. Reflink is a xfs feature that cannot be
used together with DAX.

fixes kata-containers/osbuilder#456
fixes #577

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-08-26 09:42:17 -05:00
Salvador Fuentes
715d342519 osbuilder: move code into tools directory
move all osbuilder files into `tools` directory to be able
to merge this into kata-containers repo.

Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
2020-04-29 16:45:00 -05:00