Commit Graph

16527 Commits

Author SHA1 Message Date
Aurélien Bombo
ceb348ad98 gha: Set Zizmor check as non-required
As a consequence of moving away from Advanced Security for Zizmor, it now
checks the entire codebase and will error out on this PR and future.

To be reverted once we address all Zizmor findings in a future PR.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-21 14:02:54 -05:00
Aurélien Bombo
1dd4e20f25 gha: Run Zizmor without Advanced Security
This does not change the security of the analysis, this is just to work
around zizmorcore/zizmor-action#43.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-10-21 14:02:54 -05:00
stevenhorsman
7dd298a0aa workflows: Set top-level permissions to empty
The default suggestion for top-level permissions was
`contents: read`, but scorecard notes anything other than empty,
so try updating it and see if there are any issues. I think it's
only needed if we run workflows from other repos.

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-10-21 14:02:54 -05:00
stevenhorsman
630f568f5b workflows: Tighten up workflow permissions
Since the previous tightening a few workflow updates have
gone in and the zizmor job isn't flagging them as issues,
so address this to remove potential attack vectors

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-10-21 14:02:54 -05:00
Manuel Huber
1630a6e98f runtime: clh: Use msft/v41.0.139 API YAML
Replace the API definitions file with the proper
definitions from Microsoft's cloud-hypervisor fork

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-09-16 10:34:21 -07:00
Aurélien Bombo
195bc72f62 ci: Run Zizmor on pushes to any branch
This runs Zizmor on pushes to any branch, not just main.

This is useful for:

 1. Testing changes in feature branches with the manually-triggered CI.
 2. Forked repos that may use a different name than "main" for their
    default branch.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-09-15 13:08:16 -05:00
Aurélien Bombo
93a9038387 ci: security: Use pull_request instead of pull_request_target
Background:

 * `pull_request` runs on the PR branch code and has access to secrets
   ONLY if the PR is from microsoft/kata-containers (i.e. NOT from an external
   contributor who forked the repo).
 * `pull_request_target` runs on the trusted main branch code by default
   and has access to secrets for any PR.

Reference: https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows#pull_request

Upstream uses `pull_request_target` (and manually checks out the PR code)
to have access to secrets for PRs from external contributors, however we
don't expect external PRs, hence we can use `pull_request`.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-09-11 08:45:13 -05:00
Aurélien Bombo
5200a8bb95 ci: security: Fix "commit hash does not point to a Git tag"
This fixes all such issues, ie.:

https://github.com/kata-containers/kata-containers/security/code-scanning/459
https://github.com/kata-containers/kata-containers/security/code-scanning/508
https://github.com/kata-containers/kata-containers/security/code-scanning/510

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-09-11 08:45:13 -05:00
Aurélien Bombo
48a55ce560 security: gha: Use Zizomor's auditor mode
This is the strictest possible setting for Zizmor.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-09-11 08:45:13 -05:00
Saul Paredes
0e929d100b network: preseed default-gateway neighbor
This change mirrors host networking into the guest as before, but now also
includes the default gateway neighbor entry for each interface.

Pods using overlay/synthetic gateways (e.g., 169.254.1.1) can hit a
first-connect race while the guest performs the initial ARP. Preseeding the
gateway neighbor removes that latency and makes early connections (e.g.,
to the API Service) deterministic.

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2025-08-25 11:17:53 -07:00
Aurélien Bombo
6a12c290ef ci: static-checks: Don't hardcode default repo branch
This would cause weird issues for downstreams which default branch is not
"main".

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-15 11:32:13 -05:00
Cameron Baird
aea2a9bbd0 runtime: Set disable_image_nvdimm=true to disable pmem
Re-add DEFDISABLEIMAGENVDIMM=true to package_build.sh to fix a
regression causing us to use pmem.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-08-14 10:29:31 -07:00
Cameron Baird
317ebb81de runtime: Enforce that OCI memory limit exceeds 128MB baseline
For our Kata UVM, we know we need at least 128MB of memory to prevent instability in the guest.

Enforce this constraint with a descriptive error to prevent users from destabilizing the UVM with faulty k8s configurations.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>
2025-08-12 09:59:34 -07:00
Aurélien Bombo
f58fd1a726 to-squash: github: Remove invalid link from PR template
This should be squashed into d1eb0ac.

This is to avoid the following static-checks error:

2025-08-05T21:39:49.8540588Z .github/pull_request_template.md
2025-08-05T21:39:49.8570049Z ERROR: Invalid URL 'https://nvd.nist.gov/vuln/detail/CVE-YYYY-XXXX' found in the following files:

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-08 18:10:25 -05:00
Aurélien Bombo
0350f72af5 to-squash: node-builder: add reference to README.md
This is needed to avoid the following static-checks error:

2025-08-05T21:27:20.0028337Z [static-checks.sh:808] ERROR: Document tools/osbuilder/node-builder/azure-linux/README.md is not referenced

This commit is to be squashed into the node-builder commit.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-08 18:10:25 -05:00
Aurélien Bombo
0f5856171c ci: static-checks: add SECURITY.md to exclude list
This adds SECURITY.md to the list of GH-native files that should be excluded by
the reference checker.

Today this is useful for downstreams who already have a SECURITY.md file for
compliance reasons. When Kata onboards that file, this commit will also be
required.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-08 18:10:25 -05:00
Aurélien Bombo
a32ac2ba41 ci: static-checks: Auto-detect repo by default
This auto-detects the repo by default (instead of having to specify
KATA_DEV_MODE=true) so that forked repos can leverage the static-checks.yaml CI
check without modification.

An alternative would have been to pass the repo in static-checks.yaml. However,
because of the matrix, this would've changed the check name, which is a pain to
handle in either the gatekeeper/GH UI.

Example fork failure:
https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142421739#step:8:75

I've tested this change to work in a fork.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-08 18:10:25 -05:00
Manuel Huber
7e786223d4 docs: node-builder: Remove references to moby-containerd-cc
As we adopted containerd2, we remove references to our prior
forked containerd version.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-08-07 16:29:01 -07:00
Aurélien Bombo
c4e130369a runtime: fix make test
This addresses the following errors from `make test` to allow us to require
that upstream CI:

https://github.com/microsoft/kata-containers/actions/runs/16656407213/job/47142422035?pr=392#step:13:53

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-01 14:32:46 -05:00
Aurélien Bombo
8b7154cc8a docs: node-builder: fix static check error
This fixes the below static check error to follow up on the infra fix from
kata-containers/kata-containers#11646:

2025-07-31T19:32:45.0031829Z time="2025-07-31T19:32:44.990004665Z" level=fatal msg="found 2 parse errors:\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Set up environment\" (heading: {Name:Set up environment MDName:Set up environment LinkName:set-up-environment Level:2})\nfile=\"tools/osbuilder/node-builder/azure-linux/README.md\": duplicate heading: \"Install build dependencies\" (heading: {Name:Install build dependencies MDName:Install build dependencies LinkName:install-build-dependencies Level:2})" commit=1d17f56b1aa7a880468b8e25d14467c92dca8eeb name=kata-check-markdown pid=9075 source=check-markdown version=0.0.1

Note: that is likely flagged because having two headings with the same
name, even under different sections, makes it impossible to create a
canonical heading link in Markdown.

This should eventually be squashed into the node-builder commit.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-08-01 13:24:10 -05:00
Aurélien Bombo
4e5f4f3235 node-builder: fix typo in string comparison
This also fixes a shellcheck error and lets us require the
shellcheck-required job:

In ./tools/osbuilder/node-builder/azure-linux/uvm_build.sh line 34:
        if [ -z "${UVM_KERNEL_HEADER_DIR}}" ]; then
                                         ^-- SC2157 (error): Argument to -z is always false due to literal strings.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-07-31 13:13:30 -05:00
Manuel Huber
78a191d779 Revert "runtime: fix error when using the debug console"
This reverts commit 3cfdd53a88.
2025-07-31 10:10:32 -07:00
ms-mahuber
7eda0e8cf4 ci: re-add codeql.yml with proper branches
Enabling advanced CodeQL logic, re-adding upstream's
codeql.yml with the only modifications being branch
specifications. This should align fork and upstream
CodeQL task logic.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-07-30 15:11:32 -07:00
Manuel Huber
23eb5c982f ci: Delete codeql.yml
The file is currently being ignored as the CodeQL analysis task
is configured as 'default'. In order to configure this as an
'advanced' task, one needs to push a CodeQL file. However, we
cannot push as this file already exists. As we don't want to
change the file's path, I am temporarily removing this file.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-07-30 14:57:09 -07:00
Sumedh Alok Sharma
5586d27bd7 runtime: relax timeout for CreateVM + BootVM in CLH
This commit introduces changes merged in upstream PR 9153
of relaxing the timeout for calling CLH's CreateVM+BootVM
APIs. Further, the commit increases the timeout to 100s to
handle guest boot with large memory requests.

Signed-off-by: Sumedh Alok Sharma <sumsharma@microsoft.com>
2025-07-24 19:39:35 +00:00
Manuel Huber
cf7d65a6a2 runtime: clh: update cloud-hypervisor API reference
- Change Makefile to point to fork
- Change versions.yaml to point to proper version on fork
- Do not regenerate the binding - the current definitions are invalid
- Definitions will be fixed with upcoming versions such as v41.0.120

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-07-24 19:39:35 +00:00
Dan Mihai
baf6963ab0 node-builder: 2Mb aligned guest image size
Build the mariner guest image using IMAGE_SIZE_ALIGNMENT_MB=2.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-07-24 19:39:35 +00:00
Manuel Huber
9af9844bc7 runtime: Allocate default workload vcpus
- similar to the static_sandbox_default_workload_mem option,
  assign a default number of vcpus to the VM when no limits
  are given, 1 vcpu in this case
- similar to commit c7b8ee9, do not allocate additional vcpus
  when limits are provided

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-07-24 19:39:35 +00:00
Dan Mihai
0ec34036bb runtime: improved memory overhead management
After these changes:

1. The value of the K8s runtime class memory overhead:
   - Covers the memory usage from all the Host-side components (mainly
     the Kata Shim and the VMM).
   - Doesn't include the memory usage from any Guest-side components.

2. The value of a pod memory limit specified by the user:
   - Is equal to the memory size of the Pod VM.
   - Includes the memory usage from all the Guest-side components
     (mainly user's workload, the Guest kernel, and the Kata Agent)
   - Doesn't include the memory usage from any Host-side components.

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
2025-07-24 19:39:35 +00:00
Manuel Huber
7bf23ac958 tools: Add initial igvm-builder and node-builder/azure-linux scripting
This branch starts introducing additional scripting to build, deploy
and evaluate the components used in AKS' Pod Sandboxing and
Confidential Containers preview features. This includes the capability
to build the IGVM file and its reference measurement file for remote
attestation.

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Improve igvm-builder and node-builder/azure-linux scripting

- Support for Mariner 3 builds using OS_VERSION variable
- Improvements to IGVM build process and flow as described in README
- Adoption of using only cloud-hypervisor-cvm on CBL-Mariner

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Add package-tools-install functionality

- Add script to install kata-containers(-cc)-tools bits
- Minor improvements in README.md
- Minor fix in package_install
- Remove echo outputs in package_build

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Enable setting IGVM SVN

- Allow setting SVN parameter for IGVM build scripting

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: introduce BUILD_TYPE variable

This lets developers build and deploy Kata in debug mode without having to make
manual edits to the build scripts.

With BUILD_TYPE=debug (default is release):

 * The agent is built in debug mode.
 * The agent is built with a permissive policy (using allow-all.rego).
 * The shim debug config file is used, ie. we create the symlink
   configuration-clh-snp-debug.toml <- configuration-clh-snp.toml.

For example, building and deploying Kata-CC in debug mode is now as simple as:

   make BUILD_TYPE=debug all-confpods deploy-confpods

Also do note that make still lets you override the other variables even after
setting BUILD_TYPE. For example, you can use the production shim config with
BUILD_TYPE=debug:

   make BUILD_TYPE=debug SHIM_USE_DEBUG_CONFIG=no all-confpods deploy-confpods

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: introduce SHIM_REDEPLOY_CONFIG

See README: when SHIM_REDEPLOY_CONFIG=no, the shim configuration is NOT
redeployed, so that potential config changes made directly on the host
during development aren't lost.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: Use img for Pod Sandboxing

Switch from UVM initrd to image format

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Adapt README instructions

- Sanitize containerd config snippet
- Set podOverhead for Kata runtime class

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

tools: Adapt AGENT_POLICY_FILE path

- Adapt path in uvm_build.sh script to comply
  with the usptream changes we pulled in

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Use Azure Linux 3 as default path

- update recipe and node-builder scripting
- change default value on rootfs-builder

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Deploy-only for AzL3 VMs

- split deployment sections in node-builder README.md
- install jq, curl dependencies within IGVM script
- add path parameter to UVM install script

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Minor updates to README.md

- no longer install make package, is part of meta package
- remove superfluous popd
- add note on permissive policy for ConfPods UVM builds

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Updates to README.md

- with the latest 3.2.0.azl4 package on PMC, can remove OS_VERSION parameter
  and use the make deploy calls instead of copying files by hand for variant
  I (now aligned with Variant II)
- with the latest changes on msft-main, set the podOverhead to 600Mi

Signed-off-by: Manuel Huber <mahuber@microsoft.com>

node-builder: Fix SHIM_USE_DEBUG_CONFIG behavior

Using a symlink would create a cycle after calling this script again when
copying the final configuration at line 74 so we just use cp instead.

Also, I moved this block to the end of the file to properly override the final
config file.

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>

node-builder: Build and install debug configuration for pod sandboxing

For ease of debugging, install a configuration-clh-debug.toml for pod
sandboxing as we do in Conf pods.

Signed-off-by: Cameron Baird <cameronbaird@microsoft.com>

runtime: remove clh-snp config file usage in makefile

Not needed to build vanilla kata

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>

package_tools_install.sh: include nsdax.gpl.c

Include nsdax.gpl.c

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2025-07-24 19:39:35 +00:00
Daniel Mihai
3cfdd53a88 runtime: fix error when using the debug console
This fixes the below error when attempting to access the debug console when
all debug_console_enabled=true and all 3 enable_debug options are true:

level=error msg="error create pseudo tty" error="open /dev/ptmx: operation not
permitted"

Signed-off-by: Aurelien Bombo <abombo@microsoft.com>
2025-07-24 19:39:35 +00:00
Christopher Co
7ce4a34ce2 github: copy CODEOWNERS from cc-msft-prototypes
This adds our team as reviewers for PRs automatically again.

Signed-off-by: Chris Co <chrco@microsoft.com>
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
2025-07-24 19:39:35 +00:00
ms-mahuber
d1eb0ac37e docs: add pull_request_template.md
Add pull_request_template.md

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2025-07-24 19:39:35 +00:00
microsoft-github-policy-service[bot]
05f705fe94 docs: add Microsoft mandatory file
Add Microsoft mandatory file SECURITY.md

Signed-off-by: Saul Paredes <saulparedes@microsoft.com>
2025-07-24 19:39:35 +00:00
Aurélien Bombo
8815954587 runtime: Resolve high UVM memory footprint
Bug: https://microsoft.visualstudio.com/OS/_workitems/edit/43668151

Rationale: This is a temporary solution for optimizing memory usage for
the current mechanism of requesting resources through pod Limit
annotations:
- if no Limits are specified and hence WorkloadMemMB is 0, set a default
  value 'StaticWorkloadDefaultMem' to allocate a default amount of
  memory for use for containers in the sandbox in addition to the base
  memory
- if Limits are specified, the base memory and the sum of Limits are
  allocated. The end user needs to be aware of the minimum memory
  requirements for their pods, otherwise the pod will be stuck in the
  ContainerCreating state

Testing: Manual testing, creating pods with Limits and without limits,
and with two containers where each container has a limit, tested with
integration in a SPEC file where the config variables were set via
environment variables via the make command

Adapted by @mfrw from 3.1.0 to apply to 3.2.0

Signed-off-by: Muhammad Falak R Wani <mwani@microsoft.com>
Signed-off-by: Manuel Huber <mahuber@microsoft.com>

runtime: Remove unused VMM options for mem alloc

- We only ever tested these fork changes with CLH+MSHV
- Remove these options as we don't use QEMU/FC

Signed-off-by: Manuel Huber <mahuber@microsoft.com>
2025-07-24 19:39:35 +00:00
Fabiano Fidêncio
acae4480ac Merge pull request #11604 from fidencio/release/3.19.1
release: Bump version to 3.19.1
3.19.1
2025-07-22 09:00:15 +02:00
Fabiano Fidêncio
0220b4d661 release: Bump version to 3.19.1
As there were a few moderate security vulnerability fixes missed as part
of the 3.19.0 release.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-21 20:09:21 +02:00
Steve Horsman
09efcfbd86 Merge pull request #11606 from kata-containers/dependabot/cargo/src/tools/genpolicy/zerocopy-0.6.6
build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy
2025-07-21 18:58:56 +01:00
Steve Horsman
9f04d8e121 Merge pull request #11605 from kata-containers/dependabot/cargo/src/tools/kata-ctl/unsafe-libyaml-0.2.11
build(deps): bump unsafe-libyaml from 0.2.9 to 0.2.11 in /src/tools/kata-ctl
2025-07-21 18:50:01 +01:00
dependabot[bot]
a9c8377073 build(deps): bump zerocopy from 0.6.1 to 0.6.6 in /src/tools/genpolicy
---
updated-dependencies:
- dependency-name: zerocopy
  dependency-version: 0.6.6
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-21 12:50:38 +00:00
dependabot[bot]
0b4c434ece build(deps): bump unsafe-libyaml in /src/tools/kata-ctl
Bumps [unsafe-libyaml](https://github.com/dtolnay/unsafe-libyaml) from 0.2.9 to 0.2.11.
- [Release notes](https://github.com/dtolnay/unsafe-libyaml/releases)
- [Commits](https://github.com/dtolnay/unsafe-libyaml/compare/0.2.9...0.2.11)

---
updated-dependencies:
- dependency-name: unsafe-libyaml
  dependency-version: 0.2.11
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-21 12:46:27 +00:00
Fabiano Fidêncio
35629d0690 Merge pull request #11603 from stevenhorsman/security-updates-21-jul
dependencies: More crate bumps to resolve security issues
2025-07-21 14:33:07 +02:00
stevenhorsman
162ba19b85 agent-ctl: Bump rusttls
Bump rusttls to >=0.23.18 to remediate RUSTSEC-2024-0399

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:41:59 +01:00
stevenhorsman
42339e9cdf dragonball: Update url crate
Update url to 2.5.4 to bump idna to 1.0.3 and remediate
RUSTSEC-2024-0421

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:35:05 +01:00
stevenhorsman
1795361589 runk: Update rustjail
Update the rustjail crate to pull in the latest security fixes

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:31:18 +01:00
stevenhorsman
28929f5b3e runtime: Bump promethus
Bump this crate to remove the old version of protobuf
and remediate RUSTSEC-2024-0437

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:29:57 +01:00
stevenhorsman
e66aa1ef8c runtime: Bump promethus and ttrpc-codegen
Bump these crates to remove the old version of protobuf
and remediate RUSTSEC-2024-0437

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2025-07-21 10:29:39 +01:00
Fabiano Fidêncio
d60513ece9 Merge pull request #11597 from kata-containers/topic/fix-release-static-tarball-content
release: Copy the VERSION file to the tarball
3.19.0
2025-07-20 21:06:40 +02:00
Fabiano Fidêncio
55aae75ed7 shellcheck: Fix issues on kata-deploy-merge-builds.sh
As we're already touching the file, let's get those fixed.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-20 09:33:50 +02:00
Fabiano Fidêncio
aaeb3b3221 release: Copy the VERSION file to the tarball
For the release itself, let's simply copy the VERSION file to the
tarball.

To do so, we had to change the logic that merges the build, as at that
point the tag is not yet pushed to the repo.

Signed-off-by: Fabiano Fidêncio <fidencio@northflank.com>
2025-07-20 00:06:14 +02:00