v0.7.0 of guest-components has been released, so let's
use the new tag for the attestation-agent
Fixes: #7400
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Allow triggering this action manually, since we noticed it being skipped on
push exactly when we needed it the most.
Fixes: #7353
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
We need to do proper sandbox sizing when we're doing cold-plug introduce CDI,
the de-facto standard for enabling devices in containers. containerd
will pass-through annotations for accumulated CPU,Memory and now CDI
devices. With that information sandbox sizing can be derived correctly.
Fixes: #7331
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The k8s.gcr.io is deprecated for a while now and has been redirected to
registry.k8s.io. However on some bare-metal machines in our testing
pools that redirection is not working, so let's just replace the
registries.
Fixes#6461
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Update image-rs, which is part of the guest-components repo, to the commit that
will become the v0.7.0 tag.
Fixes: #7353
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
DEFSERVICEOFFLOAD controls whether images are pulled inside
the guest. This should always be set for CoCo, not just
when we use MEASURED_ROOTFS.
Fixes: #7350
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Ignore cyclomatic complexity failure. I have fixed this in my PR waiting
to forward port remote-hypervisor support into main
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Bump kernel version to reflect that they are changes
- We've some how gone out of sync with main, so just add a +
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This is a very nice suggestion from Steve Horsman, as with that we can
manually trigger the workflow anytime we need to test it, instead of
waiting for a full day for it to be retriggered via the `schedule`
event.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Passing the commit hash as the "pr-number" has shown problematic as it
would make the AKS cluster name longer than what's accepted by AKS.
One easy way to solve this is just passing "nightly" as the PR number,
as that's only used to create the cluster.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help folks to monitor the history of the failing tests, as
we've done in Jenkins with the "Green Effort CI".
Fixes: #7279
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We should not go through the trouble of running all our tests on AKS /
Azure / baremetal machines in case a PR only changes our documentation.
Fixes: #7258
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've noticed this caused regressions with the k8s-oom tests, and then
decided to take a step back and do this in the same way it was done
before 67972ec48a.
Moreover, this step back is also more reasonable in terms of the
controlling logic.
And by doing this we can re-enable the k8s-oom.bats tests, which is done
as part of this PR.
Fixes: #7271
Depends-on: github.com/kata-containers/tests#5705
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
As we need to pass down the commit sha to the jobs that will be
triggered from the `push` event, we must be careful on what exactly
we're using there.
At first we were using ${{ github.ref }}, but this turns out to be the
**branch name**, rather than the commit hash. In order to actually get
the commit hash, Let's use ${{ github.sha }} instead.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we need to pass down the commit sha to the jobs that will be
triggered from the `schedule` event, we must be careful on what exactly
we're using there.
At first we were using ${{ github.ref }}, but this turns out to be the
**branch name**, rather than the commit hash. In order to actually get
the commit hash, Let's use ${{ github.sha }} instead, as described by
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's take the same approach of the go runtime, instead, and allocate
the maximum allowed number of vcpus instead.
Fixes: #7270
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 25d2fb0fde.
The reason we're reverting the commit is because it to check whether
it's the cause for the regression on devmapper tests.
Fixes: #7253
Depends-on: github.com/kata-containers/tests#5705
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
On cc3993d860 we introduced a regression,
where we started passing inputs.commit-hash, instead of
github.event.pull_request.head.sha. However, we have been setting
commit-hash to github.event.pull_request.sha, meaning that we're mssing
a `.head.` there.
github.event.pull_request.sha is empty for the pull_request_target
event, leading the CI to pull the content from `main` instead of the
content from the PR.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Remove the logic that made the kata-remote containerd config not support
io.katacontainers annotations
Fixes: #7265
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The latter workflow is breaking as it doesn't recognise ${GITHUB_REF},
the former would most likely break as well, but it didn't get triggered
yet.
The error we're facing is:
```
Determining the checkout info
/usr/bin/git branch --list --remote origin/${GITHUB_REF}
/usr/bin/git tag --list ${GITHUB_REF}
Error: A branch or tag with the name '${GITHUB_REF}' could not be found
```
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We have to do this, otherwise we cannot log into azure.
This is a regression introduced by
106e305717.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of passing "-${{ inputs.tag }}-amd64", we must only pass
"-${{ inputs.tag }}".
This is a regression introduced by
106e305717.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Add annotation enablement for machine_type, default_memory and
default_vcpus
- Remove note that says that cpu and memory settings are ignored.
Fixes: #7256
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
de83cd9de7 tried to solve an issue, but it
clearly seems that I'm using env wrongly, as what ended up being passed
as input was "$VAR", instead of the content of the VAR variable.
As we can simply avoid using those here, let's do it and save us a
headache.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It has to have steps declared, and we need to make it a dependency for
the nightly kata-containers-ci-on-push job.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Otherwise we'll get the following error from the workflow:
```
The workflow is not valid. .github/workflows/ci-on-push.yaml (Line: 24,
Col: 20): Unrecognized named-value: 'env'. Located at position 1 within
expression: env.COMMIT_HASH .github/workflows/ci-on-push.yaml (Line: 25,
Col: 18): Unrecognized named-value: 'env'. Located at position 1 within
expression: env.PR_NUMBER
```
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR fixes the call to check_metrics function as KATA_HYPERVISOR
is not needed to be passed.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The idea is to mimic what's been done with Jenkins and the "Green CI"
effort, but now using our GHA and the GHA infrastructure.
Fixes: #7247
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we run our tests in a specific namespace, as in case of
any kind of issue, we will just get rid of the namespace itself, which
will take care of cleaning up any leftover from failing tests.
One important thing to mention is why we can get rid of the `namespace:
${namespace}` on the tests that are already using it, and let's do it in
parts:
* namespace: default
We can easily get rid of this as that's the default namespace where
pods are created, so it was a no-op so far.
* namespace: test-quota-ns
My understanding is that we'd need this in order to get a clean
namespace where we'd be setting a quota for. Doing this in the
namespace that's only used for tests should **not** cause any
side-effect on the tests, as we're running those in serial and there's
no other pods running on the `kata-containers-k8s-tests` namespace
Last but not least, we're not dynamically creating namespaces as the
tests are not running in parallel, **never**, not in the case of having
2 tests being ran at same time, neither in the case of having 2 jobs
being scheduled to the same machine.
Fixes: #6864
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is based on the `ci-on-push.yaml` file, and it's called from ther
The reason to split on a new file is that we can easily introduce a
`ci-nightly.yaml` file and re-use the `ci.yaml` file there as well.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Those components are being built twice, one as part of the normal matrix
assets, and the second one as part of the include.
Fixes: #7235
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure we're not relying, on any of the called workflows, on event
specific information.
Right now, the two information we've been relying on are:
* PR number, coming from github.event.pull_request.number
* Commit hash, coming from github.event.pull_request.head.sha
As we want to, in the future, add nightly jobs, which will be triggered
by a different event (thus, having different fields populated), we
should ensure that those are not used unless it's in the "top action"
that's trigerred by the event.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Use the 'function' keyword to prevent bash aliases from colliding
with other function's name.
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR enables storing metrics workflow artifacts in two
separated flavours: clh and qemu.
Fixes: #7239
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
TIL that when using "include" for a matrix we must duplicate the value
we're overriding for each element we want that to happen.
Fixes: #7235
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds the function name before the function to have uniformity
across all the test.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Since the measured rootfs work has been merged to main, and then
brought in to the CCv0 via the weekly merge, we have introduced a few
regressions related to how we build it / use it.
This PR attempts to make sure the artefacts are properly built, using
GitHub Actions, so the feature can be used with the operator.
Fixes: #7235
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
SNP's QEMU has changed its name some time ago and, due to that, we have
been leaving the new binary behind during the uninstall process, which
lead to the Operator hanging when uninstalling.
Fixes: #7233
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds blogbench and webtooling metrics checks to this repo.
The function running the test intentionally returns zero, so
the test will be enabled in another PR once the workflow is
green.
Fixes: #7069
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR usses double quotes in all the variables as well as general fixes
to the memory usage script in order to have uniformity.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Non AKS k8s tests (SEV/SNP/TDX) don't currently set KATA_HOST_OS, so provide a
default empty value for the variable so that those tests can run.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
as OSSKU value, to get rid of this warning when creating the AKS cluster:
WARNING: The osSKU "AzureLinux" should be used going forward instead of
"CBLMariner" or "Mariner". The osSKUs "CBLMariner" and "Mariner" will
eventually be deprecated.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
We only need to install in run_tests() so that the yq install is picked up by
kubernets/setup.sh as well. We also need to either use (sudo &&
INSTALL_IN_GOPATH=false) || (INSTALL_IN_GOPATH=true).
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This PR adds the checkmetrics ci worker file for cloud hypervisor in
order to check the boot times limit.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds double quotes in all variables to have uniformity across
all the gha-run.sh script.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds checkmetrics installation for gha-run.sh in order to compare
results limits as part of the metrics CI.
Fixes#7198
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
cargo/rust is installed in one step, we need to write the PATH update to
GITHUBENV so that it becomes visible in the next steps.
Fixes: #7221
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
This simply displays the asset name first in GH's UI, so that the
release name (always "test") is truncated rather than the asset name.
Makes things slightly easier to read.
e.g.
build-asset (cloud-hypervisor-glibc, te...
instead of
build-asset (test, cloud-hypervisor-gli...
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This allows setting `USE_CACHE=no` to test building e2e during
developmet without having to comment code blocks and so forth.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This enables building CLH with glibc and the mshv feature as required
for Mariner. At test time, it also configures Kata to use that CLH
flavor when running Mariner.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Mariner ships a bleeding-edge kernel that might be ahead of upstream, so
we use that to guarantee compatibility with the host.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
* Adds a new `rootfs-initrd-mariner` build target.
* Sets the custom initrd path via annotation in `setup.sh` at test
time.
* Adapts versions.yaml to specify a `cbl-mariner` initrd variant.
* Introduces env variable `HOST_OS` at deploy time to enable using a
custom initrd.
* Refactors the image builder so that its caller specifies the desired
guest OS.
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.
This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.
However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.
For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.
Fixes: #7207
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR adds memory foot print metrics to tests/metrics/density
folder.
Intentionally, each test exits w/ zero in all test cases to ensure
that tests would be green when added, and will be enabled in a
subsequent PR.
A workflow matrix was added to define hypervisor variation on
each job, in order to run them sequentially.
The launch-times test was updated to make use of the matrix
environment variables.
Fixes: #7066
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
There is nothing in them that requires them to be macros. Converting
them to functions allows for better error messages.
Fixes: #7201
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
There is nothing in it that requires it to be a macro. Converting it to
a function allows for better error messages.
Fixes: #7201
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Having a function allows for better error messages from the type checker
and it makes it clearer to callers what can happen. For example:
is_allowed!(req);
Gives no indication that it may result in an early return, and no simple
way for callers to modify the behaviour. It also makes it look like
ownership of `req` is being transferred.
On the other hand,
is_allowed(&req)?;
Indicates that `req` is being borrowed (immutably) and may fail. The
question mark indicates that the caller wants an early return on
failure.
Fixes: #7201
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Since it is never modified, it doesn't really need a lock of any kind.
Removing the `RwLock` wrapper allows us to remove all `.read().await`
calls when accessing it.
Additionally, `AGENT_CONFIG` already has a static lifetime, so there is
no need to wrap it in a ref-counted heap allocation.
Fixes: #5409
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This PR adds the word function before the function names in order to have
uniformity across the script as some are using this and some are not.
Fixes#7196
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Vfio support introduce build error on AArch64. Remove arch related
annotation can avoid this error.
Fixes: #7187
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The failure mainly caused by the encoded volume path and
the mount/src. As the src will be validated with stat,but
it's not a full path and encoded, which causes the stat
mount source failed.
Fixes: #7186
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This PR adds link to the unreference docs in the cmd path to make
them more discoverable.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds checkmetrics makefile which is used to process the
metrics json results files.
Fixes#7172
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds time tests documentation reference in the general README
for kata metrics.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Unlike the previous usage which requires creating
/dev/xxx by mknod on the host, the new approach will
fully utilize the DirectVolume-related usage method,
and pass the spdk controller to vmm.
And a user guide about using the spdk volume when run
a kata-containers. it can be found in docs/how-to.
Fixes: #6526
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This PR adds the metrics documentation as a general reference in the
main README for kata containers.
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the checkmetrics scripts that will be used for the kata metrics CI.
Fixes#7160
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
We have GH configured so that manual approval is required for CI runs
triggered by outside contributors. However, because CI is triggered by
the `pull_request_target` event, this setting isn't being honored
(see [1]). This means that an attacker could trivially extracts secrets
by submitting a PR.
This change aims to mititgate this issue by preventing PRs from
triggering CI unless the `ok-to-test` label is set.
Note: For further context, we use the `pull_request_target` event and
manually check out the PR branch because it is the only way to both
access secrets and test incoming code changes.
Fixes: #7163
[1]: https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
In order to support different pod VM instance type via
remote hypervisor implementation (cloud-api-adaptor),
we need to pass machine_type, default_vcpus
and default_memory annotations to cloud-api-adaptor.
The cloud-api-adaptor then uses these annotations to spin
up the appropriate cloud instance.
Reference PR for cloud-api-adaptor
https://github.com/confidential-containers/cloud-api-adaptor/pull/1088Fixes: #7140
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
When running on a VM, the kernel parameter "unrestricted_guest" for
kernel module "kvm_intel" is not required. So, return success when running
on a VM without checking value of this kernel parameter.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Implement functionality to add to the env output if the host is capable
of running a VM.
Fixes: #6727
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This PR removes an unrecognized value located in one of the yamls for the
gha in order to make it work the CI again.
Fixes#7149
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR replaces single spaces for tabs in order to fix the indentation
in the init.sh script.
Fixes#7147
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR installs kata static tarball on metrics runner
and run launch-times tests.
Fixes: #7049
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
The common.sh script includes helper functions used in
our metrics tests, so we are gradually adding more
metrics used in kata.
Fixes: #7108
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This test measures the duration of a workload that starts, and then
immediately stops the contianer. Also measures the workload period,
the time to quit period, and the time to kernel period.
Fixes: #7049
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
A new choice of using vfio devic based volume for kata-containers.
With the help of kata-ctl direct-volume, users are able to add a
specified device which is BDF or IOMMU group ID.
To help users to use it smoothly, A doc about howto added in
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.
Fixes: #6525
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Limitations:
As no ready rust vmm's vfio manager is ready, it only supports
part of vfio in runtime-rs. And the left part is to call vmm
interfaces related to vfio add/remove.
So when vmm/vfio manager ready, a new PR will be pushed to
narrow the gap.
Fixes: #6525
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
This PR fixes the format for the run launchtimes metrics yaml which
is causing to the workflow to fail.
Fixes#7130
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This PR adds the json script which allow us to save the metrics results
into a json file which will be used in the kata containers metrics.
Fixes#7128
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
This is to set a default value for `AA_KBC` for the make target `cc_rootfs_initrd_tarball`.
Fixes: #7121
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
This will help to not have to build those on every CI run, and rather
take advantage of the cached image.
Fixes: #7084
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c720869eef)
Let's add the needed infra for only building and pushing the initramfs
builder image to the Kata Containers' quay.io registry.
Fixes: #7084
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 111ad87828)
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder for the initramds.
This will save us some CI time.
Fixes: #7084
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ebf6c83839)
The `-o` option is the legacy way to configure virtiofsd, inherited
from the C implementation. The rust implementation honours it for
compatibility but it logs deprecation warnings.
Let's use the replacement options in the go shim code. Also drop
references to `-o` from the configuration TOML file.
Fixes#7111
Signed-off-by: Greg Kurz <groug@kaod.org>
The C implementation of virtiofsd had some kind of limited support
for remote POSIX locks that was causing some workflows to fail with
kata. Commit 432f9bea6e hard coded `-o no_posix_lock` in order
to enforce guest local POSIX locks and avoid the issues.
We've switched to the rust implementation of virtiofsd since then,
but it emits a warning about `-o` being deprecated.
According to https://gitlab.com/virtio-fs/virtiofsd/-/issues/53 :
The C implementation of the daemon has limited support for
remote POSIX locks, restricted exclusively to non-blocking
operations. We tried to implement the same level of
functionality in #2, but we finally decided against it because,
in practice most applications will fail if non-blocking
operations aren't supported.
Implementing support for non-blocking isn't trivial and will
probably require extending the kernel interface before we can
even start working on the daemon side.
There is thus no justification to pass `-o no_posix_lock` anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
The rust implementation of virtiofsd always runs foreground and
spits a deprecation warning when `-f` is passed.
Signed-off-by: Greg Kurz <groug@kaod.org>
This PR adds the test lib common script that is going to be used
for kata containers metrics.
Fixes#7113
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
The run-launchtimes-metrics workflow needs to get the commit ID
for the last commit to the head branch of the PR.
Fixes: #7116
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
If we override the cold, hot plug with an annotation
we need to reset the other plugging mechanism to NoPort
otherwise both will be enabled.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
In Virt the vhost-user-block is an PCIe device so
we need to make sure to consider it as well. We're keeping
track of vhost-user-block devices and deduce the correct
amount of PCIe root ports.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
This gh-workflow prints a simple msg, but is the base for future
PRs that will gradually add the jobs corresponding to the kata
metrics test.
Fixes: #7100
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
This PR updates the developer guide at the connect to the debug console
section.
Fixes#7094
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Now it is possible to configure the PCIe topology via annotations
and addded a simple test, checking for Invalid and RootPort
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Removed the configuration of PCIeRootPort and PCIeSwitchPort, those
values can be deduced in createPCIeTopology
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Refactor the bus assignment so that the call to GetAllVFIODevicesFromIOMMUGroup
can be used by any module without affecting the topology.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
The hypervisor_state file was the wrong location for the PCIe Port
settings, moved everything under device umbrella, where it can be
consumed more easily and we do not get into circular deps.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
For the GPU CC use case we need to set several crypto algorithms.
The driver relies on them in the CC case.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Use now the sev.conf rather then the snp.conf.
Devices can be prestend in two different way in the
container (1) as vfio devices /dev/vfio/<num>
(2) the device is managed by whataever driver in
the VM kernel claims it.
Fixes: #6844
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
This fixes the builds of `cloud-hypervisor-glibc` and
`rootfs-initrd-mariner` to properly create the `build/` directory.
Fixes: #7098
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This PR updates the firecracker version to 1.3.3 which includes the following
changes
Fixed passing through cache information from host in CPUID leaf 0x80000006.
A race condition that has been identified between the API thread and the VMM
thread due to a misconfiguration of the api_event_fd.
Fixes#7089
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Main merge back to CCv0 caused snp qemu build to move from install_qemu to install_qemu_experimental.
Thus, reflecting this change into the qemu snp command.
Fixes: #7059
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Qemu for SNP is experimental. Thus, when building QEMU for SNP we need to create a builder that builds experimental qemu for CC.
Fixes: #7059
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Kubernetes and Containerd will help calculate the Sandbox Size and pass it to
Kata Containers through annotations.
In order to accommodate this favorable change and be compatible with the past,
we have implemented the handling of the number of vCPUs in runtime-rs. This is
This is slightly different from the original runtime-go design.
This doc introduce how we handle vCPU size in runtime-rs.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
In this commit, we refactored the logic of static resource management.
We defined the sandbox size calculated from PodSandbox's annotation and
SingleContainer's spec as initial size, which will always be the sandbox
size when booting the VM.
The configuration static_sandbox_resource_mgmt controls whether we will
modify the sandbox size in the following container operation.
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Some vmms, such as dragonball, will actively help us
perform online cpu operations when doing cpu hotplug.
Under the old onlineCpuMem interface, it is difficult
to adapt to this situation.
So we modify the semantics of nb_cpus in onlineCpuMemRequest.
In the original semantics, nb_cpus represents the number of
newly added CPUs that need to be online. The modified
semantics become that the number of online CPUs in the guest
needs to be guaranteed.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
The declaration of the cpu number in the cpuset is greater
than the actual number of vcpus, which will cause an error when
updating the cgroup in the guest.
This problem is difficult to solve, so we temporarily clean up
the cpuset in the container spec before passing in the agent.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Updating vCPU resources and memory resources of the sandbox and
updating cgroups on the host will always happening together, and
they are all updated based on the linux resources declarations of
all the containers.
So we merge update_cgroups into the update_linux_resources, so we
can better manage the resources allocated to one pod in the host.
Fixes: #5030
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Support vcpu resizing on runtime side:
1. Calculate vcpu numbers in resource_manager using all the containers'
linux_resources in the spec.
2. Call the hypervisor(vmm) to do the vcpu resize.
3. Call the agent to online vcpus.
Fixes: #5030
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Nobody has volunteered to maintain the (currently broken) snap build, so
remove it.
Fixes: #6769.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
After we support memory resize in Dragonball, we need to update
Cargo.lock in runtime-rs.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
Adding vhost and vhost-net to the kernel modules. These do not require
any kernel module parameters to be checked. Currently, kernel params is
a required field. Make this as optional. Could make this as <Option>,
but making this a slice instead, as a module could have multiple kernel
params. Refactor the function that checks are for kernel modules into
two with one specifically checking if the module is loaded and other
checking for module parameters.
Refactor some of the tests to take into account these changes.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This adds the glibc flavor of CLH to the list of assets as preparation
for #6839. Mariner Kata is only tested with glibc.
Fixes: #7026
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
We introduce virtio-balloon device to support memory resize.
virtio-balloon device could reclaim memory from guest to host.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
We introduce virtio-mem device to support memory resize. virtio-mem
device could hot-plug more memory blocks to guest and could also
hot-unplug them from guest.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
As block/direct volume use similar steps of device adding,
so making full use of block volume code is a better way to
handle direct volume.
the only different point is that direct volume will use
DirectVolume and get_volume_mount_info to parse mountinfo.json
from the direct volume path. That's to say, direct volume needs
the help of `kata-ctl direct-volume ...`.
Details seen at Advanced Topics:
[How to run Kata Containers with kinds of Block Volumes]
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.md
Fixes: #5656
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
In order to support virtio-mem and virtio-balloon devices, we need to
extend DeviceOpContext with VmConfigInfo and InstanceInfo.
Fixes: #6719
Signed-off-by: Helin Guo <helinguo@linux.alibaba.com>
The key aspects of the DM implementation refactoring as below:
1. reduce duplicated code
Many scenarios have similar steps when adding devices. so to reduce
duplicated code, we should create a common method abstracted and use
it in various scenarios.
do_handle_device:
(1) new_device with DeviceConfig and return device_id;
(2) try_add_device with device_id and do really add device;
(3) return device info of device's info;
2. return full info of Device Trait get_device_info
replace the original type DeviceConfig with full info DeviceType.
3. refactor find_device method.
Fixes: #5656
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
Qemu entry for SNP was changed in the versions.yaml resulting into the incorrect qemu build for SNP.
Fixes: #7059
Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
After we have a guest kernel with builtin initramfs which
provide the rootfs measurement capability and Kata rootfs
image with hash device, we need set related root hash value
and measure config to the kernel params in kata configuration file.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Integrate initramfs into guest kernel as one binary,
which will be measured by the firmware together.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.
Fixes: #6674
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Add required kernel config for dm-crypt/dm-integrity/dm-verity
and related crypto config.
Add userspace command line tools for disk encryption support
and ext4 file system utilities.
Fixes: #6674
Signed-off-by: Arron Wang <arron.wang@intel.com>
This PR updates the link to the correspondent Developer Guide at the
enabling full containerd debug that we have for kata 2.0 documentation.
Fixes#7034
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
When the version of libc is upgraded to 0.2.145, older getrandom could not adapt
to new API, and this will make agent-ctl fail to compile.
We upgrade the version of `rand`, so the low version of getrandom will no longer
need.
Fixes: #7032
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
Fixes: #5401, #6654
- Switch kata-ctl from eprintln!()/println!() to structured logging via
the logging library which uses slog.
- Adds a new create_term_logger() library call which enables printing
log messages to the terminal via a less verbose / more human readable
terminal format with colors.
- Adds --log-level argument to select the minimum log level of printed messages.
- Adds --json-logging argument to switch to logging in JSON format.
Co-authored-by: Byron Marohn <byron.marohn@intel.com>
Co-authored-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Jayant Singh <jayant.singh@intel.com>
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Luke Phillips <lucas.phillips@intel.com>
Signed-off-by: Kelby Madal-Hellmuth <kelby.madal-hellmuth@intel.com>
Signed-off-by: Liz Lawrens <liz.lawrens@intel.com>
Github Actions reads and runs workflow files from the main branch,
rather than from the PR branch. This means that PRs that modify workflow
files aren't being tested with the updated workflows coming from the PR,
but rather with the old workflows from the main branch. AFAIK, this
behavior isn't avoidable for workflow files (but is for other scripts).
This makes it very hard to reliably test workflow changes before they're
actually merged into main and leads to issues that we have to hotifx
(see #6983, #6995).
This PR aims to mitigate that by extracting the commands used in
workflows to a separate script file. The way our CI is set up, those
script files are read from the PR branch and thus changes would be
reflected in the CI checks.
Fixes: #6971
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
In preparation for CoCo 0.6.0 release, updated td-shim to commit
3252047213b2c580c21bdc52f67e8515ca1e374a
Fixes#7022
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
In preparation for CoCo 0.6.0 release, updated attestation-agent to
commit aa1d3c510350cd2f2668aca374abba19e2b73b3f
Fixes#7022
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
In hypervisors that do not support virtiofs we have to copy files in
the VM sandbox to properly setup the network (resolv.conf, hosts, and hostname).
To do that, we construct the volume as before, with the addition of an extra
variable that designates the path where the file will reside in the sandbox.
In this case, we issue a `copy_file` agent request *and* we patch the spec
to account for this change.
Fixes: #6978
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
When dragonball update dbs-boot crate in commit
64c764c147, the Cargo.lock in runtime-rs
should also be updated.
Fixes: #6969
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Full SHA is 40 characters, while AKS cluster name has a limit of 63. Trim the
SHA to 12 characters, which is widely considered to be unique enough and is
short enough to be used in the cluster name
Fixes: #7010
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Let's start adding the status of our jobs as part of our main page, so
folks monitoring those can easily check whether they're okay, or if
someone has to be pinged about those.
Fixes: #7008
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Content of commit
Update Cargo.toml of kata-agent
Change the features to use new naming convention
Run make vendor, to fix the static checks
Update image-rs, step4 of release checklist
Fixes: #6635
Signed-off-by: Jordan Jackson <jordan.jackson@ibm.com>
We added that to create the cluster name, but I forgot to add that to
the part we get the k8s config file, or to the part where we delete the
AKS cluster.
Fixes: #6999
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to do so, otherwise we'll create two clusters for testing Cloud
Hypervisor with exactly the same name, one using Ubuntu, and one using
Mariner.
Fixes: #6999
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The string representing the architecture aarch64 and x86_64 need to be changed to arm64 and amd64 for the release.
Fixes: #6986
Signed-off-by: SinghWang <wangxin_0611@126.com>
There were recent changes for the tdx kernel in the version.yaml that are
not currently accounted for in the build-kernel.sh script.
Attempts to setup a tdx kernel to build local changes seemed to not download
the tdx kernel. Instead the mainline kernel is downloaded which has no
tdx-related changes.
The version.yaml has a new entry for tdx kernel. Use that instead for
setting up and downloading the tdx kernel.
Fixes: #6984
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
While the Mariner Kata host is in preview, we need the `aks-preview`
extension to enable the `--workload-runtime KataMshvVmIsolation` flag.
Fixes: #6994
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
This PR is to make an environment variable `BUILDER_REGISTRY` configurable
so that those who want to use their own registry for build can set up
the registry.
Fixes: #6988
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
The vcpu hotplug/hotunplug feature is implemented with upcall. This commit
add three patches to support the feature on aarch64. Patches:
> 0005: add support of upcall on aarch64
> 0006: skip activate offline cpus' MSI interrupt
> 0007: set the correct boot cpu number
Fixes: #6010
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
This commit implements the vcpu_boot_onlined vector in get_fdt_vm_info.
"boot_enabled" means whether this vcpu should be onlined at first boot.
It will be used by fdt, which write an attribute called boot_enabled,
and will be handled by guest kernel to pass the correct cpu number to
function "bringup_nonboot_cpus".
Fixes: #6010
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
This commit add support of resize_vcpu on aarch64. As kvm will check
whether vgic is initialized when calling KVM_CREATE_VCPU ioctl, all the
vcpu fds should be created before vm is booted.
To support resizing vcpu scenario, we use max_vcpu_count for
create_vcpus and setup_interrupt_controller interfaces. The
SetVmConfiguration API will ensure max_vcpu_count >= boot_vcpu_count.
Fixes: #6010
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
dbs-boot-v0.4.0 refectors the create_fdt interface. It simplifies the
parameters needed to be passed and abstracts them into three structs.
By the way, it also reserves some interfaces for future feature: numa
passthrough and cache passthrough.
Fixes: #6969
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Rewrite the comment of Vm::init_microvm method for aarch64.
Fixes cargo test warnings on aarch64.
Fixes: #6969
Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
Pod annotations (io.katacontainers.*) are not meaningful
for the remote hypervisor. This patch disables pod annotations
in the kata-remote settings of the containerd configuration.
Fixes: #6345
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Move the get_volume_mount_info to kata-types/src/mount.rs.
If so, it becomes a common method of DirectVolumeMountInfo
and reduces duplicated code.
Fixes: #6701
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
When run a exec process in backgroud without tty, the
exec will hang and didn't terminated.
For example:
crictl -i <container id> sh -c 'nohup tail -f /dev/null &'
Fixes: #4747
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
The current testing setup only supports running Kata on top of an Ubuntu
host. This adds Mariner to the matrix of testable hosts for k8s
tests, with Cloud Hypervisor as a VMM.
As preparation for the upcoming PR that will change only the actual test
code (rather than workflow YAMLs), this also introduces a new file
`setup.sh` that will be used to set host-specific parameters at test
run-time.
Fixes: #6961
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
sandbox_bind_mounts supports kinds of mount patterns, for example:
(1) "/path/to", default readonly mode.
(2) "/path/to:ro", same as (1).
(3) "/path/to:rw", readwrite mode.
Both support configuration and annotation:
(1)[runtime]
sandbox_bind_mounts=["/path/to", "/path/to:rw", "/mnt/to:ro"]
(2) annotation will alse be supported, restricted as below:
io.katacontainers.config.runtime.sandbox_bind_mounts
= "/path/to /path/to:rw /mnt/to:ro"
Fixes: #6597
Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
We're still facing issues related to the time taken to deploy the
kata-deplot daemonset and starting to run the tests.
Ideally, we should solve this with a readiness probe, and that's the
approach we want to take in the future. However, for now, let's just
make sure those tests are not on the way of the community.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've seen tests being aborted close to the end of the run due to the
timeout. Let's increase it, avoiding to hit such cases again..
Fixes: #6964
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR removes unwanted white spaces in order to fix the format
of the kata-deploy-binaries script.
Fixes: #6962
Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
Instead of setting:
```
firmware = "/path/to/OVMF.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
We should either be setting:
```
firmware = "/path/to/OVMF.fd"
```
Or:
```
firmware = "/path/to/OVMF_CODE.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
I'm taking the approach to setting up the latter, as that's what's been
tested as part of our TDX CI.
Fixes: #4926
This patch is the same as #4927, but it ended up reverted somewhere in
the CCv0 -> main process, or in the attempts to fix TDX after that.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently backing up and restoring all the possible shim files,
but the default one ("containerd-shim-kata-v2").
Let's ensure this is also backed up and restored.
Fixes: #6957
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR fixes the indentation on the kata deploy merge script
that instead of single spaces uses a tap.
Fixes#6925
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
There is a race condition when virtiofsd is killed without finishing all
the clients. Because of that, when a pod is stopped, QEMU detects
virtiofsd is gone, which is legitimate.
Sending a SIGTERM first before killing could introduce some latency
during the shutdown.
Fixes#6757.
Signed-off-by: Beraldo Leal <bleal@redhat.com>
We previously were doing:
* Create a new image on kata-deploy-ci using the commit hash of the
latest tag
* This was used to test on AKS, which is no longer needed as we test
on AKS on every PR
* Create a new image on kata-deploy using the release tag and "latest"
or "stable", by tagging the kata-deploy-ci image accordingly
As part of cfe63527c5, we broke the
workflow described above, as in the first step we would save the PKG_SHA
to be used in the second step, but that part ended up being removed.
Anyways, this back and forth is not needed anymore and we can simplify
the process by doing:
* Create a new image on kata-deploy, using:
- The tag received as ref from the event that triggered this worklow
- "latest" or "stable" tag, depending on whether it's a stable release
or not
Fixes: #6946
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The rebase from `main` to `CCv0` ended up overwriting the image path
that should be used for QEMU, in the CCv0 branch.
Fixes: #6932
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For some bizarre reason, the login-action will simply fail to
authenticate to docker.io in it's specified as a registry. The way to
proceed, instead, is to *not* specify any registry as it'd be used by
default.
Fixes: #6943
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR updates the nydus version to 2.2.1. This change includes:
nydus-image: fix a underflow issue in get_compressed_size()
backport fix/feature to stable 2.2
[backport] contrib: upgrade runc to v1.1.5
service: add README for nydus-service
nydus: fix a possible panic caused by SubCmdArgs::is_present
Backports two bugfixes from master into stable/v2.2
[backport stable/v2.2] action: upgrade golangci-lint to v1.51.2
[backport] action: fix smoke test for branch pattern
Fixes#6938
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
`docker/login-action@v3` does *not* exist and `docker/login-action@v2`
should be used instead.
Fixes: #6934
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've a discrepancy on what's set along the scripts used to build the
Kata Cotainers artefacts locally.
Some of those were missing a way to easily debug them in case of a
failure happens, but one specific one (build-and-upload-payload.sh)
could actually silently fail.
All of those have been changed as part of this commut.
Fixes: #6908
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ae24dc73c1)
We're facing some issues to download / use the public key provided by
google for installing kubernetes as part of the kata-deploy image.
```
The following signatures couldn't be verified because the public key is
not available: NO_PUBKEY B53DC80D13EDEF05
Reading package lists... Done
W: GPG error: https://packages.cloud.google.com/apt kubernetes-xenial
InRelease: The following signatures couldn't be verified because the
public key is not available: NO_PUBKEY B53DC80D13EDEF05 E: The
repository 'https://apt.kubernetes.io kubernetes-xenial InRelease' is
not signed.
N: Updating from such a repository can't be done securely, and is
therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user
configuration details.
```
Let's work this around following the suggestion made by @dims, at:
https://github.com/kubernetes/k8s.io/pull/4837#issuecomment-1446426585
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 636539bf0c)
This commit should've been part of the series that reverted a bunch of
TDX changes that are not compatible with the TDX stack we're using in
the Jenkins CI machine.
The change made here is in order to match what's been undone here:
c29e5036a6Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
I cannot easily pin-point which commit dropped it, but my gut feeling is
that it's the result of an erroneous conflict resolution when merging
content from main to the CCv0 branch.
Regardless of when / why it happened, as the root_hash logic ended up
being dropped, workflows that depend on that are now failing.
With everything said in mind, let's bring the logic back.
Fixes: #6901
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 800ee5cd88.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3018c9ad51.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If a hypervisor debug console is enabled and sandbox_cgroup_only is set,
the hypervisor can fail to open /dev/ptmx, which prevents the sandbox
from launching.
This is caused by the absence of a device cgroup entry to allow access
to /dev/ptmx. When sandbox_cgroup_only is not set, the hypervisor
inherits the default unrestrcited device cgroup, but with it enabled it
runs into allow / deny list restrictions.
Fix by adding an allowlist entry for /dev/ptmx when debug is enabled,
sandbox_cgroup_only is true, and no /dev/ptmx is already in the list of
devices.
Fixes: #6870
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
This reverts commit 20ab2c2420.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit f33345c311.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This partially reverts commit 054174d3e6
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 25b3cdd38c.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit ed145365ec.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3c5ffb0c85.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit 3e15800199.
As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.
The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.
This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)
Fixes: #6884
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When this option is enabled the runtime will attempt to determine the
appropriate sandbox size (memory, CPU) before booting the virtual
machine.
As TEEs do not support memory and CPU hotplug, this approach must be
used.
Fixes: #6818
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When updating an interface, there's maybe an existed
interface whose name would be the same with the updated
required name, thus it would update failed with interface
name existed error. Thus we should rename the existed interface
with an temporary name and swap it with the previouse interface
name last.
Fixes: #6842
Signed-off-by: fupan <fupan.lfp@antgroup.com>
`uname -m` produces `x86_64`, but container image convention
is to use `amd64`, so update this in the tag
Fixes: #6820
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
(cherry picked from commit 2856d3f23d)
- Create multi-arch manifests for the ci and release runtime-payload
that are tagged with the commit, for use in the operator
Fixes: #6814
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add configurable context timeout for StopVM in remote hypervisor
similar to StartVM
Fixes: #6730
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
`image_policy_file` is the URI of the image security policy file which
is used to control the allowlist and denylist of the containers to be
run inside the guest.
`image_registry_auth_file` is the URI of the auth file which is needed
to access an private registry.
`simple_signing_sigstore_config` is a configuration file that shows
where the signature files are when simple signing is enabled.
All the config items support both `kbs://..` scheme, `file://..` scheme,
or a direct absolute path `/...`, which means either the file is to be
fetched from a remote KBS that `aa_kbc_params` points to, or from the
local filesystem of the guest.
Fixes: #6640
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
In preparation for CoCo 0.5 release, updated td-shim to
commit 10568bab569bc40034cc973f26fbb0a768dcc3e3
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
In preparation for CoCo 0.5 release, updated attestation-agent to
commit c939d211fe5ac497715008e36161aff20cabb6e6
Fixes#6650
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
This patch updates the template configuration file for
the remote hypervisor to set static_sandbox_resource_mgmt
to be true. The remote hypervisor uses the peer pod config
to determine the sandbox size, so requires this to be set to
true by default.
Fixes: #6616
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
Adding SNP components needed to the x86 payload push and release payloads.
QEMU is needed in both the after-push payload and release payload, while OVMF is only
missing from the release workflow.
Fixes: #6600
Signed-Off-By: Alex Carter <AlexCarter@ibm.com>
This is to add an artifact named `cc-rootfs-initrd` to a payload image
because it is identified that the artifact is required to run a cc-operator
e2e test.
Fixes: #6544
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
attestation-agent depends on tdx-attest-rs when cc_kbc is enabled, which
depends on libtdx-attest.so. Include the dev package in build container,
and the runtime package in the built rootfs.
The build of tdx-attest-sys (which is a dep of tdx-attest-rs) uses
bindgen, which requires libclang so install that in the build container
as well.
We specify the tdx stack DCAP v1.15
Fixes: #6519
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.
Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.
Fixes: #6341
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
`ttrpc=true` parameter tells the Makefile of attestation-agent
to build the attestation-agent with ttrpc support
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
This commit brings ttrpc of image-rs. It will use the
lightweight underlying ttrpc to interact between kata-agent
and attestation-agent.
Also, this PR brings a patch for `oci-distribution`,
because two dependencies of `image-rs` depends on different
versions of `oci-distribution`, which will cause that
`image-rs` can not be built. We need a specified version of
`oci-distribution` to unify.
Fixes#6219
Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
This is to add a new element `qemu-se` to the shims for a new runtime
class `kata-qemu-se`.
Fixes: #6549
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
SNP needs two builds of ovmf: the AmdSev build and the normal x86_64 build.
Adds target for vanilla ovmf build for snp
Adding another make target / kata-deploy function, and fixing the ovmf builder so these builds dont overlap.
Fixes: #5849
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
This is a preliminary work to establish an e2e test for a new runtime
class kata-qemu-se (IBM secure execution).
Fixes: #6544
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
DCAP has upgraded to 1.16, which is not compatible with the host OS used
as part of our CI (2022ww44). Let's ensure DCAP 1.15 is used instead.
Fixes: #6529
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Update the build to use the attestation-agent makefile to build it, so
we can pick up the enhancements there
Fixes: #6253
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We've been seeing the 'sudo make test' job occasionally run out of space in
/tmp, which is part of the root filesystem. Removing dotnet and
`AGENT_TOOLSDIRECTORY` frees around 10GB of space and in my tests the job still
has 13GB of space left after running.
Fixes: #6401
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
{{ runner.workspace }}/kata-containers and {{ github.workspace }} resolve to
the same value, but they're being used multiple times in the workflow. Remove
multiple definitions and define the GOPATH var at job level once.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The last remaining user of the TRAVIS variable in this repo is
tools/osbuilder/tests and it is only used to skip spinning up VMs. Travis
didn't support virtualization and the same is true for github actions hosted
runners. Replace the variable with KVM_MISSING and determine availability of
/dev/kvm at runtime.
TRAVIS is also used by '.ci/setup.sh' in kata-containers/tests to reduce the
set of dependencies that gets installed, but this is also in the process of
being removed.
Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
These variables are unused since we don't use travis CI. This also allows to
remove two steps:
- 'Setup GOPATH' only printed variables
- 'Setup travis reference' modified some shell local variables that don't have
any influence on the rest of the steps
The TRAVIS var is still used by tools/osbuilder/tests to determine if
virtualization is available.
Fixes: #3544
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2b41dbe broke all the cached jobs as it changed the permission of the
cache_components.sh file from 755 to 644.
Fixes: #6497
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
`make kata-tarball` needs a lot of disk space and github action runners don't
have that much of it. Remove unneeded tools from the runner, which frees
another ~10GB of space.
Fixes: #6490
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The various tarballs are unpacked into a temporary directory, and then that
directory is compressed into kata-static.tar.xz. After we have the tarball,
there is no reason to keep the temporary directory. Dispose of it as the last
step.
Fixes: #6490
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
We need to ensure `kata_config_version` is taken into account when:
* consuming a cached kernel, otherwise we may introduce changes to a
kernel that will never be validated as part of the PR
* caching the kernel, otherwise we won't update the artefacts if just a
config is changed
Fixes: #6485
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If cid is empty, we will use image name as default cid, to
support multiple containers with same image, we need append
unique id to the image name.
Fixes: #6456
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Reverting the container image names to pick up the lib.sh methods.
Signed-off-by: Georgina Kinge georgina.kinge@ibm.com
Co-authored-by: Steve Horsman <steven@uk.ibm.com>
This will help us in several ways:
* The first one is not using an image that's close to be EOLed, and
which doesn't officially provide multi-arch images.
* The second is getting closer to what's been already done on main.
* The third is simplifying the logic to build the payload image.
Fixes: #6446
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is to prepare a secure image tarball to run a confidential
container for IBM Z SE(TEE).
Fixes: #6206
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
/opt/confidential-containers/runtime-rs needs to be cleaned up, otherwise
containerd post-uninstall script fails due to weird logic in `rmdir
--ignore-fail-on-non-empty`.
Fixes: #6396
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The latest ubuntu runners already have docker installed and trying to
install it manually will cause the following issue:
```
Run curl -fsSL https://test.docker.com/ -o test-docker.sh
Warning: the "docker" command appears to already exist on this system.
If you already have Docker installed, this script can cause trouble, which is
why we're displaying this warning and provide the opportunity to cancel the
installation.
If you installed the current Docker package using this script and are using it
again to update Docker, you can safely ignore this message.
You may press Ctrl+C now to abort this script.
+ sleep 20
+ sudo -E sh -c apt-get update -qq >/dev/null
E: The repository 'https://packages.microsoft.com/ubuntu/22.04/prod jammy Release' is no longer signed.
```
Fixes: #6390
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In online-kbs attestation the guest is given the location of the
keybroker server to connect after launch. This patch appends the
IP:Port of the online-kbs to the kernel params of the guest.
Patch also simplifies the kbs config into "mode" = offline/online,
and updates SEV config variable names and default values
Fixes: #5661#5715
Signed-off-by: Jim Cadden <jcadden@ibm.com>
This is the latest td-shim commit and the latest known working
attestation-agent commit.
Fixes: #6366
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Features were renamed, so switch both arches to the katacc* feature.
Testing showed that "signature-simple" feature in image-rs is needed on
x86_64, so add that too. This image-rs commit does not include the
latest ocicrypt-rs and attestation-agent code itself.
Fixes: #6366
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Renaming the output binary from the AmdSevPkg from OVMF.fd to AMDSEV.fd so it does not conflict with the base x86_64 build.
Changing install name in ovmf static builder and the location in the sev config file.
Fixes: #6337
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
In order to disable or enable some features when running tdx vms, we
need to add is_tdx_enabled() function to identify whether the VM
confidiential type is TDX.
fixes: #6276
Signed-off-by: fengshifang <fengshifang@linux.alibaba.com>
Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
It is just to place a missing stage for permission adjustment in the
cc-payload-after-push-s390x workflow.
Fixes#6278
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
A Makefile target `install` should be included in the conditional branch
as default and test.
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
(cherry picked from commit 4139d68d51)
Nothing much to add, let's just make the message more clear.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c071355359)
As done for s390x, let's just skip the runtime-rs build for Power.
Fixes: #6142
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4e2db96ef7)
This is to make a docker version to v20.10 in docker upstream image ubuntu:20.04 for s390x and ppc64le.
Fixes: #6211
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
Cherry-picked: f49b89b
A known bug in qemu 7.2.0 causes a problem handling the kernel hashes argument and causes SEV container launching to fail.
Fixes: #6189
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
Now that TDX work will start coming for runtime-rs, let's also take it
into consideration when caching the shim-v2 tarball.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Removed the qemu paramter 'policy' (and also dh-cert-file, session-file, kernel-hashes=on)
for SNP container.
Fixes: #5795
Signed-off-by: Niteesh Dubey <niteesh@linux.ibm.com>
- Remove umoci entry from versions
- Update the usage of skopeo to control the tooling we use to build
the pause image
Fixes: #
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Now we don't need to have skopeo and umoci in the rootfs
remove the code that optionally builds and installs them
Fixes: #3970
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Remove the container_policy_file config parameter as it was only used
by the skopeo code path
Fixes: #3970
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
These changed will be consumed by SEV firmware caching job in the CI. This will help in reducing the CI runtime.
Fixes: #6119
Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
The attestation-agent had its v0.3.0 release earlier Today, following
the Confidential Containers v0.3.0 release process.
Let's bump it on our side, as we've already tested the version that
became this release.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
image-rs has released its v0.3.0 release earlier Today, following the
v0.3.0 Confidential Containers release process.
The v0.3.0 is based on exactly the same commit we've been using already,
so no changes are expected for us.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TD-Shim has released its v0.3.0 release earlier Today, following the
Confidential Containers v0.3.0 release.
Let's update it here. We need to also bump the toolchain to using the
nightly-2022-11-15
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With `disable_netns=true`, we should never scan the sandbox netns which
is the host netns in such case.
Fixes: #6021
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Cherry-picked: 12fd6ff
As the toolchain is installed in the image itself, we *must* take the
toolchain into consideration when deciding whether to use a cached image
or building a new one.
Fixes: #6033
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Bug fix for #5651. Faulty bash syntax let a initrd build complete, but not copy the kernel module.
This change fixes the if logic to work as an 'or' as intended.
Fixes: #6024
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
When `HOST_ARCH` != `ARCH` unset `CC`
Specifying a foreign CC is incompatible with building libgit2. Thus after the RUSTFLAGS linker
has been set we can safely unset CC to avoid passing this value through the build.
Fixes: #5890
Signed-off-by: James Tumber <james.tumber@ibm.com>
Cherry-picked: 087515a
Adds AA_KBC option in rootfs builder to specify online_sev_kbc into the initrd.
Guid and secret type for sev updated in shim makefile to generate default config
KBC URI will be specified via kernel_params
Also changing the default option for sev in the local build scipts
Making sure sev guest kernel module is copied into the initrd. Will also eventually be needed for SNP
Fixes: #5650
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
The `sev_guest_policy` configuration field distinguishes between SEV and
SEV-ES guests (according to standard AMD SEV policy values).
Modify the kata runtime to detect SEV-ES guests and calculate calculate
the expected launch digest taking into account the number of VCPUs and
their CPU signature (model/family/stepping).
Fixes: #5471
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Switching sev build of ovmf to the cc fork until patches are upstreamed.
Adding build for dependencies
Fixes: #5892
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
the switch to cases lets AA_KBC to be parsed correctly.
There will be an addition to the offline_sev_kbc case to do the same for online_sev_kbc
There will also be an addition for SNP
Fixes: #5909
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
This is to enable quay.io/confidential-containers/runtime-payload for
s390x on top of amd64.
Fixes: #5894
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
As done for different components, let's also use a cached version of the
shim-v2 whenever it's possible.
Fixes: #5838
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In order to cache the shim-v2 we're considering the the cached component
can be used if:
* There were no changes in the runtime directory
* There were no changes in the golang version used
* There were no changes in the rust version used
* We don't build the rust agent, but better be prepared for the future
* There were no changes in the following files that are provided by the
rootfs builds:
* root_hash_vanilla.txt
* root_hash_tdx.txt
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As done for different components, let's also use a cached version of
the rootfs whenever it's possible.
Fixes: #5433
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
e1f075dc60 reworked the action so the
shim-v2 was split out of the matrix build. With that done I ended up
not realising I'd need to log into the quay.io as one step of the
build-asset-cc-shim-v2 job.
Fixes: #5885
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is the most complex part to cache, as the cached component can be
only used if:
* There were no changes in the agent
* There were no changes in the libs (used by the agent)
* There were no changes in the rootfs build scripts
* There is no change in the version of the following components:
* attestation-agent (part of the rootfs)
* gperf (used to build libseccomp)
* libseccomp (used to build the agent)
* pause image (part of the rootfs)
* skopeo (part of the rootfs)
* umoci (part of the rootfs)
* rust (used to build the kata-containers and attestation agents)
We're relying on the last commit merged on places related to the rootfs
generation and using that as the rootfs version and that should be good
enough for what we need.
Apart from everything already mentioned, we've also added the ability to
cache the `root_hash_vanilla.txt` and `root_hash_tdx.txt` files, as
those are needed for when building the shim-v2, in order to have
measured boot working there.
It's important to note that we've added the ability to cache *both*
files, and I've taken that path as the shim-v2 cache work (which will
come soon) relies on both files.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help us, in the future, to debug any possible issue related to
the measured rootfs arguments passed to the shim during the build time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The ability to do a measured boot has been overlooked when releasing the
payload consumed by the Confidential Containers project, and this
happened as we depend, at the shim-v2 build time, of a `root_hash_*.txt`
generated in the `tools/osbuilder/` directory, which is then used to add
a specific parameter to the `kernel_params` in the Kata Containers
configuration files.
With everything said above, the best way we can ensure this is done is
by saving those files during the rootfs build, download them during the
shim-v2 build (which *must* happen only after the rootfs builds happen),
and correctly use them there.
Fixes: #5847
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As Cloud Hypervisor and QEMU are using different rootfs images (the
former with `offline_fs_kbc` as aa_kbc, and the latter with `eaa_kbc`),
we need to differentiate the kernel parameters passed to each one of
those, as the `root_hash.txt` file used for measured boot will differ
according to the rootfs used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
By doing this we can ensure that when building different rootfs-images
we won't end up overring the `root_hash.txt` file.
Plus, this will help us later in this series to pass the correct
argument to be used with the respective image.
Nothing's been done for SEV as it uses a initrd instead of an image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Adding the `root_hash.txt` to the final tarball doesn't bring any
benefit to the project, as the file dependency is for building the
shim-v2 and passing the correct measurement for the kernel command line.
It's important to mention that when building shim-v2, it doesn't look
for the file in `/opt/confidential-containers/share/kata-containers`,
bur rather in the `${repo_root_dir}/tools/osbuilder/`, as shown here:
ac3683e26e/tools/packaging/kata-deploy/local-build/kata-deploy-binaries.sh (L228-L232)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It turns out that there's more work needed to be done on the Cloud
Hypervisor side so we can fully support EAA_KBC with it.
For now, let's remove the configuration as the tests are not currently
passing when using it, and stick to the `offline_fs_kbc` and its
specific image for the Cloud Hypervisor + TDX case.
Fixes: #5862
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `qemu-tdx` configuration is tied to using `offline_fs_kbc` as the
aa_kbc, which is something we're moving away from.
With this in mind, let's rename the `qemu-tdx-eaa-kbc` to `qemu-tdx` and
decrease the amount of the way too many configurations that we ship.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is to differentiate an artifact name between amd64 and s390x and add a
virtiofsd target for s390x.
Fixes: #5851
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
As already done for install_cc_kernel(), let's ensure we export
KATA_BUILD_CC=yes as part of the install_cc_tee_kernel.
This is used to generate the hash of the devices in the initramfs.
Fixes: #5845
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's move the info about building initramfs to *after* trying to
install the cached components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This PR allow us to use the virtiofsd cache tarball instead of
building it from source.
Fixes#5356
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Exclude the image-rs cosign feature when the build target
is the s390x architecture.
Change Cargo to use workspace resolver 2 so that conditional
include for the image-rs crate is resolved correctly for different
targets.
Update cargo lock.
Fixes: #5582
Signed-off-by: Matthew Arnold <mattarno@uk.ibm.com>
It seems that the Kata Containers jenkins may be very slow to reach from
behind the firewall, causing TDX machine to fail downloading some of the
cached artefacts.
With this in mind, let's switch to using wget for this specific case.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's avoid getting into a dir and risking not being able to leave that
dir in case something fails.
Instead, let's just stay in the current dir and move the final tarball
to the exoected directory in case all the checks go as expected.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With the cached version we're concatenating the td-shim version with the
toolchain version used to build the project.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a documentation about the environment variables that can be
used with the `-k` and `-q` options.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `qemu_script_dir` is a leftover from before the rework on how we
cache the components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're already doing for some components, let's also add support for
caching firmwares. TD-Shim and TDVF are the ones supported for now.
Fixes: #5360, #5361
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We should use {} instead of () when passing the component version to the
install_cached_component() function.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
asserts -> assets
stastic -> static
Those were not caught during the first merge of the series as we didn't
have CI jobs testing for the TEE artefacts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently building Cloud Hypervusor with thE TDX feature
regardless of using with TDX or not.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The name of the asset was wrong, "cloud-hypervisor" instead of
"hypervisor.cloud_hypervsior", generating an empty "latest" file.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The final cloud hypervisor tarball name is either
kata-static-cc-cloud-hypervisor.tar.xz or
kata-static-cc-tdx-cloud-hypervisor.tar.xz, meaning it uses
"cloud-hypervisor" instead of "clh" in the name.
Fixes: #5816
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's check for the cached version of the components as part of the
kata-deploy-binaries.sh as here we already have the needed info for
checking whether a component is cached or not, and to use it without
depending on changes made on each one of the builder scripts.
Fixes: #5816
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of caching files generated during the component build, let's
cache the final tarball generated for each component.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's do this as the component name will be re-used later on, when we
start checking whether a cached component needs to be rebuilt or not.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is used in several parts of the code, and can have a single
declaration as part of the `lib.sh` file, which is already imported by
all the places where it's used.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're going to use this function from different places, so we better
move it to lib.sh and avoid rewriting it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If you're directly using the output of this function, the info message
will show up as part of the string, and that's not what we want.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Change the if statement to check if the CWD is set to /
Add unit tests for the correct merging of working directory
in the container and image process
Note: there is an outstanding question about one test case
Format code
Fixes: #5721
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Jordan Jackson <jordan.jackson@ibm.com>
This includes contructing VMSA pages, parsing OVMF footer table to fetch
the AP reset EIP address, and allowing different vcpu types.
Fixes: #5471
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Now that we're caching the kernel, we're relying on the kernel version
being exported. This is already done for the CC kernel, but not for the
TEE specific ones.
Fixes: #5770
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Loop through the images enviroment variables, checking if it exists
inside the target. If it does then do not append it.
Add unit tests for correctly merging the env variables of the pod yaml
and image itself in the container and image process
Format code
Fixes: #5730
Signed-off-by: Jordan Jackson <jordan.jackson@ibm.com>
As we bumped containerd dependency to v1.6.8, let's also do the
re-vendor of its code on the runtime side.
Fixes: #5745
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
If the attestation-agent is used then enable image_client_auth
to enable the attempt to get registry credentials for the pull
Fixes: #5652
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The incorrect name causes `make cc-payload` to fail, as
`cc-tdx-rootfs-tarball` is a non existent target.
Fixes: #5628
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This is just to keep the support for s390x without the cosign
verification while looking for a solution for #5582.
Fixes: #5599
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
image-rs tagged its v0.2.0 release, let's bump it here as we're about to
release the payload for the v0.2.0 Confidential Containers release.
Fixes: #5593
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's bump the td-shim to its `v0.2.0` release.
Together with the bump, let's also adapt its build scripts so we're able
to build the `v0.2.0` as part of our infra.
Fixes: #5593
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The attestation-agent v0.2.0 has been released, let's bump it here and
ensure we use the new release as part of what will become the payload
for the Confidential Containers v0.2.0 release.
Fixes: #5593
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
It appears that _either_ the GitHub workflow runners have changed their
environment, or the Ubuntu archive has changed package dependencies,
resulting in the following error when building the snap:
```
Installing build dependencies: bc bison build-essential cpio curl docker.io ...
:
The following packages have unmet dependencies:
docker.io : Depends: containerd (>= 1.2.6-0ubuntu1~)
E: Unable to correct problems, you have held broken packages.
```
This PR uses the simplest solution: install the `containerd` and `runc`
packages. However, we might want to investigate alternative solutions in
the future given that the docker and containerd packages seem to have
gone wild in the Ubuntu GitHub workflow runner environment. If you
include the official docker repo (which the snap uses), a _subset_ of
the related packages is now:
- `containerd`
- `containerd.io`
- `docker-ce`
- `docker.io`
- `moby-containerd`
- `moby-engine`
- `moby-runc`
- `runc`
Fixes: #5545.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
(cherry picked from commit 990e6359b7)
Rather than hard-coding the package manager into the docker part,
use the `build-packages` section to specify the parts package
dependencies in a distro agnostic manner.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
(cherry picked from commit ca69a9ad6d)
Although introducing an awful amount of code duplication, let's
parallelise the static checks in order to reduce its time and the space
used in the VMs running those.
While I understand there may be ways to make the whole setup less
repetitive and error prone, I'm taking the approach of:
* Make it work
* Make it right
* Make it fast
So, it's clear that I'm only attempting to make it work, and I'd
appreciate community help in order to improve the situation here. But,
for now, this is a stopgap solution.
JFYI, the time needed for run the tests on the `main` branch went down
from ~110 minutes to ~60 minutes. Plus, we're not running those on a
single VM anymore, which decreases the change to hit the space limit.
Reference: https://github.com/kata-containers/kata-containers/actions/runs/3393468605/jobs/5640842041
Ideally, each one of the following tests should be also split into
smaller tests, each test for one component, for instance.
* static-checks
* compiler-checks
* unit-tests
* unit-tests-as-root
Fixes: #5585
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 40d514aa2c)
This commit adds the `kernel-hashes=on` flag to the QEMU command line
for all SEV guests (previously, this was only enabled for SEV guests
with `guest_pre_attestation=on`. This change allows the AmdSev firmware
to be used for both encrypted and non-encrypted container images.
**Note:** This change makes the AmdSev OVMF build a requirement for all
SEV guests. The standard host OVMF package will no longer work.
Fixes#5307.
Signed-off-by: Jim Cadden <jcadden@ibm.com>
Let's ensure we add the option for the user, at build time, to set the
AGENT_AA_KBC_PARAMS passed to the agent, via the kernel command line.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're switching TDX to using EAA KBC instead of OfflineFS KBC, let's
add the configuration files needed for testing this before we fully
switch TDX to using such an image.
Fixes: #5563
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The specific TDX image relies on having EAA KBC, instead of using the
default `offline_fs_kbc`.
This image is, with this commit, built and distributed, but not yet used
by TDX specific configurations, which will be done in a follow-up
commit.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of removing the non-needed packages under `/usr/share` and then
installing new components, let's make sure we do the removal at the end
of our script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's do that instead of updating and installing the
`software-properties-common` package, as it reduces the final size of
the image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
First of all, EAA KBC is only used with TDX, thus we can safely assume
that eaa_kbc means TDX, at least for now.
A `/etc/tdx-attest.conf` file, with the data "port=4050" is needed as
that's the default configuration for the Quote Generation Service (QGS)
which is present on the guest side.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will become very handy by the moment we start building different
images targetting different TEEs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Replace hard-coded aa_kbc_param check to set the image_client's
security_validate, with reading the setting from the agent config
Fixes: #4888
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Launching a pod with measured boot enabled seems to be taking longer
than expected with Cloud Hypervisor, which leads to hitting a timeout
limit.
Let's double those timeout limits for now.
Fixes: #5576
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add agent.enable_signature_verification=false to the kernel_params
default config to get backwards compatibility in config.
Note the the agent config will default this setting to true for security
reasons if it's unset
Fixes: #4888
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add a new agent config parameter enable_signature_verification which
defaults to true for security reasons
- Add unit tests to check parsing and defaults
Fixes: #4888
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Allows parameters in the agent config file to be overwritten
by the kernel commandline. Does not change trust model since
the commandline is measured.
Makes sure to set endpoints_allowed correctly.
Fixes: #5173
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This PR implements the use of a cached cc qemu tarball to speed up
the CI and avoid building the cc qemu tarball when it is not
necessary.
Fixes#5363
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
As we discussed in #5178, user need set aa_kbc_params config without
modify kata guest image, since kernel params is also measured in TEE
boot flow, we make aa_kbc_params can be parsed through kernel cmdline.
Fixes: #5178
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Inclavare released a rats-tls-tdx package, which we depend on for using
verdictd.
Let's install it when using EAA_KBC, as already done for the rats-tls
package.
One thin to note here is that rats-tls-tdx depends on libtdx-attest,
which depends on libprotobuf-c1, thus we had to add the intel-sgx repo
together with enabling the universe channel.
Fixes: #5543
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently using Ubuntu 20.04 as the base for the Ubuntu rootfs,
meaning that right now there's no issue with the approach currently
taken. However, if we do a bump of an Ubuntu version, we could face
issues as the rats-tls package is only provided for Ubuntu 20.04.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
To support cosign signature verification.
Fix build warning in signal.rs:
error: unused `tokio::sync::MutexGuard` that must be used
--> src/signal.rs:27:9
|
27 | rustjail::container::WAIT_PID_LOCKER.lock().await;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `-D unused-must-use` implied by `-D warnings`
= note: if unused the Mutex will immediately unlock
Fixes: #5541
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Add tag entry to the attestation agent entry of the versions file.
Checkout tag commit after cloning AA in rootfs builder.
Fixes: #5373
Fixes: kata-containers#5373
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
1. Add unit tests for pkg/sev
2. Allow CalculateLaunchDigest to calculate launch digest without direct
booted kernel (and, therefore, without initrd and kernel cmdline).
This mode is currently not used in kata.
Fixes: #5456
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
This, combined with the effort of caching builder images *and* only
performing the build itself inside the builder images, is the very first
step for reproducible builds for the project.
Reproducible builds are quite important when we talk about Confidential
Containers, as users may want to verify the content used / provided by
the CSPs, and this is the first step towards that direction.
Fixes: #5517
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to do that in order to avoid trying to use the image in an
architecture which is not yet supported (such as trying to use the x6_64
image on a s390x machine)
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This will help to not have to build those on every CI run, and rather
take advantage of the cached image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This reverts commit c1aac0cdea.
The reason this has to be reverted is because we cannot cache an image
that has a specific user, uid, gid, docker_host_id, and expect that to
work equally on different machines. Unfortunately, this is one of the
images that cannot be cached at all.
This reverts commit fe8b246ae4.
The reason this has to be reverted is because we cannot cache an image
that has a specific user, uid, gid, docker_host_id, and expect that to
work equally on different machines. Unfortunately, this is one of the
images that cannot be cached at all.
The ${file} path is an absolute path, as /home/fidencio/..., while the
result of the `git status --porcelain` is a path relative to the
${repo_root_dir}. Because of this, the logic to adding `-dirty` to the
image name would never work.
Let's fix this by removing the ${repo_root_dir} from the ${file} when
grepping for it.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #5135
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Let's take advantge of an existing action that publishes the payload
after each pull request, to also publish the "builder images" used to
build each one of the artefacts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the initramfs
builder image to the Kata Containers' quay.io registry.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder for the initramds.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Now that the QEMU builder image provides only the environment used for
building QEMU, let's ensure it doesn't get removed.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the QEMU
builder image to the Kata Containers' quay.io registry.
Fixes: #5481
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existsing image, instead of building our
own, to be used as a builder image for QEMU.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Differently than every single other bit that's part of our repo, QEMU
has been using a single Dockerfile that prepares an environment where
the project can be built, but *also* building the project as part of
that very same Dockerfile.
This is a problem, for several different reasons, including:
* It's very hard to have a reproducible build if you don't have an
archived image of the builder
* One cannot cache / ipload the image of the builder, as that contains
already a specific version of QEMU
* Every single CI run we end up building the builder image, which
includes building dependencies (such as liburing)
Let's split the logic into a new build script, and pass the build script
to be executed inside the builder image, which will be only responsible
for providing an environment where QEMU can be built.
Fixes: #5464
Backports: #5465
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the virtiofsd
builder image to the Kata Containers' quay.io registry.
Fixes: #5480
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the virtiofsd.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's ensure we're building virtiofsd with a specific toolchain that's
known to not cause any issues, instead of always using the latest one.
On each bump of the virtiofsd, we'll make sure to adjust this according
to what's been used by the virtiofsd community.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the td-shim
builder image to the Kata Containers' quay.io registry.
Fixes: #5479
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the td-shim.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the shim-v2
builder image to the Kata Containers' quay.io registry.
Fixes: #5478
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's try to pull a pre-existing image, instead of building our own, to
be used as a builder for the shim-v2.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for building and pushing the OVMF builder
image to the Kata Containers' quay.io registry.
Fixes: #5477
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of buildinf our
own, to be used as a builder image for OVMF.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the kernel
builder image to the Kata Containers' quay.io registry.
Fixes: #5476
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the kernel.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the needed infra for only building and pushing the image used
to build the kata-deploy artefacts to the Kata Containers' quay.io
registry.
Fixes: #5475
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's first try to pull a pre-existing image, instead of building our
own, to be used as a builder image for the kata-deploy artefacts.
This will save us some CI time.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This function will push a specific tag to a registry, whenever the
PUSH_TO_REGISTRY environment variable is set, otherwise it's a no-op.
This will be used in the future to avoid replicating that logic in every
builder used by the kata-deploy scripts.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a function to get the hash of the last commit modifying a
specific file.
This will help to avoid writing `git rev-list ...` into every single
build script used by the kata-deploy.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
CC_BUILD_REGISTRY, which points to quay.io/kata-containers/cc-builder,
will be used for storing the builder images used to build the artefacts
via the kata-deploy scripts.
The plan is to tag, whenever it's possible and makes sense, images like:
* ${CC_BUILDER_REGISTRY}:kernel-${sha}
* ${CC_BUILDER_REGISTRY}:qemu-${sha}
* ${CC_BUILDER_REGISTRY}:ovmf-${sha}
* ${CC_BUILDER_REGISTRY}:shim-v2-${go-toolchain}-{rust-toolchain}-${sha}
* ${CC_BUILDER_REGISTRY}:td-shim-${toolchain}-${sha}
* ${CC_BUILDER_REGISTRY}:virtiofsd-${toolchain}-${sha}
Where ${sha} is the sha of the last commit modifying the Dockerfile used
by the builder.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The init.sh in initramfs will parse the verity scheme,
roothash, root device and setup the root device accordingly.
Fixes: #5135
Signed-off-by: Wang, Arron <arron.wang@intel.com>
As the CCv0 effort is not using the runtime-rs, let's add a mechanism to
avoid building it.
The easiest way to do so, is to simply *not* build the runtime-rs if the
RUST_VERSION is not provided, and then not providing the RUST_VERSION as
part of the cc-shim-v2-tarball target.
Fixes: #5462
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There was a typo in the registry name, which should be
quay.io/confidential-containers/runtime-payload-ci instead of
quay.io/repository/confidential-containers/runtime-payload-ci
Fixes: #5469
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create a GitHub action to automate the Kata Containers payload
generation for the Confidential Containers project.
This GitHub action builds the artefacts (in parallel), merges them into
a single tarball, generates the payload with the resulting tarball, and
uploads the payload to the Confidential Containers quay.io.
It expects the tags to be used to be in the `CC-x.y.z` format, with x,
y, and z being numbers.
Fixes: #5330
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's have a GitHub action to publish the Kata Containers payload, after
every push to the CCv0 branch, to the Confidential Containers
`runtime-payload-ci` registry.
The intention of this action is to allow developers to test new
features, and easily bisect breakages that could've happened during the
development process. Ideally we'd have a CI/CD pipeline where every
single change would be tested with the operator, but we're not yet
there. In any case, this work would still be needed. :-)
It's very important to mention that this should be carefully considered
on whether it should or should not be merged back to `main`, as the flow
of PRs there is way higher than what we currently have as part of the
CCv0 branch.
Fixes: #5460
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's modify the script so we allow passing an extra tag, which will be
used as part of the Kata Containers pyload for Confidential Containers
CI GitHub action.
With this we can pass a `latest` tag, which will make things easier for
the integration on the operator side.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make the registry an optional argument to be passed to the
`kata-deploy-build-and-upload-payload.sh` script, defaulting to the
official Confidential Containers payload registry.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instad of bailing out whenever the docker group doesn't exist, just
consider podman is being used, and set the docker_gid to the user's gid.
Also, let's ensure to pass `--privileged` to the container, so
`/run/podman/podman.socket` (which is what `/var/run/docker.sock` points
to) can be passed to the container.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We need to pass the RUST_VERSION, in the same way done for GO_VERSION,
as nowadays both the go and the rust runtime are built.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's build virtiofsd using the kata-deploy build scripts, which
simplifies and unifies the way we build our components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 0bc5baafb9)
Let's have the docker installation / configuration as part of its own
task, which can be set as a dependency of other tasks whcih may or may
not depend on docker.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit cb4ef4734f)
When moving to building the CI artefacts using the kata-deploy scripts,
we've noticed that the build would fail on any machine where the tarball
wasn't officially provided.
This happens as rust is missing from the 1st layer container. However,
it's a very common practice to leave the 1st layer container with the
minimum possible dependencies and install whatever is needed for
building a specific component in a 2nd layer container, which virtiofsd
never had.
In this commit we introduce the second layer containers (yes,
comtainers), one for building virtiofsd using musl, and one for building
virtiofsd using glibc. The reason for taking this approach was to
actually simplify the scripts and avoid building the dependencies
(libseccomp, libcap-ng) using musl libc.
Fixes: #5425
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 7e5941c578)
The previously used repo will be removed by Intel, as done with the one
used for TDX kernel. The TDX team has already worked on providing the
patches that were hosted atop of the QEMU commit with the following hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0 as a tarball in the
https://github.com/intel/tdx-tools repo, see
https://github.com/intel/tdx-tools/pull/162.
On the Kata Containers side, in order to simplify the process and to
avoid adding hundreds of patches to our repo, we've revived the
https://github.com/kata-containers/qemu repo, and created a branch and a
tag with those hundreds of patches atop of the QEMU commit hash
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0. The branch is called
4c127fdbe81d66e7cafed90908d0fd1f6f2a6cd0-plus-TDX-v3.1 and the tag is
called TDX-v3.1.
Knowing the whole background, let's switch the repo we're getting the
TDX QEMU from.
Fixes: #5419
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 35d52d30fd)
The previously used repo has been removed by Intel. As this happened,
the TDX team worked on providing the patches that were hosted atop of
the v5.15 kernel as a tarball present in the
https://github.com/intel/tdx-tools repos, see
https://github.com/intel/tdx-tools/pull/161.
On the Kata Containers side, in order to simplify the process and to
avoid adding ~1400 kernel patches to our repo, we've revived the
https://github.com/kata-containers/linux repo, and created a branch and
a tag with those ~1400 patches atop of the v5.15. The branch is called
v5.15-plus-TDX, and the tag is called 5.15-plus-TDX (in order to avoid
having to change how the kernel builder script deals with versioning).
Knowing the whole background, let's switch the repo we're getting the
TDX kernel from.
Fixes: #5326
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9eb73d543a)
Prior to this patch, we were missing a call to `verify_cid` when the cid
was derived from the image path, which meant that the host could specify
something like "prefix/..", and we would use ".." as the cid. Paths
derived from this (e.g., `bundle_path`) would not be at the intended
tree.
This patch factors the code out of `pull_image` so that it can be more
easily tested. Tests are added for a number of cases.
Fixes#5421
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This also slightly improves readability by decluttering the function
declaration and call site.
Fixes#5405
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Based on https://gitlab.com/cryptsetup/cryptsetup/-/issues/525
1. When --no-wipe is used, the device will have invalid checksums
2. mkfs.ext4 would fail on an un-wiped device due to reads of pages with
invalid checksums
3. To make mkfs.ext4 work
- Perform a dry run to figure out which sectors (pages) mkfs.ext4 will
write to.
- Perform directe writes to these pages to ensure that they will have
valid checksums
- Invoke mkfs.ext4 again to perform initialization
4 Use lazy_journal_init option with mkfs.ext4 to lazily initialize the journal.
According to the man pages,
"This speeds up file system initialization noticeably, but carries some small
risk if the system crashes before the journal has been overwritten entirely
one time."
Since the storage is ephemeral, not expected to survive a system crash/power cycle,
it is safe to use lazy_journal_init.
Fixes#5329
Signed-off-by: Anand Krishnamoorthi <anakrish@microsoft.com>
In order to ensure that the proxy configuration is passed to the 2nd
layer container, let's ensure the $HOME/.docker/config.json file is
exposed inside the 1st layer container.
For some reason which I still don't fully understand exporting
https_proxy / http_proxy / no_proxy was not enough to get those
variables exported to the 2nd layer container.
In this commit we're creating a "$HOME/.docker" directory, and removing
it after the build, in case it doesn't exist yet. The reason we do this
is to avoid docker not running in case "$HOME/.docker" doesn't exist.
This was not tested with podman, but if there's an issue with podman,
the issue was already there beforehand and should be treated as a
different problem than the one addressed in this commit.
Fixes: #5077
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4da743f90b)
before setting a limit, otherwise paths may not be found.
guest supporting different hugepage size is more likely with peer-pods where
podvm may use different flavor.
Fixes: #5191
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Umoci is not longer required if we have the attestation-agent, so don't
override the user input
Fixes: #5237
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
To use depmod in the rootfs builder, the docker environment will require kmod.
Fixes: kata-containers#5125
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Using depmod when adding kernel modules to get dependencies.
Needed for the efi secret module for sev.
Fixes: #5125
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
If we are using the offline_fs_kbc and have created a resource json
then switch security_validate on the image_client to enable
the signature verification feature for image-rs
Fixes: #4581
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Update the agent build to get around the nix & glibc linker problems
by running the libseccomp installation first
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Update the doc and scripts to reflect that skopeo isn't mandatory
for signature verification any longer
- Update the script to default the aa_kbc to offline_fs_kbc
Fixes: #4581
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Adds default config file.
Adds case in rootfs.sh to copy config.
Fixes kata-containers#5023
Fixes: #5023
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
After we have a guest kernel with builtin initramfs which
provide the rootfs measurement capability and Kata rootfs
image with hash device, we need set related root hash value
and measure config to the kernel params in kata configuration file.
Fixes: #5168
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The sev initrd target had been changed to "cc-sev-rootfs-initrd".
This was good discussion as part of #5120.
I failed to rename it from "cc-sev-initrd-image" in kata-deploy-binaries.
The script will fail for a bad build target.
Fixes: #5166
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Adds the efi_secret kernel module to the sev initrd.
Adds a rootfs flag for kernel module based on the AA_KBC.
Finding the kernel module in the local build based on kernel version and kernel config version.
Moved kernel config version checking function from kernel builder to lib script.
Fixes: #5118
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Adds a make target, and a function in the kata-deploy-binaries script.
In the spirit of avoiding code duplication, making the cc-initrd function more generic.
Fixes: #5118
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Adds a new make target for an sev kernel which can be built and put into payload bundles for the operator.
Currently not including this sev kernel target in the cc payload bundle.
Unfortunately having to breakflow from using the generic cc_tee_kernel functions in either the kata-deploy-binaries or build-kernel.
Largely based on using an upstreamed kernel release, meaning the url is the defaul cdn, and e.g. we use version rather than tag.
The upside of this is that we can use the sha sum checking functionality from the generic get_kernel function.
CC label in title removed for commit message check.
Fixes: #5037
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Currently leaving the cc-sev-ovmf-tarball target out of the cc payload.
I was not sure where discussion had landed on the number of payload bundles.
e.g. could be included in a cc bundle along with tdx support or create an SEV bundle.
Fixes: kata-containers#5025
Fixes: #5025
Signed-off-by: Alex Carter <Alex.Carter@ibm.com>
Integrate initramfs into guest kernel as one binary,
which will be measured by the firmware together.
Fixes: #5148
Signed-off-by: Wang, Arron <arron.wang@intel.com>
This patch allows copying of directories and symlinks when
static file copying is used between host and guest. This change is
necessary to support recursive file copying between shim and agent.
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
This patch adds the support of the remote hypervisor type.
Shim opens a Unix domain socket specified in the config file,
and sends TTPRC requests to a external process to control
sandbox VMs.
Fixes#4482
Co-authored-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
After adding an SEV QEMU config file (#4850), need to configure containerd to select this when appropriate based on a new runtimeclass.
Adds to the configuration of containerd so the correct config is selected.
Fixes: #4851
Signed-Off-By: Alex Carter <alex.carter@ibm.com>
Let's remove the whole content from:
* /opt/confidential-containers/libexec
* /opt/confidential-containers/share
And then manually remove the binaries under bin directory` as the
pre-install hook will drop binaries there.
Finally, let's call a `rmdir -p /opt/confidential-containers/bin` which
should take care of the cleanup in case no pre-install hook is used, and
let's make sure we pass `--ignore-fail-on-non-empty` so we don't fail
when using a pre-install hook.
Fixes: #5128
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For Confidential Containers the file is present at
`/opt/confidential-containers` instead of `/opt/kata`.
Fixes: #5119
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Every now and then we've been hitting issues with parallel builds. in
order to not rely on lucky for the first release, let's do a serial
build of the payload image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add the documentation on how to generate the Kata Containers
payload, based in the CCv0 branch, that's consumed by the Confidential
Containers Operator.
Fixes: #5041
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `image` target is only used by and only present in the `CCv0`
branch, and it's name is misleading. :-)
Let's rename it (and the scripts used by it) to mention payload rather
than image, and to actually build the cc related tarballs instead of the
"vanilla" Kata Containers tarballs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's adjust the `kata-deploy-build-and-upload-image.sh` to build the
image following the `kata-containers-${commit}` tag pattern, and to push
it to the quay.io/confidential-containers/runtime-payload repo.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's try to remove the /opt/confidential-containers directory. If it's
not empty, let's not bother force removing it, as the pre-install script
also drops files to the very same directory.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're currently backing up and restoring all the possible shim files,
but the default one ("containerd-shim-kata-v2").
Let's ensure this is also backed up and restored.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of passing a `KATA_CONF_FILE` environament variable, let's rely
on the configured (in the container engine) config path, as both
containerd and CRI-O support it, and we're using this for both of them.
This is a "backport" of f7ccf92dc8, from
the original `kata-deploy.sh` to the one used for Confidential
Containers.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As containerd is the only supported container engine, let's simplify the
script and, at the same time, make it clear that other container engines
are not supported yet.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create the QEMU build image based on the version of QEMU used, so
if we happen to have a parallel build we ensure different images are
being used.
Also, let's ensure the image gets remove after the build.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
In the commit 54d6d01754 we ended up
removing the BUILD_SUFFIX argument passed to QEMU as it only seemed to
be used to generate the HYPERVISOR_NAME and PKGVERSION, which were added
as arguments to the dockerfile.
However, it turns out BUILD_SUFFIX is used by the `qemu-build-post.sh`
script, so it can rename the QEMU binary accordingly.
Let's just bring it back.
Fixes: #5078
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 373dac2dbb)
Patches failing without the no_patches.txt file for SPR-BKC-QEMU-v2.5.
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
(cherry picked from commit 59e3850bfd)
Dockerfile cannot decipher multiple conditional statements in the main RUN call.
Cannot segregate statements in Dockerfile with '{}' braces without wrapping entire statement in 'bash -c' statement.
Dockerfile does not support setting variables by bash command.
Must set HYPERVISOR_NAME and PKGVERSION from parent script: build-base-qemu.sh
Fixes: #5078
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
(cherry picked from commit 54d6d01754)
4cf502fb20 added the ability to build
TD-Shim, but forgot to have it added as part of the cc-tarball target.
Fixes: #5042
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Added default sev kata config template.
Added required default variables in Makefile.
Fixes#5012Fixes#5008
Signed-Off-By: Ryan Savino <ryan.savino@amd.com>
The agent configuration file, which is part of the docs, is used by the
confidential containers CIs and, right now, cannot be run behind a
firewall, which is exactly how the TDX CIs are reunning, as https_proxy
is not set there.
Fixes: #5020
Depends-on: github.com/kata-containers/tests#5080
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the previous commit added a new runtime class to be used with TDX,
let's make sure this gets shipped and configured as part of the
kata-deploy-cc script, which is used by the Confidential Containers
Operator.
This commit also cleans up all the extra artefacts that will be
installed in order to run the CLH TDX workloads.
Fixes: #4833
Depends-on: github.com/kata-containers/tests#5070
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new configuration file for using a cloud hypervisor (and all
the needed artefacts) that are TDX capable.
This PR extends the Makefile in order to provide variables to be set
during the build time that are needed for the proper configuration of
the VMM, such as:
* Specific kernel parameters to be used with TDX
* Specific kernel features to be used when using TDX
* Artefacts path for the artefacts built to be used with TDX
* Kernel
* TD-Shim
The reason we don't hack into the current Cloud Hypervisor configuration
file is because we want to ship both configurations, with for the
non-TEE use case and one for the TDX use case.
It's important to note that the Cloud Hypervisor used upstream is
already built with TDX support.
Fixes: #4831
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
TDX kernel is based on a kernel version which doesn't have the
CONFIG_SPECULATION_MITIGATIONS option.
Having this in the allow list for missing configs avoids a breakage in
the TDX CI.
Fixes: #4998
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
With the current TDX kernel used with Kata Containers, `tdx_guest` is
not needed, as TDX_GUEST is now a kernel configuration.
With this in mind, let's just drop the kernel parameter.
Fixes: #4981
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As right now the TDX guest kernel doesn't support "serial" console,
let's switch to using HVC in this case.
Fixes: #4980
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The runtime will crash when trying to resize memory when memory hotplug
is not allowed.
This happens because we cannot simply set the hotplug amount to zero,
leading is to not set memory hotplug at all, and later then trying to
access the value of a nil pointer.
Fixes: #4979
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
While doing tests using `ctr`, I've noticed that I've been hitting those
timeouts more frequently than expected.
Till we find the root cause of the issue (which is *not* in the Kata
Containers), let's increase the timeouts when dealing with a
Confidential Guest.
Fixes: #4978
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
When booting the TDX kernel with `tdx_disable_filter`, as it's been done
for QEMU, VirtioFS can work without any issues.
Whether this will be part of the upstream kernel or not is a different
story, but it easily could make it there as Cloud Hypervisor relies on
the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the
DMA API, making these devices compatible with TDX.
See Sebastien Boeuf's explanation of this in the
3c973fa7ce208e7113f69424b7574b83f584885d commit:
"""
By using DMA API, the guest triggers the TDX codepath to share some of
the guest memory, in particular the virtqueues and associated buffers so
that the VMM and vhost-user backends/processes can access this memory.
"""
Fixes: #4977
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The `force_tdx_guest` kernel parameter was only needed in the early
development stages of the TDX kernel driver. We can safely drop it with
the kernel version we've been currently using.
Fixes: #4985
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Generate rootfs hash data during creating the kata rootfs,
current kata image only have one partition, we add another
partition as hash device to save hash data of rootfs data blocks.
Fixes: #4966
Signed-off-by: Wang, Arron <arron.wang@intel.com>
The local-build script should honor the value of SKOPEO exported in the
environment so that it will be able to build the image without skopeo
inside. This remove the hard-coded "SKOPEO=yes".
Fixes#4959
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Initialize the trusted stroage when the device is defined
as "/dev/trusted_store" with shell script as first step.
Fixes: #4882
Signed-off-by: Wang, Arron <arron.wang@intel.com>
After enable data integrity for trusted storage, the initialize
time will take three times more and IO performance will drop more than
30%, the default value will be NOT enabled but add this config to
allow the user to enable if they care more strict security.
Fixes: #4882
Signed-off-by: Wang, Arron <arron.wang@intel.com>
By default the pause image and runtime config will provided
by host side, this may have potential security risks when the
host config a malicious pause image, then we will use the pause
image packaged in the rootfs.
Fixes: #4882
Signed-off-by: Wang, Arron <arron.wang@intel.com>
With the runtime-rs changes the agent services need to be asynchronous,
so attempt to update the image_service to match this
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Instead of setting:
```
firmware = "/path/to/OVMF.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
We should either be setting:
```
firmware = "/path/to/OVMF.fd"
```
Or:
```
firmware = "/path/to/OVMF_CODE.fd"
firmware_volume = "/path/to/OVMF_VARS.fd"
```
I'm taking the approach to setting up the latter, as that's what's been
tested as part of our TDX CI.
Fixes: #4926
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add required kernel config for dm-crypt/dm-integrity/dm-verity
and related crypto config.
Add userspace command line tools for disk encryption support
and ext4 file system utilities.
Fixes: #4761
Signed-off-by: Arron Wang <arron.wang@intel.com>
For CoCo stack, the pause image is managed by host side,
then it may configure a malicious pause image, we need package
a pause image inside the rootfs and don't the pause image from host.
Fixes: #4768
Signed-off-by: Wang, Arron <arron.wang@intel.com>
Pick up a new verison of image-rs as the pinned version depended on a
version of ocicrypt-rs that doesn't build anymore
Fixes: #4867
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
AMD SEV pre-attestation is handled by the runtime before the guest is
launched. Guest VM is started paused and the runtime communicates with a
remote keybroker service (e.g., simple-kbs) to validate the attestation
measurement and to receive launch secret. Upon validation, the launch
secret is injected into guest memory and the VM is started.
Fixes: #4280
Signed-off-by: Jim Cadden <jcadden@ibm.com>
Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Let's add the QEMU TDX targets to be generated together with the cc
targets, when calling `make cc-tarball`.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As the previous commit added a new runtime class to be used with TDX,
let's make sure this gets shipped and configured as part of the
kata-deploy-cc script, which is used by the Confidential Containers
Operator.
This commit also cleans up all the extra artefacts that will be
installed in order to run the QEMU TDX workloads.
Fixes: #4832
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new configuration file for using a QEMU (and all the needed
artefacts) that are TDX capable.
This PR extends the Makefile in order to provide variables to be set
during the build time that are needed for the proper configuration of
the VMM, such as:
* Specific kernel parameters to be used with TDX
* Specific kernel features to be used when using TDX
* Artefacts path for the artefacts built to be used with TDX
* QEMU
* Kernel
* TDVF
The reason we don't hack into the current QEMU configuration file is
because we want to ship both configurations, with for the non-TEE use
case and one for the TDX use case.
Fixes: #4830
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building TD-shim, a firmware used with
Cloud Hypervisor to start TDX capable VMs for CC.
Fixes: #4780
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building a TDVF, a firmware used with QEMU
to start TDX capable VMs for CC.
Fixes: #4625
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create the td-shim tarball in the directory where the script was
called from, instead of doing it in the $DESTDIR.
This aligns with the logic being used for creating / extracting the
tarball content, which is already in use by the kata-deploy local build
scripts.
Fixes: #4809
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's create the OVMF tarball in the directory where the script was
called from, instead of doing it in the $DESTDIR.
This aligns with the logic being used for creating / extracting the
tarball content, which is already in use by the kata-deploy local build
scripts.
Fixes: #4808
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Simple-kbs keybroker protocol is used by runtime for SEV(-ES)
pre-attestation. Includes protobuf module.
Fixes: #4280
Signed-off-by: Jim Cadden <jcadden@ibm.com>
We don't need skopeo to get the encrypted container image
scenario working, so remove that instruction from the doc
Fixes: #4587
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We're adding a new target for building a TDX capable Cloud Hypervisor
for CC.
As the current version of Cloud Hypervisor is already built with TDX
support, we just rely on calling the same `install_cc_clh()` function,
as done for the non-tee `cc` target.
Fixes: #4659
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building a TDX capable QEMU for CC.
This commit, differently than b307531c29,
introduces support for building the artefacts that are TEE specific.
Fixes: #4623
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Currently $BUILD_DIR will be used to create a directory as:
/opt/kata/share/kata-qemu${BUILD_DIR}
It means that when passing a BUILD_DIR, like "foo", a name would be
built like /opt/kata/share/kata-qemufoo
We should, instead, be building it as /opt/kata/share/kata-qemu-foo.
Fixes: #4638
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of always naming the binary as "-experimental", let's take
advantage of the $BUILD_SUFFIX that's already passed and correctly name
the binary according to it.
Fixes: #4638
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building a TDX capable kernel for CC.
This commit, differently than c4cc16efcd,
introduces support for building the artefacts that are TEE specific.
Fixes: #4622
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Passing the URL to be used to download the kernel tarball is useful in
various scenarios, mainly when doing a downstream build, thus let's add
this new option.
This new option also works around a known issue of the Dockerfile used
to build the kernel not having `yq` installed.
Fixes: #4629
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
There's no need to have the entire function for building SEV / TDX
duplicated.
Let's remove those functions and create a `get_tee_kernel()` which takes
the TEE as the argument.
Fixes: #4627
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Although I don't like the duplication introduced here, it's (at least
for now) way cleaner to have a specific daemonset for the Confidential
Containers effort.
As soon as we have all the bits and pieces upstreamed (kernel, QEMU, and
specific dependencies for each one of the TEEs), we'll be easily able to
get rid of this one. However, for now, focusing on this different set
of files will make our lives easier.
This new daemonset includes the configurations needed for containerd in
order to use the `cc` specific `cri_handler`, which is not and will not
be upstream on the containerd side.
Note, CRI-O is **not** supported for now.
Fixes: #4620
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead of hacking the original `kata-deploy.sh` script, let's add a
totally new folder where we'll be adding content that's CC related.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We've added a bunch of new options related to Confidential Containers
builds as part of the kata-deploy-binaries.sh. Let's make sure those
are displayed to the users of the script when it's called with --help.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Similar to what we have with the `all` option, let's also add a `cc`
one, allowing others to easily call the script and build all the `cc`
related components.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Quite similar to the `kata-tarball` target, let's add a `cc-tarball`
target so we can build all the CC related tarballs in a single command,
with all the tarballs being merged together in the end.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Quite similar to the `all-parallel` target, let's add a `cc-parallel`
target so we can build all the CC related tarballs in parallel.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Quite similar to the `all` target, let's add a `cc` target so we can
build all the CC related tarballs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building virtiofsd for CC, but it's
important to note that the only difference between this one and the
"vanilla" build is the installation path.
Moreover, virtiofsd will **NOT** be used by the CC effort, but as the
very first release target doesn't include TEE support, let's not force
those who want to give it a try to setup devicemapper.
Fixes: #4569
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building QEMU for CC, but it's important
to note that the only difference between this one and the "vanilla"
build is the installation path.
The reason we're taking this approach is because the first release
target for CC doesn't include TEE support.
We had to also include a new builder for QEMU, a specific one for CC, as
for now that's the easiest way to override the prefix in a way that
we'll be easily able to expand the script to support TEE capable builds
in the very near future.
Fixes: #4568
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building the Kernel for CC, but it's
important to note that the only difference between this one and the
"vanilla" build is the installation path.
The reason we're taking this approach is because the first release
target for CC doesn't include TEE support.
Fixes: #4567
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
We're adding a new target for building Cloud Hypervisor for CC, but it's
important to note that the only difference between this one and the
"vanilla" build is the installation path.
The reasons we're taking this approach are:
* Cloud Hypervisor, for the `main` and `stable` branches, is already
built with TDX support.
* The first target for the CC release doesn't include TEE support.
Fixes: #4566
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new build target for our local-build scripts, cc-shim-v2,
and use it to build Kata Containers properly configured for the CC
use-case.
Fixes: #4564
Depends-on: github.com/kata-containers/tests#4895
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new "REMOVE_VMM_CONFIGS" environment variable that can be
passsed to the script responsible for building Kata Containers.
Right now this is not useful for the `main` or `stable` branch, but for
the CC release we only have been working and testing with QEMU and Cloud
Hypervisor.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
For the CC build we need to enable such a flag, and the cleaner way to
do so is exposing it in the Makefile and, later on, making sure its
correct value to the build script.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
While this has never been needed for the `main` and `stable` releases,
for the coming CC release we need to pass a few extra options when
building the shim.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's add a new build target for our local-build scripts,
cc-rootfs-image-tarball, and use it to build an image that has skopeo
and umoci embedded in, and that using the offline_fs_kbc as the
attenstation agent KBC.
Fixes: #4557
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
SKOPEO, UMOCI, and AA_KBC have been unset so far as we have not been
generating rootfs images that would be used for CC as part of our
workflow.
Now, as we're targetting the first release of the operator with the CCv0
branch, let's stop unsetting those and start taking advantage of our
tools to help us building a CC capable image.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
As we're still depending on components that are only being tested on
Ubuntu, let's make sure the VM image distributed is exactly the same
we've been testing.
Fixes: #4551
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Refactored ccv0.sh to utilise new automated tests for pulling encrypted images and creating a pod.
Fixes: #4512
Depends-on: github.com/kata-containers/tests#4866
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>
Let's pin a specific version of image-rs, one that pins a specific
version of ocicrypt-rs on their side, and ensure we don't fall into
issues by consuming the content from main on those repos, and also
helping to ensure reproducible builds from our side.
Fixes: #4517
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's update the Cargo.lock file to bring in all the new dependencies
and to decrease the diff after pinning a specific version of image-rs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The error shown below was caught during a dependency bump in the CCv0
branch, but we better fix it here first.
```
error: this boolean expression can be simplified
--> src/random.rs:85:21
|
85 | assert!(!ret.is_ok());
| ^^^^^^^^^^^^ help: try: `ret.is_err()`
|
= note: `-D clippy::nonminimal-bool` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool
error: this boolean expression can be simplified
--> src/random.rs:93:17
|
93 | assert!(!ret.is_ok());
| ^^^^^^^^^^^^ help: try: `ret.is_err()`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#nonminimal_bool
```
Fixes: #4523
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The error shown below was caught during a dependency bump in the CCv0
branch, but we better fix it here first.
```
error: use of `ok_or` followed by a function call
--> src/netlink.rs:526:14
|
526 | .ok_or(anyhow!(nix::Error::EINVAL))?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(|| anyhow!(nix::Error::EINVAL))`
|
= note: `-D clippy::or-fun-call` implied by `-D warnings`
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call
error: use of `ok_or` followed by a function call
--> src/netlink.rs:615:49
|
615 | let v = u8::from_str_radix(split.next().ok_or(anyhow!(nix::Error::EINVAL))?, 16)?;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: try this: `ok_or_else(|| anyhow!(nix::Error::EINVAL))`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#or_fun_call
```
Fixes: #4523
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Update `ccv0.sh` to use the new lib method which updates the CC pod config yaml
to add a a unique id
for compatibility with crictl 1.24.0+
Fixes: #4867
Depends-on: github.com/kata-containers/tests#4867
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Ensure that our documented crictl pod config file contents have
uid and namespace fields for compatibility with crictl 1.24+
Fixes: #4513
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Encrypted image support with offline_fs_kbc mode
of the attesation-agent, currently required skopeo
so update the doc to clarify this
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Solve `fatal: unsafe repository` ownership error by using `lib.sh`
code to check out the kata-containers repo
- Update `~/rustup` and repo directory ownership to `${USER}`
in order to allow subsequent build steps to work as a non-root
user
Fixes: #4241
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Based on @fidencio's opoinon,
On Arm: static build virtiofsd using musl lib;
on ppc64 & s390: static build virtiofsd using gnu lib;
Fixes: #4258
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The current implementation of walking the
disks to match with the requested volume path
in agent doesn't work because the volume path
provided by the shim to the agent is the mount
path within the guest and not the device name.
The current logic is trying to match the
device name to the volume path which will never
match.
This change will simplify the
get_volume_capacity_stats and
get_volume_inode_stats to just call statfs and
get the bytes and inodes usage of the volume
path directly.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
Today the shim does a translation when doing
direct-volume stats where it takes the source and
returns the mount path within the guest.
The source for a direct-assigned volume is actually
the device path on the host and not the publish
volume path.
This change will perform a lookup of the mount info
during direct-volume stats to ensure that the
device path is provided to the shim for querying
the volume stats.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The go default http mux AFAIK doesn’t support pattern
routing so right now client is padding the url
for direct-volume stats with a subpath of the volume
path and this will always result in 404 not found returned
by the shim.
This change will update the shim to take the volume
path as a GET query parameter instead of a subpath.
If the parameter is missing or empty, then return
400 BadRequest to the client.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The action function expects a function that returns error
but the current direct-volume stats Action returns
(string, error) which is invalid.
This change fixes the format and print out the stats from
the command instead.
Fixes: #4293
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
This PR removes the clear containers reference as this is not longer
being used and is deprecated at the rootfs builder README.
Fixes#4278
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
Let's reword the sentence so it's easier for someone who's not a native
nor familiar with the project to understand.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Right now the script only support QEMU, but there's not a reason to do
that, mainly considering we already have the tests parity in the CIs
between QEMU and Clouud Hypervisor.
With this in mind, let's expand this script to also using Cloud
Hypervisor.
Whether this script should use QEMU or Cloud Hypervisor is defined
according to the KATA_HYPERVISOR environment variable.
Fixes: #4038
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Instead, rely on the conntainerd-shim-kata-v2 process, as this makes
this script VMM agnostic.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This configuration option is valid for all the hypervisor that are going
to be used with the confidential containers effort, thus exposing the
configuration option for Cloud Hypervisor as well.
Fixes: #4022
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 98750d792b)
Refactor image verification documentation to be more user
focussed, using crictl rather than agent-ctl and re-using the
integration test config files
Fixes: #3958
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Use `multistrap` for building Ubuntu rootfs. Adds support for building
for foreign architectures using the `ARCH` environment variable
(including umoci).
In the process, the Ubuntu rootfs workflow is vastly simplified.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Containerd can support set a proxy when downloading images with a environment variable.
For CC stack, image download is offload to the kata agent, we need support similar feature.
Current we add https_proxy and no_proxy, http_proxy is added since it is insecure.
Fixes#3956
Signed-off-by: Arron Wang <arron.wang@intel.com>
Requires setting ARCH and CC.
- Add CC linker option for building agent.
- Set host for building libseccomp.
Fixes: #3681
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
- Add a doc comment
- Pass to build container, e.g. to build x86_64 with glibc (would
always use musl)
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Remove a lot of cruft of musl installations -- we needed those for the
Go agent, but Rustup just takes care of everything. aarch64 on
Debian-based & Alpine is an exception -- create a symlink
`aarch64-linux-musl-gcc` to `musl-tools`'s `musl-gcc` or `gcc` on
Alpine. This is unified -- arch-specific Dockerfiles are removed.
Furthermore, we should keep it in Ubuntu for supporting the offline SEV
KBC. We also keep it in Clear Linux, as that runs our internal checks,
but it is e.g. not shipped in CentOS Stream 9.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Hadolint DL3019. If you're wondering why this is in this PR, that's
because I touch the file later, and we're only triggering the lints for
changed files.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
- We've updated the CC logging scripts to log to the journal
rather than a socket, so remove socat scripts and instructions
to reflect this
Fixes: #3928
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
for config, as per suggestion from @jodh-intel in #3243.
- Uses the pre-established `kata-containers` folder which we can also
use for more
- Makes it clear the agent is used
Also, use curl instead of wget for uniformity.
Fixes: #3920
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
All scripts should use `EOF` as the shell here document delimiter as
this is checked by the static checker.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Added new `kata-manager` options to control the self-test behaviour. By
default, after installation the manager will run a test to ensure a Kata
Containers container can be created. New options allow:
- The self test to be disabled.
- Only the self test to be run (no installation).
These features allow changes to be made to the installed system before
the self test is run.
Fixes: #3851.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Make the `kata-manager` create a `containerd` link to ensure the
downloaded containerd systemd service file can find the daemon when
using the GitHub packaged version of containerd.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add test coverage for get_memory_info function in src/rpc.rs. Includes
some minor refactoring of the function.
Fixes#3837
Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
Change the secret used by the GitHub Action that adds the PR size
label to one with the correct set of privileges.
Fixes: #3856.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
As 2.4.0-rc0 has been released, let's switch the kata-deploy / kata-cleanup
tags back to "latest", and re-add the kata-deploy-stable and the
kata-cleanup-stable files.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
- Enhancement: fix comments/logs and delete not used function
- storage: make k8s emptyDir volume creation location configurable
- Implement direct-assigned volume
- Bump containerd to 1.6.1
- experimentally enable vcpu hotplug and virtio-mem on arm64 in kernel part
- versions: Upgrade to Cloud Hypervisor v22.0
- katatestutils: remove distro constraints
- Minor fixes for the `disable_block_device_use` comments
- clh: stop virtofsd if clh fails to boot up the vm
- clh: tdx: Don't use sharedFS with Confidential Guests
- runtime: Build golang components with extra security options
- snap: Use git clone depth 1 for QEMU and dependencies
- snap: Don't build cloud-hypevisor on ppc64le
- build: always reset ARCH after getting it
- virtcontainers: remove temp dir created for vsock in test code
- docs: Add unit testing presentation
- virtcontainers: Use available s390x hugepages
- Update QEMU >= 6.1.0 in configure-hypervisor.sh
- Fix monitor listen address
- snap: clh: Re-use kata-deploy script here
- osbuilder: Add CentOS Stream rootfs
- runtime: Gofmt fixes
- Update `confidential_guest` comments
- cleanup runtime pkgs for Darwin build, add basic Darwin build/unit test
- docs: Update Readme document
- runtime: use Cmd.StdoutPipe instead of self-created pipe
- docs: Developer-Guide build a custom Kata agent with musl
- kata-agent: Fix mismatching error of cgroup and mountinfo.
- runtime, config: make selinux configurable
- Fix unbound variable / typo on error mesage
- clh: Add TDX support
- virtcontainers: Do not add a virtio-rng-ccw device
- kata-monitor: fix collecting metrics for sandboxes not started through CRI
- runtime: fix package declaration for ppc64le
- Make the hypervisor framework not Linux specific
- kata-deploy: Simplify Dockerfile and support s390x
- Support nerdctl OCI hooks
- shim: log events for CRI-O
- docs: Update contributing link
- kata-deploy: Use (kata with) qemu as the default shim-v2 binary
- kata-monitor: simplify sandbox cache management and attach kubernetes POD metadata to metrics
- nydus: add lazyload support for kata with clh
- kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments
- packaging: Use `patch` for applying patches
- virtcontainers: Remove duplicated assert messages in utils test code
- versions: add nydus-snapshotter
- docs: Update limitations document
- packaging: support qemu-tdx
- Kata manager fix install
- versions: Linux 5.15.x
- trace-forwarder/agent-ctl: run cargo fmt/clippy in make check
- docs: Improve top-level README
- runtime: use github.com/mdlayher/vsock@v1.1.0
- tools: Build cloud-hypervisor with "--features tdx"
- virtiofsd: Use "-o announce_submounts"
- feature: hugepages support
- tools: clh: Allow to set when to build from sources and the build flags passed down to cargo
- docs: Remove docker run and shared memory from limitations
- versions: Udpate Cloud Hypervisor to 55479a64d237
- kernel: add missing config fragment for TDx
- runtime: The index variable is initialized multiple times in for
- scripts: fix a typo while to check build_type
- versions: bump CRI-O to its 1.23 release
- feature(nydusd): add nydusd support to introduce lazyload ability
- docs: Fix relative links in Markdown
- kernel: support TDx
- device: Actually update PCIDEVICE_ environment variables for the guest
- docs: Update link to EFK stack docs
- runtime: support QEMU SGX
- snap: update qemu version to 6.1.0 for arm
- Release process related fixes
- openshift-ci: switch to CentOS Stream
- virtcontainers: Split the rootless package into OS specific parts
- runtime: suppport split firmware
- kata-deploy: for testing, make sure we use the PR branch
- docs: Remove Zun documentation with kata containers
- agent: Fix execute_hook() args error
- workflows: stop checking revert commit
84dff440 release: Adapt kata-deploy for 2.4.0-rc0
b257e0e5 rustjail: delete function signal in BaseContainer
d647b28b agent: delete meaningless FIXME comment
1b34494b runtime: fix invalid comments for pkg/resourcecontrol
afc567a9 storage: make k8s emptyDir creation configurable
e76519af runtime: small refactor to improve readability
7e5f11a5 vendor: Update containerd to 1.6.1
42771fa7 runtime: don't set socket and thread for arm/virt
8828ef41 kernel: add arm experimental kernel build support
8a9007fe config: remove 2 config as they are removed in 5.15
1b6f7401 kernel: add arm experimental patches to support vcpu hotplug and virtio-mem
f905161b runtime: mount direct-assigned block device fs only once
27fb4902 agent: add get volume stats handler in agent
ea51ef1c runtime: forward the stat and resize requests from shimv2 to kata agent
c39281ad runtime: update container creation to work with direct assigned volumes
4e00c237 agent: add grpc interface for stat and resize operations
e9b5a255 runtime: add stat and resize APIs to containerd-shim-v2
6e0090ab runtime: persist direct volume mount info
fa326b4e runtime: augment kata-runtime CLI to support direct-assigned volume
b8844fb8 versions: Upgrade to Cloud Hypervisor v22.0
af804734 clh: stop virtofsd if clh fails to boot up the vm
97951a2d clh: Don't use SharedFS with Confidential Guests
c30b3a9f clh: Adding a volume is not supported without SharedFS
f889f1f9 clh: introduce supportsSharedFS()
54d27ed7 clh: introduce loadVirtiofsDaemon()
ae2221ea clh: introduce stopVirtiofsDaemon()
e8bc26f9 clh: introduce setupVirtiofsDaemon()
413b3b47 clh: introduce createVirtiofsDaemon()
55cd0c89 runtime: Build golang components with extra security options
76e4f6a2 Revert "hypervisors: Confidential Guests do not support Device hotplug"
fa8b9392 config: qemu: Fix disable_block_device_use comments
9615c8bc config: fc: Don't expose disable_block_device_use
c1fb4bb7 snap: Don't build cloud-hypevisor on ppc64le
58913694 snap: Use git clone depth 1 for QEMU and dependencies
b27c7f40 docs: Add unit testing presentation
e64c54a2 monitor: Listen to localhost only by default
e6350d3d monitor: Fix build options
a67b93bb snap: clh: Re-use kata-deploy script here
f31125fe version: Bump cloud-hypervisor to b0324f85571c441f
54d0a672 subsystem: build
edf20766 docs: Update Readme document
eda8ea15 runtime: Gofmt fixes
4afb278f ci: add github action to exercise darwin build, unit tests
e355a718 container: file is not linux specific
b31876ee device-manager: move linux-only test to a linux-only file
6a5c6344 resourcecontrol: SystemdCgroup check is not necessarily linux specific
cc58cf69 resourcecontrol: convert stats dev_t to unit64types
5be188cc utils: Add darwin stub
ad044919 virtcontainers: Convert stats dev_t to uint64
56751089 katautils: Use a syscall wrapper for the hook JSON state
7d64ae7a runtime: Add a syscall wrapper package
abc681ca katautils: Add Darwin stub for the netNS API
de574662 config: Expand confidential_guest comments
641d475f config: clh: Use "Intel TDX" instead of just "TDX"
0bafa2de config: clh: Mention supported TEEs
81ed269e runtime: use Cmd.StdoutPipe instead of self-created pipe
8edca8bb kata-agent: Fix mismatching error of cgroup and mountinfo.
a9ba7c13 clh: Fix typo on HotplugRemoveDevice
827ab82a tools: clh: Fix unbound variable
082d538c runtime: make selinux configurable
1103f5a4 virtcontainers: Use FilesystemSharer for sharing the containers files
533c1c0e virtcontainers: Keep all filesystem sharing prep code to sandbox.go
61590bbd virtcontainers: Add a Linux implementation for the FilesystemSharer
03fc1cbd virtcontainers: Add a filesystem sharing interface
72434333 clh: Add TDX support
a13b4d5a clh: Add firmware to the config file
a8827e0c hypervisors: Confidential Guests do not support NVDIMM
f50ff9f7 hypervisors: Confidential Guests do not support Memory hotplug
df8ffecd hypervisors: Confidential Guests do not support Device hotplug
28c4c044 hypervisors: Confidential Guests do not support VCPUs hotplug
29ee870d clh: Add confidential_guest to the config file
9621c596 clh: refactor image / initrd configuration set
dcdc412e clh: use common kernel params from the hypervisor code
4c164afb versions: Update Cloud Hypervisor to 5343e09e7b8db
b2a65f90 virtcontainers: Use available s390x hugepages
cb4230e6 runtime: fix package declaration for ppc64le
fec26f8e kata-monitor: trivial: rename symbols & labels
9fd4e551 runtime: Move the resourcecontrol package one layer up
823faee8 virtcontainers: Rename the cgroups package
0d1a7da6 virtcontainers: Rename and clean the cgroup interface
ad10e201 virtcontainers: cgroups: Move non Linux routine to utils.go
d49d0b6f virtcontainers: cgroups: Define a cgroup interface
3ac52e81 kata-monitor: fix updating sandbox cache at startup
160bb621 kata-monitor: bump version to 0.3.0
1a3381b0 docs: Developer-Guide build a custom Kata agent with musl
f6fc1621 shim: log events for CRI-O
1d68a08f docs: Update contributing link
9123fc09 kata-deploy: Simplify Dockerfile and support s390x
11220f05 kata-deploy: Use (kata with) qemu as the default shim-v2 binary
3175aad5 virtiofs-nydus: add lazyload support for kata with clh
94b831eb virtcontainers: remove temp dir created for vsock in test code
8cc1b186 kernel: remove SYS_SUPPORTS_HUGETLBFS from powerpc fragments
5c9d2b41 packaging: Use `patch` for applying patches
5b3fb6f8 kernel: Build SGX as part of the vanilla kernel
2c35d8cb workflows: Stop building the experimental kernel
32e7845d snap: Build vanilla kernel for all arches
27de212f runtime: Always add network endpoints from the pod netns
1cee0a94 virtcontainers: Remove duplicated assert messages in utils test code
6c1d149a docs: Update limitations document
7c4ee6ec packaging/qemu: create no_patches file for qemu-tdx
d47c488b versions: add qemu tdx section
77c29bfd container: Remove VFIO lazy attach handling
7241d618 versions: add nydus-snapshotter
26b3f001 virtcontainers: Split hypervisor into Linux and OS agnostic bits
fa0e9dc6 virtcontainers: Make all Linux VMMs only build on Linux
c91035d0 virtcontainers: Move non QEMU specific constants to hypervisor.go
10ae0591 virtcontainers: Move guest protection definitions to hypervisor.go
b28d0274 virtcontainers: Make max vCPU config less QEMU specific
a5f6df6a govmm: Define the number of supported vCPUs per architecture
a6b40151 tools: clh: Remove unused variables
5816c132 tools: Build cloud-hypervisor with "--features tdx"
e6060cb7 versions: Linux 5.15.x
9818cf71 docs: Improve top-level and runtime README
36c3fc12 agent: support hugepages for containers
81a8baa5 runtime: add hugepages support
7df677c0 runtime: Update calculateSandboxMemory to include Hugepages Limit
948a2b09 tools: clh: Ensure the download binary is executable
72bf5496 agent: handle hook process result
80e8dbf1 agent: valid envs for hooks
4f96e3ea katautils: Pass the nerdctl netns annotation to the OCI hooks
a871a33b katautils: Run the createRuntime hooks
d9dfce14 katautils: Run the preStart hook in the host namespace
6be6d0a3 katautils: Pass the OCI annotations back to the called OCI hooks
493ebc8c utils: Update kata manager docs
34b2e67d utils: Added more kata manager cli options
714c9f56 utils: Improve containerd configuration
c464f326 utils: kata-manager: Force containerd sym link creation
4755d004 utils: Fix unused parameter
601be4e6 utils: Fix containerd installation
ae21fcc7 utils: Fix Kata tar archive check
f4d1e45c utils: Add kata-manager CLI options for kata and containerd
395cff48 docs: Remove docker run and shared memory from limitations
e07545a2 tools: clh: Allow passing down a build flag
55cdef22 tools: clh: Add the possibility to always build from sources
3f87835a utils: Switch kata manager to use getopts
4bd945b6 virtiofsd: Use "-o announce_submounts"
37df1678 build: always reset ARCH after getting it
3a641b56 katatestutils: remove distro constraints
90fd625d versions: Udpate Cloud Hypervisor to 55479a64d237
573a37b3 osbuilder: Add CentOS Stream rootfs
f10642c8 osbuilder: Source .cargo/env before checking Rust
955d359f kernel: add missing config fragment for TDx
734b618c agent-ctl: run cargo fmt/clippy in make check
12c37faf trace-forwarder: add make check for Rust
c1ce67d9 runtime: use github.com/mdlayher/vsock@v1.1.0
42a878e6 runtime: The index variable is initialized multiple times in for
1797b3eb packaging/kernel: build TDX guest kernel
98752529 versions: add url and tag for tdx kernel
bc8464e0 packaging/kernel: add option -s option
2d9f89ae feature(nydusd): add nydusd support to introduse lazyload ability
b19b6938 docs: Fix relative links in Markdown
9590874d device: Update PCIDEVICE_ environment variables for the guest
7b7f426a device: Keep host to VM PCI mapping persistently
0b2bd641 device: Rework update_spec_pci() to update_env_pci()
982f14fa runtime: support QEMU SGX
40aa43f4 docs: Update link to EFK stack docs
54e1faec scripts: fix a typo while to check build_type
07b9d93f virtcontainer: Simplify the sandbox network creation flow
2c7087ff virtcontainers: Make all endpoints Linux only
49d2cde1 virtcontainers: Split network tests into generic and OS specific parts
0269077e virtcontainers: Remove the netlink package dependency from network.go
7fca5792 virtcontainers: Unify Network endpoints management interface
c67109a2 virtcontainers: Remove the Network PostAdd method
e0b26443 virtcontainers: Define a Network interface
5e119e90 virtcontainers: Rename the Network structure fields and methods
b858d0de virtcontainers: Make all Network fields private
49eee79f virtcontainers: Remove the NetworkNamespace structure
844eb619 virtcontainers: Have CreateVM use a Network reference
d7b67a7d virtcontainers: Network API cleanups and simplifications
2edea883 virtcontainers: Make the Network structure manage endpoints
8f48e283 virtcontainers: Expand the Network structure
5ef522f7 runtime: check kvm module `sev` correctly
419d8134 snap: update qemu version to 6.1.0 for arm
00722187 docs: update Release-Process.md
496bc10d tools: check for yq before using it
88a70d32 Revert "workflows: Ensure a label change re-triggers the actions"
a9bebb31 openshift-ci: switch to CentOS Stream
89047901 kata-deploy-push: only run if PR modifying tools path
7ffe9e51 virtcontainers: Do not add a virtio-rng-ccw device
1f29478b runtime: suppport split firmware
24796d2f kata-deploy: for testing, make sure we use the PR branch
1cc1c8d0 docs: Remove images from Zun documentation
5861e52f docs: Remove Zun documentation with kata containers
903a6a45 versions: Bump critools to its 1.23 release
63eb1158 versions: bump CRI-O to its 1.23 release
5083ae65 workflows: stop checking revert commit
14e7f52a virtcontainers: Split the rootless package into OS specific parts
ab447285 kata-monitor: add kubernetes pod metadata labels to metrics
834e199e kata-monitor: drop unused functions
7516a8c5 kata-monitor: rework the sandbox cache sync with the container manager
e78d80ea kata-monitor: silently ignore CHMOD events on the sandboxes fs
e9eb34ce kata-monitor: improve debug logging
4fc4c76b agent: Fix execute_hook() args error
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
* kata-deploy / kata-cleanup: bump the release to the new one.
There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
During the release of 2.4.0-rc0 @egernst noticed an incositency in the
way we handle release tags, as release candidates are being taken as
"stable" releases, while both the kata-deploy tests and the release
action consider this as "latest".
Ideally we should have our own tag for "release candidate", but that's
something that could and should be discussed more extensively outside of
the scope of this quick fix.
For now, let's align the code generating the PR for bumping the release
with what we already do as part of the release action and kata-deploy
test, and tag "-rc" as latest, regardless of which branch it's coming
from.
Fixes: #3847
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
allocate_hugepages() writes to the kernel sysfs file to allocate hugepages
in the Kata VM. However, even if the write succeeds, it's not certain that
the kernel will actually be able to allocate as many hugepages as we
requested.
This patch reads back the file after writing it to check if we were able to
allocate all the required hugepages.
fixes#3816
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
allocate_hugepages() constructs the path for the sysfs directory containing
hugepage configuration, then attempts to create this directory if it does
not exist.
This doesn't make sense: sysfs is a view into kernel configuration, if the
kernel has support for the hugepage size, the directory will already be
there, if it doesn't, trying to create it won't help.
For the same reason, attempting to create the "nr_hugepages" file
itself is pointless, so there's no reason to call
OpenOptions::create(true).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Image-rs crate image pull/decrypt/decompression/unpack/mount
features are ready now.
With image-rs pull_image API, the downloaded container image layers
will store at IMAGE_RS_WORK_DIR, and generated bundle dir with rootfs
and config.json will be saved under CONTAINER_BASE/cid directory.
Fixes: #3860
Signed-off-by: Arron Wang <arron.wang@intel.com>
1.Update AA's launch command according to latest implementation
2.Enable get_resource port which will be used by signature verification
Fixes: #3827
Signed-off-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
- Clippy checks were introduced that cause a warning
for a function with more than 7 arguments.
The image service addition means handle_cmd
has 8 and re-factoring it would take us further
away from main, so ignore for now
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add scripts and documentation to build, configure and test
the ssh-demo encrypted image sample in Kubernetes
Fixes: #3637
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add scripts and documentation to build, configure and test
created a Kata CC unencrypted container using Kubernetes
- Switch test images to quay.io as image_rpc.rs has some
problems with docker.io?
- Update documentation to better fit the kata documentation
requirements and fix typos
- Fixes: #3511
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
General doc enchancements including:
- Change `cd`s for `pushd` and `popd`s
- Remove hard coded architectures
- Tighten up the security where we `chmod 777`
- Add support for not running as source
- Updates so it doesn't do `ctr pull` if the image is on the
local system already
- Doc and Test running as non-root user (covered by #2879)
- Update doc to match image_rpc changes
Fixes: #3549Fixes: #2879
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add scripts and documentation to build, configure and test
created a Kata CC unencrypted container using crictl
- Update documentation to better fit the kata documentation requirements
- Fixes: #3510
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Push the demo image to `quay.io/confidential-containers/runtime-payload`
(which, as opposed to `.../kata-demo`, existed all along).
Fixes: #3345
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Basic config, no debug endpoints, no exec/reseed. Uses the
`$AA_KBC_PARAMS` variable to be used with `envsubst`.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Run `realpath` on `INCLUDE_ROOTFS` so it is not required to provide a
full path. This simplifies the required GitHub Actions workflow, as
GitHub's `env` cannot use shell expansions, as well as the usability
overall.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
With the new rust image pull service skopeo we can parameterise whether to build
and install skopeo and turn it off by default if we don't need
signature verification support
Fixes: #3170
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
With the new rust image pull service skopeo we can parameterise whether to build
and install skopeo and turn it off by default if we don't need
signature verification support
Fixes: #3170
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The initrd build process now supports facultatively installing Skopeo
while still installing Umoci. Mirror this change in the respective
kata-deploy process.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Integrate EAA KBC into ubuntu rootfs image.
Fix build failure if build with AA_KBC=eaa_kbc option.
Fixes: #3167
Signed-off-by: zhouliang121 <liang.a.zhou@linux.alibaba.com>
The CI runner fails to clone the git crates as it probably is confused
about its CARGO_HOME value. That prevents vendoring to succeed as the
runner has nothing to copy over to the vendoring code.
We fix that by temporarily setting CARGO_HOME to tmpfs, only for the
vendoring step. It's hackish.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
We use tonic to build GRPC client to talk with attestation agent,
and tonic require newer version of rust.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
':' will have special meaning for umoci during upack, then we
do not use it as part of the image store path
Signed-off-by: Arron Wang <arron.wang@intel.com>
For the initrd build, add makeopts for $SKOPEO_UMOCI and $AA_KBC. Use
the $INCLUDE_ROOTFS variable to specify a directory of files that should
be recursively merged into the guest.
Fixes: #3126
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
As discussed in #2908, Ubuntu is used as a base for CCv0 for building
umoci in the guest. Currently, CCv0 only works with initrd, so this only
applies to initrd.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Some generated merge commit messages are >75 chars
Allow these to not trigger the subject line length failure
Fixes: #3132
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Weekly merge of main branch into CCv0 26th November
Fixes: #3132
Depends-on: github.com/kata-containers/tests#4226
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Pull an encrypted image using the Attestation Agent as
a keyprovider.
Fixes: #3022
Signed-off-by: Tobin Feldman-Fitzthum <tobin@linux.ibm.com>
Co-authored-by: Jakob Naucke <jakob.naucke@ibm.com>
Add the "Merge pull request (kata-containers)?#<x> from" message to the
subsystem check to allow commit check on merges between branches to work
Fixes: #3085
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add new agent configuration policy path parameter
- Update agent pull image to use the policy path if specified and
otherwise fall back to the accept all policy
- Remove the double copy of the image during pulling
- Ensure that temporary directories are always removed
Fixes: #2682
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Document how to test the signature validation with
a number of different scenarios and test images
- Update ccv0.sh to add policy_path to kernel_params
Fixes: #2682
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add hardcoded gpg, signature and polict files
- Modify rootfs.sh to put these in the correct place in the kata image
if skopeo and umoci are being used
Fixes: #2682
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We do not get a root filesystem path from the agent when creating a
new container for which the container image was not pulled by
containerd. That prevents the agent from creating the container.
To fix that, we populate the container root path with the internal
rootfs path by fetching the containerd added image name annotation and
mapping it back to a path through our image hash map.
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
In the confidential computing scenario, there is no Image
information on the host, so skip handling Rootfs at
CreateContainer.
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
- Replace containerd to `confidential-containers/containerd` in go.mod
- Use separate ImageService to support PullImage
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
The version `v1.6.0-beta.2` released support for shim service,
which is needed for our implementation of ImageService.
Fixes#3009
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Since the `utils::get_option` interface is modified,
PullImage needs to adapt to this modification in CCv0 branch.
Fixes#3044
Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
When the environment variable $SKOPEO_UMOCI is set to "yes", Skopeo and
umoci are built inside the guest build container and installed to the
guest rootfs. The respective build- and runtime dependencies are added.
This respects the (existing) $LIBC variable (gnu/musl) and avoids issues
with glibc mismatches.
This is currently only supported for Ubuntu guests, as the system Golang
packages included in the versions of other distros that we use are too
old to build these packages, and re-enabling installing Golang from
golang.org is cumbersome, given especially that it is unclear how long
we will keep using Skopeo and umoci.
Additionally, when the environment variable $AA_KBC is set,
attestation-agent (with that KBC) is included.
This replaces some logic in ccv0.sh that is removed.
Fixes: #2907
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
- Update CCv0 to use the new confidential containers fork of containerd
- Start using the current-CCv0 branch
Fixes#2947
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add a new PullImage endpoint to the shim API.
Add new PullImage functions to the virtcontainers files, which allows
the PullImage endpoint on the agent to be called.
Update the containerd vendor files to support new PullImage API changes.
Fixes#2651
Signed-off-by: Dave Hay <david_hay@uk.ibm.com>
Co-authored-by: ashleyrobertson <ashleyro@uk.ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Just in case someone thinks about using kata-deploy directly from this
branch, let's point to the `-cc:v0`image.
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
This is a dirty hack, that we should expand later so we can pass one or
n number of repos where we'll upload our images, and use it as part of
the release scripts.
For now, however, let's just do this quick & dirty hack so we can
present the CCv0 demo using the operator, even knowing that the
kubernetes part of the work is not done yet and that the demo itself
will be done connecting to a node and doing all the shenanigans
manually.
Fixes: #2854
Signed-off-by: Fabiano Fidêncio <fabiano@fidencio.org>
- Update the CCv0 demo script to use ubuntu instead of fedora
- Update the extra packages to reflect the apt vs dnf namings
- Build and add the skopeo binary to the rootfs image
- Minor kubernetes init fix
Fixes#2849
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Revert fedora OS changes made in #2556 as we aren't using it anymore.
- They should be done in main under #2116Fixes#2849
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add skopeo --insecure-policy tag to reflect that ubuntu doesn't
create a default container policy
Fixes#2849
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add source credentials field to pull_image endpoint
If field is not blank, send to skopeo in image pull command
Add source_creds to agentl-ctl pull command
Fixes: #2653
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add support for a new source credentials environment variable in the
test script
Add documentation of it into the how-to guide
Fixes#2653
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
- Add a check in setup_bundle to see if the bundle already exists
and if it does then skip the setup.
Fixes: #2617
Co-authored-by: Dave Hay <david_hay@uk.ibm.com>
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Add a pr_id field to the cri-containerd config in versions.yaml
so the CI scripts can use this in the CCv0 builds
Fixes#2576
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This commit add documentation and a script to help people to build, run,
test and demo the CCv0 changes around PullImage on guest.
It is currently limited to the Agent pullimage, but can be expanded
as more code is shared.
Fixes#2574
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
This commit adds the PullImge endpoint to the agent
and the agent-ctl command to test it.
Fixes: #2509
Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
2021-11-05 14:49:20 +00:00
1121 changed files with 301551 additions and 6468 deletions
# Virtual machine vCPU sizing in Kata Containers 3.0
> Preview:
> [Kubernetes(since 1.23)][1] and [Containerd(since 1.6.0-beta4)][2] will help calculate `Sandbox Size` info and pass it to Kata Containers through annotations.
> In order to adapt to this beneficial change and be compatible with the past, we have implemented the new vCPUs handling way in `runtime-rs`, which is slightly different from the original `runtime-go`'s design.
## When do we need to handle vCPUs size?
vCPUs sizing should be determined by the container workloads. So throughout the life cycle of Kata Containers, there are several points in time when we need to think about how many vCPUs should be at the time. Mainly including the time points of `CreateVM`, `CreateContainer`, `UpdateContainer`, and `DeleteContainer`.
*`CreateVM`: When creating a sandbox, we need to know how many vCPUs to start the VM with.
*`CreateContainer`: When creating a new container in the VM, we may need to hot-plug the vCPUs according to the requirements in container's spec.
*`UpdateContainer`: When receiving the `UpdateContainer` request, we may need to update the vCPU resources according to the new requirements of the container.
*`DeleteContainer`: When a container is removed from the VM, we may need to hot-unplug the vCPUs to reclaim the vCPU resources introduced by the container.
## On what basis do we calculate the number of vCPUs?
When Kata calculate the number of vCPUs, We have three data sources, the `default_vcpus` and `default_maxvcpus` specified in the configuration file (named `TomlConfig` later in the doc), the `io.kubernetes.cri.sandbox-cpu-quota` and `io.kubernetes.cri.sandbox-cpu-period` annotations passed by the upper layer runtime, and the corresponding CPU resource part in the container's spec for the container when `CreateContainer`/`UpdateContainer`/`DeleteContainer` is requested.
Our understanding and priority of these resources are as follows, which will affect how we calculate the number of vCPUs later.
* From `TomlConfig`:
*`default_vcpus`: default number of vCPUs when starting a VM.
*`default_maxvcpus`: maximum number of vCPUs.
* From `Annotation`:
*`InitialSize`: we call the size of the resource passed from the annotations as `InitialSize`. Kubernetes will calculate the sandbox size according to the Pod's statement, which is the `InitialSize` here. This size should be the size we want to prioritize.
* From `Container Spec`:
* The amount of CPU resources that the Container wants to use will be declared through the spec. Including the aforementioned annotations, we mainly consider `cpu quota` and `cpuset` when calculating the number of vCPUs.
*`cpu quota`: `cpu quota` is the most common way to declare the amount of CPU resources. The number of vCPUs introduced by `cpu quota` declared in a container's spec is: `vCPUs = ceiling( quota / period )`.
*`cpuset`: `cpuset` is often used to bind the CPUs that tasks can run on. The number of vCPUs may introduced by `cpuset` declared in a container's spec is the number of CPUs specified in the set that do not overlap with other containers.
## How to calculate and adjust the vCPUs size:
There are two types of vCPUs that we need to consider, one is the number of vCPUs when starting the VM (named `Boot Size` in the doc). The second is the number of vCPUs when `CreateContainer`/`UpdateContainer`/`DeleteContainer` request is received (`Real-time Size` in the doc).
### `Boot Size`
The main considerations are `InitialSize` and `default_vcpus`. There are the following principles:
`InitialSize` has priority over `default_vcpus` declared in `TomlConfig`.
1. When there is such an annotation statement, the originally `default_vcpus` will be modified to the number of vCPUs in the `InitialSize` as the `Boot Size`. (Because not all runtimes support this annotation for the time being, we still keep the `default_cpus` in `TomlConfig`.)
2. When the specs of all containers are aggregated for sandbox size calculation, the method is consistent with the calculation method of `InitialSize` here.
### `Real-time Size`
When we receive an OCI request, it may be for a single container. But what we have to consider is the number of vCPUs for the entire VM. So we will maintain a list. Every time there is a demand for adjustment, the entire list will be traversed to calculate a value for the number of vCPUs. In addition, there are the following principles:
1. Do not cut computing power and try to keep the number of vCPUs specified by `InitialSize`.
* So the number of vCPUs after will not be less than the `Boot Size`.
2.`cpu quota` takes precedence over `cpuset` and the setting history are took into account.
* We think quota describes the CPU time slice that a cgroup can use, and `cpuset` describes the actual CPU number that a cgroup can use. Quota can better describe the size of the CPU time slice that a cgroup actually wants to use. The `cpuset` only describes which CPUs the cgroup can use, but the cgroup can use the specified CPU but consumes a smaller time slice, so the quota takes precedence over the `cpuset`.
* On the one hand, when both `cpu quota` and `cpuset` are specified, we will calculate the number of vCPUs based on `cpu quota` and ignore `cpuset`. On the other hand, if `cpu quota` was used to control the number of vCPUs in the past, and only `cpuset` was updated during `UpdateContainer`, we will not adjust the number of vCPUs at this time.
3.`StaticSandboxResourceMgmt` controls hotplug.
* Some VMMs and kernels of some architectures do not support hotplugging. We can accommodate this situation through `StaticSandboxResourceMgmt`. When `StaticSandboxResourceMgmt = true` is set, we don't make any further attempts to update the number of vCPUs after booting.
- [How to run Kata Containers with `nydus`](how-to-use-virtio-fs-nydus-with-kata.md)
- [How to run Kata Containers with AMD SEV-SNP](how-to-run-kata-containers-with-SNP-VMs.md)
- [How to use EROFS to build rootfs in Kata Containers](how-to-use-erofs-build-rootfs.md)
- [How to run Kata Containers with kinds of Block Volumes](how-to-run-kata-containers-with-kinds-of-Block-Volumes.md)
## Confidential Containers
- [How to use build and test the Confidential Containers `CCv0` proof of concept](how-to-build-and-test-ccv0.md)
- [How to generate a Kata Containers payload for the Confidential Containers Operator](how-to-generate-a-kata-containers-payload-for-the-confidential-containers-operator.md)
which should only show a single `rootfs` directory if the container image was pulled on the guest, not the host
- Looking that `rootfs` directory with
```bash
$ sudo ls -ltr $(sudo find /run/kata-containers/shared/sandboxes/${pod_id}/shared -name rootfs)
```
shows something similar to
```
total 668
-rwxr-xr-x 1 root root 682696 Aug 25 13:58 pause
drwxr-xr-x 2 root root 6 Jan 20 02:01 proc
drwxr-xr-x 2 root root 6 Jan 20 02:01 dev
drwxr-xr-x 2 root root 6 Jan 20 02:01 sys
drwxr-xr-x 2 root root 25 Jan 20 02:01 etc
```
which is clearly the pause container indicating that the `busybox` based container image is not exposed to the host.
### Using a Kata pod sandbox for testing with `agent-ctl` or `ctr shim`
Once you have a kata pod sandbox created as described above, either using
[`crictl`](#using-crictl-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image), or [Kubernetes](#using-kubernetes-for-end-to-end-provisioning-of-a-kata-confidential-containers-pod-with-an-unencrypted-image)
, you can use this to test specific components of the Kata confidential
containers architecture. This can be useful for development and debugging to isolate and test features
that aren't broadly supported end-to-end. Here are some examples:
- In the first terminal run the pull image on guest command against the Kata agent, via the shim (`containerd-shim-kata-v2`).
This can be achieved using the [containerd](https://github.com/containerd/containerd) CLI tool, `ctr`, which can be used to
interact with the shim directly. The command takes the form
`ctr --namespace k8s.io shim --id <sandbox-id> pull-image <image> <new-container-id>` and can been run directly, or through
the `ccv0.sh` script to automatically fill in the variables:
- Optionally, set up some environment variables to set the image and credentials used:
- By default the shim pull image test in `ccv0.sh` will use the `busybox:1.33.1` based test image
`quay.io/kata-containers/confidential-containers:signed` which requires no authentication. To use a different
image, set the `PULL_IMAGE` environment variable e.g.
{"msg":"Command PullImage (1 of 1) returned (Ok(()), false)","level":"INFO","ts":"2021-09-15T08:40:43.828261708-07:00","subsystem":"rpc","pid":"830920","source":"kata-agent-ctl","version":"0.1.0","name":"kata-agent-ctl"}
```
> **Note**: The first time that `~/ccv0.sh agent_pull_image` is run, the `agent-ctl` tool will be built
which may take a few minutes.
- To validate that the image pull was successful, you can open a shell into the Kata guest with:
```bash
$ ~/ccv0.sh open_kata_shell
```
- Check the `/run/kata-containers/` directory to verify that the container image bundle has been created in a directory
named either `01234556789` (for the container id), or the container image name, e.g.
```bash
$ ls -ltr /run/kata-containers/confidential-containers_signed/
```
which should show something like
```
total 72
drwxr-xr-x 10 root root 200 Jan 1 1970 rootfs
-rw-r--r-- 1 root root 2977 Jan 20 16:45 config.json
```
- Leave the Kata shell by running:
```bash
$ exit
```
## Verifying signed images
For this sample demo, we use local attestation to pass through the required
configuration to do container image signature verification. Due to this, the ability to verify images is limited
to a pre-created selection of test images in our test
For pulling images not in this test repository (called an *unprotected* registry below), we fall back to the behaviour
of not enforcing signatures. More documentation on how to customise this to match your own containers through local,
or remote attestation will be available in future.
In our test repository there are three tagged images:
| Test Image | Base Image used | Signature status | GPG key status |
| --- | --- | --- | --- |
| `quay.io/kata-containers/confidential-containers:signed` | `busybox:1.33.1` | [signature](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/signatures.tar) embedded in kata rootfs | [public key](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/public.gpg) embedded in kata rootfs |
| `quay.io/kata-containers/confidential-containers:unsigned` | `busybox:1.33.1` | not signed | not signed |
| `quay.io/kata-containers/confidential-containers:other_signed` | `nginx:1.21.3` | [signature](https://github.com/kata-containers/tests/tree/CCv0/integration/confidential/fixtures/quay_verification/x86_64/signatures.tar) embedded in kata rootfs | GPG key not kept |
Using a standard unsigned `busybox` image that can be pulled from another, *unprotected*, `quay.io` repository we can
test a few scenarios.
In this sample, with local attestation, we pass in the the public GPG key and signature files, and the [`offline_fs_kbc`
- Again this results in an error message from `crictl`:
`"PullImage from image service failed" err="rpc error: code = Internal desc = Security validate failed: Validate image failed: The signatures do not satisfied! Reject reason: [signature verify failed! There is no pubkey can verify the signature!]" image="quay.io/kata-containers/confidential-containers:other_signed"`
### Using Kubernetes to create a Kata confidential containers pod from the encrypted ssh demo sample image
The [ssh-demo](https://github.com/confidential-containers/documentation/tree/main/demos/ssh-demo) explains how to
demonstrate creating a Kata confidential containers pod from an encrypted image with the runtime created by the
# A new way for Kata Containers to use Kinds of Block Volumes
> **Note:** This guide is only available for runtime-rs with default Hypervisor Dragonball.
> Now, other hypervisors are still ongoing, and it'll be updated when they're ready.
## Background
Currently, there is no widely applicable and convenient method available for users to use some kinds of backend storages, such as File on host based block volume, SPDK based volume or VFIO device based volume for Kata Containers, so we adopt [Proposal: Direct Block Device Assignment](https://github.com/kata-containers/kata-containers/blob/main/docs/design/direct-blk-device-assignment.md) to address it.
## Solution
According to the proposal, it requires to use the `kata-ctl direct-volume` command to add a direct assigned block volume device to the Kata Containers runtime.
And then with the help of method [get_volume_mount_info](https://github.com/kata-containers/kata-containers/blob/099b4b0d0e3db31b9054e7240715f0d7f51f9a1c/src/libs/kata-types/src/mount.rs#L95), get information from JSON file: `(mountinfo.json)` and parse them into structure [Direct Volume Info](https://github.com/kata-containers/kata-containers/blob/099b4b0d0e3db31b9054e7240715f0d7f51f9a1c/src/libs/kata-types/src/mount.rs#L70) which is used to save device-related information.
We only fill the `mountinfo.json`, such as `device` ,`volume_type`, `fs_type`, `metadata` and `options`, which correspond to the fields in [Direct Volume Info](https://github.com/kata-containers/kata-containers/blob/099b4b0d0e3db31b9054e7240715f0d7f51f9a1c/src/libs/kata-types/src/mount.rs#L70), to describe a device.
The JSON file `mountinfo.json` placed in a sub-path `/kubelet/kata-test-vol-001/volume001` which under fixed path `/run/kata-containers/shared/direct-volumes/`.
And the full path looks like: `/run/kata-containers/shared/direct-volumes/kubelet/kata-test-vol-001/volume001`, But for some security reasons. it is
encoded as `/run/kata-containers/shared/direct-volumes/L2t1YmVsZXQva2F0YS10ZXN0LXZvbC0wMDEvdm9sdW1lMDAx`.
Finally, when running a Kata Containers with `ctr run --mount type=X, src=Y, dst=Z,,options=rbind:rw`, the `type=X` should be specified a proprietary type specifically designed for some kind of volume.
Now, supported types:
-`directvol` for direct volume
-`vfiovol` for VFIO device based volume
-`spdkvol` for SPDK/vhost-user based volume
## Setup Device and Run a Kata-Containers
### Direct Block Device Based Volume
#### create raw block based backend storage
> **Tips:** raw block based backend storage MUST be formatted with `mkfs`.
As `/run/kata-containers/shared/direct-volumes/` is a fixed path , we will be able to run a kata pod with `--mount` and set
`src` sub-path. And the `--mount` argument looks like: `--mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001`.
#### Run a Kata container with SPDK vhost-user block device
In the case, `ctr run --mount type=X, src=source, dst=dest`, the X will be set `spdkvol` which is a proprietary type specifically designed for SPDK volumes.
```bash
$ # ctr run with --mount type=spdkvol,src=/kubelet/kata-test-vol-001/volume001,dst=/disk001
| [Using kata-deploy](#kata-deploy-installation) | The preferred way to deploy the Kata Containers distributed binaries on a Kubernetes cluster | **No!** | Best way to give it a try on kata-containers on an already up and running Kubernetes cluster. |
| [Using official distro packages](#official-packages) | Kata packages provided by Linux distributions official repositories | yes | Recommended for most users. |
| ~~[Using snap](#snap-installation)~~ | ~~Easy to install~~ | ~~yes~~ | **Snap is unmaintained!**~~Good alternative to official distro packages.~~ |
| [Automatic](#automatic-installation) | Run a single command to install a full system | **No!** | For those wanting the latest release quickly. |
| [Manual](#manual-installation) | Follow a guide step-by-step to install a working system | **No!** | For those who want the latest release with more control. |
| [Build from source](#build-from-source-installation) | Build the software components manually | **No!** | Power users and developers only. |
@@ -42,27 +41,6 @@ Kata packages are provided by official distribution repositories for:
| [CentOS](centos-installation-guide.md) | 8 |
| [Fedora](fedora-installation-guide.md) | 34 |
### Snap Installation
> **WARNING:**
>
> The Snap package method is **unmaintained** and only provides an old
> version of Kata Containers:
> The [latest Kata Containers snap](https://snapcraft.io/kata-containers)
| [Using kata-deploy](#kata-deploy-installation) | The preferred way to deploy the Kata Containers distributed binaries on a Kubernetes cluster | **No!** | Best way to give it a try on kata-containers on an already up and running Kubernetes cluster. | Yes |
| [Using official distro packages](#official-packages) | Kata packages provided by Linux distributions official repositories | yes | Recommended for most users. | No |
| [Using snap](#snap-installation) | Easy to install | yes | Good alternative to official distro packages. | No |
| [Automatic](#automatic-installation) | Run a single command to install a full system | **No!** | For those wanting the latest release quickly. | No |
| [Manual](#manual-installation) | Follow a guide step-by-step to install a working system | **No!** | For those who want the latest release with more control. | No |
| [Build from source](#build-from-source-installation) | Build the software components manually | **No!** | Power users and developers only. | Yes |
@@ -36,8 +35,6 @@ architectures:
Follow the [`kata-deploy`](../../tools/packaging/kata-deploy/README.md).
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.