When a storage device is used by more than one container, the second
and forth instances will cause storage device reference count leakage,
thus cause storage device leakage. The reason is:
add_storages() will increase reference count of existing storage device,
but forget to add the device to the `mount_list` array, thus leak the
reference count.
Fixes: #7820
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Today I learned, I must say.
When running a basic script, such as:
```bash
#/usr/bin/env bash
set -o errexit
set -o pipefail
set -o errtrace
cat junk && echo "hello"
echo "didn't fail"
cat junk
echo "hello"
echo "didn't fail"
```
One will get as a result:
```bash
cat: junk: No such file or directory
didn't fail
cat: junk: No such file or directory
```
Meaning that although there was an error on `cat junk && echo "hello"`,
and the `echo "hello"` part was not executed, an error was not reported
for that failure.
On the second part, though, it just breaks and returns an error as
expected.
Small scripts aside, this is exactly what was happening with the
attestation-agent, where a `make ... && make install ...` was being
called, make was failing but not actually breaking the script.
Let's change the logic and avoid such situations in the future, as it
caused our CI to be broken for quite some time without a simple way to
detect that line in the huge amount of logs left behind.
Here goes a reference to the documentation:
```
-e Exit immediately if a pipeline (which may consist
of a single simple command), a list, or a compound
command (see SHELL GRAMMAR above), exits with a
non-zero status. The shell does not exit if the
command that fails is part of the command list
immediately following a while or until keyword,
part of the test following the if or elif reserved
words, part of any command executed in a && or ||
list except the command following the final && or
||, any command in a pipeline but the last, or if
the command's return value is being inverted with
!. If a compound command other than a subshell
returns a non-zero status because a command failed
while -e was being ignored, the shell does not
exit. A trap on ERR, if set, is executed before
the shell exits. This option applies to the shell
environment and each subshell environment
separately (see COMMAND EXECUTION ENVIRONMENT
above), and may cause subshells to exit before
executing all the commands in the subshell.
If a compound command or shell function executes
in a context where -e is being ignored, none of
the commands executed within the compound command
or function body will be affected by the -e
setting, even if -e is set and a command returns a
failure status. If a compound command or shell
function sets -e while executing in a context
where -e is ignored, that setting will not have
any effect until the compound command or the
command containing the function call completes.
```
This comes from https://www.man7.org/linux/man-pages/man1/bash.1.htmlFixes: #7793
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Bump image-rs and attestation-agent to use the latest guest-components
with the rust clap version fix
Fixes: #7580
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Refine storage related code by:
- remove the STORAGE_HANDLER_LIST
- define type alias
- move code near to its caller
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Introduce StorageDevice and StorageHandlerManager, which will be used
to refine storage device management for kata-agent.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Simplify the way to manage storage objects, and introduce
StorageStateCommon structures for coming extensions.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
docker install now creates a group with gid 999 which happens to match what we
need to get docker-in-docker to work. Remove the group first as we don't need
it.
Fixes: #7726
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
(cherry picked from commit 3b881fbc0e)
The directory is a host path mount and cannot be removed from within the
container. What we actually want to remove is whatever is inside that
directory.
This may raise errors like:
```
rm: cannot remove '/opt/kata/': Device or resource busy
```
Fixes: #7746
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Introduce structure KataVirtualVolume to to encapsulate information
for extra mount options and direct volumes, so we could build a common
infrastructure to handle these cases.
Fixes: #7699
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
We can simply use `rm -f` all over the place and avoid the container
returning any error.
Fixes: #7733
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 5cba38c175)
Move image service related code into image-rpc.rs, to simplify
maintenance.
Fixes: #7633
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Co-authored-by: wllenyj <wllenyj@linux.alibaba.com>
Co-authored-by: jordan9500 <jordan.jackson@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.
Fixes: #7681
Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
Without this library the builds are failing with the following error:
```
...
error: failed to run custom build command for `devicemapper-sys v0.1.5`
Caused by: process didn't exit successfully:
`/kata-containers/src/agent/target/release/build/devicemapper-sys-d8eae524a127e049/build-script-build`
(exit status: 101) --- stderr thread 'main' panicked at 'Unable to
find libclang: "couldn't find any valid shared libraries matching:
['libclang.so', 'libclang-*.so', 'libclang.so.*', 'libclang-*.so.*'],
set the `LIBCLANG_PATH` environment variable to a path where one of
these files can be found (invalid: [])"',
/root/.cargo/registry/src/github.com-1ecc6299db9ec823/bindgen-0.63.0/./lib.rs:2338:31
```
Fixes: #7580
Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
After image-rs added the image-block-device integrity check using
dm-verity a new dependency is now needed, so install that.
Refer the following PR for more information:
https://github.com/confidential-containers/guest-components/pull/270Fixes: #7580
Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Add k0s support to kata-deploy, in the very same way kata-containers
already supports k3s, and rke2.
k0s support requires v1.27.1, which is noted as part of the kata-deploy
documentation, as it's the way to use dynamic configuration on
containerd CRI runtimes.
This support will only be part of the `main` branch, as it's not a bug
fix that can be backported to the `stable-3.2` branch, and this is also
noted as part of the documentation.
Fixes: #7548
Signed-off-by: Steve Fan <29133953+stevefan1999-personal@users.noreply.github.com>
The GHA runners are not exactly powerful, which makes the static-checks
take way too long (almost an hour).
Let's give a try and move those to the same size of Azure instances used
as part of our CI, and probably have this time reduced.
Fixes: #7446
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>