Compare commits

...

36 Commits

Author SHA1 Message Date
Fabiano Fidêncio
0ad6f05dee Merge pull request #4024 from bergwolf/2.4.0-branch-bump
# Kata Containers 2.4.0
2022-04-01 13:46:35 +02:00
Peng Tao
4c9c01a124 release: Kata Containers 2.4.0
- stable-2.4 | agent: fix container stop error with signal SIGRTMIN+3
- stable-2.4 | kata-monitor: fix duplicated output when printing usage
- stable-2.4 | runtime: Stop getting OOM events from agent for "ttrpc closed" error
- kata-deploy: fix version bump from -rc to stable
- stable-2.4: release: Include all the rust vendored code into the vendored tarball
- stable-2.4 | tools: release: Do not consider release candidates as stable releases
- agent: Signal the whole process group
- stable-2.4 | docs: Update k8s documentation
- backport main commits to stable 2.4
- stable-2.4: Bump QEMU to 6.2 (bringing then SGX support in)
- runtime: Properly handle ESRCH error when signaling container
- stable-2.4 | versions: Upgrade to Cloud Hypervisor v22.1

f2319d69 release: Adapt kata-deploy for 2.4.0
cae48e9c agent: fix container stop error with signal SIGRTMIN+3
342aa95c kata-monitor: fix duplicated output when printing usage
9f75e226 runtime: add logs around sandbox monitor
363fbed8 runtime: stop getting OOM events when ttrpc: closed error
f840de5a workflows,release: Ship *all* the rust vendored code
952cea5f tools: Add a generate_vendor.sh script
cc965fa0 kata-deploy: fix version bump from -rc to stable
f41cc184 tools: release: Do not consider release candidates as stable releases
e059b50f runtime: Add more debug logs for container io stream copy
71ce6f53 agent: Kill the all the container processes of the same cgroup
30fc2c86 docs: Update k8s documentation
24028969 virtcontainers: Run mock hook from build tree rather than system bin dir
4e54aa5a doc: fix filename typo
d815393c manager: Add options to change self test behaviour
4111e1a3 manager: Add option to enable component debug
2918be18 manager: Create containerd link
6b31b068 kernel: fix cve-2022-0847
5589b246 doc: update Intel SGX use cases document
1da88dca tools: update QEMU to 6.2
3e2f9223 runtime: Properly handle ESRCH error when signaling container
4c21cb3e versions: Upgrade to Cloud Hypervisor v22.1

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-04-01 06:20:20 +00:00
Peng Tao
f2319d693d release: Adapt kata-deploy for 2.4.0
kata-deploy files must be adapted to a new release.  The cases where it
happens are when the release goes from -> to:
* main -> stable:
  * kata-deploy-stable / kata-cleanup-stable: are removed

* stable -> stable:
  * kata-deploy / kata-cleanup: bump the release to the new one.

There are no changes when doing an alpha release, as the files on the
"main" branch always point to the "latest" and "stable" tags.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-04-01 06:20:20 +00:00
Bin Liu
98ccf8f6a1 Merge pull request #4008 from wxx213/stable-2.4
stable-2.4 | agent: fix container stop error with signal SIGRTMIN+3
2022-04-01 11:29:18 +08:00
Wang Xingxing
cae48e9c9b agent: fix container stop error with signal SIGRTMIN+3
The nix::sys::signal::Signal package api cannot deal with SIGRTMIN+3,
directly use libc function to send the signal.

Fixes: #3990

Signed-off-by: Wang Xingxing <stellarwxx@163.com>
(cherry picked from commit 0d765bd082)
Signed-off-by: Wang Xingxing <stellarwxx@163.com>
2022-03-31 16:49:06 +08:00
snir911
a36103c759 Merge pull request #4003 from fgiudici/kata-monitor_fix_help_backport
stable-2.4 | kata-monitor: fix duplicated output when printing usage
2022-03-30 18:57:17 +03:00
Fabiano Fidêncio
6abbcc551c Merge pull request #3997 from liubin/backport-2.4
stable-2.4 | runtime: Stop getting OOM events from agent for "ttrpc closed" error
2022-03-30 14:08:55 +02:00
Francesco Giudici
342aa95cc8 kata-monitor: fix duplicated output when printing usage
(default: "/run/containerd/containerd.sock") is duplicated when
printing kata-monitor usage:

[root@kubernetes ~]# kata-monitor --help
Usage of kata-monitor:
  -listen-address string
        The address to listen on for HTTP requests. (default ":8090")
  -log-level string
        Log level of logrus(trace/debug/info/warn/error/fatal/panic). (default "info")
  -runtime-endpoint string
        Endpoint of CRI container runtime service. (default: "/run/containerd/containerd.sock") (default "/run/containerd/containerd.sock")

the golang flag package takes care of adding the defaults when printing
usage. Remove the explicit print of the value so that it would not be
printed on screen twice.

Fixes: #3998

Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit a63bbf9793)
2022-03-30 14:02:54 +02:00
bin
9f75e226f1 runtime: add logs around sandbox monitor
For debugging purposes, add some logs.

Fixes: #3815

Signed-off-by: bin <bin@hyper.sh>
2022-03-30 17:11:40 +08:00
bin
363fbed804 runtime: stop getting OOM events when ttrpc: closed error
getOOMEvents is a long-waiting call, it will retry when failed.
For cases of agent shutdown, the retry should stop.

When the agent hasn't detected agent has died, we can also check
whether the error is "ttrpc: closed".

Fixes: #3815

Signed-off-by: bin <bin@hyper.sh>
2022-03-30 17:11:35 +08:00
Fabiano Fidêncio
54a638317a Merge pull request #3988 from bergwolf/github/kata-deploy
kata-deploy: fix version bump from -rc to stable
2022-03-30 11:01:45 +02:00
Peng Tao
8ce6b12b41 Merge pull request #3993 from fidencio/wip/stable-2.4-release-include-all-rust-vendored-code-to-the-vendored-tarball
stable-2.4: release: Include all the rust vendored code into the vendored tarball
2022-03-30 16:10:47 +08:00
Fabiano Fidêncio
f840de5acb workflows,release: Ship *all* the rust vendored code
Instead of only vendoring the code needed by the agent, let's ensure we
vendor all the needed rust code, and let's do it using the newly
introduced enerate_vendor.sh script.

Fixes: #3973

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3606923ac8)
2022-03-29 23:27:43 +02:00
Fabiano Fidêncio
952cea5f5d tools: Add a generate_vendor.sh script
This script is responsible for generating a tarball with all the rust
vendored code that is needed for fully building kata-containers on a
disconnected environment.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2eb07455d0)
2022-03-29 23:27:29 +02:00
Peng Tao
cc965fa0cb kata-deploy: fix version bump from -rc to stable
In such case, we should bump from "latest" tag rather than from
current_version.

Fixes: #3986
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-03-29 03:45:27 +00:00
GabyCT
44b1473d0c Merge pull request #3977 from fidencio/wip/backport-fix-for-3847
stable-2.4 | tools: release: Do not consider release candidates as stable releases
2022-03-28 10:38:47 -06:00
Fupan Li
565efd1bf2 Merge pull request #3975 from bergwolf/github/backport-stable-2.4
agent: Signal the whole process group
2022-03-28 18:26:12 +08:00
Fabiano Fidêncio
f41cc18427 tools: release: Do not consider release candidates as stable releases
During the release of 2.4.0-rc0 @egernst noticed an incositency in the
way we handle release tags, as release candidates are being taken as
"stable" releases, while both the kata-deploy tests and the release
action consider this as "latest".

Ideally we should have our own tag for "release candidate", but that's
something that could and should be discussed more extensively outside of
the scope of this quick fix.

For now, let's align the code generating the PR for bumping the release
with what we already do as part of the release action and kata-deploy
test, and tag "-rc"  as latest, regardless of which branch it's coming
from.

Fixes: #3847

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 4adf93ef2c)
2022-03-28 11:01:58 +02:00
Feng Wang
e059b50f5c runtime: Add more debug logs for container io stream copy
This can help debugging container lifecycle issues

Fixes: #3913

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-03-28 16:22:22 +08:00
Feng Wang
71ce6f537f agent: Kill the all the container processes of the same cgroup
Otherwise the container process might leak and cause an unclean exit

Fixes: #3913

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-03-28 16:21:51 +08:00
Bin Liu
a2b73b60bd Merge pull request #3960 from cmaf/update-k8s-docs-1-stable-2.4
stable-2.4 | docs: Update k8s documentation
2022-03-25 15:25:25 +08:00
Bin Liu
2ce9ce7b8f Merge pull request #3954 from bergwolf/github/backport-stable-2.4
backport main commits to stable 2.4
2022-03-25 14:45:17 +08:00
Chelsea Mafrica
30fc2c863d docs: Update k8s documentation
Update documentation with missing step to untaint node to enable
scheduling and update the example to run a pod using the kata runtime
class instead of untrusted workloads, which applies to versions of CRI-O
prior to v1.12.

Fixes #3863

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
(cherry picked from commit 5c434270d1)
2022-03-24 11:22:18 -07:00
David Gibson
24028969c2 virtcontainers: Run mock hook from build tree rather than system bin dir
Running unit tests should generally have minimal dependencies on
things outside the build tree.  It *definitely* shouldn't modify
system wide things outside the build tree.  Currently the runtime
"make test" target does so, though.

Several of the tests in src/runtime/pkg/katautils/hook_test.go require a
sample hook binary.  They expect this hook in
/usr/bin/virtcontainers/bin/test/hook, so the makefile, as root, installs
the test binary to that location.

Go tests automatically run within the package's directory though, so
there's no need to use a system wide path.  We can use a relative path to
the binary build within the tree just as easily.

fixes #3941

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-03-24 12:02:00 +08:00
Garrett Mahin
4e54aa5a7b doc: fix filename typo
Corrects a filename typo in cleanup cluster part
of kata-deploy README.md

Fixes: #3869
Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>
2022-03-24 12:00:17 +08:00
James O. D. Hunt
d815393c3e manager: Add options to change self test behaviour
Added new `kata-manager` options to control the self-test behaviour. By
default, after installation the manager will run a test to ensure a Kata
Containers container can be created. New options allow:

- The self test to be disabled.
- Only the self test to be run (no installation).

These features allow changes to be made to the installed system before
the self test is run.

Fixes: #3851.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-24 11:59:48 +08:00
James O. D. Hunt
4111e1a3de manager: Add option to enable component debug
Added a `-d` option to `kata-manager` to enable Kata Containers
and containerd debug.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-24 11:59:33 +08:00
James O. D. Hunt
2918be180f manager: Create containerd link
Make the `kata-manager` create a `containerd` link to ensure the
downloaded containerd systemd service file can find the daemon when
using the GitHub packaged version of containerd.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-03-24 11:59:26 +08:00
Julio Montes
6b31b06832 kernel: fix cve-2022-0847
bump guest kernel version to fix cve-2022-0847 "Dirty Pipe"

fixes #3852

Signed-off-by: Julio Montes <julio.montes@intel.com>
2022-03-24 11:58:43 +08:00
Fabiano Fidêncio
53a9cf7dc4 Merge pull request #3927 from fidencio/stable-2.4/qemu-bump
stable-2.4: Bump QEMU to 6.2 (bringing then SGX support in)
2022-03-23 07:20:35 +01:00
Julio Montes
5589b246d7 doc: update Intel SGX use cases document
Installation section is not longer needed because of the latest
default kata kernel supports Intel SGX.
Include QEMU to the list of supported hypervisors.

fixes #3911

Signed-off-by: Julio Montes <julio.montes@intel.com>
(cherry picked from commit 24b29310b2)
2022-03-22 08:36:04 +01:00
Julio Montes
1da88dca4b tools: update QEMU to 6.2
bring Intel SGX support

Changes tha may impact in Kata Containers
Arm:
The 'virt' machine now supports an emulated ITS
The 'virt' machine now supports more than 123 CPUs in TCG emulation mode
The pl031 real-time clock device now supports sending RTC_CHANGE QMP events

PowerPC:
Improved POWER10 support for the 'powernv' machine
Initial support for POWER10 DD2.0 CPU added
Added support for FORM2 PAPR NUMA descriptions in the "pseries" machine
 type

s390x:
Improved storage key emulation (e.g. fixed address handling, lazy
 storage key enablement for TCG, ...)
New gen16 CPU features are now enabled automatically in the latest
 machine type

KVM:
Support for SGX in the virtual machine, using the /dev/sgx_vepc device
 on the host and the "memory-backend-epc" backend in QEMU.
New "hv-apicv" CPU property (aliased to "hv-avic") sets the
 HV_DEPRECATING_AEOI_RECOMMENDED bit in CPUID[0x40000004].EAX.

virtio-mem:
QEMU now fully supports guest memory dumps with virtio-mem.
QEMU now cleanly supports precopy migration, postcopy migration and
 background snapshots with virtio-mem.

fixes #3902

Signed-off-by: Julio Montes <julio.montes@intel.com>
(cherry picked from commit 18d4d7fb1d)
2022-03-22 08:35:45 +01:00
Peng Tao
8cc2231818 Merge pull request #3892 from fengwang666/my_2.4_pr_backport
runtime: Properly handle ESRCH error when signaling container
2022-03-15 10:11:25 +08:00
GabyCT
63c1498f05 Merge pull request #3891 from likebreath/stable-2.4
stable-2.4 | versions: Upgrade to Cloud Hypervisor v22.1
2022-03-14 17:44:09 -06:00
Feng Wang
3e2f9223b0 runtime: Properly handle ESRCH error when signaling container
Currently kata shim v2 doesn't translate ESRCH signal, causing container
fail to stop and shim leak.

Fixes: #3874

Signed-off-by: Feng Wang <feng.wang@databricks.com>
(cherry picked from commit aa5ae6b17c)
2022-03-14 13:15:54 -07:00
Bo Chen
4c21cb3eb1 versions: Upgrade to Cloud Hypervisor v22.1
This is a bug fix release. The following issues have been addressed:
1) VFIO ioctl reordering to fix MSI on AMD platforms; 2) Fix virtio-net
control queue.

Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v22.1

Fixes: #3872

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 7a18e32fa7)
2022-03-14 12:34:31 -07:00
28 changed files with 504 additions and 123 deletions

View File

@@ -140,13 +140,10 @@ jobs:
- uses: actions/checkout@v2
- name: generate-and-upload-tarball
run: |
pushd $GITHUB_WORKSPACE/src/agent
cargo vendor >> .cargo/config
popd
tag=$(echo $GITHUB_REF | cut -d/ -f3-)
tarball="kata-containers-$tag-vendor.tar.gz"
pushd $GITHUB_WORKSPACE
tar -cvzf "${tarball}" src/agent/.cargo/config src/agent/vendor
bash -c "tools/packaging/release/generate_vendor.sh ${tarball}"
GITHUB_TOKEN=${{ secrets.GIT_UPLOAD_TOKEN }} hub release edit -m "" -a "${tarball}" "${tag}"
popd

View File

@@ -1 +1 @@
2.4.0-rc0
2.4.0

View File

@@ -104,26 +104,69 @@ $ sudo kubeadm init --ignore-preflight-errors=all --cri-socket /run/containerd/c
$ export KUBECONFIG=/etc/kubernetes/admin.conf
```
You can force Kubelet to use Kata Containers by adding some `untrusted`
annotation to your pod configuration. In our case, this ensures Kata
Containers is the selected runtime to run the described workload.
### Allow pods to run in the master node
`nginx-untrusted.yaml`
```yaml
apiVersion: v1
kind: Pod
By default, the cluster will not schedule pods in the master node. To enable master node scheduling:
```bash
$ sudo -E kubectl taint nodes --all node-role.kubernetes.io/master-
```
### Create runtime class for Kata Containers
Users can use [`RuntimeClass`](https://kubernetes.io/docs/concepts/containers/runtime-class/#runtime-class) to specify a different runtime for Pods.
```bash
$ cat > runtime.yaml <<EOF
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nginx-untrusted
annotations:
io.kubernetes.cri.untrusted-workload: "true"
spec:
containers:
name: kata
handler: kata
EOF
$ sudo -E kubectl apply -f runtime.yaml
```
### Run pod in Kata Containers
If a pod has the `runtimeClassName` set to `kata`, the CRI plugin runs the pod with the
[Kata Containers runtime](../../src/runtime/README.md).
- Create an pod configuration that using Kata Containers runtime
```bash
$ cat << EOF | tee nginx-kata.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-kata
spec:
runtimeClassName: kata
containers:
- name: nginx
image: nginx
```
Next, you run your pod:
```
$ sudo -E kubectl apply -f nginx-untrusted.yaml
```
EOF
```
- Create the pod
```bash
$ sudo -E kubectl apply -f nginx-kata.yaml
```
- Check pod is running
```bash
$ sudo -E kubectl get pods
```
- Check hypervisor is running
```bash
$ ps aux | grep qemu
```
### Delete created pod
```bash
$ sudo -E kubectl delete -f nginx-kata.yaml
```

View File

@@ -21,20 +21,7 @@ CONFIG_X86_SGX_KVM=y
* [Intel SGX Kubernetes device plugin](https://github.com/intel/intel-device-plugins-for-kubernetes/tree/main/cmd/sgx_plugin#deploying-with-pre-built-images)
> Note: Kata Containers supports creating VM sandboxes with Intel® SGX enabled
> using [cloud-hypervisor](https://github.com/cloud-hypervisor/cloud-hypervisor/) VMM only. QEMU support is waiting to get the
> Intel SGX enabled QEMU upstream release.
## Installation
### Kata Containers Guest Kernel
Follow the instructions to [setup](../../tools/packaging/kernel/README.md#setup-kernel-source-code) and [build](../../tools/packaging/kernel/README.md#build-the-kernel) the experimental guest kernel. Then, install as:
```sh
$ sudo cp kata-linux-experimental-*/vmlinux /opt/kata/share/kata-containers/vmlinux.sgx
$ sudo sed -i 's|vmlinux.container|vmlinux.sgx|g' \
/opt/kata/share/defaults/kata-containers/configuration-clh.toml
```
> using [cloud-hypervisor](https://github.com/cloud-hypervisor/cloud-hypervisor/) and [QEMU](https://www.qemu.org/) VMMs only.
### Kata Containers Configuration
@@ -48,6 +35,8 @@ to the `sandbox` are: `["io.katacontainers.*", "sgx.intel.com/epc"]`.
With the following sample job deployed using `kubectl apply -f`:
> Note: Change the `runtimeClassName` option accordingly, only `kata-clh` and `kata-qemu` support Intel® SGX.
```yaml
apiVersion: batch/v1
kind: Job

View File

@@ -8,8 +8,8 @@ use std::fs::File;
use std::os::unix::io::RawFd;
use tokio::sync::mpsc::Sender;
use nix::errno::Errno;
use nix::fcntl::{fcntl, FcntlArg, OFlag};
use nix::sys::signal::{self, Signal};
use nix::sys::wait::{self, WaitStatus};
use nix::unistd::{self, Pid};
use nix::Result;
@@ -80,7 +80,7 @@ pub struct Process {
pub trait ProcessOperations {
fn pid(&self) -> Pid;
fn wait(&self) -> Result<WaitStatus>;
fn signal(&self, sig: Signal) -> Result<()>;
fn signal(&self, sig: libc::c_int) -> Result<()>;
}
impl ProcessOperations for Process {
@@ -92,8 +92,10 @@ impl ProcessOperations for Process {
wait::waitpid(Some(self.pid()), None)
}
fn signal(&self, sig: Signal) -> Result<()> {
signal::kill(self.pid(), Some(sig))
fn signal(&self, sig: libc::c_int) -> Result<()> {
let res = unsafe { libc::kill(self.pid().into(), sig) };
Errno::result(res).map(drop)
}
}
@@ -281,6 +283,6 @@ mod tests {
// signal to every process in the process
// group of the calling process.
process.pid = 0;
assert!(process.signal(Signal::SIGCONT).is_ok());
assert!(process.signal(libc::SIGCONT).is_ok());
}
}

View File

@@ -19,6 +19,7 @@ use ttrpc::{
};
use anyhow::{anyhow, Context, Result};
use cgroups::freezer::FreezerState;
use oci::{LinuxNamespace, Root, Spec};
use protobuf::{Message, RepeatedField, SingularPtrField};
use protocols::agent::{
@@ -39,9 +40,9 @@ use rustjail::specconv::CreateOpts;
use nix::errno::Errno;
use nix::mount::MsFlags;
use nix::sys::signal::Signal;
use nix::sys::stat;
use nix::unistd::{self, Pid};
use rustjail::cgroups::Manager;
use rustjail::process::ProcessOperations;
use sysinfo::{DiskExt, System, SystemExt};
@@ -69,7 +70,6 @@ use tracing_opentelemetry::OpenTelemetrySpanExt;
use tracing::instrument;
use libc::{self, c_char, c_ushort, pid_t, winsize, TIOCSWINSZ};
use std::convert::TryFrom;
use std::fs;
use std::os::unix::fs::MetadataExt;
use std::os::unix::prelude::PermissionsExt;
@@ -389,7 +389,6 @@ impl AgentService {
let cid = req.container_id.clone();
let eid = req.exec_id.clone();
let s = self.sandbox.clone();
let mut sandbox = s.lock().await;
info!(
sl!(),
@@ -398,27 +397,93 @@ impl AgentService {
"exec-id" => eid.clone(),
);
let p = sandbox.find_container_process(cid.as_str(), eid.as_str())?;
let mut signal = Signal::try_from(req.signal as i32).map_err(|e| {
anyhow!(e).context(format!(
"failed to convert {:?} to signal (container-id: {}, exec-id: {})",
req.signal, cid, eid
))
})?;
// For container initProcess, if it hasn't installed handler for "SIGTERM" signal,
// it will ignore the "SIGTERM" signal sent to it, thus send it "SIGKILL" signal
// instead of "SIGTERM" to terminate it.
if p.init && signal == Signal::SIGTERM && !is_signal_handled(p.pid, req.signal) {
signal = Signal::SIGKILL;
let mut sig: libc::c_int = req.signal as libc::c_int;
{
let mut sandbox = s.lock().await;
let p = sandbox.find_container_process(cid.as_str(), eid.as_str())?;
// For container initProcess, if it hasn't installed handler for "SIGTERM" signal,
// it will ignore the "SIGTERM" signal sent to it, thus send it "SIGKILL" signal
// instead of "SIGTERM" to terminate it.
if p.init && sig == libc::SIGTERM && !is_signal_handled(p.pid, sig as u32) {
sig = libc::SIGKILL;
}
p.signal(sig)?;
}
p.signal(signal)?;
if eid.is_empty() {
// eid is empty, signal all the remaining processes in the container cgroup
info!(
sl!(),
"signal all the remaining processes";
"container-id" => cid.clone(),
"exec-id" => eid.clone(),
);
if let Err(err) = self.freeze_cgroup(&cid, FreezerState::Frozen).await {
warn!(
sl!(),
"freeze cgroup failed";
"container-id" => cid.clone(),
"exec-id" => eid.clone(),
"error" => format!("{:?}", err),
);
}
let pids = self.get_pids(&cid).await?;
for pid in pids.iter() {
let res = unsafe { libc::kill(*pid, sig) };
if let Err(err) = Errno::result(res).map(drop) {
warn!(
sl!(),
"signal failed";
"container-id" => cid.clone(),
"exec-id" => eid.clone(),
"pid" => pid,
"error" => format!("{:?}", err),
);
}
}
if let Err(err) = self.freeze_cgroup(&cid, FreezerState::Thawed).await {
warn!(
sl!(),
"unfreeze cgroup failed";
"container-id" => cid.clone(),
"exec-id" => eid.clone(),
"error" => format!("{:?}", err),
);
}
}
Ok(())
}
async fn freeze_cgroup(&self, cid: &str, state: FreezerState) -> Result<()> {
let s = self.sandbox.clone();
let mut sandbox = s.lock().await;
let ctr = sandbox
.get_container(cid)
.ok_or_else(|| anyhow!("Invalid container id {}", cid))?;
let cm = ctr
.cgroup_manager
.as_ref()
.ok_or_else(|| anyhow!("cgroup manager not exist"))?;
cm.freeze(state)?;
Ok(())
}
async fn get_pids(&self, cid: &str) -> Result<Vec<i32>> {
let s = self.sandbox.clone();
let mut sandbox = s.lock().await;
let ctr = sandbox
.get_container(cid)
.ok_or_else(|| anyhow!("Invalid container id {}", cid))?;
let cm = ctr
.cgroup_manager
.as_ref()
.ok_or_else(|| anyhow!("cgroup manager not exist"))?;
let pids = cm.get_pids()?;
Ok(pids)
}
#[instrument]
async fn do_wait_process(
&self,

View File

@@ -589,12 +589,10 @@ $(GENERATED_FILES): %: %.in $(MAKEFILE_LIST) VERSION .git-commit
generate-config: $(CONFIGS)
test: install-hook go-test
test: hook go-test
install-hook:
hook:
make -C virtcontainers hook
echo "installing mock hook"
sudo -E make -C virtcontainers install
go-test: $(GENERATED_FILES)
go clean -testcache

View File

@@ -21,7 +21,7 @@ import (
const defaultListenAddress = "127.0.0.1:8090"
var monitorListenAddr = flag.String("listen-address", defaultListenAddress, "The address to listen on for HTTP requests.")
var runtimeEndpoint = flag.String("runtime-endpoint", "/run/containerd/containerd.sock", `Endpoint of CRI container runtime service. (default: "/run/containerd/containerd.sock")`)
var runtimeEndpoint = flag.String("runtime-endpoint", "/run/containerd/containerd.sock", "Endpoint of CRI container runtime service.")
var logLevel = flag.String("log-level", "info", "Log level of logrus(trace/debug/info/warn/error/fatal/panic).")
// These values are overridden via ldflags

View File

@@ -776,6 +776,8 @@ func (s *service) Kill(ctx context.Context, r *taskAPI.KillRequest) (_ *ptypes.E
return empty, errors.New("The exec process does not exist")
}
processStatus = execs.status
} else {
r.All = true
}
// According to CRI specs, kubelet will call StopPodSandbox()

View File

@@ -8,12 +8,14 @@ package containerdshim
import (
"context"
"fmt"
"github.com/sirupsen/logrus"
"github.com/containerd/containerd/api/types/task"
"github.com/kata-containers/kata-containers/src/runtime/pkg/katautils"
)
func startContainer(ctx context.Context, s *service, c *container) (retErr error) {
shimLog.WithField("container", c.id).Debug("start container")
defer func() {
if retErr != nil {
// notify the wait goroutine to continue
@@ -78,7 +80,8 @@ func startContainer(ctx context.Context, s *service, c *container) (retErr error
return err
}
c.ttyio = tty
go ioCopy(c.exitIOch, c.stdinCloser, tty, stdin, stdout, stderr)
go ioCopy(shimLog.WithField("container", c.id), c.exitIOch, c.stdinCloser, tty, stdin, stdout, stderr)
} else {
// close the io exit channel, since there is no io for this container,
// otherwise the following wait goroutine will hang on this channel.
@@ -94,6 +97,10 @@ func startContainer(ctx context.Context, s *service, c *container) (retErr error
}
func startExec(ctx context.Context, s *service, containerID, execID string) (e *exec, retErr error) {
shimLog.WithFields(logrus.Fields{
"container": containerID,
"exec": execID,
}).Debug("start container execution")
// start an exec
c, err := s.getContainer(containerID)
if err != nil {
@@ -140,7 +147,10 @@ func startExec(ctx context.Context, s *service, containerID, execID string) (e *
}
execs.ttyio = tty
go ioCopy(execs.exitIOch, execs.stdinCloser, tty, stdin, stdout, stderr)
go ioCopy(shimLog.WithFields(logrus.Fields{
"container": c.id,
"exec": execID,
}), execs.exitIOch, execs.stdinCloser, tty, stdin, stdout, stderr)
go wait(ctx, s, c, execID)

View File

@@ -12,6 +12,7 @@ import (
"syscall"
"github.com/containerd/fifo"
"github.com/sirupsen/logrus"
)
// The buffer size used to specify the buffer for IO streams copy
@@ -86,18 +87,20 @@ func newTtyIO(ctx context.Context, stdin, stdout, stderr string, console bool) (
return ttyIO, nil
}
func ioCopy(exitch, stdinCloser chan struct{}, tty *ttyIO, stdinPipe io.WriteCloser, stdoutPipe, stderrPipe io.Reader) {
func ioCopy(shimLog *logrus.Entry, exitch, stdinCloser chan struct{}, tty *ttyIO, stdinPipe io.WriteCloser, stdoutPipe, stderrPipe io.Reader) {
var wg sync.WaitGroup
if tty.Stdin != nil {
wg.Add(1)
go func() {
shimLog.Debug("stdin io stream copy started")
p := bufPool.Get().(*[]byte)
defer bufPool.Put(p)
io.CopyBuffer(stdinPipe, tty.Stdin, *p)
// notify that we can close process's io safely.
close(stdinCloser)
wg.Done()
shimLog.Debug("stdin io stream copy exited")
}()
}
@@ -105,6 +108,7 @@ func ioCopy(exitch, stdinCloser chan struct{}, tty *ttyIO, stdinPipe io.WriteClo
wg.Add(1)
go func() {
shimLog.Debug("stdout io stream copy started")
p := bufPool.Get().(*[]byte)
defer bufPool.Put(p)
io.CopyBuffer(tty.Stdout, stdoutPipe, *p)
@@ -113,20 +117,24 @@ func ioCopy(exitch, stdinCloser chan struct{}, tty *ttyIO, stdinPipe io.WriteClo
// close stdin to make the other routine stop
tty.Stdin.Close()
}
shimLog.Debug("stdout io stream copy exited")
}()
}
if tty.Stderr != nil && stderrPipe != nil {
wg.Add(1)
go func() {
shimLog.Debug("stderr io stream copy started")
p := bufPool.Get().(*[]byte)
defer bufPool.Put(p)
io.CopyBuffer(tty.Stderr, stderrPipe, *p)
wg.Done()
shimLog.Debug("stderr io stream copy exited")
}()
}
wg.Wait()
tty.close()
close(exitch)
shimLog.Debug("all io stream copy goroutines exited")
}

View File

@@ -7,6 +7,7 @@ package containerdshim
import (
"context"
"github.com/sirupsen/logrus"
"io"
"os"
"path/filepath"
@@ -179,7 +180,7 @@ func TestIoCopy(t *testing.T) {
defer tty.close()
// start the ioCopy threads : copy from src to dst
go ioCopy(exitioch, stdinCloser, tty, dstInW, srcOutR, srcErrR)
go ioCopy(logrus.WithContext(context.Background()), exitioch, stdinCloser, tty, dstInW, srcOutR, srcErrR)
var firstW, secondW, thirdW io.WriteCloser
var firstR, secondR, thirdR io.Reader

View File

@@ -15,7 +15,6 @@ import (
"github.com/containerd/containerd/api/types/task"
"github.com/containerd/containerd/mount"
"github.com/sirupsen/logrus"
"google.golang.org/grpc/codes"
"github.com/kata-containers/kata-containers/src/runtime/pkg/oci"
)
@@ -31,12 +30,17 @@ func wait(ctx context.Context, s *service, c *container, execID string) (int32,
if execID == "" {
//wait until the io closed, then wait the container
<-c.exitIOch
shimLog.WithField("container", c.id).Debug("The container io streams closed")
} else {
execs, err = c.getExec(execID)
if err != nil {
return exitCode255, err
}
<-execs.exitIOch
shimLog.WithFields(logrus.Fields{
"container": c.id,
"exec": execID,
}).Debug("The container process io streams closed")
//This wait could be triggered before exec start which
//will get the exec's id, thus this assignment must after
//the exec exit, to make sure it get the exec's id.
@@ -63,6 +67,7 @@ func wait(ctx context.Context, s *service, c *container, execID string) (int32,
if c.cType.IsSandbox() {
// cancel watcher
if s.monitor != nil {
shimLog.WithField("sandbox", s.sandbox.ID()).Info("cancel watcher")
s.monitor <- nil
}
if err = s.sandbox.Stop(ctx, true); err != nil {
@@ -82,13 +87,17 @@ func wait(ctx context.Context, s *service, c *container, execID string) (int32,
c.exitTime = timeStamp
c.exitCh <- uint32(ret)
shimLog.WithField("container", c.id).Debug("The container status is StatusStopped")
} else {
execs.status = task.StatusStopped
execs.exitCode = ret
execs.exitTime = timeStamp
execs.exitCh <- uint32(ret)
shimLog.WithFields(logrus.Fields{
"container": c.id,
"exec": execID,
}).Debug("The container exec status is StatusStopped")
}
s.mu.Unlock()
@@ -102,6 +111,7 @@ func watchSandbox(ctx context.Context, s *service) {
return
}
err := <-s.monitor
shimLog.WithError(err).WithField("sandbox", s.sandbox.ID()).Info("watchSandbox gets an error or stop signal")
if err == nil {
return
}
@@ -147,13 +157,11 @@ func watchOOMEvents(ctx context.Context, s *service) {
default:
containerID, err := s.sandbox.GetOOMEvent(ctx)
if err != nil {
shimLog.WithError(err).Warn("failed to get OOM event from sandbox")
// If the GetOOMEvent call is not implemented, then the agent is most likely an older version,
// stop attempting to get OOM events.
// for rust agent, the response code is not found
if isGRPCErrorCode(codes.NotFound, err) || err.Error() == "Dead agent" {
if err.Error() == "ttrpc: closed" || err.Error() == "Dead agent" {
shimLog.WithError(err).Warn("agent has shutdown, return from watching of OOM events")
return
}
shimLog.WithError(err).Warn("failed to get OOM event from sandbox")
time.Sleep(defaultCheckInterval)
continue
}

View File

@@ -20,7 +20,7 @@ import (
var testKeyHook = "test-key"
var testContainerIDHook = "test-container-id"
var testControllerIDHook = "test-controller-id"
var testBinHookPath = "/usr/bin/virtcontainers/bin/test/hook"
var testBinHookPath = "../../virtcontainers/hook/mock/hook"
var testBundlePath = "/test/bundle"
func getMockHookBinPath() string {

View File

@@ -12,6 +12,7 @@ import (
"os"
"path/filepath"
"strconv"
"strings"
"syscall"
"time"
@@ -1060,7 +1061,18 @@ func (c *Container) signalProcess(ctx context.Context, processID string, signal
return fmt.Errorf("Container not ready, running or paused, impossible to signal the container")
}
return c.sandbox.agent.signalProcess(ctx, c, processID, signal, all)
// kill(2) method can return ESRCH in certain cases, which is not handled by containerd cri server in container_stop.go.
// CRIO server also doesn't handle ESRCH. So kata runtime will swallow it here.
var err error
if err = c.sandbox.agent.signalProcess(ctx, c, processID, signal, all); err != nil &&
strings.Contains(err.Error(), "ESRCH: No such process") {
c.Logger().WithFields(logrus.Fields{
"container": c.id,
"process-id": processID,
}).Warn("signal encounters ESRCH, process already finished")
return nil
}
return err
}
func (c *Container) winsizeProcess(ctx context.Context, processID string, height, width uint32) error {

View File

@@ -18,6 +18,8 @@ const (
watcherChannelSize = 128
)
var monitorLog = virtLog.WithField("subsystem", "virtcontainers/monitor")
// nolint: govet
type monitor struct {
watchers []chan error
@@ -33,6 +35,9 @@ type monitor struct {
}
func newMonitor(s *Sandbox) *monitor {
// there should only be one monitor for one sandbox,
// so it's safe to let monitorLog as a global variable.
monitorLog = monitorLog.WithField("sandbox", s.ID())
return &monitor{
sandbox: s,
checkInterval: defaultCheckInterval,
@@ -72,6 +77,7 @@ func (m *monitor) newWatcher(ctx context.Context) (chan error, error) {
}
func (m *monitor) notify(ctx context.Context, err error) {
monitorLog.WithError(err).Warn("notify on errors")
m.sandbox.agent.markDead(ctx)
m.Lock()
@@ -85,18 +91,19 @@ func (m *monitor) notify(ctx context.Context, err error) {
// but just in case...
defer func() {
if x := recover(); x != nil {
virtLog.Warnf("watcher closed channel: %v", x)
monitorLog.Warnf("watcher closed channel: %v", x)
}
}()
for _, c := range m.watchers {
monitorLog.WithError(err).Warn("write error to watcher")
// throw away message can not write to channel
// make it not stuck, the first error is useful.
select {
case c <- err:
default:
virtLog.WithField("channel-size", watcherChannelSize).Warnf("watcher channel is full, throw notify message")
monitorLog.WithField("channel-size", watcherChannelSize).Warnf("watcher channel is full, throw notify message")
}
}
}
@@ -104,6 +111,7 @@ func (m *monitor) notify(ctx context.Context, err error) {
func (m *monitor) stop() {
// wait outside of monitor lock for the watcher channel to exit.
defer m.wg.Wait()
monitorLog.Info("stopping monitor")
m.Lock()
defer m.Unlock()
@@ -122,7 +130,7 @@ func (m *monitor) stop() {
// but just in case...
defer func() {
if x := recover(); x != nil {
virtLog.Warnf("watcher closed channel: %v", x)
monitorLog.Warnf("watcher closed channel: %v", x)
}
}()

View File

@@ -321,6 +321,7 @@ func WaitLocalProcess(pid int, timeoutSecs uint, initialSignal syscall.Signal, l
if initialSignal != syscall.Signal(0) {
if err = syscall.Kill(pid, initialSignal); err != nil {
if err == syscall.ESRCH {
logger.WithField("pid", pid).Warnf("kill encounters ESRCH, process already finished")
return nil
}

View File

@@ -143,7 +143,7 @@ $ kubectl -n kube-system wait --timeout=10m --for=delete -l name=kata-deploy pod
After ensuring kata-deploy has been deleted, cleanup the cluster:
```sh
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-cleanup/base/kata-cleanup-stabe.yaml
$ kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-cleanup/base/kata-cleanup-stable.yaml
```
The cleanup daemon-set will run a single time, cleaning up the node-label, which makes it difficult to check in an automated fashion.

View File

@@ -18,7 +18,7 @@ spec:
katacontainers.io/kata-runtime: cleanup
containers:
- name: kube-kata-cleanup
image: quay.io/kata-containers/kata-deploy:latest
image: quay.io/kata-containers/kata-deploy:2.4.0
imagePullPolicy: Always
command: [ "bash", "-c", "/opt/kata-artifacts/scripts/kata-deploy.sh reset" ]
env:

View File

@@ -16,7 +16,7 @@ spec:
serviceAccountName: kata-label-node
containers:
- name: kube-kata
image: quay.io/kata-containers/kata-deploy:latest
image: quay.io/kata-containers/kata-deploy:2.4.0
imagePullPolicy: Always
lifecycle:
preStop:

View File

@@ -1 +1 @@
89
90

View File

@@ -0,0 +1,81 @@
From 29c4a3363bf287bb9a7b0342b1bc2dba3661c96c Mon Sep 17 00:00:00 2001
From: Fabiano Rosas <farosas@linux.ibm.com>
Date: Fri, 17 Dec 2021 17:57:18 +0100
Subject: [PATCH] Revert "target/ppc: Move SPR_DSISR setting to powerpc_excp"
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This reverts commit 336e91f85332dda0ede4c1d15b87a19a0fb898a2.
It breaks the --disable-tcg build:
../target/ppc/excp_helper.c:463:29: error: implicit declaration of
function cpu_ldl_code [-Werror=implicit-function-declaration]
We should not have TCG code in powerpc_excp because some kvm-only
routines use it indirectly to dispatch interrupts. See
kvm_handle_debug, spapr_mce_req_event and
spapr_do_system_reset_on_cpu.
We can re-introduce the change once we have split the interrupt
injection code between KVM and TCG.
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Message-Id: <20211209173323.2166642-1-farosas@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
target/ppc/excp_helper.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index feb3fd42e2..6ba0840e99 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -464,15 +464,13 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp)
break;
}
case POWERPC_EXCP_ALIGN: /* Alignment exception */
+ /* Get rS/rD and rA from faulting opcode */
/*
- * Get rS/rD and rA from faulting opcode.
- * Note: We will only invoke ALIGN for atomic operations,
- * so all instructions are X-form.
+ * Note: the opcode fields will not be set properly for a
+ * direct store load/store, but nobody cares as nobody
+ * actually uses direct store segments.
*/
- {
- uint32_t insn = cpu_ldl_code(env, env->nip);
- env->spr[SPR_DSISR] |= (insn & 0x03FF0000) >> 16;
- }
+ env->spr[SPR_DSISR] |= (env->error_code & 0x03FF0000) >> 16;
break;
case POWERPC_EXCP_PROGRAM: /* Program exception */
switch (env->error_code & ~0xF) {
@@ -1441,6 +1439,11 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
int mmu_idx, uintptr_t retaddr)
{
CPUPPCState *env = cs->env_ptr;
+ uint32_t insn;
+
+ /* Restore state and reload the insn we executed, for filling in DSISR. */
+ cpu_restore_state(cs, retaddr, true);
+ insn = cpu_ldl_code(env, env->nip);
switch (env->mmu_model) {
case POWERPC_MMU_SOFT_4xx:
@@ -1456,8 +1459,8 @@ void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr vaddr,
}
cs->exception_index = POWERPC_EXCP_ALIGN;
- env->error_code = 0;
- cpu_loop_exit_restore(cs, retaddr);
+ env->error_code = insn & 0x03FF0000;
+ cpu_loop_exit(cs);
}
#endif /* CONFIG_TCG */
#endif /* !CONFIG_USER_ONLY */
--
GitLab

View File

@@ -0,0 +1,53 @@
#!/usr/bin/env bash
#
# Copyright (c) 2022 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#
set -o errexit
set -o nounset
set -o pipefail
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
script_name="$(basename "${BASH_SOURCE[0]}")"
# This is very much error prone in case we re-structure our
# repos again, but it's also used in a few other places :-/
repo_dir="${script_dir}/../../.."
function usage() {
cat <<EOF
Usage: ${script_name} tarball-name
This script creates a tarball with all the cargo vendored code
that a distro would need to do a full build of the project in
a disconnected environment, generating a "tarball-name" file.
EOF
}
create_vendor_tarball() {
vendor_dir_list=""
pushd ${repo_dir}
for i in $(find . -name 'Cargo.lock'); do
dir="$(dirname $i)"
pushd "${dir}"
[ -d .cargo ] || mkdir .cargo
cargo vendor >> .cargo/config
vendor_dir_list+=" $dir/vendor $dir/.cargo/config"
echo "${vendor_dir_list}"
popd
done
popd
tar -cvzf ${1} ${vendor_dir_list}
}
main () {
[ $# -ne 1 ] && usage && exit 0
create_vendor_tarball ${1}
}
main "$@"

View File

@@ -68,7 +68,6 @@ generate_kata_deploy_commit() {
kata-deploy files must be adapted to a new release. The cases where it
happens are when the release goes from -> to:
* main -> stable:
* kata-deploy / kata-cleanup: change from \"latest\" to \"rc0\"
* kata-deploy-stable / kata-cleanup-stable: are removed
* stable -> stable:
@@ -161,7 +160,7 @@ bump_repo() {
# +----------------+----------------+
# | from | to |
# -------------------+----------------+----------------+
# kata-deploy | "latest" | "rc0" |
# kata-deploy | "latest" | "latest" |
# -------------------+----------------+----------------+
# kata-deploy-stable | "stable" | REMOVED |
# -------------------+----------------+----------------+
@@ -183,29 +182,34 @@ bump_repo() {
info "Updating kata-deploy / kata-cleanup image tags"
local version_to_replace="${current_version}"
local replacement="${new_version}"
if [ "${target_branch}" == "main" ]; then
local need_commit=false
if [ "${target_branch}" == "main" ];then
if [[ "${new_version}" =~ "rc" ]]; then
## this is the case 2) where we remove te kata-deploy / kata-cleanup stable files
## We are bumping from alpha to RC, should drop kata-deploy-stable yamls.
git rm "${kata_deploy_stable_yaml}"
git rm "${kata_cleanup_stable_yaml}"
else
## this is the case 1) where we just do nothing
replacement="latest"
need_commit=true
fi
version_to_replace="latest"
fi
if [ "${version_to_replace}" != "${replacement}" ]; then
## this covers case 2) and 3), as on both of them we have changes on kata-deploy / kata-cleanup files
sed -i "s#${registry}:${version_to_replace}#${registry}:${new_version}#g" "${kata_deploy_yaml}"
sed -i "s#${registry}:${version_to_replace}#${registry}:${new_version}#g" "${kata_cleanup_yaml}"
elif [ "${new_version}" != *"rc"* ]; then
## We are on a stable branch and creating new stable releases.
## Need to change kata-deploy / kata-cleanup to use the stable tags.
if [[ "${version_to_replace}" =~ "rc" ]]; then
## Coming from "rcX" so from the latest tag.
version_to_replace="latest"
fi
sed -i "s#${registry}:${version_to_replace}#${registry}:${replacement}#g" "${kata_deploy_yaml}"
sed -i "s#${registry}:${version_to_replace}#${registry}:${replacement}#g" "${kata_cleanup_yaml}"
git diff
git add "${kata_deploy_yaml}"
git add "${kata_cleanup_yaml}"
need_commit=true
fi
if [ "${need_commit}" == "true" ]; then
info "Creating the commit with the kata-deploy changes"
local commit_msg="$(generate_kata_deploy_commit $new_version)"
git commit -s -m "${commit_msg}"

View File

@@ -250,7 +250,6 @@ generate_qemu_options() {
qemu_options+=(size:--disable-auth-pam)
# Disable unused filesystem support
[ "$arch" == x86_64 ] && qemu_options+=(size:--disable-fdt)
qemu_options+=(size:--disable-glusterfs)
qemu_options+=(size:--disable-libiscsi)
qemu_options+=(size:--disable-libnfs)
@@ -303,7 +302,6 @@ generate_qemu_options() {
;;
esac
qemu_options+=(size:--disable-qom-cast-debug)
qemu_options+=(size:--disable-tcmalloc)
# Disable libudev since it is only needed for qemu-pr-helper and USB,
# none of which are used with Kata

View File

@@ -208,11 +208,14 @@ Description: Install $kata_project [1] (and optionally $containerd_project [2])
Options:
-c <version> : Specify containerd version.
-d : Enable debug for all components.
-f : Force installation (use with care).
-h : Show this help statement.
-k <version> : Specify Kata Containers version.
-o : Only install Kata Containers.
-r : Don't cleanup on failure (retain files).
-t : Disable self test (don't try to create a container after install).
-T : Only run self test (do not install anything).
Notes:
@@ -402,13 +405,21 @@ install_containerd()
sudo tar -C /usr/local -xvf "${file}"
sudo ln -sf /usr/local/bin/ctr "${link_dir}"
for file in \
/usr/local/bin/containerd \
/usr/local/bin/ctr
do
sudo ln -sf "$file" "${link_dir}"
done
info "$project installed\n"
}
configure_containerd()
{
local enable_debug="${1:-}"
[ -z "$enable_debug" ] && die "no enable debug value"
local project="$containerd_project"
info "Configuring $project"
@@ -460,26 +471,55 @@ configure_containerd()
info "Backed up $cfg to $original"
}
local modified="false"
# Add the Kata Containers configuration details:
local comment_text
comment_text=$(printf "%s: Added by %s\n" \
"$(date -Iseconds)" \
"$script_name")
sudo grep -q "$kata_runtime_type" "$cfg" || {
cat <<-EOT | sudo tee -a "$cfg"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "${kata_runtime_name}"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.${kata_runtime_name}]
runtime_type = "${kata_runtime_type}"
EOT
# $comment_text
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "${kata_runtime_name}"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.${kata_runtime_name}]
runtime_type = "${kata_runtime_type}"
EOT
info "Modified $cfg"
modified="true"
}
if [ "$enable_debug" = "true" ]
then
local debug_enabled
debug_enabled=$(awk -v RS='' '/\[debug\]/' "$cfg" |\
grep -E "^\s*\<level\>\s*=\s*.*\<debug\>" || true)
[ -n "$debug_enabled" ] || {
cat <<-EOT | sudo tee -a "$cfg"
# $comment_text
[debug]
level = "debug"
EOT
}
modified="true"
fi
[ "$modified" = "true" ] && info "Modified $cfg"
sudo systemctl enable containerd
sudo systemctl start containerd
info "Configured $project\n"
local msg="disabled"
[ "$enable_debug" = "true" ] && msg="enabled"
info "Configured $project (debug $msg)\n"
}
install_kata()
@@ -540,11 +580,48 @@ install_kata()
info "$project installed\n"
}
configure_kata()
{
local enable_debug="${1:-}"
[ -z "$enable_debug" ] && die "no enable debug value"
[ "$enable_debug" = "false" ] && \
info "Using default $kata_project configuration" && \
return 0
local config_file='configuration.toml'
local kata_dir='/etc/kata-containers'
sudo mkdir -p "$kata_dir"
local cfg_from
local cfg_to
cfg_from="${kata_install_dir}/share/defaults/kata-containers/${config_file}"
cfg_to="${kata_dir}/${config_file}"
[ -e "$cfg_from" ] || die "cannot find $kata_project configuration file"
sudo install -o root -g root -m 0644 "$cfg_from" "$cfg_to"
sudo sed -i \
-e 's/^# *\(enable_debug\).*=.*$/\1 = true/g' \
-e 's/^kernel_params = "\(.*\)"/kernel_params = "\1 agent.log=debug initcall_debug"/g' \
"$cfg_to"
info "Configured $kata_project for full debug (delete $cfg_to to use pristine $kata_project configuration)"
}
handle_kata()
{
local version="${1:-}"
install_kata "$version"
local enable_debug="${2:-}"
[ -z "$enable_debug" ] && die "no enable debug value"
install_kata "$version" "$enable_debug"
configure_kata "$enable_debug"
kata-runtime --version
}
@@ -556,6 +633,9 @@ handle_containerd()
local force="${2:-}"
[ -z "$force" ] && die "need force value"
local enable_debug="${3:-}"
[ -z "$enable_debug" ] && die "no enable debug value"
local ret
if [ "$force" = "true" ]
@@ -572,7 +652,7 @@ handle_containerd()
fi
fi
configure_containerd
configure_containerd "$enable_debug"
containerd --version
}
@@ -617,20 +697,32 @@ handle_installation()
local only_kata="${3:-}"
[ -z "$only_kata" ] && die "no only Kata value"
local enable_debug="${4:-}"
[ -z "$enable_debug" ] && die "no enable debug value"
local disable_test="${5:-}"
[ -z "$disable_test" ] && die "no disable test value"
local only_run_test="${6:-}"
[ -z "$only_run_test" ] && die "no only run test value"
# These params can be blank
local kata_version="${4:-}"
local containerd_version="${5:-}"
local kata_version="${7:-}"
local containerd_version="${8:-}"
[ "$only_run_test" = "true" ] && test_installation && return 0
setup "$cleanup" "$force"
handle_kata "$kata_version"
handle_kata "$kata_version" "$enable_debug"
[ "$only_kata" = "false" ] && \
handle_containerd \
"$containerd_version" \
"$force"
"$force" \
"$enable_debug"
test_installation
[ "$disable_test" = "false" ] && test_installation
if [ "$only_kata" = "true" ]
then
@@ -647,21 +739,27 @@ handle_args()
local cleanup="true"
local force="false"
local only_kata="false"
local disable_test="false"
local only_run_test="false"
local enable_debug="false"
local opt
local kata_version=""
local containerd_version=""
while getopts "c:fhk:or" opt "$@"
while getopts "c:dfhk:ortT" opt "$@"
do
case "$opt" in
c) containerd_version="$OPTARG" ;;
d) enable_debug="true" ;;
f) force="true" ;;
h) usage; exit 0 ;;
k) kata_version="$OPTARG" ;;
o) only_kata="true" ;;
r) cleanup="false" ;;
t) disable_test="true" ;;
T) only_run_test="true" ;;
esac
done
@@ -674,6 +772,9 @@ handle_args()
"$cleanup" \
"$force" \
"$only_kata" \
"$enable_debug" \
"$disable_test" \
"$only_run_test" \
"$kata_version" \
"$containerd_version"
}

View File

@@ -75,7 +75,7 @@ assets:
url: "https://github.com/cloud-hypervisor/cloud-hypervisor"
uscan-url: >-
https://github.com/cloud-hypervisor/cloud-hypervisor/tags.*/v?(\d\S+)\.tar\.gz
version: "v22.0"
version: "v22.1"
firecracker:
description: "Firecracker micro-VMM"
@@ -88,8 +88,8 @@ assets:
qemu:
description: "VMM that uses KVM"
url: "https://github.com/qemu/qemu"
version: "v6.1.0"
tag: "v6.1.0"
version: "v6.2.0"
tag: "v6.2.0"
# Do not include any non-full release versions
# Break the line *without CR or space being appended*, to appease
# yamllint, and note the deliberate ' ' at the end of the expression.
@@ -153,7 +153,7 @@ assets:
kernel:
description: "Linux kernel optimised for virtual machines"
url: "https://cdn.kernel.org/pub/linux/kernel/v5.x/"
version: "v5.15.23"
version: "v5.15.26"
tdx:
description: "Linux kernel that supports TDX"
url: "https://github.com/intel/tdx/archive/refs/tags"