allocate_hugepages() writes to the kernel sysfs file to allocate hugepages
in the Kata VM. However, even if the write succeeds, it's not certain that
the kernel will actually be able to allocate as many hugepages as we
requested.
This patch reads back the file after writing it to check if we were able to
allocate all the required hugepages.
fixes#3816
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
allocate_hugepages() constructs the path for the sysfs directory containing
hugepage configuration, then attempts to create this directory if it does
not exist.
This doesn't make sense: sysfs is a view into kernel configuration, if the
kernel has support for the hugepage size, the directory will already be
there, if it doesn't, trying to create it won't help.
For the same reason, attempting to create the "nr_hugepages" file
itself is pointless, so there's no reason to call
OpenOptions::create(true).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The content about systemd in "/proc/self/cgroup" is as:
1:name=systemd:/kubepods/pod1815643d-3789-4e4e-aaf4-00de024912e1/0e15a65bd5f7b30a0b818d90706212354d8b3f0998a1495473c3be9a24706ccf
and in "/prol/self/mountinfo" is as:
30 29 0:26 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
The keys extracted from the two files are the same as "name=systemd". So no need to rename the key to "systemd".
Fixes: #3385
Signed-off-by: sailorvii <challengingway@hotmail.com>
Mount hugepage directories and configure the requested number of hugepages
dynamically by writing to sysfs files
Port from:
78b307b5bdFixes: #3342
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: bin <bin@hyper.sh>
Current hook process is handled by just calling
unwrap() on it, sometime it will cause panic.
By handling all Result type and check the error can
avoid panic.
Fixes: #3649
Signed-off-by: bin <bin@hyper.sh>
Envs contain null-byte will cause running hooks to panic,
this commit will filter envs and only pass valid envs to hooks.
Fixes: #3667
Signed-off-by: bin <bin@hyper.sh>
Pulling image is the most time-consuming step in the container lifecycle. This PR
introduse nydus to kata container, it can lazily pull image when container start. So it
can speed up kata container create and start.
Fixes#2724
Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>
In commit 78dff468bf1 we introduced logic to rewrite PCIDEVICE_ environment
variables for the container so that they contain correct addresses for the
Kata VM rather than for the host. Unfortunately, we never actually invoked
the function to do this.
It turns out we need to do this not only at container creation time, but
also for environment variables supplied to processes exec-ed into the
container after creation (e.g. with crictl exec). Add calls to make both
those updates.
fixes#3634
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
add_devices() generates a mapping of host to guest PCI addresses which is
used to update some environment variables for the workload. Currently it
just does this locally, but it turns out we're going to need the same map
again in order to correct environment variables for processes exec-ed into
the existing container.
Move the map to the sandbox structure so we can keep it around for those
later uses.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This function updates PCIDEVICE_ environment variables (such as those
supplied by the Kubernetes SR-IOV plugin) in the OCI spec to be correct
for the Kata VM, rather than for the host.
We neglected to actually call this function, however, and it turns out that
when we do, we need to do things slightly different. We actually need to
adjust envionment variables both in the OCI spec when creating a container
and also in the variables supplied for exec-ing a new process within an
existing container.
Adjust the function so that it can be used for both these cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
1. The hook.args[0] is the hook binary name which shouldn't be included
in the Command.args.
2. Add new unit tests
Fixes: #2610
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
After the protocols are moved to upper libs (PR3355),
the runtime protocol generation is broken. This fixes it.
Fixes: #3414
Signed-off-by: Feng Wang <feng.wang@databricks.com>
move the protocols to upper libs thus it can
be shared between agent and other rust runtime.
Depends-on: github.com/kata-containers/tests#4306
Fixes: #3348
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
Move the oci crate to upper libs thus it can be
shared between agent and other rust runtimes.
Fixes: #3348
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
The tokio's spawn will only create an future async task
instead of a new real thread, thus executing unshare to
create a new namespace in tokio's async task would make
the agent process to join in the new created namespace,
which isn't expected.
Thus, we'd better to to the unshare in a real thread to
prevent moving the agent process into a new namespace.
Fixes: #3369
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
For calls from shim to agent, the return error will be processed like this:
match self.do_start_container(req).await {
Err(e) => Err(ttrpc_error(ttrpc::Code::INTERNAL, e.to_string())),
Ok(_) => Ok(Empty::new()),
}
The e.to_string() return only a part of the error(for example set by context()),
this may lead lack of information.
The `format!("{:?}", err)` will return more info.
Fixes: #3353
Signed-off-by: bin <bin@hyper.sh>
We already have a `mount_path` local Path variable which holds the mount
point.
Use it instead of creating a new `mount_point` variable with identical
type and content.
Fixes: #3332
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Make it clear when reading the table in the agent's "Change the agent
API" documentation that the commands in the "Generation method" column
should be run in the agent repo.
Fixes: #3317.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Move the architecture document into a new `docs/design/architecture/` directory
in preparation for splitting it into more manageable pieces.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
In function `update_target`, if the updated source is a directory,
we should create the corresponding directory.
Fixes: #3140
Signed-off-by: bin <bin@hyper.sh>
Wrap `nix` `Error`'s in an `anyhow` error for consistency with the way
`rustjail` handles errors.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Replace `Result` values that use a "bare" `nix` `Error` like this:
```rust
return Err(nix::Error::EINVAL.into());
```
... to the following which wraps the nix` error in an `anyhow` call for
consistency with the other errors returned by `rustjail`:
```rust
return Err(anyhow!(nix::Error::EINVAL));
```
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Remove a bare `return` from a test function. This looks wrong but isn't
because the callers are all tests that just wait for a state change
caused by this test function.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Running `cargo audit` showed that the `nix` package for the agent and
the `rustjail` and `vsock-exporter` local crates need to be updated to
resolve rust security issue
[RUSTSEC-2021-0119](https://rustsec.org/advisories/RUSTSEC-2021-0119).
Hence, bumped `nix` to the latest version (which required changes to
work with the new, simpler `errno` handling).
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Run `cargo update` to update to the latest crate dependency versions.
The agent is an application so this includes expanding the partially
specified semvers to full semver values for the following crates,
which makes those crates consistent with the other agent dependencies:
- `futures`
- `regex`
- `scan_fmt`
- `tokio`
Fixes: #3124.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Remove the format specifier in the `"failed to get VFIO group"` error
returned by `vfio_device_handler()`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>