Modify some path variables to be functions that return the path
with the rootless directory prefix if running rootlessly.
Fixes: #1827
Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
Fixes: #2023
We can get OCI spec config from bundle instead of annotations, so this
field isn't necessary.
Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
The container store should be deleted when new/create is failed if the
store is newly created.
Fixes: #2013
Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
When a new sandbox is created, join to its cgroup path
this will create all proxy, shim, etc in the sandbox cgroup.
Fixes: #1879
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
When create container failed, it should delete the container
config from sandbox, otherwise, the following new creating container
would get a wrong resources caculating which would contain the previous
failed container resources such as memory and cpu.
Fixes: #1997
Signed-off-by: lifupan <lifupan@gmail.com>
When force is true, ignore any guest related errors. This can
be used to stop a sandbox when hypervisor process is dead.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
It requires root to manipulate netns and otherwise fails
like below:
=== RUN TestStartNetworkMonitor
--- FAIL: TestStartNetworkMonitor (0.00s)
Error Trace: sandbox_test.go:1481
Error: Expected nil, but got: &errors.errorString{s:"Error switching to ns /proc/6648/task/6651/ns/net: operation not permitted"}
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Convert virtcontainers tests to testify/assert to make the virtcontainers
tests more readable.
fixes#156
Signed-off-by: Julio Montes <julio.montes@intel.com>
Fixed `TestSandboxCreationFromConfigRollbackFromCreateSandbox` which
requires that the hypervisor does not exist. Unfortunately, it does
exist (as a fake test binary), but isn't executable meaning although the
test failed (since an error is expected), rather than the expected
`ENOENT` error, the test was logging a message similar to the following
since the fake hypervisor exists with non-executable permissions:
```
Unable to launch /tmp/vc-tmp-526112270/hypervisor: fork/exec /tmp/vc-tmp-526112270/hypervisor: permission denied
```
Fixes: #1835.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
When using experimental feature "newstore", we save and load devices
information from `persist.json` instead of `devices.json`, in such case,
file `devices.json` isn't needed anymore, so remove it.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
This reverts commit 196661bc0d.
Reverting because cri-o with devicemapper started
to fail after this commit was merged.
Fixes: #1574.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
We can use the same data structure to describe both of them.
So that we can handle them similarly.
Fixes: #1566
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Fixes: #1422
Detect failing test case:
```
....
=== RUN TestEnterContainerFailingContNotStarted
--- PASS: TestEnterContainerFailingContNotStarted (0.01s)
=== RUN TestEnterContainer
--- FAIL: TestEnterContainer (0.00s)
Error Trace: sandbox_test.go:1154
Error: Expected value not to be nil.
Messages: Entering non-running container should fail
Error Trace: sandbox_test.go:1157
Error: Expected nil, but got: &errors.errorString{s:"Can not
move from running to running"}
Messages: Failed to start sandbox: Can not move from running to
running
FAIL
```
`TestEnterContainerFailingContNotStarted` calls `cleanUp` at function
begging but it doesn't clean its garbage after it ends.
`TestEnterContainer` only call `cleanUp` in the end but it doesn't do
cleanUp in the begging, that gives first test case a chance to impact
latter one.
This commit modifies all the test cases, let them all do the cleanUp()
in the end.
The policy here is: "everyone needs to take their garbage away when they
leave" :)
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Fixes#1226
Add new flag "experimental" for supporting underworking features.
Some features are under developing which are not ready for release,
there're also some features which will break compatibility which is not
suitable to be merged into a kata minor release(x version in x.y.z)
For getting these features above merged earlier for more testing, we can
mark them as "experimental" features, and move them to formal features
when they are ready.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Make cpu and memory calculation in a different function
this help to reduce the function complexity and easy unit test.
Fixes: #1296
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
All containers run in different cgroups even the sandbox, with this new
implementation the sandbox cpu cgroup wil be equal to the sum of all its
containers and the hypervisor process will be placed there impacting to the
containers running in the sandbox (VM). The default number of vcpus is
used when the sandbox has no constraints. For example, if default_vcpus
is 2, then quota will be 200000 and period 100000.
**c-ray test**
http://www.futuretech.blinkenlights.nl/c-ray.html
```
+=============================================+
| | 6 threads 6cpus | 1 thread 1 cpu |
+=============================================+
| current | 40 seconds | 122 seconds |
+==============================================
| new | 37 seconds | 124 seconds |
+==============================================
```
current = current cgroups implementation
new = new cgroups implementation
**workload**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: c-ray
annotations:
io.kubernetes.cri.untrusted-workload: "true"
spec:
restartPolicy: Never
containers:
- name: c-ray-1
image: docker.io/devimc/c-ray:latest
imagePullPolicy: IfNotPresent
args: ["-t", "6", "-s", "1600x1200", "-r", "8", "-i",
"/c-ray-1.1/sphfract", "-o", "/tmp/output.ppm"]
resources:
limits:
cpu: 6
- name: c-ray-2
image: docker.io/devimc/c-ray:latest
imagePullPolicy: IfNotPresent
args: ["-t", "1", "-s", "1600x1200", "-r", "8", "-i",
"/c-ray-1.1/sphfract", "-o", "/tmp/output.ppm"]
resources:
limits:
cpu: 1
```
fixes#1153
Signed-off-by: Julio Montes <julio.montes@intel.com>
We convert the whole virtcontainers code to use the store package
instead of the resource_storage one. The resource_storage removal will
happen in a separate change for a more logical split.
This change is fairly big but mostly does not change the code logic.
What really changes is when we create a store for a container or a
sandbox. We now need to explictly do so instead of just assigning a
filesystem{} instance. Other than that, the logic is kept intact.
Fixes: #1099
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
There's only one real implementer of the network interface and no real
need to implement anything else. We can just go ahead and remove this
abstraction.
Fixes: #1179
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
In order to move the hypervisor implementations into their own package,
we need to put the asset type into the types package and break the
hypervisor->asset->virtcontainers->hypervisor cyclic dependency.
Fixes: #1119
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We can now remove all the sandbox shared types and convert the rest of
the code to using the new internal types package.
This commit includes virtcontainers, cli and containerd-shim changes in
one atomic change in order to not break bisect'ibility.
Fixes: #1095
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
cri containerd calls kill on stopped sandbox and if we
fail the call, it can cause `cri stopp` command to fail
too.
Fixes: #1084
Signed-off-by: Peng Tao <bergwolf@gmail.com>
In case we use an hypervisor that cannot support filesystem sharing,
we copy files over to the VM rootfs through the gRPC protocol. This
is a nice workaround, but it only works with regular files, which
means no device file, no socket file, no directory, etc... can be
sent this way.
This is a limitation that we accept here, by simply ignoring those
non-regular files.
Fixes#1068
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Start adding support for virtio-mmio devices starting with block.
The devices show within the vm as vda, vdb,... based on order of
insertion and such within the VM resemble virtio-blk devices.
They need to be explicitly differentiated to ensure that the
agent logic within the VM can discover and mount them appropropriately.
The agent uses PCI location to discover them for virtio-blk.
For virtio-mmio we need to use the predicted device name for now.
Note: Kata used a disk for the VM rootfs in the case of Firecracker.
(Instead of initrd or virtual-nvdimm). The Kata code today does not
handle this case properly.
For now as Firecracker is the only Hypervisor in Kata that
uses virtio-mmio directly offset the drive index to comprehend
this.
Longer term we should track if the rootfs is setup as a block
device explicitly.
Fixes: #1046
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
We can use the background context when creating test sandboxes from the
sanbox unit tests. This shuts the "trace called before context set"
erros down.
Fixes: #1048
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
- Container only is responsable of namespaces and cgroups
inside the VM.
- Sandbox will manage VM resources.
The resouces has to be re-calculated and updated:
- Create new Container: If a new container is created the cpus and memory
may be updated.
- Container update: The update call will change the cgroups of a container.
the sandbox would need to resize the cpus and VM depending the update.
To manage the resources from sandbox the hypervisor interaface adds two methods.
- resizeMemory().
This function will be used by the sandbox to request
increase or decrease the VM memory.
- resizeCPUs()
vcpus are requested to the hypervisor based
on the sum of all the containers in the sandbox.
The CPUs calculations use the container cgroup information all the time.
This should allow do better calculations.
For example.
2 containers in a pod.
container 1 cpus = .5
container 2 cpus = .5
Now:
Sandbox requested vcpus 1
Before:
Sandbox requested vcpus 2
When a update request is done only some atributes have
information. If cpu and quota are nil or 0 we dont update them.
If we would updated them the sandbox calculations would remove already
removed vcpus.
This commit also moves the sandbox resource update call at container.update()
just before the container cgroups information is updated.
Fixes: #833
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StartSandbox(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
As we want to call the OCI hook from the CLI, we need a way for the
CLI to figure out what is the network namespace used by the sandbox.
This is needed particularly because virtcontainers creates the netns
if none was provided.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Add additional `context.Context` parameters and `struct` fields to allow
trace spans to be created by the `virtcontainers` internal functions,
objects and sub-packages.
Note that not every function is traced; we can add more traces as
desired.
Fixes#566.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Fixes#50 .
Add new interface sandbox.AddDevice, then for Frakti use case, a device
can be attached to sandbox before container is created.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
This PR got merged while it had some issues with some shim processes
being left behind after k8s testing. And because those issues were
real issues introduced by this PR (not some random failures), now
the master branch is broken and new pull requests cannot get the
CI passing. That's the reason why this commit revert the changes
introduced by this PR so that we can fix the master branch.
Fixes#529
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
ContainerID is supposed to be unique within a sandbox. It is better to use
a map to describe containers of a sandbox.
Fixes: #502
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Add SetFactory to allow virtcontainers consumers to set a vm factory.
And use it to create new VMs whenever the factory is set.
Signed-off-by: Peng Tao <bergwolf@gmail.com>
CreateDevice() is only used by `NewDevices()` so we can make it private and
there's no need to export it.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Fixes#50
This is done for decoupling device management part from other parts.
It seperate device.go to several dirs and files:
```
virtcontainers/device
├── api
│ └── interface.go
├── config
│ └── config.go
├── drivers
│ ├── block.go
│ ├── generic.go
│ ├── utils.go
│ ├── vfio.go
│ ├── vhost_user_blk.go
│ ├── vhost_user.go
│ ├── vhost_user_net.go
│ └── vhost_user_scsi.go
└── manager
├── manager.go
└── utils.go
```
* `api` contains interface definition of device management, so upper level caller
should import and use the interface, and lower level should implement the interface.
it's bridge to device drivers and callers.
* `config` contains structed exported data.
* `drivers` contains specific device drivers including block, vfio and vhost user
devices.
* `manager` exposes an external management package with a `DeviceManager`.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>