Update command is used to update container's resources at run time.
All constraints are applied inside the VM to each container cgroup.
By now only CPU constraints are fully supported, vCPU are hot added
or removed depending of the new constraint.
fixes#189
Signed-off-by: Julio Montes <julio.montes@intel.com>
* Move makeNameID() func to virtcontainers/utils file as it's a generic
function for making name and ID.
* Move bindDevicetoVFIO() and bindDevicetoHost() to vfio driver package.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
CreateDevice() is only used by `NewDevices()` so we can make it private and
there's no need to export it.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Fixes#50
This is done for decoupling device management part from other parts.
It seperate device.go to several dirs and files:
```
virtcontainers/device
├── api
│ └── interface.go
├── config
│ └── config.go
├── drivers
│ ├── block.go
│ ├── generic.go
│ ├── utils.go
│ ├── vfio.go
│ ├── vhost_user_blk.go
│ ├── vhost_user.go
│ ├── vhost_user_net.go
│ └── vhost_user_scsi.go
└── manager
├── manager.go
└── utils.go
```
* `api` contains interface definition of device management, so upper level caller
should import and use the interface, and lower level should implement the interface.
it's bridge to device drivers and callers.
* `config` contains structed exported data.
* `drivers` contains specific device drivers including block, vfio and vhost user
devices.
* `manager` exposes an external management package with a `DeviceManager`.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Store the PCI address of rootfs in case the rootfs is block
based and passed using virtio-block.
This helps up get rid of prdicting the device name inside the
container for the block device. The agent will determine the device
node name using the PCI address.
Fixes#266
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Store PCI address for a block device on hotplugging it via
virtio-blk. This address will be passed by kata agent in the
device "Id" field. The agent within the guest can then use this
to identify the PCI slot in the guest and create the device node
based on it.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
We need to store the bridge address to state to use it
for assigning addresses to devices attached to teh bridge.
So we need to make sure that the bridge pointer is assigned
the address.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Introduce a new field in Drive to store the PCI address if the drive is
attached using virtio-blk.
Assign PCI address in the format bridge-addr/device-addr.
Since we need to assign the address while hotplugging, pass Drive
by address.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Change the function to return the bridge itself that the
device is attached to. This will allow bridge address to be used
for determining the PCI slot of the device within the guest.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
"make install" fails on a clean working directory:
$ make install
install: cannot stat ‘data/kata-collect-data.sh’: No such file or directory
This happens because install and install-scripts do not depend on the
runtime. Make doesn't know it needs to build the runtime before it can
be installed.
Add the missing dependencies to the install targets so that "make
install" works on a clean working directory and rebuilds when source
files have been modified.
Note that SCRIPTS contains the generated kata-collect-data.sh script.
That file needs to be generated before it can be installed, so make
SCRIPTS a dependency of install-scripts.
Fixes: #283
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
CI complains about cyclomatic complexity in sendReq.
warning: cyclomatic complexity 16 of function (*kataAgent).sendReq() is
high (> 15) (gocyclo)
Refactor it a bit to avoid such error. I'm not a big fan of the new code
but it is done so because golang does not support generics.
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Currently we sometimes pass it as a pointer and other times not. As
a result, the view of sandbox across virtcontainers may not be the same
and it costs extra memory copy each time we pass it by value. Fix it
by ensuring sandbox is always passed by pointers.
Fixes: #262
Signed-off-by: Peng Tao <bergwolf@gmail.com>
This commit will allow for better performance regarding the time spent
to retrieve the sandbox ID related to a container ID.
The way it works is by relying on a specific mapping between container
IDs and sanbox IDs, meaning it allows to retrieve directly the sandbox
ID related to a container ID from the CLI. This lowers complexity from
O(n²) to O(1), because we don't need to call into ListPod() which was
parsing all the pods and all the containers on the system everytime
the CLI need to retrieve this mapping.
This commit also updates the whole unit tests as a consequence. This
is involving most of them since they were all relying on ListPod()
before.
Fixes#212
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Here is an interesting case I have been debugging. I was trying to
understand why a "kubeadm reset" was not working for kata-runtime
compared to runc. In this case, the only pod started with Kata is
the kube-dns pod. For some reasons, when this pod is stopped and
removed, its containers receive some signals, 2 of them being SIGTERM
signals, which seems the way to properly stop them, but the third
container receives a SIGCONT. Obviously, nothing happens in this
case, but apparently CRI-O considers this should be the end of the
container and after a few seconds, it kills the container process
(being the shim in Kata case). Because it is using a SIGKILL, the
signal does not get forwarded to the agent because the shim itself
is killed right away. After this happened, CRI-O calls into
"kata-runtime state", we detect the shim is not running anymore
and we try to stop the container. The code will eventually call
into agent.RemoveContainer(), but this will fail and return an
error because inside the agent, the container is still running.
The approach to solve this issue here is to send a SIGKILL signal
to the container after the shim has been waited for. This call does
not check for the error returned because most of the cases, regular
use cases, will end up returning an error because the shim itself
not being there actually represents the container inside the VM has
already terminated.
And in case the shim has been killed without the possibility to
forward the signal (like described in first paragraph), the SIGKILL
will work and will allow the following call to agent.stopContainer()
to proceed to the removal of the container inside the agent.
Fixes#274
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We currently just send the pid in the state. While OCI specifies
a few other fields as well, this commit just adds the bundle path
and the container id to the state. This should fix the errors seen
with hooks that rely on the bundle path.
Other fields like running "state" string have been left out. As this
would need sending the strings that OCI recognises. Hooks have been
implemented in virtcontainers and sending the state string would
require calling into OCI specific code in virtcontainers.
The bundle path again is OCI specific, but this can be accessed
using annotations. Hooks really need to be moved to the cli as they
are OCI specific. This however needs network hotplug to be implemented
first so that the hooks can be called from the cli after the
VM has been created.
Fixes#271
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Now that our CI has moved to Go 1.10, we need to update one file
that is not formatted as the new gofmt (1.10) expects it to be
formatted.
Fixes#249
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Rework the signal handling code so that if debug is enabled and a
`SIGUSR1` signal is received, backtrace to the system log but continue
to run.
Added some basic tests for the signal handling code.
Fixes#241.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
The same way a caller of "kata-runtime kill 12345" expects
the container 12345 to be killed, the same call to a container
representing a sandbox should actually kill the sandbox, meaning
it would be stopped after the container has been killed.
This way, the caller knows the VM is stopped after kill returns.
This is an issue raised by Openshift and Kubernetes tests. They
call into delete way after the call to kill has been submitted,
and in the meantime they kill all processes related to the container,
meaning they do kill the VM before we could do it ourselves. In this
case, the delete responsible of stopping the VM comes too late and it
returns an error when trying to destroy the sandbox while trying to
communicate with the agent since the VM is not here anymore.
This commit addresses this issue by letting "kill" call into
StopSandbox() if the command relates to a sandbox instead of
a simple container.
Fixes#246
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The way a delete works, it was always trying to stop the sandbox, even
when the force flag was not enabled. Because we want to be able to stop
the sandbox from a kill command, this means a sandbox stop might be
called twice, and we don't want the second stop to fail, leading to the
failure of the delete command.
That's why this commit checks for the sandbox status before to try
stopping the sandbox.
Fixes#246
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
change from go1.10 to 1.9.2.
Our static checks and unit tests fail when using
go 1.10. Since we use go 1.9.2 to test in our CI,
reflect this version in versions.yaml
By doing this, we will be able to remove the hardcoded version
from the jenkins scripts and instead install golang using
`.ci/install_go.sh` from the tests repository. And when moving
to go1.10 using a PR, the CI will test that the static checks
and unit tests pass correctly.
Fixes: #254.
Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>