kata-containers

mirror of https://github.com/kata-containers/kata-containers.git synced 2025-10-22 20:39:41 +00:00

Author	SHA1	Message	Date
Zichang Lin	36306e283c	sandbox/virtcontainers: modify tests relate to memory hotplug. Signed-off-by: Clare Chen <clare.chenhui@huawei.com> Signed-off-by: Zichang Lin <linzichang@huawei.com>	2018-10-17 23:01:13 -04:00
Clare Chen	14f480af8f	sandbox/virtcontainers: combine addResources and updateResources addResources is just a special case of updateResources. Combine the shared codes so that we do not maintain the two pieces of identical code. Signed-off-by: Clare Chen <clare.chenhui@huawei.com>	2018-10-15 10:39:08 +08:00
Zichang Lin	8e2ee686bd	sandbox/virtcontainers: memory resource hotplug when create container. When create sandbox, we setup a sandbox of 2048M base memory, and then hotplug memory that is needed for every new container. And we change the unit of c.config.Resources.Mem from MiB to Byte in order to prevent the 4095B < memory < 1MiB from being lost. Depends-on:github.com/kata-containers/tests#813 Fixes #400 Signed-off-by: Clare Chen <clare.chenhui@huawei.com> Signed-off-by: Zichang Lin <linzichang@huawei.com>	2018-10-15 10:37:29 +08:00
Jose Carlos Venegas Munoz	4697cf3c79	memory: update: Update state using the memory removed. If the memory is reduced , its cgroup in the VM was updated properly. But the runtime assumed that the memory was also removed from the VM. Then when it is added more memory again, more is added (but not needed). Fixes: #801 Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>	2018-10-02 14:38:21 -05:00
Julio Montes	9e606b3da8	virtcontainers: revert "fix shared dir resource remaining" This reverts commit `8a6d383715`. Don't remove all directories in the shared directory because `docker cp` re-mounts all the mount points specified in the config.json causing serious problems in the host. fixes #777 Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-09-24 12:15:09 -05:00
Clare Chen	12a0354084	sandbox: get and store guest details. Get and store guest details after sandbox is completely created. And get memory block size from sandbox state file when check hotplug memory valid. Signed-off-by: Clare Chen <clare.chenhui@huawei.com> Signed-off-by: Zichang Lin <linzichang@huawei.com>	2018-09-17 07:00:46 -04:00
Clare Chen	13bf7d1bbc	virtcontainers: hotplug memory with kata-runtime update command Add support for using update command to hotplug memory to vm. Connect kata-runtime update interface with hypervisor memory hotplug feature. Fixes #625 Signed-off-by: Clare Chen <clare.chenhui@huawei.com>	2018-09-17 05:02:18 -04:00
James O. D. Hunt	ed1e343b93	Merge pull request #655 from WeiZhang555/add-ref-counter-for-devices Add ref counter for devices	2018-09-06 09:51:07 +01:00
fupan	a5478b93e0	virtcontainers: wait until process exited before RemoveContainer RemoveContainer is called right after SignalProcess(SIGKILL), the container process might be still running and container Destroy() will fail, thus it's better to wait on this process exited before to issue RemoveContainer. Fixes: #690 Signed-off-by: fupan <lifupan@gmail.com>	2018-09-03 12:18:12 +08:00
Wei Zhang	c518b1ef00	device: use devicemanager to manage rootfs block Fixes #635 When container rootfs is block based in devicemapper use case, we can re-use sandbox device manager to manage rootfs block plug/unplug, we don't detailed description of block in container state file, instead we only need a Block index referencing sandbox device. Remove `HotpluggedDrive` and `RootfsPCIAddr` from state file because it's not necessary any more. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-08-31 19:30:08 +08:00
Wei Zhang	affd6e3216	devices: add reference count for devices. Fixes #635 Remove `Hotplugged bool` field from device and add two new fields instead: * `RefCount`: how many references to this device. One device can be referenced(`NewDevice()`) many times by same/different container(s), two devices are regarded identical if they have same hostPath * `AttachCount`: how many times this device has been attached. A device can only be hotplugged once to the qemu, every new Attach command will add the AttachCount, and real `Detach` will be done only when `AttachCount == 0` Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-08-31 09:53:01 +08:00
James O. D. Hunt	d0679a6fd1	tracing: Add tracing support to virtcontainers Add additional `context.Context` parameters and `struct` fields to allow trace spans to be created by the `virtcontainers` internal functions, objects and sub-packages. Note that not every function is traced; we can add more traces as desired. Fixes #566. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-08-22 08:24:58 +01:00
Wei Zhang	6e6be98b15	devices: add interface "sandbox.AddDevice" Fixes #50 . Add new interface sandbox.AddDevice, then for Frakti use case, a device can be attached to sandbox before container is created. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-08-15 15:24:12 +08:00
Sebastien Boeuf	16600efc1d	Merge pull request #531 from WeiZhang555/bugfix re-add: refactor device manager	2018-08-02 07:32:02 -07:00
Wei Zhang	b7464899ec	devices: address some comments Address some review comments: * remove unnecessary rollback logics * add vfio hot unplug handling. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-31 10:05:56 +08:00
Wei Zhang	f905c16f21	device-manager: refactor device manger Fixes #50 This commit imports a big logic change: * host device to be attached or appended now is sandbox level resources, one device should bind to sandbox/hypervisor first, then container could reference it via device's unique ID. * attach or detach device should go through the device manager interface instead of the device interface. * allocate device ID in global device mapper to guarantee every device has a uniq device ID and there won't be any ID collision. With this change, there will some changes on data format on disk for sandbox and container, these changes also make a breakage of backward compatibility. New persist data format: * every sandbox will get a new "devices.json" file under "/run/vc/sbs/<sid>/" which saves detailed device information, this also conforms to the concept that device should be sandbox level resource. * every container uses a "devices.json" file but with new data format: ``` [ { "ID": "b80d4736e70a471f", "ContainerPath": "/dev/zero" }, { "ID": "6765a06e0aa0897d", "ContainerPath": "/dev/null" } ] ``` `ID` should reference to a device in a sandbox, `ContainerPath` indicates device path inside a container. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-07-31 10:03:57 +08:00
Wei Zhang	1194154309	devices: use device manager to manage all devices Fixes #50 Previously the devices are created with device manager and laterly attached to hypervisor with "device.Attach()", this could work, but there's no way to remember the reference count for every device, which means if we plug one device to hypervisor twice, it's truly inserted twice, but actually we only need to insert once but use it in many places. Use device manager as a consolidated entrypoint of device management can give us a way to handle many "references" to single device, because it can save all devices and remember it's use count. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-31 09:59:29 +08:00
James O. D. Hunt	763a1b6265	logging: Remove unnecessary fields and use standard names Ensure the entire codebase uses `"sandbox"` and `"container"` log fields for the sandboxID and containerID respectively. Simplify code where fields can be dropped. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-07-30 15:32:41 +01:00
Sebastien Boeuf	927487c142	revert: "virtcontainers: support pre-add storage for frakti" This PR got merged while it had some issues with some shim processes being left behind after k8s testing. And because those issues were real issues introduced by this PR (not some random failures), now the master branch is broken and new pull requests cannot get the CI passing. That's the reason why this commit revert the changes introduced by this PR so that we can fix the master branch. Fixes #529 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-07-27 09:39:56 -07:00
Wei Zhang	8391b20805	devices: address some comments Address some review comments: * remove unnecessary rollback logics * add vfio hot unplug handling. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-26 14:15:52 +08:00
Wei Zhang	7f5989f06c	device-manager: refactor device manger Fixes #50 This commit imports a big logic change: * host device to be attached or appended now is sandbox level resources, one device should bind to sandbox/hypervisor first, then container could reference it via device's unique ID. * attach or detach device should go through the device manager interface instead of the device interface. * allocate device ID in global device mapper to guarantee every device has a uniq device ID and there won't be any ID collision. With this change, there will some changes on data format on disk for sandbox and container, these changes also make a breakage of backward compatibility. New persist data format: * every sandbox will get a new "devices.json" file under "/run/vc/sbs/<sid>/" which saves detailed device information, this also conforms to the concept that device should be sandbox level resource. * every container uses a "devices.json" file but with new data format: ``` [ { "ID": "b80d4736e70a471f", "ContainerPath": "/dev/zero" }, { "ID": "6765a06e0aa0897d", "ContainerPath": "/dev/null" } ] ``` `ID` should reference to a device in a sandbox, `ContainerPath` indicates device path inside a container. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-07-26 14:09:53 +08:00
Wei Zhang	2885eb0532	devices: use device manager to manage all devices Fixes #50 Previously the devices are created with device manager and laterly attached to hypervisor with "device.Attach()", this could work, but there's no way to remember the reference count for every device, which means if we plug one device to hypervisor twice, it's truly inserted twice, but actually we only need to insert once but use it in many places. Use device manager as a consolidated entrypoint of device management can give us a way to handle many "references" to single device, because it can save all devices and remember it's use count. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-26 11:33:28 +08:00
c00416947	8a6d383715	virtcontainers : fix shared dir resource remaining Before this patch shared dir will reamin when sandox has already removed, espacilly for kata-agent mod. Do clean up shared dirs after all mounts are umounted. Fixes: #291 Signed-off-by: Haomin <caihaomin@huawei.com>	2018-06-19 20:32:07 +08:00
Julio Montes	b99cadb553	virtcontainers: add pause and resume container to the API Pause and resume container functions allow us to just pause/resume a specific container not all the sanbox, in that way different containers can be paused or running in the same sanbox. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-05-31 09:38:13 -05:00
Archana Shinde	d885782df1	namespace: Check if pid namespaces need to be shared k8s provides a configuration for sharing PID namespace among containers. In case of crio and cri plugin, an infra container is started first. All following containers are supposed to share the pid namespace of this container. In case a non-empty pid namespace path is provided for a container, we check for the above condition while creating a container and pass this out to the kata agent in the CreatContainer request as SandboxPidNs flag. We clear out the PID namespaces in the configuration passed to the kata agent. Fixes #343 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-05-30 13:34:24 -07:00
Peng Tao	be82c7fc6f	Merge pull request #299 from jshachm/implement-events-command cli :Implement events command	2018-05-18 15:35:52 +08:00
c00416947	1205e347f2	cli: implement events command Events cli display container events such as cpu, memory, and IO usage statistics. By now OOM notifications and intel RDT are not fully supproted. Fixes: #186 Signed-off-by: Haomin <caihaomin@huawei.com>	2018-05-18 09:17:49 +08:00
Julio Montes	4527a8066a	virtcontainers/qemu: honour CPU constrains Don't fail if a new container with a CPU constraint was added to a POD and no more vCPUs are available, instead apply the constraint and let kernel balance the resources. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-05-14 17:33:31 -05:00
Julio Montes	81f376920e	cli: implement update command Update command is used to update container's resources at run time. All constraints are applied inside the VM to each container cgroup. By now only CPU constraints are fully supported, vCPU are hot added or removed depending of the new constraint. fixes #189 Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-05-08 07:26:38 -05:00
Zhang Wei	f4a453b86c	virtcontainers: address some comments * Move makeNameID() func to virtcontainers/utils file as it's a generic function for making name and ID. * Move bindDevicetoVFIO() and bindDevicetoHost() to vfio driver package. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-05-08 10:24:26 +08:00
Zhang Wei	366558ad5b	virtcontainers: refactor device.go to device manager Fixes #50 This is done for decoupling device management part from other parts. It seperate device.go to several dirs and files: ``` virtcontainers/device ├── api │ └── interface.go ├── config │ └── config.go ├── drivers │ ├── block.go │ ├── generic.go │ ├── utils.go │ ├── vfio.go │ ├── vhost_user_blk.go │ ├── vhost_user.go │ ├── vhost_user_net.go │ └── vhost_user_scsi.go └── manager ├── manager.go └── utils.go ``` * `api` contains interface definition of device management, so upper level caller should import and use the interface, and lower level should implement the interface. it's bridge to device drivers and callers. * `config` contains structed exported data. * `drivers` contains specific device drivers including block, vfio and vhost user devices. * `manager` exposes an external management package with a `DeviceManager`. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-05-08 10:24:26 +08:00
Peng Tao	1bb6ab9e22	api: add sandbox iostream API It returns stdin, stdout and stderr stream of the specified process in the container. Fixes: #258 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-05-04 15:38:32 +08:00
Peng Tao	bf4ef4324e	API: add sandbox winsizeprocess api It sends tty resize request to the agent to resize a process's tty window. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-05-04 15:38:32 +08:00
Peng Tao	55dc0b2995	API: add sandbox signalprocess api It sends the signal to a process of a container, or all processes inside a container. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-05-04 15:38:32 +08:00
Peng Tao	45970ba796	API: add sandbox waitprocess api It waits a process inside the container of a sandbox. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-05-04 15:38:32 +08:00
Archana Shinde	717bc4cd26	virtcontainers: Pass the PCI address for block based rootfs Store the PCI address of rootfs in case the rootfs is block based and passed using virtio-block. This helps up get rid of prdicting the device name inside the container for the block device. The agent will determine the device node name using the PCI address. Fixes #266 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-05-03 10:59:09 -07:00
Archana Shinde	718dbd2a71	device: Assign pci address for block devices Introduce a new field in Drive to store the PCI address if the drive is attached using virtio-blk. Assign PCI address in the format bridge-addr/device-addr. Since we need to assign the address while hotplugging, pass Drive by address. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-05-03 10:59:09 -07:00
Peng Tao	5fb4768f83	virtcontainers: always pass sandbox as a pointer Currently we sometimes pass it as a pointer and other times not. As a result, the view of sandbox across virtcontainers may not be the same and it costs extra memory copy each time we pass it by value. Fix it by ensuring sandbox is always passed by pointers. Fixes: #262 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-05-01 20:50:07 +08:00
Sebastien Boeuf	789dbca6d6	virtcontainers: Properly remove the container when shim gets killed Here is an interesting case I have been debugging. I was trying to understand why a "kubeadm reset" was not working for kata-runtime compared to runc. In this case, the only pod started with Kata is the kube-dns pod. For some reasons, when this pod is stopped and removed, its containers receive some signals, 2 of them being SIGTERM signals, which seems the way to properly stop them, but the third container receives a SIGCONT. Obviously, nothing happens in this case, but apparently CRI-O considers this should be the end of the container and after a few seconds, it kills the container process (being the shim in Kata case). Because it is using a SIGKILL, the signal does not get forwarded to the agent because the shim itself is killed right away. After this happened, CRI-O calls into "kata-runtime state", we detect the shim is not running anymore and we try to stop the container. The code will eventually call into agent.RemoveContainer(), but this will fail and return an error because inside the agent, the container is still running. The approach to solve this issue here is to send a SIGKILL signal to the container after the shim has been waited for. This call does not check for the error returned because most of the cases, regular use cases, will end up returning an error because the shim itself not being there actually represents the container inside the VM has already terminated. And in case the shim has been killed without the possibility to forward the signal (like described in first paragraph), the SIGKILL will work and will allow the following call to agent.stopContainer() to proceed to the removal of the container inside the agent. Fixes #274 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-04-27 18:36:27 -07:00
Archana Shinde	71c7a9c13e	virtcontainers: Handle regular files in /dev The k8s test creates a log file in /dev under /dev/termination-log, which is not the right place to create logs, but we need to handle this. With this commit, we handle regular files under /dev by passing them as 9p shares. All other special files including device files and directories are not passed as 9p shares as these are specific to the host. Any operations on these in the guest would fail anyways. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-19 10:59:26 -07:00
Archana Shinde	10c596a4ff	dev: Revert "Don't ignore container mounts based on their path" This reverts commit `08909b2213`. We should not be passing any bind-mounts from /dev, /sys and /proc. Mounting these from the host inside the container does not make sense as these files are relevant to the host OS. Fixes #219 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-19 10:46:10 -07:00
Graham whaley	d6c3ec864b	license: SPDX: update all vc files to use SPDX style When imported, the vc files carried in the 'full style' apache license text, but the standard for kata is to use SPDX style. Update the relevant files to SPDX. Fixes: #227 Signed-off-by: Graham whaley <graham.whaley@intel.com>	2018-04-18 13:43:15 +01:00
Julio Montes	8c9c7ddef8	virtcontainers: agent: fix CPU hot plug race condition Communicate to the agent the number of vCPUs that were hot added, allowing to the agent wait for the creation of all vCPUs. fixes #90 Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-04-13 07:05:23 -05:00
Peng Tao	6107694930	runtime: rename pod to sandbox As agreed in [the kata containers API design](https://github.com/kata-containers/documentation/blob/master/design/kata-api-design.md), we need to rename pod notion to sandbox. The patch is a bit big but the actual change is done through the script: ``` sed -i -e 's/pod/sandbox/g' -e 's/Pod/Sandbox/g' -e 's/POD/SB/g' ``` The only expections are `pod_sandbox` and `pod_container` annotations, since we already pushed them to cri shims, we have to use them unchanged. Fixes: #199 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-04-13 09:32:51 +08:00
Archana Shinde	ed1078c800	volumes: Attach volumes that are block device files as block devices Check if a volume passed to the container with -v is a block device file, and if so pass the block device by hotplugging it to the VM instead of passing this as a 9pfs volume. This would give us better performance. Add block device associated with a volume to the list of container devices, so that it is detached with all other devices when the container is stopped with detachDevices() Fixes #137 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-11 12:24:12 -07:00
Archana Shinde	e96d3ef0d3	virtcontainers: Do not pass /dev/shm as 9p mount All bind mounts are now passed to the guest with 9p. We need to exclude /dev/shm, as this is passed as a bind mount in the spec. We handle /dev/shm in the guest by allocating memory for it on the guest side. Passing /dev/shm as a 9p mount was causing it to be mounted twice. Fixes #190 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-10 10:46:35 -07:00
Archana Shinde	50fd76eb9a	virtcontainers: block: Factorize checks for evaluating block support Factorize configuration and hardware support for hotplugging block devices into a single function and use that. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-09 15:49:17 -05:00
Sebastien Boeuf	1404928c05	virtcontainers: Fix container creation rollback The rollback does not work as expected because the error has to be checked from the defer itself. Fixes #178 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-04-03 16:13:45 -07:00
Sebastien Boeuf	788664809f	virtcontainers: container: Rollback when createContainer fails In case the container creation fails, we need a proper rollback regarding the mounts and hotplugs previously performed. This patch also rework the hotplugDrive() function in order to prevent createContainer() function complexity to exceed 15. Fixes #135 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-04-03 09:28:34 -07:00
Sebastien Boeuf	aa469f4573	exec: Allow to exec a process on a ready container If a container is not running, but created/ready instead, this means a container process exists and that we can actually exec another process inside this container. The container does not have to be in running state. Fixes #120 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-03-29 08:40:44 -07:00

1 2

55 Commits