Commit Graph

2150 Commits

Author SHA1 Message Date
Wei Zhang
1194154309 devices: use device manager to manage all devices
Fixes #50

Previously the devices are created with device manager and laterly
attached to hypervisor with "device.Attach()", this could work, but
there's no way to remember the reference count for every device, which
means if we plug one device to hypervisor twice, it's truly inserted
twice, but actually we only need to insert once but use it in many
places.

Use device manager as a consolidated entrypoint of device management can
give us a way to handle many "references" to single device, because it
can save all devices and remember it's use count.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-31 09:59:29 +08:00
James O. D. Hunt
763a1b6265 logging: Remove unnecessary fields and use standard names
Ensure the entire codebase uses `"sandbox"` and `"container"` log
fields for the sandboxID and containerID respectively.

Simplify code where fields can be dropped.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
3323c087c5 logging: Add cid logging to update command
PR #468 neglected to update the `update` command.

Fixes #519.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
3d5ed6669c logging: Improve cid+sid logging
Refine the changes made on #468 by adding the containerID log field as
soon as possible (before *any* virtcontainers calls). This requires
that `setExternalLoggers()` be called more times, but it's essential to
ensure the correct log fields are available as early as possible.

Partially fixes #519.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
58448bbcb8 logging: Allow SetLogger to be called multiple times
Now that the `SetLogger()` functions accept a `logrus.Entry`, they can
access the fields that have already been set for the logger and
re-apply them if `SetLogger()` is called multiple times.

This fixes a bug whereby the logger functions -- which are necessarily
called multiple times [1] -- previously ended up applying any new fields
the specified logger contained, but erroneously removing any additional
fields added since `SetLogger()` was last called.

Partially fixes #519.

--
[1] - https://github.com/kata-containers/runtime/pull/468

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
029e7ca680 api: Change logger functions to accept a log entry
Rather than accepting a `logrus.FieldLogger` interface type, change all
the `SetLogger()` functions to accept a `logrus.Entry`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
dfb758a82d logging: Remove duplicate arch field in vc
As of #521, the runtime now adds the `arch` log field so
`virtcontainers` doesn't need to set it too.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
acdd0b8e68 logging: Split logging source into two fields
Don't use slash-delimited values in log fields - create two separate
log fields (`source` and `subsystem`) for clarity.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
c59394d3ed network: Make better use of log fields
Add key log information as log fields rather than free-format text.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
James O. D. Hunt
a0be57f64f network: Always call network logger function
Rather than using the virtcontainers logger, always call the network
logger function.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:40 +01:00
Eric Ernst
f4a7712795
Merge pull request #530 from kata-containers/revert-301-pre-addstorage-based-devmanager
revert: "virtcontainers: support pre-add storage for frakti"
2018-07-27 13:02:57 -07:00
Sebastien Boeuf
927487c142 revert: "virtcontainers: support pre-add storage for frakti"
This PR got merged while it had some issues with some shim processes
being left behind after k8s testing. And because those issues were
real issues introduced by this PR (not some random failures), now
the master branch is broken and new pull requests cannot get the
CI passing. That's the reason why this commit revert the changes
introduced by this PR so that we can fix the master branch.

Fixes #529

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-07-27 09:39:56 -07:00
Peng Tao
cfbc974fec
Merge pull request #521 from bergwolf/log
factory: add SetLogger API
2018-07-27 15:52:24 +08:00
zhangwei_cs
2c3215c018
Merge pull request #301 from WeiZhang555/pre-addstorage-based-devmanager
virtcontainers: support pre-add storage for frakti
2018-07-27 15:19:30 +08:00
z00280905
b3015dda26 devices: fix typo
Fix typo.

Signed-off-by: z00280905 <zhangwei555@huawei.com>
2018-07-27 09:33:50 +08:00
Eric Ernst
2a670ce022
Merge pull request #522 from chavafg/topic/update-docker-version
versions: Update docker-ce to 18.06
2018-07-26 16:15:50 -07:00
Salvador Fuentes
da77124898 versions: Update docker-ce to 18.06
Docker 18.06 was released last week, update our
supported docker to this new version.

Fixes: #510

Signed-off-by: Salvador Fuentes <salvador.fuentes@intel.com>
2018-07-26 10:52:43 -05:00
Sebastien Boeuf
c5075d08ed
Merge pull request #517 from jcvenegas/issue-516-timeout-centos
agent: Increase timeout for check request.
2018-07-26 06:59:37 -07:00
Peng Tao
9a497fedf5 factory: add SetLogger API
So that we actually use the same logger as other packages when being
invoked by CLI.

Fixes: #520

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-26 20:56:31 +08:00
James O. D. Hunt
daa65a5526
Merge pull request #514 from gkennedy12/work
cli: add AMD support to kata-check
2018-07-26 13:28:14 +01:00
Wei Zhang
198a0695ab devices: add some test cases
Add test cases for device manager reworks.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 14:15:52 +08:00
Wei Zhang
8391b20805 devices: address some comments
Address some review comments:
* remove unnecessary rollback logics
* add vfio hot unplug handling.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 14:15:52 +08:00
Zhang Wei
04f4f528f7 devices: rename VFIODrive to VFIODev
Rename VFIODrive to VFIODev, also rename device interface "GetDeviceDrive()" to
"GetDeviceInfo()".

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-26 14:15:52 +08:00
Zhang Wei
daf5abce2d devices: remove unused functions
cleanup: remove ununsed device interface function "GetDeviceInfo()"

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-26 14:14:02 +08:00
Wei Zhang
1b062b3db4 unit-tests: fix unit tests
Fix #50

Fix unit tests

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 14:14:02 +08:00
Wei Zhang
7f5989f06c device-manager: refactor device manger
Fixes #50

This commit imports a big logic change:
* host device to be attached or appended now is sandbox level resources,
one device should bind to sandbox/hypervisor first, then container could
reference it via device's unique ID.
* attach or detach device should go through the device manager interface
instead of the device interface.
* allocate device ID in global device mapper to guarantee every device
has a uniq device ID and there won't be any ID collision.

With this change, there will some changes on data format on disk for sandbox
and container, these changes also make a breakage of backward compatibility.

New persist data format:
* every sandbox will get a new "devices.json" file under "/run/vc/sbs/<sid>/"
which saves detailed device information, this also conforms to the concept that
device should be sandbox level resource.
* every container uses a "devices.json" file but with new data format:
```
[
  {
    "ID": "b80d4736e70a471f",
    "ContainerPath": "/dev/zero"
  },
  {
    "ID": "6765a06e0aa0897d",
    "ContainerPath": "/dev/null"
  }
]
```
`ID` should reference to a device in a sandbox, `ContainerPath` indicates device
path inside a container.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-26 14:09:53 +08:00
Wei Zhang
c08a26397e devices: don't use drivers package directly.
Instead of using drivers.XXXDevice directly, we should use exported
struct from device structure. package drivers should be internal struct
and other package should avoid read it's struct content directly.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 14:09:53 +08:00
Wei Zhang
b54df7e127 devices: remove interface VhostUserDevice
The interface "VhostUserDevice" has duplicate functions and fields with
Device, so we can merge them into one interface and manage them with one
group of interfaces.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 11:33:28 +08:00
Wei Zhang
2885eb0532 devices: use device manager to manage all devices
Fixes #50

Previously the devices are created with device manager and laterly
attached to hypervisor with "device.Attach()", this could work, but
there's no way to remember the reference count for every device, which
means if we plug one device to hypervisor twice, it's truly inserted
twice, but actually we only need to insert once but use it in many
places.

Use device manager as a consolidated entrypoint of device management can
give us a way to handle many "references" to single device, because it
can save all devices and remember it's use count.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 11:33:28 +08:00
Jose Carlos Venegas Munoz
5fc7219315 agent: check: Increase timeout check request.
In some slow enviroments the agent is taking more than 5 seconds
to start to serve grpc request.

This was reproducible in a Centos VM with 4 cpus running 8 pods in
parallel.

Fixes: #516

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-07-25 21:51:32 -05:00
Jose Carlos Venegas Munoz
12e1911aab kata-agent: Improve error message.
If the grpc connection check fails we only return the grpc error.
To make more clear what failed add more information to the error.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-07-25 20:25:23 -05:00
George Kennedy
4326ea874a cli: add AMD support to kata-check
Added support for identifying AMD CPUs in the `kata-check` CLI command.

Signed-off-by: George Kennedy <george.kennedy@oracle.com>

Fixes #476.
2018-07-25 12:05:47 -04:00
James O. D. Hunt
67b5841153
Merge pull request #512 from sboeuf/disable_codecov_patch
codecov: Explicitly disable codecov/patch coverage
2018-07-25 11:12:56 +01:00
Sebastien Boeuf
0e5f6b27e9 codecov: Explicitly disable codecov/patch coverage
Because codecov coverage regarding the patch is very inconsistent,
this commit introduces codecov.yml config file in order to disable
this check.

Fixes #511

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-07-24 11:20:07 -07:00
Eric Ernst
cd133dc9cb
Merge pull request #509 from lifupan/kata-integration
virtconainers: rollback the NetNs when createNetwork failed
2018-07-24 08:13:16 -07:00
Eric Ernst
20066270b9
Merge pull request #503 from bergwolf/container
sandbox: change container slice to a map
2018-07-24 08:10:39 -07:00
Haomin Tsai
8939fd802f
Merge pull request #351 from woshijpf/fix-no-kata-agent
virtcontainers: process the case that kata-agent doesn't start in VM
2018-07-24 19:47:08 +08:00
flyflypeng
2993cb3dd4 virtcontainers: fix kata-agent fail to start
If kata-agent doesn't start in VM, we need to do some rollback
operations to release related resources.

add grpc check() to check kata-agent is running or not

Fixes: #297

Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>
2018-07-25 00:54:33 +08:00
flyflypeng
7103c4f14a virtcontainers: add qemu process rollback
If some errors occur after qemu process start, then we need to
rollback to kill qemu process

Fixes: #297

Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>
2018-07-24 21:36:57 +08:00
flyflypeng
c2651a85a8 virtcontainers: add kata-proxy rollback
If some errors occur after kata-proxy start, we need to
rollback to kill kata-proxy process

Fixes: #297

Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>
2018-07-24 21:36:57 +08:00
flyflypeng
daebbd1e93 virtcontainers: add rollback to remove sandbox network
If error occurs after sandbox network created successfully, we need to rollback
to remove the created sandbox network

Fixes: #297

Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>
2018-07-24 21:34:58 +08:00
Peng Tao
99954d5025
Merge pull request #501 from bergwolf/qemu
virtcontainers: keep qmp connection whenever possible
2018-07-24 13:23:32 +08:00
Peng Tao
f9d50723b9 sandbox: change container slice to a map
ContainerID is supposed to be unique within a sandbox. It is better to use
a map to describe containers of a sandbox.

Fixes: #502

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-24 12:17:02 +08:00
fupan
c6fda444b7 virtconainers: rollback the NetNs when createNetwork failed
When createNetwork failed, cleanup the NetNs if it created.

Fixes: #508

Signed-off-by: fupan <lifupan@gmail.com>
2018-07-24 12:09:13 +08:00
Peng Tao
b244410443
Merge pull request #505 from bergwolf/create_factory
cli: create vm factory if failed to load existing one
2018-07-24 10:43:41 +08:00
Peng Tao
8bdceb92be
Merge pull request #496 from grahamwhaley/20180713_clean_tests
Ensure tests clean their tempfiles
2018-07-24 10:38:05 +08:00
Graham Whaley
50b445cf35 cli: tests: Clarify who cleans up tmpdir
Add a comment to clarify that the caller of
testRunContainerSetup() cleans up the tmpdir.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2018-07-23 17:32:52 +01:00
Graham Whaley
73c8286c7e cli: tests: remove the tmpdir to the config.json
We were defer removing the temporary config.json files
but not the tmpdir path we had created to store them in.
Expose that path out so we can defer removeall it.

Fixes: #480

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2018-07-23 17:32:52 +01:00
Graham Whaley
d6d38dae13 cli: update_test: defer remove tmpfile
Ensure we remove the tmpfile used for testing.

Signed-off-by: Graham Whaley <graham.whaley@intel.com>
2018-07-23 17:32:52 +01:00
Peng Tao
d69fbcf17f sandbox: add stateful sandbox config
When enabled, do not release in memory sandbox resources in VC APIs,
and callers are expected to call sandbox.Release() to release the in
memory resources.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-23 09:54:02 +08:00