Commit Graph

7513 Commits

Author SHA1 Message Date
Fabiano Fidêncio
76540dbdd1 tools: Automatically revert kata-deploy changes
When branching the "stable-x.y" branch, we need to do some quite
specific changes to kata-deploy / kata-cleanup files, such as:
* changing the tags from "latest" to "stable-x.y".
* removing the kata-deploy / kata-cleanup stable files.

However, after the branching is done, we need to get the `main` repo to
its original state, with the kata-deploy / kata-cleanup using the
"latest" tag, and with the stable files present there, and this commit
ensures that, during the release process, a new PR is automatically
created with these changes.

Fixes: #3069

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-24 22:07:53 +01:00
Fabiano Fidêncio
36d73c96c8 tools: Do the kata-deploy changes on its own commit
Rather than doing the kata-deploy changes as part of the release bump
commit, let's split those on its own changes, as it will both make the
life of the reviewer less confusing and also allows us to start
preparing the field for a possible automated revert of these changes,
whenever it becomes needed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-24 22:07:52 +01:00
Fabiano Fidêncio
c8e22daf67 tools: Use vars for the registry in the update repo script
Similarly to what was done for the yaml files, let's use a var for
representing the registry where our images will be pushed to and avoid
repetition and too long lines.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-24 22:07:03 +01:00
Fabiano Fidêncio
ac958a3073 tools: Use vars for the yaml files used in the update repo script
Instead of always writing the full path of some files, let's just create
some vars and avoid both repetition (which is quite error prone) and too
long lines (which makes the file not so easy to read).

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-24 22:06:45 +01:00
Fabiano Fidêncio
edca829242 tools: Rewrite the logic around kata-deploy changes
We can simplify the code a little bit, as at least now we group common
operationr together.  Hopefully this will improve the maintainability
and the readability of the code.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-24 22:05:35 +01:00
Fabiano Fidêncio
31f6c2c2ea tools: Update comments about the kata-deploy yaml changes
The comments were mentioning kata-deploy-base files while it really
should mention kata-deploy-stable files.

While here, I've also added a missing '"' to one of the tags.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-24 21:17:40 +01:00
Binbin Zhang
75bb340137 shimv2/service: fix defer funtions never run with os.Exit()
os.Exit() will terminate program immediately, the defer functions
won't be executed, so we add defer functions again before os.Exit().
Refer to https://pkg.go.dev/os#Exit

Fixes: #3059

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-11-24 15:59:59 +01:00
James O. D. Hunt
bd3217daeb agent: Remove redundant returns
Remove an unnecessary `return` statement identified by clippy.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
adab64349c agent: Remove some unwrap and expect calls
Replace some `unwrap()` and `expect()` calls with code to return the
error to the caller.

Fixes: #3011.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
351cef7b6a agent: Remove unwrap from verify_cid()
Improved the `verify_cid()` function that validates container ID's by
removing the need for an `unwrap()`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
a7d1c70c4b agent: Improve baremount
Change `baremount()` to accept `Path` values rather than string values
since:

- `Path` is more natural given the function deals with paths.
- This minimises the caller having to convert between string and `Path`
  types, which simplifies the surrounding code.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
09abcd4dc6 agent-ctl: Remove some unwrap and expect calls
Replace some `unwrap()` and `expect()` calls with code to return the
error to the caller.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
35db75baa1 agent-ctl: Remove redundant returns
Remove a number of redundant `return`'s.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
46e459584d agent-ctl: Simplify main
Make the `main()` function simpler.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
James O. D. Hunt
c7349d0bf1 agent-ctl: Simplify error handling
Replace `ok_or().map_err()` combinations with the simpler `ok_or_else()`
construct.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-24 11:43:49 +00:00
bin
ddc68131df runtime: delete netmon
Netmon is not used anymore.

Fixes: #3112

Signed-off-by: bin <bin@hyper.sh>
2021-11-24 15:08:18 +08:00
Carlos Venegas
ac058b3897
Merge pull request #3105 from YchauWang/wyc-agent-make-02
agent: fixed the `make optimize` bug
2021-11-23 13:17:05 -06:00
Fabiano Fidêncio
181f876fdb
Merge pull request #3098 from fidencio/wip/move_kata-deploy-install-instruction_to_docs
docs: make kata-deploy more visible
2021-11-23 18:32:42 +01:00
João Vanzuita
705687dc42 docs: Add kata-deploy as part of the install docs
This PR links the kata-deloy installation instructions to the
docs/install folder.

Fixes: #2450

Signed-off-by: João Vanzuita <joao.vanzuita@de.bosch.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-23 13:57:22 +01:00
Fabiano Fidêncio
acece84906 docs: Use the default notation for "Note" on install README
Let's use the default GitHub notation for notes in documentation, as
describe here:
https://github.com/kata-containers/kata-containers/blob/main/docs/Documentation-Requir

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Suggested-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-23 13:27:35 +01:00
Fabiano Fidêncio
143fb27802 kata-deploy: Use the default notation for "Note"
Let's use the default GitHub notation for notes in documentation, as
describe here:
https://github.com/kata-containers/kata-containers/blob/main/docs/Documentation-Requirements.md#notes

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Suggested-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-11-23 13:24:42 +01:00
Fabiano Fidêncio
45d76407aa kata-deploy: Don't mention arch specific binaries in the README
Although the binary name of the shipped binary is `qemu-system-x86_64`,
and we only ship kata-deploy for `x86_64`, we better leaving the
architecture specific name out of our README file.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-23 13:21:37 +01:00
wangyongchao.bj
0c6c0735ec agent: fixed the make optimize bug
The unrecognized option: 'deny-warnings' args caused `make optimize` failed.
Fixed the Makefile of the agent project, make sure the `make optimize` command
execute correctly. This PR modify the rustc args from '--deny-warnings' to
'--deny warnings'.

Fixes: #3104

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2021-11-23 09:44:05 +08:00
Fabiano Fidêncio
0ae77e1232
Merge pull request #3102 from fidencio/wip/add-back-wrongly-removed-check-for-test-kata-deploy
workflows: Add back the checks for running test-kata-deploy
2021-11-22 22:36:03 +01:00
Fabiano Fidêncio
a7c08aa4b6 workflows: Add back the checks for running test-kata-deploy
Commit 3c9ae7f made /test_kata_deploy run
against HEAD, but it also mistakenly removed all the checks that ensure
/test_kata_deploy only runs when explicitly called.

Mea culpa on this, and let's add the tests back.

Fixes: #3101

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2021-11-22 18:33:10 +01:00
Carlos Venegas
3be15aed1c
Merge pull request #3071 from fidencio/wip/test-kata-deploy-should-use-the-latest-builds
kata-deploy: Ensure we test HEAD with `/test_kata_deploy`
2021-11-22 10:48:35 -06:00
bin
ce0693d6dc agent: clear cargo test warnings
Function parameters in test config is not used. This
commit will add under score before variable name
in test config.

Fixes: #3091

Signed-off-by: bin <bin@hyper.sh>
2021-11-22 20:45:46 +08:00
Tim Zhang
cad279b37d
Merge pull request #3055 from liubin/fix/3054-update-spdk-doc
docs: update using-SPDK-vhostuser-and-kata.md
2021-11-22 15:47:02 +08:00
David Gibson
1b28d7180f
Merge pull request #2927 from dgibson/vfio-env-mangling
Update k8s SR-IOV plugin environment variables to work properly with Kata
2021-11-22 13:44:19 +11:00
Eric Ernst
a0919b0865
Merge pull request #2998 from egernst/fix-symlinks
watchers: don't dereference symlinks when copying files
2021-11-19 12:43:22 -08:00
Eric Ernst
ce92cadc7d vc: hypervisor: remove setSandbox
The hypervisor interface implementation should not know a thing about
sandboxes.

Fixes: #2882

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 12:20:41 -08:00
Eric Ernst
2227c46c25 vc: hypervisor: use our own logger
This'll end up moving to hypervisors pkg, but let's stop using virtLog,
instead introduce hvLogger.

Fixes: #2884

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 12:20:41 -08:00
Eric Ernst
4c2883f7e2 vc: hypervisor: remove dependency on persist API
Today the hypervisor code in vc relies on persist pkg for two things:
1. To get the VM/run store path on the host filesystem,
2. For type definition of the Load/Save functions of the hypervisor
   interface.

For (1), we can simply remove the store interface from the hypervisor
config and replace it with just the path, since this is all we really
need. When we create a NewHypervisor structure, outside of the
hypervisor, we can populate this path.

For (2), rather than have the persist pkg define the structure, let's
let the hypervisor code (soon to be pkg) define the structure. persist
API already needs to call into hypervisor anyway; let's allow us to
define the structure.

We'll probably want to look at following similar pattern for other parts
of vc that we want to make independent of the persist API.

In doing this, we started an initial hypervisors pkg, to hold these
types (avoid a circular dependency between virtcontainers and persist
pkg). Next step will be to remove all other dependencies and move the
hypervisor specific code into this pkg, and out of virtcontaienrs.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 12:20:41 -08:00
Eric Ernst
34f23de512 vc: hypervisor: Remove need to get shared address from sandbox
Add shared path as part of the hypervisor config

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 12:20:41 -08:00
Eric Ernst
c28e5a7807 acrn: remove dependency on sandbox, persistapi datatypes
Today, acrn relies on sandbox level information, as well as a store
provided by common parts of the hypervisor. As we cleanup the
abstractions within our runtime, we need to ensure that there aren't
cross dependencies between the sandbox, the persistence logic and the
hypervisor.

Ensure that ACRN still compiles, but remove the setSandbox usage as
well as persist driver setup.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 12:20:41 -08:00
Eric Ernst
a0e0e18639 hypervisors: introduce pkg to unbreak vc/persist dependency
Initial hypervisors pkg, with just basic state types defined.

Fixes: #2883

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 12:20:41 -08:00
Eric Ernst
b5dfcf2653 watcher: tests: ensure there is 20ms delay between fs writes
We noticed s390x test failures on several of the watcher unit tests.

Discovered that on s390 in particular, if we update a file in quick
sucecssion, the time stampe on the file would not be unique between the
writes. Through testing, we observe that a 20 millisecond delay is very
reliable for being able to observe the timestamp update. Let's ensure we
have this delay between writes for our tests so our tests are more
reliable.

In "the real world" we'll be polling for changes every 2 seconds, and
frequency of filesystem updates will be on order of minutes and days,
rather that microseconds.

Fixes: #2946

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-11-19 11:33:36 -08:00
Fabiano Fidêncio
d08bcde7aa
Merge pull request #3068 from fidencio/wip/kata-deploy-re-add-latest-and-stable-tags
kata-deploy: Add back stable & latest tags
2021-11-19 15:58:55 +01:00
David Gibson
78dff468bf agent/device: Adjust PCIDEVICE_* container environment variables for VM
The k8s SR-IOV plugin, when it assigns a VFIO device to a container, adds
an variable of the form PCIDEVICE_<identifier> to the container's
environment, so that the payload knows which device is which.  The contents
of the variable gives the PCI address of the device to use.

Kata allows VFIO devices to be passed in to a Kata container, however it
runs within a VM which has a different PCI topology.  In order for the
payload to find the right device, the environment variables therefore need
to be converted to list the guest PCI addresses instead of the host PCI
addresses.

fixes #2897

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 17:44:05 +11:00
David Gibson
4530e7df29 agent/device: Use simpler structure in update_spec_devices()
update_spec_devices() takes a bunch of updates for the device entries in
the OCI spec and applies them, adjusting things in both the linux.devices
and linux.resources.devices sections of the spec.

It's important that each entry in the spec only be updated once.  Currently
we ensure this by first creating an index of where the entries are, then
consulting that as we apply each update, so that earlier updates don't
cause us to incorrectly detect an entry as being relevant to a later
update.  This method works, but it's quite awkward.

This inverts the loop structure in update_spec_devices() to make this
clearer.  Instead of stepping through each update and finding the relevant
entries in the spec to change, we step through each entry in the spec and
find the relevant update.  This makes it structurally clear that we're only
updating each entry once.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 17:21:11 +11:00
Tim Zhang
653b461dc2
Merge pull request #3064 from lifupan/main
agent: fix the issue of missing create a new session for container
2021-11-19 11:28:54 +08:00
David Gibson
b60622786d agent/device: Correct misleading comment on test case
We have a test case commented as testing the case where linux.devices is
empty in the OCI spec.  While it's true that linux.devices is empth in this
example, the reason it fails isn't specifically because it's empty but
because it doesn't contain a device for the update we're trying to apply.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:25:04 +11:00
David Gibson
89ff700038 agent/device: Remove unnecessary check for empty container_path
update_spec_devices() explicitly checks for being called with an empty
container path and fails.  We have a unit test to verify this behaviour.

But while an empty container_path probably does mean something has gone
wrong elsewhere, that's also true of any number of other bad paths.  Having
an empty string here doesn't prevent what we're doing in this function
making sense - we can compare it to the strings in the OCI spec perfectly
well (though more likely we simply won't find it there).

So, there's no real reason to check this one particular odd case.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:25:03 +11:00
David Gibson
c855a312f0 agent/device: Make DevIndex local to update_spec_devices()
The DevIndex data structure keeps track of devices in the OCI
specification.  We used to carry it around to quite a lot of
functions, but it's now used only within update_spec_devices().  That
means we can simplify things a bit by just open coding the maps we
need, rather than declaring a special type.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:24:47 +11:00
David Gibson
084538d334 agent/device: Change update_spec_device to handle multiple devices at once
update_spec_device() adjusts the OCI spec for device differences
between the host and guest.  It is called repeatedly for each device
we need to alter.  These calls are now all in a single loop in
add_devices(), so it makes more sense to move the loop into a renamed
update_spec_devices() and process all the fixups in one call.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:23:58 +11:00
David Gibson
d6a3ebc496 agent/device: Obtain guest major/minor numbers when creating DevNumUpdate
Currently the DevNumUpdate structure is created with a path to a
device node in the VM, which is then used by update_spec_device().
However the only piece of information that update_spec_device()
actually needs is the VM side major and minor numbers for the device.
We can determine those when we create the DevNumUpdate structure.
This means we detect errors earlier and as a bonus we don't need to
make a copy of the vm path string.

Since that change requires updating 2 of the log statements, we take the
opportunity to update all the log statements to structured style.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:23:36 +11:00
David Gibson
f4982130e1 agent/device: Check for conflicting device updates
For each device in the OCI spec we need to update it to reflect the guest
rather than the host.  We do this with additional device information
provided by the runtime.  There should only be one update for each device
though, if there are multiple, something has gone horribly wrong.

Detect and report this situation, for safety.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:23:34 +11:00
David Gibson
f10e8c8165 agent/device: Batch changes to the OCI specification
As we process container devices in the agent, we repeatedly call
update_spec_device() to adjust the OCI spec as necessary for differences
between the host and the VM.  This means that for the whole of a pretty
complex call graph, the spec is in a partially-updated state - neither
fully as it was on the host, not fully as it will be for the container
within the VM.

Worse, it's not discernable from the contents itself which parts of the
spec have already been updated and which have not.  We used to have real
bugs because of this, until the DevIndex structure was introduced, but that
means a whole, fairly complex, parallel data structure needs to be passed
around this call graph just to keep track of the state we're in.

Start simplifying this by having the device handler functions not directly
update the spec, but instead return an update structure describing the
change they need.  Once all the devices are added, add_devices() will
process all the updates as a batch.

Note that collecting the updates in a HashMap, rather than a simple Vec
doesn't make a lot of sense in the current code, but will reduce churn
in future changes which make use of it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 14:23:15 +11:00
David Gibson
46a4020e9e agent/device: Types to represent update for a device in the OCI spec
Currently update_spec_device() takes parameters 'vm_path' and 'final_path'
to give it the information it needs to update a single device in the OCI
spec for the guest.  This bundles these parameters into a single structure
type describing the updates to a single device.  This doesn't accomplish
much immediately, but will allow a number of further cleanups.

At the same time we change the representation of vm_path from a Unicode
string to a std::path::Path, which is a bit more natural since we are
performing file operations on it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 12:27:52 +11:00
David Gibson
e7beed5430 agent/device: Remove unneeded clone() from several device handlers
virtio_blk_device_handler(), virtio_blk_ccw_device_handler() and
virtio_scsi_device_handler() all take a clone of their 'device' parameter.
They appear to do this in order to get a mutable copy in which they can
update the vm_path field.

However, the copy is dropped at the end of the function, so the only thing
that's used in it is the vm_path field passed to update_spec_device()
afterwards.

We can avoid the clone by just using a local variable for the vm_path.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-19 12:27:52 +11:00