Commit Graph

11469 Commits

Author SHA1 Message Date
Fabiano Fidêncio
67fed26f18 runtime: Use TDX image with in the qemu-tdx config
Let's make sure we use the TDX image as part of the QEMU TDX
configuration, which will help us to have the policies tested here.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-09-01 14:28:08 +00:00
Fabiano Fidêncio
ef70aeb6b8
Merge pull request #7817 from fidencio/topic/update-alpine-to-its-latest-release
versions: Update alpine to its 3.18 version
2023-09-01 14:51:58 +02:00
Jeremi Piotrowski
bde06758b1
Merge pull request #7761 from jepio/iocopy-fix-race
runtime: Fix data race in ioCopy
2023-09-01 09:30:54 +02:00
Fabiano Fidêncio
82cd14ba39 versions: Update alpine to its 3.18 version
3.15 will be out of life in 2 months from now.

Fixes: #7816

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-31 23:02:54 +02:00
GabyCT
d75c7b5f9c
Merge pull request #7813 from GabyCT/topic/genreport
metrics: Add grabdata script for metrics report
2023-08-31 13:33:38 -06:00
Gabriela Cervantes
6668825752 metrics: Add grabdata script for metrics report
This PR adds the grabdata script so it can be used for the metrics report
for kata metrics.

Fixes #7812

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-31 16:17:29 +00:00
Jeremi Piotrowski
d7612440b8
Merge pull request #7789 from beraldoleal/tests/amd
Fixes tests on AMD machines
2023-08-31 11:23:51 +02:00
Jeremi Piotrowski
c2ba29c15b runtime: Fix data race in ioCopy
IoCopy is a tricky function (I don't claim to fully understand its contract),
but here is what I see: The goroutine that runs it spawns 3 goroutines - one
for each stream to handle (stdin/stdout/stderr). The goroutine then waits for
the stream goroutines to exit. The idea is that when the process exits and is
closed, the stdout goroutine will be unblocked and close stdin - this should
unblock the stdin goroutine. The stderr goroutine will exit at the same time as
the stdout goroutine. The iocopy routine then closes all tty.io streams.

The problem is that the stdout goroutine decrements the WaitGroup before
closing the stdin stream, which causes the iocopy goroutine to race to close
the streams. Move the wg.Done() of the stdout routine past the close so that
*this* race becomes impossible. I can't guarantee that this doesn't affect some
unspecified behavior.

Fixes: #5031
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-08-31 10:17:38 +02:00
GabyCT
b467f2ef68
Merge pull request #7772 from GabyCT/topic/fiolimit
metrics: Enable FIO limits for kata metrics
2023-08-30 14:49:04 -06:00
Fabiano Fidêncio
aa2b51a831
Merge pull request #7783 from GabyCT/topic/makereport
metrics: Add metrics report script
2023-08-30 17:11:39 +02:00
Gabriela Cervantes
a7b59a5bf9 metrics: Add limit for 90 percentile for qemu value
This PR adds the limit for 90 percentile for qemu value for
FIO kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
99db6568e9 metrics: Add limit for write 90 percentile value for clh
This PR adds the limit for write 90 percentile value for clh for
FIO metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
Gabriela Cervantes
6e06392c55 metrics: Enable FIO limits for kata metrics
This PR enables the FIO limits for kata metrics.

Fixes #7771

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-30 13:53:38 +00:00
David Esparza
924d06a7f5
Merge pull request #7787 from GabyCT/topic/fixmemoryinsidelimit
metrics: Fix memory inside limits for kata metrics
2023-08-30 07:45:17 -06:00
Beraldo Leal
00e7ffd988 tests: check vmx only on Intel machines
When running on amd machines, those tests will fail because there is no
vmx flag. Following other tests that checks for cpuType, let's adapt
them to restrict vmx only on Intel machines.

Fixes #7788.
Related #5066

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-08-29 20:04:31 -04:00
Gabriela Cervantes
c8dd3c0737 metrics: Fix memory footprint qemu limit
This PR fixes the memory footprint qemu limit for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 22:51:21 +00:00
Gabriela Cervantes
8877ec62fb metrics: Fix memory inside limits for kata metrics
This PR fixes the memory inside limit for clh for kata metrics due
to the recent changes that we had in the script which impacted
in the performance measurement.

Fixes #7786

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 21:38:18 +00:00
Beraldo Leal
80146f2078 tests: Fixes cpuType check on AMD machines
cpuType is not initialized yet. gets 0 (Intel) by default, failing on
AMD machines.

Fixes #7785

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-08-29 17:04:07 -04:00
Gabriela Cervantes
7e364716dd metrics: Add test setup details to metrics report
This PR adds test setup details to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:56:53 +00:00
Gabriela Cervantes
17dc1b9760 metrics: Add boot lifecycle times to metrics report
This PR adds the boot lifecycle times to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:55:44 +00:00
Gabriela Cervantes
3b0d6538f2 metrics: Add memory inside container to metrics report
This PR adds memory inside container to metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:53:17 +00:00
Gabriela Cervantes
79fbb9d243 metrics: Add scaling system footprint in metrics report
This PR adds scaling system footprint in metrics report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:51:27 +00:00
Gabriela Cervantes
8e6d4e6f3d metrics: Add metrics reportgen
This PR adds metrics reportgen for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:45:36 +00:00
Gabriela Cervantes
139ffd4f75 metrics: Add report file titles
This PR adds report file titles for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 17:43:06 +00:00
GabyCT
8f2dae7b53
Merge pull request #7775 from dborquez/fix_memory_usage_parsing_results
metrics: fix parsing issue on memory-usage test
2023-08-29 11:26:13 -06:00
Gabriela Cervantes
878d1a2e7d metrics: Generate PNGs alongside the PDF report
This PR generates the PNGs for the kata metrics PDF report.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:50:32 +00:00
Gabriela Cervantes
fce2487971 metrics: Add metrics report R files
This PR adds the metrics report R files.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:45:22 +00:00
Gabriela Cervantes
08812074d1 metrics: Add report dockerfile
This PR adds the report dockerfile for kata metrics.

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:28:32 +00:00
Gabriela Cervantes
69781fc027 metrics: Add metrics report script
This PR adds metrics report script for kata metrics.

Fixes #7782

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-08-29 16:25:14 +00:00
Chao Wu
e4fb20c74a
Merge pull request #7585 from lifupan/main
dragonball: vsock add fifo/pipe stream support for passed fd hybridSt…
2023-08-29 23:39:21 +08:00
Fabiano Fidêncio
50e51bcafe
Merge pull request #7185 from UnmeshDeodhar/add-cc-sev-test
tests: Add confidential test
2023-08-29 15:32:25 +02:00
Fabiano Fidêncio
e286e842c1 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
e31f099be1 tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
2023-08-29 14:10:47 +02:00
Unmesh Deodhar
c3b9d4945e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-29 14:10:34 +02:00
David Esparza
538c965c2b
metrics: fix parsing issue on memory-usage test
This PR fixes an issues in the parsing results stage,
by collecting just the n-results from the n-running
containers, discarding irrelevant data.

Fixes: #7774

Signed-off-by: David Esparza <david.esparza.borquez@intel.com>
2023-08-28 23:39:46 -06:00
Fabiano Fidêncio
708b0a3052
Merge pull request #7768 from fidencio/topic/update-tdx-to-the-6.2-kernel-based-stack
tdx: Update the components needed for using the 6.2 kernel stack
2023-08-28 19:27:15 +02:00
Fabiano Fidêncio
3818bf3311 local-build: Remove $HOME/.docker/buildx/activity/default
The file can be removed between builds without causing any issue, and
leaving it around has been causing us some headache due to:
```
ERROR: open /home/runner/.docker/buildx/activity/default: permission denied
```

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Fabiano Fidêncio
d1b54ede29 qemu: tdx: Workaround SMP issue with TDX 1.5
`...,sockets=1,cores=numvcpus,threads=1,...` must be used.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Archana Shinde
1e34220c41 qemu: tdx: Adapt to the TDX 1.5 stack
QEMU for TDX 1.5 makes use of private memory map/unmap.
Make changes to govmm to support this. Support for private backing fd
for memory is added as knob to the qemu config.

Userspace's map/unmap operations are done by fallocate() ioctl on the
backing store fd.
Reference:
https://lore.kernel.org/linux-mm/20220519153713.819591-1-chao.p.peng@linux.intel.com/

Fixes: #7770

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:41:36 +02:00
Fabiano Fidêncio
8115a0522d versions: tdx: Update Kernel to 6.2 + TDX
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:34 +02:00
Fabiano Fidêncio
ec18180f34 versions: tdx: Update TDVF to the "edk2-stable202302"
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:34 +02:00
Fabiano Fidêncio
9803b24286 versions: tdx: Update QEMU to v7.2 + TDX v1.10
This is the version that's been used and tested inside Intel, and it
matches with https://github.com/intel/tdx-tools/releases/tag/2023ww15.

Fixes: #7770

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-28 13:11:27 +02:00
Fabiano Fidêncio
02a08c956b
Merge pull request #7754 from microsoft/danmihai1/pod-quota-deployment
tests: delete k8s deployment at the test's end
2023-08-27 17:52:00 +02:00
Fabiano Fidêncio
98037ced52
Merge pull request #7755 from microsoft/danmihai1/unique-test-name
tests: use unique test name
2023-08-27 17:27:40 +02:00
Zhongtao Hu
f0440a9cfe
Merge pull request #7742 from frezcirno/fix-log-forwarder-loop
runtime-rs: check peer close in log_forwarder
2023-08-26 10:44:09 +08:00
Fabiano Fidêncio
16a610d788
Merge pull request #7758 from fidencio/topic/gha-avoid-fail-fast-till-everything-is-ultra-stable
gha: Avoid "fail-fast" in tests that are known to be flaky
2023-08-25 16:49:26 +02:00
Jiang Liu
91db888d83
Merge pull request #7602 from jiangliu/agent-storage
Refine storage device management for kata-agent
2023-08-25 22:20:18 +08:00
Zixuan Tan
dffc16e5b3 runtime-rs: check peer close in log_forwarder
The log_forwarder task does not check if the peer has closed, causing a
meaningless loop during the period of “kata vm exit”, when the peer
closed, and “ShutdownContainer RPC received” that aborts the log forwarder.

This patch fixes the problem.

Fixes: #7741

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2023-08-25 19:00:07 +08:00
Jiang Liu
aaa5ab1264 agent: simplify storage device by removing StorageDeviceObject
Simplify storage device implementation by removing StorageDeviceObject.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-08-25 17:23:16 +08:00
Fabiano Fidêncio
fb49d5d7ce gha: Avoid "fail-fast" in tests that are known to be flaky
Otherwise we'll have to re-run all the tests due to a flaky behaviour in
one of the parts.

Fixes: #7757

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-25 10:00:17 +02:00