Commit Graph

23 Commits

Author SHA1 Message Date
Shiqing Gao
80b1edabf5 dm: block_if: support bypassing BST_BLOCK logic
With current implementation, in blockif_dequeue/blockif_complete,
if the current request is consecutive to any request in penq or busyq,
current request's status is set to BST_BLOCK. Then, this request is blocked
until the prior request, which blocks it, is completed.
It indicates that consecutive requests are executed sequentially.

This patch adds a flag `no_bst_block` to bypass such logic because:
1. the benefit of this logic is not noticeable;
2. there is a chance that a request is enqueued in block_if_queue but
   not dequeued when this logic is triggered along with the io_uring mechanism;

Example to use this flag:
`add_virtual_device                     5 virtio-blk /dev/nvme1n1,no_bst_block`

Note:
When io_uring is enabled, the BST_BLOCK logic would be bypassed.

Tracked-On: #8612

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Shiqing Gao
f92b0f43e6 dm: block_if: io_uring: flush the modified in-core data on demand
When `io_uring` is used, `blockif_flush_cache` is missing when an WRITE
operation is completed. `blockif_flush_cache` would flush the modified
in-core data to the disk device according to the setting of the cache mode.

Tracked-On: #8612

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Shiqing Gao
14c20fa31c dm: block_if: support misaligned request when O_DIRECT is used
Use of O_DIRECT flag could be a performance option.
But this flag may impose alignment restrictions on the length
and address of user-space buffers and the file offset of I/Os.

To support the use of O_DIRECT flag in block_if, this patch adds the support
to handle the misaligned request.
 - When O_DIRECT flag is used (`nocache` is specified in acrn-dm parameters),
    * if the original I/O request is aligned,
      the original I/O request is submitted directly.
    * if the original I/O request is not aligned (either due to the buffer
        address/length misalignment, or the offset misalignment),
      the misaligned request is converted to an aligned request before
      submission.

 - When O_DIRECT flag is not used,
   the original I/O request is submitted directly.

v1 -> v2:
 * cleanup the free() logic in `blockif_init_bounced_write`

Tracked-On: #8612

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Shiqing Gao
e2da306755 dm: block_if: support bypassing the Service VM's page cache
This patch adds an acrn-dm option `nocache` to bypass the Service VM's
page cache.
 - By default, the Service VM's page cache is utilized.
 - If `nocache` is specified in acrn-dm parameters, the Service VM's page cache
   would be bypassed (opening the file/block with O_DIRECT flag).

Example to bypass the Service VM's page cache:
`add_virtual_device    5 virtio-blk iothread,mq=2,/dev/nvme2n1,writeback,nocache`

Tracked-On: #8612

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Shiqing Gao
7e6a239646 dm: improve the flexibility of the iothread support
Prior to this patch, one single iothread instance is created and initialized
in the `main` function. This single iothread monitors all the registered fds
and handles all the corresponding requests. It leads to the limited flexibility
of the iothread support.

To improve the flexibility of the iothread support, this patch does:
- add the support of multiple iothread instances.
  `iothread_create` is introduced to create a certain number of iothread
  instances. It shall be called at first by each virtual device owner (such as
  virtio-blk BE) on initialization phase. Then, `iothread_add` can be called
  to add the to be monitored fd to the specified iothread.

- update virtio-blk BE to let the acrn-dm option `iothread` accept a number
  as the number of iothread instances to be created.
  If `iothread` is contained in the parameters, but the number is not specified,
  one iothread instance would be created by default.
  Examples to specify the number of iothread instances:
  1. Create 2 iothread instances
  `add_virtual_device    9 virtio-blk iothread=2,mq=2,/dev/nvme1n1,writeback,aio=io_uring`
  2. Create 1 iothread instances (by default)
  `add_virtual_device    9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback,aio=io_uring`

- update virtio-blk BE to separate the request handling of different virtqueues
  to different iothreads.
  The request from one or more virtqueues can be handled in one iothread.
  The mapping between virtqueues and iothreads is based on round robin.

v1 -> v2:
 * add a mutex to protect the free ioctx slot allocation

Tracked-On: #8612

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Shiqing Gao
fed8ce513c dm: block_if: add the io_uring support
io_uring is a high-performance asynchronous I/O framework, primarily designed
to improve the efficiency of input and output (I/O) operations in user-space
applications.
This patch enables io_uring in block_if module. It utilizes the interfaces
provided by the user-space library `liburing` to interact with io_uring
in kernel-space.

To build the acrn-dm with io_uring support, `liburing-dev` package needs to be
installed. For example, it can be installed like below in Ubuntu 22.04.
        sudo apt install liburing-dev

In order to support both the thread pool mechanism and the io_uring mechanism,
an acrn-dm option `aio` is introduced. By default, thread pool mechanism is
selected.
- Example to use io_uring:
  `add_virtual_device    9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback,aio=io_uring`
- Example to use thread pool:
  `add_virtual_device    9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback,aio=threads`
- Example to use thread pool (by default):
  `add_virtual_device    9 virtio-blk iothread,mq=2,/dev/nvme1n1,writeback`

v2 -> v3:
 * Update iothread_handler
    - Use the unified eventfd interfaces to read the counter value of the
      ioeventfd.
    - Remove the while loop to read the ioeventfd. It is not necessary
      because one read would reset the counter value to 0.
 * Update iou_submit_sqe to return an error code
   The caller of iou_submit_sqe shall check the return value.
   If there is NO available submission queue entry in the submission queue,
   need to break the while loop. Request can only be submitted when SQE is
   available.

v1 -> v2:
 * move the logic of reading out ioeventfd from iothread.c to virtio.c, because
   it is specific to the virtqueue handling.

Tracked-On: #8612

Signed-off-by: Shiqing Gao <shiqing.gao@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Jian Jun Chen
edb392e7ed dm: block_if: add multiple queues support
block_if is the backend of ahci and virtio-blk. Only one queue is
supported by block_if now. Several worker threads are created as
the thread pool for the queue. One BIG mutex is used for the queue
and thread operation. With this patch block_if can support multiple
queues and each queue is backed by several worker threads. blockif_req
can be submited/enqueued into one specified queue. By spliting into
several queues contention from the BIG mutex can be relieved/eliminated.
This is used to support virtio-blk multiple queues feature.

Tracked-On: #8612

Signed-off-by: Jian Jun Chen <jian.jun.chen@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2024-06-05 15:23:33 +08:00
Sun Peng
fd671047ed dm: blockif: Convert print output to acrn-dm logger
Unifies the logs to pr_* interfaces instead of printf for better log management.

Tracked-On: #5267
Signed-off-by: Sun Peng <peng.p.sun@intel.com>
Reviewed-by: Chi Mingqiang <mingqiang.chi@intel.com>
Acked-by: Wang, Yu1 <yu1.wang@intel.com>
2020-09-10 09:33:25 +08:00
Mingqiang Chi
5267a9775c dm:replace perror with pr_err
use acrn-dm logger function instread of perror,
this helps the stability testing log capture.

Tracked-On: #4098
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
2020-01-08 13:37:57 +08:00
Mingqiang Chi
a59205f6a2 dm:use acrn-dm logger function instread of fprintf
use acrn-dm logger function instread of fprintf,
this helps the stability testing log capture.

Tracked-On: #4098
Signed-off-by: Mingqiang Chi <mingqiang.chi@intel.com>
Reviewed-by: Cao Minggui <minggui.cao@intel.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
2019-11-14 15:34:04 +08:00
Tianhua Sun
b96a3555bc dm: fix some possible memory leak
free memory allocated by strdup()

Tracked-On: #3395
Signed-off-by: Tianhua Sun <tianhuax.s.sun@intel.com>
Reviewed-by: Yonghua Huang <yonghua.huang@intel.com>
2019-07-12 09:41:15 +08:00
Conghui Chen
4145b8af6e dm: block: clean up assert
This patch is to clean up assert for block interface.
'magic' is removed from block structure, as the user should make sure
the block device is created and not closed when access to it.

Tracked-On: #3252
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-06-20 08:56:29 +08:00
Peter Fang
18aebc0196 dm: safely use pthread_cond_broadcast()
Use pthread_cond_broadcast() while holding the mutex to guarantee the
signaling of its condition variable.

Tracked-On: #2763
Signed-off-by: Peter Fang <peter.fang@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
2019-04-04 00:27:10 +08:00
Minggui Cao
5397200118 DM: fix memory leak
1. free memory allocated by strdup in blockif_open
2. free msix.table when its pci device deinit

Tracked-On: #2704
Signed-off-by: Minggui Cao <minggui.cao@intel.com>
Acked-by: Yin Fengwei <fengwei.yin@intel.com>
2019-03-14 15:17:30 +08:00
Conghui Chen
ca925f0dd4 dm: storage: change DISCARD to synchronous mode
For virtio-blk, when the backend is a regular file, the discard
and
is implemented by fallocate(), but this function will not wait for
the discard command handled by disk.
So, add fdatasync to make sure the DISCARD is executed
synchronously.

Tracked-On: #2395
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Acked-by: Wang Yu <yu1.wang@intel.com>
2019-01-23 12:56:47 +08:00
Conghui Chen
2ddd24e022 dm: storage: support discard command
Support DISCARD command is meaningful when eMMC usage is high or
there are lots of remove operations. For example, when Guest
Android is running, there will be lots of files being created and
removed. However, virtio-blk BE does not support DISCARD command,
data remove operation in UOS will not trigger erase in eMMC. After
period of time, the eMMC will be consumed out, and erase must be
done by eMMC firmware before writing any new data. This causes the
eMMC performance decrease in the whole system (SOS and UOS).
To solve the problem, DISCARD should be supported in virtio-blk BE.

Tracked-On: #2011
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Yu Wang <yu1.wang@intel.com>
2018-12-18 13:21:07 +08:00
Conghui Chen
f71370ad81 dm: storage: rename delete to discard
To keep consistent with kernal code, change delete to discard.

Tracked-On: #2011
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo A Liu <shuo.a.liu@intel.com>
Acked-by: Yu Wang <yu1.wang@intel.com>
2018-12-18 13:21:07 +08:00
Conghui Chen
21458bddff dm: storage: banned functions replace
1. replace sscanf with string API.
2. replace sprintf with snprintf
3. replace strlen with strnlen

Tracked-on: #1496
Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo Liu <shuo.a.liu@intel.com>
Acked-by: Yin Fengwei <fengwei.yin@intel.com>
2018-10-17 16:22:00 +08:00
Peter Fang
7101ce87a7 dm: storage: remove GEOM support
There is no GEOM framework in Linux. Remove dead code.

v1 -> v2:
- complete removal of the entire GEOM logic

Tracked-On: #1422
Signed-off-by: Peter Fang <peter.fang@intel.com>
Acked-by: Yu Wang <yu1.wang@intel.com>
2018-10-15 22:20:04 +08:00
Conghui Chen
49322ac002 dm: storage: support cache mode toggling
1. support "writeback" and "writethru" mode toggling for virtio-blk
conditionally. When starting DM with "writethru" parameter in
virtio-blk, guest OS could not toggle cache mode. When starting DM
with "writeback" parameter in virtio-blk, guest OS could toggle
cache mode.

    ------------------------------
    DM cmdline  | toggle support
    ------------+-----------------
    writeback   | yes
    writethru   | no
    ------------------------------

2. To toggle cache mode, run below command in guest OS:

    echo "write back" > /sys/devices/xxx/vdx/cache_type
OR
    echo "write through" > /sys/devices/xxx/vdx/cache_type

Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo Liu <shuo.a.liu@intel.com>
Acked-by: Yin Fengwei <fengwei.yin@intel.com>
2018-08-10 10:33:21 +08:00
Conghui Chen
2592ea8fa4 dm: storage: support writethru and writeback mode
In writethru mode, guest storage write are reported completed only
when the data has been written to physical storage.

In writeback mode, guest storage write are reported completed when
data is placed in SOS page cache. Needs to be flushed to the
physical storage.

USAGE:
    -s x,virtio-blk,<filepath>,writeback
    -s x,virtio-blk,<filepath>,writethru

The default mode is *writethru*

Signed-off-by: Conghui Chen <conghui.chen@intel.com>
Reviewed-by: Shuo Liu <shuo.a.liu@intel.com>
Acked-by: Yin Fengwei <fengwei.yin@intel.com>
2018-08-10 10:33:21 +08:00
Zide Chen
df4ab92e81 DM: cleanup for header inclusions
used https://gitlab.com/esr/deheader to detect and remove unnecessary
header file inclusions

Signed-off-by: Zide Chen <zide.chen@intel.com>
2018-06-07 14:35:30 +08:00
Yu Wang
8c06b69622 dm: Reorganize ACRN DM directory.
The current dm, all non-pci and non-acpi related files are put into
hw/platform directory. This is actually disturbed the meaning of
*platform*. The platform devices are mean of board and SoC specific
non-PCI devices, like usb devices, etc.

This patch refines the ACRN dm directory architecture.

For some common device logic files, likes block_if.c/uart_core.c or
usb_core.c. They will move to hw/ directly.

For platform architecture depended files, create arch/ under root dir.
And create sub-dir arch/x86 for x86 architecture, will create more
architectures in future. The pm.c will move to this new dir.

The hw/acpi will be moved to hw/platform/acpi due to acpi also be
considered as part of platform.

Signed-off-by: Yu Wang <yu1.wang@intel.com>
2018-05-15 17:25:58 +08:00