doc: reorganize HLD docs

Reorganize the high-level design docs to align with a work-in-progress HLD document. Migrate previous web content (and images) into the new organization. From here we'll continue inclusion of new design chapters as they're reviewed and edited. Signed-off-by: David B. Kinder <david.b.kinder@intel.com>
2025-09-24 02:08:04 +00:00 · 2018-10-04 16:39:40 -07:00
parent 8e21d5ee99
commit 1e38544112
93 changed files with 1249 additions and 1233 deletions
--- a/doc/developer-guides/hld/acpi-virt.rst
+++ b/doc/developer-guides/hld/acpi-virt.rst
@@ -0,0 +1,359 @@
+.. _acpi-virt-HLD:
+
+ACPI Virtualization high-level design
+#####################################
+
+ACPI introduction
+*****************
+
+Advanced Configuration and Power Interface (ACPI) provides an open
+standard that operating systems can use to discover and configure
+computer hardware components to perform power management for example, by
+monitoring status and putting unused components to sleep.
+
+Functions implemented by ACPI include:
+
+-  System/Device/Processor power management
+-  Device/Processor performance management
+-  Configuration / Plug and Play
+-  System event
+-  Battery management
+-  Thermal management
+
+ACPI enumerates and lists the different DMA engines in the platform, and
+device scope relationships between PCI devices and which DMA engine
+controls them. All critical functions depend on ACPI tables. Here's an
+example on an Apollo Lake platform (APL) with Linux installed:
+
+.. code-block:: none
+
+   root@:Dom0 ~ $ ls /sys/firmware/acpi/tables/
+   APIC  data  DMAR  DSDT  dynamic  FACP  FACS  HPET  MCFG  NHLT  TPM2
+
+These tables provide different information and functions:
+
+-  Advanced Programmable Interrupt Controller (APIC) for Symmetric
+   Multiprocessor systems (SMP),
+-  DMA remapping (DMAR) for Intel |reg| Virtualization Technology for
+   Directed I/O (VT-d),
+-  Non-HD Audio Link Table (NHLT) for supporting audio device,
+-  and Differentiated System Description Table (DSDT) for system
+   configuration info. DSDT is a major ACPI table used to describe what
+   peripherals the machine has, and information on PCI IRQ mappings and
+   power management
+
+Most of the ACPI functionality is provided in ACPI Machine Language
+(AML) bytecode stored in the ACPI tables. To make use of these tables,
+Linux implements an interpreter for the AML bytecode. At BIOS
+development time, the AML bytecode is compiled from the ASL (ACPI Source
+Language) code.  The ``iasl`` command is used to disassemble the ACPI table
+and display its contents:
+
+.. code-block:: none
+
+   root@:Dom0 ~ $ cp /sys/firmware/acpi/tables/DMAR .
+   root@:Dom0 ~ $ iasl -d DMAR
+
+   Intel ACPI Component Architecture
+   ASL+ Optimizing Compiler/Disassembler version 20170728
+   Copyright (c) 2000 - 2017 Intel Corporation
+   Input file DMAR, Length 0xB0 (176) bytes
+   ACPI: DMAR 0x0000000000000000 0000B0 (v01 INTEL  BDW      00000001 INTL 00000001)
+   Acpi Data Table [DMAR] decoded
+   Formatted output:  DMAR.dsl - 5286 bytes
+
+   root@:Dom0 ~ $ cat DMAR.dsl
+   [000h 0000   4]                    Signature : "DMAR"    [DMA Remapping table]
+   [004h 0004   4]                 Table Length : 000000B0
+   [008h 0008   1]                     Revision : 01
+   ...
+   [030h 0048   2]                Subtable Type : 0000 [Hardware Unit Definition]
+   [032h 0050   2]                       Length : 0018
+   [034h 0052   1]                        Flags : 00
+   [035h 0053   1]                     Reserved : 00
+   [036h 0054   2]           PCI Segment Number : 0000
+   [038h 0056   8]        Register Base Address : 00000000FED64000
+
+From the displayed ASL, we can see some generic table fields, such as
+the version information, and one VTd remapping engine description with
+FED64000 as base address.
+
+We can modify DMAR.dsl and assemble it again to AML:
+
+.. code-block:: none
+
+   root@:Dom0 ~ $ iasl DMAR.dsl
+   Intel ACPI Component Architecture
+   ASL+ Optimizing Compiler/Disassembler version 20170728
+   Copyright (c) 2000 - 2017 Intel Corporation
+   Table Input:   DMAR.dsl - 113 lines, 5286 bytes, 72 fields
+   Binary Output: DMAR.aml - 176 bytes
+   Compilation complete. 0 Errors, 0 Warnings, 0 Remarks
+
+We can see the new AML file ``DMAR.aml`` is created.
+
+There are many ACPI tables in the system, linked together via table
+pointers.  In all ACPI-compatible system, the OS can enumerate all
+needed tables starting with the Root System Description Pointer (RSDP)
+provided at a known place in the system low address space, and pointing
+to  an XSDT (Extended System Description Table). The following picture
+shows a typical ACPI table layout in an Intel APL platform:
+
+.. figure:: images/acpi-image1.png
+   :width: 700px
+   :align: center
+   :name: acpi-layout
+
+   Typical ACPI table layout in an Intel APL platform
+
+
+ACPI virtualization
+*******************
+
+Most modern OSes requires ACPI, so ACRN provides ACPI virtualization to
+emulate an ACPI-capable virtual platform for the guest OS. To achieve
+this, there are two options, depending on physical device and ACPI
+resources are abstracted: Partitioning and Emulation.
+
+Partitioning
+============
+
+One option is to assign and partition physical devices and ACPI
+resources among all guest OSes. That means each guest OS owns specific
+devices with passthrough, such as shown below:
+
+--------------------------+--------------------------+--------------------------+
+| PCI Devices              | VM0(Cluster VM)          | VM1(IVI VM)              |
+--------------------------+--------------------------+--------------------------+
+| I2C                      | I2C3, I2C0               | I2C1, I2C2, I2C4, I2C5,  |
+|                          |                          | I2C6, I2C7               |
+--------------------------+--------------------------+--------------------------+
+| SPI                      | SPI1                     | SPI0, SPI2               |
+--------------------------+--------------------------+--------------------------+
+| USB                      |                          | USB-Host (xHCI) and      |
+|                          |                          | USB-Device (xDCI)        |
+--------------------------+--------------------------+--------------------------+
+| SDIO                     |                          | SDIO                     |
+--------------------------+--------------------------+--------------------------+
+| IPU                      |                          | IPU                      |
+--------------------------+--------------------------+--------------------------+
+| Ethernet                 | Ethernet                 |                          |
+--------------------------+--------------------------+--------------------------+
+| WIFI                     |                          | WIFI                     |
+--------------------------+--------------------------+--------------------------+
+| Bluetooth                |                          | Bluetooth                |
+--------------------------+--------------------------+--------------------------+
+| Audio                    |                          | Audio                    |
+--------------------------+--------------------------+--------------------------+
+| GPIO                     | GPIO                     |                          |
+--------------------------+--------------------------+--------------------------+
+| UART                     | UART                     |                          |
+--------------------------+--------------------------+--------------------------+
+
+In an early ACRN development phase, partitioning was used for
+simplicity. To implement partitioning, we need to hack the PCI logic to
+make different VMs see a different subset of devices, and create one
+copy of the ACPI tables for each of them, as shown in the following
+picture:
+
+.. figure:: images/acpi-image3.png
+   :width: 900px
+   :align: center
+
+For each VM, its ACPI tables are standalone copies and not related to
+other VMs. Opregion also needs to be copied for different VM.
+
+For each table, we make modifications, based on the physical table, to
+reflect the assigned devices to a particular VM. In the picture below,
+we can see keep SP2(0:19.1) for VM0, and SP1(0:19.0)/SP3(0:19.2) for
+VM1. Any time a partition policy changes, we need to modify both tables
+again, including dissembling, modification, and assembling, which is
+tricky and bug-prone.
+
+.. figure:: images/acpi-image2.png
+   :width: 900px
+   :align: center
+
+Emulation
+---------
+
+A second option is for the SOS (VM0) to "own" all devices and emulate a
+set of virtual devices for each of the UOS (VM1). This is the most
+popular model for virtualization, as show below. ACRN currently uses
+device emulation plus some device passthrough for UOS.
+
+.. figure:: images/acpi-image5.png
+   :width: 400px
+   :align: center
+
+Regarding ACPI virtualization in ACRN, different policy are used for
+different components:
+
+-  Hypervisor - ACPI is transparent to the Hypervisor, which has no
+   knowledge of ACPI at all.
+-  SOS - All ACPI resources are physically owned by the SOS, which
+   enumerates all ACPI tables and devices.
+-  UOS - Virtual ACPI resources exposed by the device model are owned by
+   UOS.
+
+Source for the ACPI emulation code for the device model is found in
+``hw/platform/acpi/acpi.c``.
+
+Each entry in ``basl_ftables`` is related to each virtual ACPI table,
+including following elements:
+
+-  wsect - output handler to write related ACPI table contents to
+   specific file
+-  offset - related ACPI table offset in the memory
+-  valid - dynamically indicate if this table is needed
+
+.. code-block:: c
+
+   static struct {
+       int (*wsect)(FILE *fp, struct vmctx *ctx);
+       uint64_t  offset;
+       bool    valid;
+   } basl_ftables[] = {
+       { basl_fwrite_rsdp, 0,       true  },
+       { basl_fwrite_rsdt, RSDT_OFFSET, true  },
+       { basl_fwrite_xsdt, XSDT_OFFSET, true  },
+       { basl_fwrite_madt, MADT_OFFSET, true  },
+       { basl_fwrite_fadt, FADT_OFFSET, true  },
+       { basl_fwrite_hpet, HPET_OFFSET, true  },
+       { basl_fwrite_mcfg, MCFG_OFFSET, true  },
+       { basl_fwrite_facs, FACS_OFFSET, true  },
+       { basl_fwrite_nhlt, NHLT_OFFSET, false }, /*valid with audio ptdev*/
+       { basl_fwrite_dsdt, DSDT_OFFSET, true  }
+   };
+
+The main function to create virtual ACPI tables is ``acpi_build`` that
+calls ``basl_compile`` for each table and performs the following:
+
+#. create two temp files: infile and outfile
+#. with output handler, write table contents stream to infile
+#. use ``iasl`` tool to assemble infile into outfile
+#. load outfile contents to the required memory offset
+
+.. code-block:: c
+
+    static int
+    basl_compile(struct vmctx *ctx,
+            int (*fwrite_section)(FILE *, struct vmctx *),
+            uint64_t offset)
+    {
+        struct basl_fio io[2];
+        static char iaslbuf[3*MAXPATHLEN + 10];
+        int err;
+
+        err = basl_start(&io[0], &io[1]);
+        if (!err) {
+            err = (*fwrite_section)(io[0].fp, ctx);
+
+            if (!err) {
+                /*
+                 * iasl sends the results of the compilation to
+                 * stdout. Shut this down by using the shell to
+                 * redirect stdout to /dev/null, unless the user
+                 * has requested verbose output for debugging
+                 * purposes
+                 */
+                if (basl_verbose_iasl)
+                    snprintf(iaslbuf, sizeof(iaslbuf),
+                         "%s -p %s %s",
+                         ASL_COMPILER,
+                         io[1].f_name, io[0].f_name);
+                else
+                    snprintf(iaslbuf, sizeof(iaslbuf),
+                         "/bin/sh -c \"%s -p %s %s\" 1> /dev/null",
+                         ASL_COMPILER,
+                         io[1].f_name, io[0].f_name);
+
+                err = system(iaslbuf);
+
+                if (!err) {
+                    /*
+                     * Copy the aml output file into guest
+                     * memory at the specified location
+                     */
+                    err = basl_load(ctx, io[1].fd, offset);
+                } else
+                    err = -1;
+            }
+            basl_end(&io[0], &io[1]);
+        }
+
+After processing each entry, the virtual ACPI tables are present in UOS
+memory.
+
+For pass-through devices in UOS, we likely need to add some ACPI
+description in the UOS virtual DSDT table. There is one hook
+(``passthrough_write_dsdt``) in ``hw/pci/passthrough.c`` for it. The following
+source code shows calls to different functions to add different contents
+for each vendor and device id.
+
+.. code-block:: c
+
+    static void
+    passthru_write_dsdt(struct pci_vdev *dev)
+    {
+        struct passthru_dev *ptdev = (struct passthru_dev *) dev->arg;
+        uint32_t vendor = 0, device = 0;
+
+        vendor = read_config(ptdev->phys_dev, PCIR_VENDOR, 2);
+
+        if (vendor != 0x8086)
+            return;
+
+        device = read_config(ptdev->phys_dev, PCIR_DEVICE, 2);
+
+        /* Provides ACPI extra info */
+        if (device == 0x5aaa)
+            /* XDCI @ 00:15.1 to enable ADB */
+            write_dsdt_xhci(dev);
+        else if (device == 0x5ab4)
+            /* HDAC @ 00:17.0 as codec */
+            write_dsdt_hdac(dev);
+        else if (device == 0x5a98)
+            /* HDAS @ 00:e.0 */
+            write_dsdt_hdas(dev);
+        else if (device == 0x5aac)
+            /* i2c @ 00:16.0 for ipu */
+            write_dsdt_ipu_i2c(dev);
+        else if (device == 0x5abc)
+            /* URT1 @ 00:18.0 for bluetooth*/
+            write_dsdt_urt1(dev);
+
+    }
+
+For instance, ``write_dsdt_urt1`` provides ACPI contents for Bluetooth
+UART device when pass-throughed to the UOS. It provides virtual PCI
+device/function as ``_ADR``, with other descriptions possible for Bluetooth
+UART enumeration.
+
+.. code-block:: c
+
+    static void
+    write_dsdt_urt1(struct pci_vdev *dev)
+    {
+        printf("write virt-%x:%x.%x in dsdt for URT1 @ 00:18.0\n",
+               dev->bus,
+               dev->slot,
+               dev->func);
+        dsdt_line("Device (URT1)");
+        dsdt_line("{");
+        dsdt_line("    Name (_ADR, 0x%04X%04X)", dev->slot, dev->func);
+        dsdt_line("    Name (_DDN, \"Intel(R) HS-UART Controller #1\")");
+        dsdt_line("    Name (_UID, One)");
+        dsdt_line("    Name (RBUF, ResourceTemplate ()");
+        dsdt_line("    {");
+        dsdt_line("    })");
+        dsdt_line("    Method (_CRS, 0, NotSerialized)");
+        dsdt_line("    {");
+        dsdt_line("        Return (RBUF)");
+        dsdt_line("    }");
+        dsdt_line("}");
+    }
+
+This document introduces basic ACPI virtualization. Other topics such as
+power management virtualization, adds more requirement for ACPI, and
+will be discussed in the power management documentation.
--- a/doc/developer-guides/hld/hld-APL_GVT-g.rst
+++ b/doc/developer-guides/hld/hld-APL_GVT-g.rst
@@ -0,0 +1,948 @@
+.. _APL_GVT-g-hld:
+
+GVT-g high-level design
+#######################
+
+Introduction
+************
+
+Purpose of this Document
+========================
+
+This high-level design (HLD) document describes the usage requirements
+and high level design for Intel® Graphics Virtualization Technology for
+shared virtual :term:`GPU` technology (:term:`GVT-g`) on Apollo Lake-I
+SoCs.
+
+This document describes:
+
+-  The different GPU virtualization techniques
+-  GVT-g mediated pass-through
+-  High level design
+-  Key components
+-  GVT-g new architecture differentiation
+
+Audience
+========
+
+This document is for developers, validation teams, architects and
+maintainers of Intel® GVT-g for the Apollo Lake SoCs.
+
+The reader should have some familiarity with the basic concepts of
+system virtualization and Intel® processor graphics.
+
+Reference Documents
+===================
+
+The following documents were used as references for this specification:
+
+-  Paper in USENIX ATC '14 - *Full GPU Virtualization Solution with
+   Mediated Pass-Through* - https://www.usenix.org/node/183932
+
+-  Hardware Specification - PRMs -
+   https://01.org/linuxgraphics/documentation/hardware-specification-prms
+
+Background
+**********
+
+Intel® GVT-g is an enabling technology in emerging graphics
+virtualization scenarios. It adopts a full GPU virtualization approach
+based on mediated pass-through technology, to achieve good performance,
+scalability and secure isolation among Virtual Machines (VMs). A virtual
+GPU (vGPU), with full GPU features, is presented to each VM so that a
+native graphics driver can run directly inside a VM.
+
+Intel® GVT-g technology for Apollo Lake (APL) has been implemented in
+open source hypervisors or Virtual Machine Monitors (VMMs):
+
+-  Intel® GVT-g for ACRN, also known as, "AcrnGT"
+-  Intel® GVT-g for KVM, also known as, "KVMGT"
+-  Intel® GVT-g for Xen, also known as, "XenGT"
+
+The core vGPU device model is released under BSD/MIT dual license, so it
+can be reused in other proprietary hypervisors.
+
+Intel has a portfolio of graphics virtualization technologies
+(:term:`GVT-g`, :term:`GVT-d` and :term:`GVT-s`). GVT-d and GVT-s are
+outside of the scope of this document.
+
+This HLD applies to the Apollo Lake platform only. Support of other
+hardware is outside the scope of this HLD.
+
+Targeted Usages
+===============
+
+The main targeted usage of GVT-g is in automotive applications, such as:
+
+-  An Instrument cluster running in one domain
+-  An In Vehicle Infotainment (IVI) solution running in another domain
+-  Additional domains for specific purposes, such as Rear Seat
+   Entertainment or video camera capturing.
+
+.. figure:: images/APL_GVT-g-ive-use-case.png
+   :width: 900px
+   :align: center
+   :name: ive-use-case
+
+   IVE Use Case
+
+Existing Techniques
+===================
+
+A graphics device is no different from any other I/O device, with
+respect to how the device I/O interface is virtualized. Therefore,
+existing I/O virtualization techniques can be applied to graphics
+virtualization. However, none of the existing techniques can meet the
+general requirement of performance, scalability, and secure isolation
+simultaneously. In this section, we review the pros and cons of each
+technique in detail, enabling the audience to understand the rationale
+behind the entire GVT-g effort.
+
+Emulation
+---------
+
+A device can be emulated fully in software, including its I/O registers
+and internal functional blocks. There would be no dependency on the
+underlying hardware capability, therefore compatibility can be achieved
+across platforms. However, due to the CPU emulation cost, this technique
+is usually used for legacy devices, such as a keyboard, mouse, and VGA
+card.  There would be great complexity and extremely low performance to
+fully emulate a modern accelerator, such as a GPU. It may be acceptable
+for use in a simulation environment, but it is definitely not suitable
+for production usage.
+
+API Forwarding
+--------------
+
+API forwarding, or a split driver model, is another widely-used I/O
+virtualization technology. It has been used in commercial virtualization
+productions, for example, VMware*, PCoIP*, and Microsoft* RemoteFx*.
+It is a natural path when researchers study a new type of
+I/O virtualization usage, for example, when GPGPU computing in VM was
+initially proposed. Intel® GVT-s is based on this approach.
+
+The architecture of API forwarding is shown in :numref:`api-forwarding`:
+
+.. figure:: images/APL_GVT-g-api-forwarding.png
+   :width: 400px
+   :align: center
+   :name: api-forwarding
+
+   API Forwarding
+
+A frontend driver is employed to forward high-level API calls (OpenGL,
+Directx, and so on) inside a VM, to a Backend driver in the Hypervisor
+for acceleration. The Backend may be using a different graphics stack,
+so API translation between different graphics protocols may be required.
+The Backend driver allocates a physical GPU resource for each VM,
+behaving like a normal graphics application in a Hypervisor.  Shared
+memory may be used to reduce memory copying between the host and guest
+graphic stacks.
+
+API forwarding can bring hardware acceleration capability into a VM,
+with other merits such as vendor independence and high density. However, it
+also suffers from the following intrinsic limitations:
+
+-  Lagging features - Every new API version needs to be specifically
+   handled, so it means slow time-to-market (TTM) to support new standards.
+   For example,
+   only DirectX9 is supported, when DirectX11 is already in the market.
+   Also, there is a big gap in supporting media and compute usages.
+
+-  Compatibility issues - A GPU is very complex, and consequently so are
+   high level graphics APIs. Different protocols are not 100% compatible
+   on every subtle API, so the customer can observe feature/quality loss
+   for specific applications.
+
+-  Maintenance burden - Occurs when supported protocols and specific
+   versions are incremented.
+
+-  Performance overhead - Different API forwarding implementations
+   exhibit quite different performance, which gives rise to a need for a
+   fine-grained graphics tuning effort.
+
+Direct Pass-Through
+-------------------
+
+"Direct pass-through" dedicates the GPU to a single VM, providing full
+features and good performance, but at the cost of device sharing
+capability among VMs. Only one VM at a time can use the hardware
+acceleration capability of the GPU, which is a major limitation of this
+technique.  However, it is still a good approach to enable graphics
+virtualization usages on Intel server platforms, as an intermediate
+solution. Intel® GVT-d uses this mechanism.
+
+.. figure:: images/APL_GVT-g-pass-through.png
+   :width: 400px
+   :align: center
+   :name: gvt-pass-through
+
+   Pass-Through
+
+SR-IOV
+------
+
+Single Root IO Virtualization (SR-IOV) implements I/O virtualization
+directly on a device. Multiple Virtual Functions (VFs) are implemented,
+with each VF directly assignable to a VM.
+
+Mediated Pass-Through
+*********************
+
+Intel® GVT-g achieves full GPU virtualization using a "mediated
+pass-through" technique.
+
+Concept
+=======
+
+Mediated pass-through allows a VM to access performance-critical I/O
+resources (usually partitioned) directly, without intervention from the
+hypervisor in most cases. Privileged operations from this VM are
+trapped-and-emulated to provide secure isolation among VMs.
+
+.. figure:: images/APL_GVT-g-mediated-pass-through.png
+   :width: 400px
+   :align: center
+   :name: mediated-pass-through
+
+   Mediated Pass-Through
+
+The Hypervisor must ensure that no vulnerability is exposed when
+assigning performance-critical resource to each VM. When a
+performance-critical resource cannot be partitioned, a scheduler must be
+implemented (either in software or hardware) to allow time-based sharing
+among multiple VMs. In this case, the device must allow the hypervisor
+to save and restore the hardware state associated with the shared resource,
+either through direct I/O register reads and writes (when there is no software
+invisible state) or through a device-specific context save and restore
+mechanism (where there is a software invisible state).
+
+Examples of performance-critical I/O resources include the following:
+
+.. figure:: images/APL_GVT-g-perf-critical.png
+   :width: 800px
+   :align: center
+   :name: perf-critical
+
+   Performance-Critical I/O Resources
+
+
+The key to implementing mediated pass-through for a specific device is
+to define the right policy for various I/O resources.
+
+Virtualization Policies for GPU Resources
+=========================================
+
+:numref:`graphics-arch` shows how Intel Processor Graphics works at a high level.
+Software drivers write commands into a command buffer through the CPU.
+The Render Engine in the GPU fetches these commands and executes them.
+The Display Engine fetches pixel data from the Frame Buffer and sends
+them to the external monitors for display.
+
+.. figure:: images/APL_GVT-g-graphics-arch.png
+   :width: 400px
+   :align: center
+   :name: graphics-arch
+
+   Architecture of Intel Processor Graphics
+
+This architecture abstraction applies to most modern GPUs, but may
+differ in how graphics memory is implemented. Intel Processor Graphics
+uses system memory as graphics memory. System memory can be mapped into
+multiple virtual address spaces by GPU page tables. A 4 GB global
+virtual address space called "global graphics memory", accessible from
+both the GPU and CPU, is mapped through a global page table. Local
+graphics memory spaces are supported in the form of multiple 4 GB local
+virtual address spaces, but are only limited to access by the Render
+Engine through local page tables. Global graphics memory is mostly used
+for the Frame Buffer and also serves as the Command Buffer. Massive data
+accesses are made to local graphics memory when hardware acceleration is
+in progress. Other GPUs have similar page table mechanism accompanying
+the on-die memory.
+
+The CPU programs the GPU through GPU-specific commands, shown in
+:numref:`graphics-arch`, using a producer-consumer model. The graphics
+driver programs GPU commands into the Command Buffer, including primary
+buffer and batch buffer, according to the high-level programming APIs,
+such as OpenGL* or DirectX*. Then, the GPU fetches and executes the
+commands. The primary buffer (called a ring buffer) may chain other
+batch buffers together. The primary buffer and ring buffer are used
+interchangeably thereafter. The batch buffer is used to convey the
+majority of the commands (up to ~98% of them) per programming model. A
+register tuple (head, tail) is used to control the ring buffer. The CPU
+submits the commands to the GPU by updating the tail, while the GPU
+fetches commands from the head, and then notifies the CPU by updating
+the head, after the commands have finished execution. Therefore, when
+the GPU has executed all commands from the ring buffer, the head and
+tail pointers are the same.
+
+Having introduced the GPU architecture abstraction, it is important for
+us to understand how real-world graphics applications use the GPU
+hardware so that we can virtualize it in VMs efficiently. To do so, we
+characterized, for some representative GPU-intensive 3D workloads (the
+Phoronix Test Suite), the usages of the four critical interfaces:
+
+1) the Frame Buffer,
+2) the Command Buffer,
+3) the GPU Page Table Entries (PTEs), which carry the GPU page tables, and
+4) the I/O registers, including Memory-Mapped I/O (MMIO) registers,
+   Port I/O (PIO) registers, and PCI configuration space registers
+   for internal state.
+
+:numref:`access-patterns` shows the average access frequency of running
+Phoronix 3D workloads on the four interfaces.
+
+The Frame Buffer and Command Buffer exhibit the most
+performance-critical resources, as shown in :numref:`access-patterns`.
+When the applications are being loaded, lots of source vertices and
+pixels are written by the CPU, so the Frame Buffer accesses occur in the
+range of hundreds of thousands per second. Then at run-time, the CPU
+programs the GPU through the commands, to render the Frame Buffer, so
+the Command Buffer accesses become the largest group, also in the
+hundreds of thousands per second. PTE and I/O accesses are minor in both
+load and run-time phases ranging in tens of thousands per second.
+
+.. figure:: images/APL_GVT-g-access-patterns.png
+   :width: 400px
+   :align: center
+   :name: access-patterns
+
+   Access Patterns of Running 3D Workloads
+
+High Level Architecture
+***********************
+
+:numref:`gvt-arch` shows the overall architecture of GVT-g, based on the
+ACRN hypervisor, with SOS as the privileged VM, and multiple user
+guests. A GVT-g device model working with the ACRN hypervisor,
+implements the policies of trap and pass-through. Each guest runs the
+native graphics driver and can directly access performance-critical
+resources: the Frame Buffer and Command Buffer, with resource
+partitioning (as presented later). To protect privileged resources, that
+is, the I/O registers and PTEs, corresponding accesses from the graphics
+driver in user VMs are trapped and forwarded to the GVT device model in
+SOS for emulation. The device model leverages i915 interfaces to access
+the physical GPU.
+
+In addition, the device model implements a GPU scheduler that runs
+concurrently with the CPU scheduler in ACRN to share the physical GPU
+timeslot among the VMs. GVT-g uses the physical GPU to directly execute
+all the commands submitted from a VM, so it avoids the complexity of
+emulating the Render Engine, which is the most complex part of the GPU.
+In the meantime, the resource pass-through of both the Frame Buffer and
+Command Buffer minimizes the hypervisor's intervention of CPU accesses,
+while the GPU scheduler guarantees every VM a quantum time-slice for
+direct GPU execution. With that, GVT-g can achieve near-native
+performance for a VM workload.
+
+In :numref:`gvt-arch`, the yellow GVT device model works as a client on
+top of an i915 driver in the SOS. It has a generic Mediated Pass-Through
+(MPT) interface, compatible with all types of hypervisors. For ACRN,
+some extra development work is needed for such MPT interfaces. For
+example, we need some changes in ACRN-DM to make ACRN compatible with
+the MPT framework. The vGPU lifecycle is the same as the lifecycle of
+the guest VM creation through ACRN-DM. They interact through sysfs,
+exposed by the GVT device model.
+
+.. figure:: images/APL_GVT-g-arch.png
+   :width: 600px
+   :align: center
+   :name: gvt-arch
+
+   AcrnGT High-level Architecture
+
+Key Techniques
+**************
+
+vGPU Device Model
+=================
+
+The vGPU Device model is the main component because it constructs the
+vGPU instance for each guest to satisfy every GPU request from the guest
+and gives the corresponding result back to the guest.
+
+The vGPU Device Model provides the basic framework to do
+trap-and-emulation, including MMIO virtualization, interrupt
+virtualization, and display virtualization. It also handles and
+processes all the requests internally, such as, command scan and shadow,
+schedules them in the proper manner, and finally submits to
+the SOS i915 driver.
+
+.. figure:: images/APL_GVT-g-DM.png
+   :width: 800px
+   :align: center
+   :name: GVT-DM
+
+   GVT-g Device Model
+
+MMIO Virtualization
+-------------------
+
+Intel Processor Graphics implements two PCI MMIO BARs:
+
+-  **GTTMMADR BAR**: Combines both :term:`GGTT` modification range and Memory
+   Mapped IO range. It is 16 MB on :term:`BDW`, with 2 MB used by MMIO, 6 MB
+   reserved and 8 MB allocated to GGTT. GGTT starts from
+   :term:`GTTMMADR` + 8 MB. In this section, we focus on virtualization of
+   the MMIO range, discussing GGTT virtualization later.
+
+-  **GMADR BAR**: As the PCI aperture is used by the CPU to access tiled
+   graphics memory, GVT-g partitions this aperture range among VMs for
+   performance reasons.
+
+A 2 MB virtual MMIO structure is allocated per vGPU instance.
+
+All the virtual MMIO registers are emulated as simple in-memory
+read-write, that is, guest driver will read back the same value that was
+programmed earlier. A common emulation handler (for example,
+intel_gvt_emulate_read/write) is enough to handle such general
+emulation requirements. However, some registers need to be emulated with
+specific logic, for example, affected by change of other states or
+additional audit or translation when updating the virtual register.
+Therefore, a specific emulation handler must be installed for those
+special registers.
+
+The graphics driver may have assumptions about the initial device state,
+which stays with the point when the BIOS transitions to the OS. To meet
+the driver expectation, we need to provide an initial state of vGPU that
+a driver may observe on a pGPU. So the host graphics driver is expected
+to generate a snapshot of physical GPU state, which it does before guest
+driver's initialization. This snapshot is used as the initial vGPU state
+by the device model.
+
+PCI Configuration Space Virtualization
+--------------------------------------
+
+PCI configuration space also needs to be virtualized in the device
+model. Different implementations may choose to implement the logic
+within the vGPU device model or in default system device model (for
+example, ACRN-DM). GVT-g emulates the logic in the device model.
+
+Some information is vital for the vGPU device model, including:
+Guest PCI BAR, Guest PCI MSI, and Base of ACPI OpRegion.
+
+Legacy VGA Port I/O Virtualization
+----------------------------------
+
+Legacy VGA is not supported in the vGPU device model. We rely on the
+default device model (for example, :term:`QEMU`) to provide legacy VGA
+emulation, which means either ISA VGA emulation or
+PCI VGA emulation.
+
+Interrupt Virtualization
+------------------------
+
+The GVT device model does not touch the hardware interrupt in the new
+architecture, since it is hard to combine the interrupt controlling
+logic between the virtual device model and the host driver. To prevent
+architectural changes in the host driver, the host GPU interrupt does
+not go to the virtual device model and the virtual device model has to
+handle the GPU interrupt virtualization by itself. Virtual GPU
+interrupts are categorized into three types:
+
+-  Periodic GPU interrupts are emulated by timers. However, a notable
+   exception to this is the VBlank interrupt. Due to the demands of user
+   space compositors, such as Wayland, which requires a flip done event
+   to be synchronized with a VBlank, this interrupt is forwarded from
+   SOS to UOS when SOS receives it from the hardware.
+
+-  Event-based GPU interrupts are emulated by the emulation logic. For
+   example, AUX Channel Interrupt.
+
+-  GPU command interrupts are emulated by a command parser and workload
+   dispatcher. The command parser marks out which GPU command interrupts
+   are generated during the command execution and the workload
+   dispatcher injects those interrupts into the VM after the workload is
+   finished.
+
+.. figure:: images/APL_GVT-g-interrupt-virt.png
+   :width: 400px
+   :align: center
+   :name: interrupt-virt
+
+   Interrupt Virtualization
+
+Workload Scheduler
+------------------
+
+The scheduling policy and workload scheduler are decoupled for
+scalability reasons. For example, a future QoS enhancement will only
+impact the scheduling policy, any i915 interface change or HW submission
+interface change (from execlist to :term:`GuC`) will only need workload
+scheduler updates.
+
+The scheduling policy framework is the core of the vGPU workload
+scheduling system. It controls all of the scheduling actions and
+provides the developer with a generic framework for easy development of
+scheduling policies. The scheduling policy framework controls the work
+scheduling process without caring about how the workload is dispatched
+or completed. All the detailed workload dispatching is hidden in the
+workload scheduler, which is the actual executer of a vGPU workload.
+
+The workload scheduler handles everything about one vGPU workload. Each
+hardware ring is backed by one workload scheduler kernel thread. The
+workload scheduler picks the workload from current vGPU workload queue
+and communicates with the virtual HW submission interface to emulate the
+"schedule-in" status for the vGPU. It performs context shadow, Command
+Buffer scan and shadow, PPGTT page table pin/unpin/out-of-sync, before
+submitting this workload to the host i915 driver. When the vGPU workload
+is completed, the workload scheduler asks the virtual HW submission
+interface to emulate the "schedule-out" status for the vGPU. The VM
+graphics driver then knows that a GPU workload is finished.
+
+.. figure:: images/APL_GVT-g-scheduling.png
+   :width: 500px
+   :align: center
+   :name: scheduling
+
+   GVT-g Scheduling Framework
+
+Workload Submission Path
+------------------------
+
+Software submits the workload using the legacy ring buffer mode on Intel
+Processor Graphics before Broadwell, which is no longer supported by the
+GVT-g virtual device model. A new HW submission interface named
+"Execlist" is introduced since Broadwell. With the new HW submission
+interface, software can achieve better programmability and easier
+context management. In Intel GVT-g, the vGPU submits the workload
+through the virtual HW submission interface. Each workload in submission
+will be represented as an ``intel_vgpu_workload`` data structure, a vGPU
+workload, which will be put on a per-vGPU and per-engine workload queue
+later after performing a few basic checks and verifications.
+
+.. figure:: images/APL_GVT-g-workload.png
+   :width: 800px
+   :align: center
+   :name: workload
+
+   GVT-g Workload Submission
+
+
+Display Virtualization
+----------------------
+
+GVT-g reuses the i915 graphics driver in the SOS to initialize the Display
+Engine, and then manages the Display Engine to show different VM frame
+buffers. When two vGPUs have the same resolution, only the frame buffer
+locations are switched.
+
+.. figure:: images/APL_GVT-g-display-virt.png
+   :width: 800px
+   :align: center
+   :name: display-virt
+
+   Display Virtualization
+
+Direct Display Model
+--------------------
+
+.. figure:: images/APL_GVT-g-direct-display.png
+   :width: 600px
+   :align: center
+   :name: direct-display
+
+   Direct Display Model
+
+A typical automotive use case is where there are two displays in the car
+and each one needs to show one domain's content, with the two domains
+being the Instrument cluster and the In Vehicle Infotainment (IVI). As
+shown in :numref:`direct-display`, this can be accomplished through the direct
+display model of GVT-g, where the SOS and UOS are each assigned all HW
+planes of two different pipes. GVT-g has a concept of display owner on a
+per HW plane basis. If it determines that a particular domain is the
+owner of a HW plane, then it allows the domain's MMIO register write to
+flip a frame buffer to that plane to go through to the HW. Otherwise,
+such writes are blocked by the GVT-g.
+
+Indirect Display Model
+----------------------
+
+.. figure:: images/APL_GVT-g-indirect-display.png
+   :width: 600px
+   :align: center
+   :name: indirect-display
+
+   Indirect Display Model
+
+For security or fastboot reasons, it may be determined that the UOS is
+either not allowed to display its content directly on the HW or it may
+be too late before it boots up and displays its content. In such a
+scenario, the responsibility of displaying content on all displays lies
+with the SOS. One of the use cases that can be realized is to display the
+entire frame buffer of the UOS on a secondary display. GVT-g allows for this
+model by first trapping all MMIO writes by the UOS to the HW. A proxy
+application can then capture the address in GGTT where the UOS has written
+its frame buffer and using the help of the Hypervisor and the SOS's i915
+driver, can convert the Guest Physical Addresses (GPAs) into Host
+Physical Addresses (HPAs) before making a texture source or EGL image
+out of the frame buffer and then either post processing it further or
+simply displaying it on a HW plane of the secondary display.
+
+GGTT-Based Surface Sharing
+--------------------------
+
+One of the major automotive use case is called "surface sharing". This
+use case requires that the SOS accesses an individual surface or a set of
+surfaces from the UOS without having to access the entire frame buffer of
+the UOS. Unlike the previous two models, where the UOS did not have to do
+anything to show its content and therefore a completely unmodified UOS
+could continue to run, this model requires changes to the UOS.
+
+This model can be considered an extension of the indirect display model.
+Under the indirect display model, the UOS's frame buffer was temporarily
+pinned by it in the video memory access through the Global graphics
+translation table. This GGTT-based surface sharing model takes this a
+step further by having a compositor of the UOS to temporarily pin all
+application buffers into GGTT. It then also requires the compositor to
+create a metadata table with relevant surface information such as width,
+height, and GGTT offset, and flip that in lieu of the frame buffer.
+In the SOS, the proxy application knows that the GGTT offset has been
+flipped, maps it, and through it can access the GGTT offset of an
+application that it wants to access. It is worth mentioning that in this
+model, UOS applications did not require any changes, and only the
+compositor, Mesa, and i915 driver had to be modified.
+
+This model has a major benefit and a major limitation. The
+benefit is that since it builds on top of the indirect display model,
+there are no special drivers necessary for it on either SOS or UOS.
+Therefore, any Real Time Operating System (RTOS) that use
+this model can simply do so without having to implement a driver, the
+infrastructure for which may not be present in their operating system.
+The limitation of this model is that video memory dedicated for a UOS is
+generally limited to a couple of hundred MBs. This can easily be
+exhausted by a few application buffers so the number and size of buffers
+is limited. Since it is not a highly-scalable model, in general, Intel
+recommends the Hyper DMA buffer sharing model, described next.
+
+Hyper DMA Buffer Sharing
+------------------------
+
+.. figure:: images/APL_GVT-g-hyper-dma.png
+   :width: 800px
+   :align: center
+   :name: hyper-dma
+
+   Hyper DMA Buffer Design
+
+Another approach to surface sharing is Hyper DMA Buffer sharing. This
+model extends the Linux DMA buffer sharing mechanism where one driver is
+able to share its pages with another driver within one domain.
+
+Applications buffers are backed by i915 Graphics Execution Manager
+Buffer Objects (GEM BOs).  As in GGTT surface
+sharing, this model also requires compositor changes. The compositor of
+UOS requests i915 to export these application GEM BOs and then passes
+them on to a special driver called the Hyper DMA Buf exporter whose job
+is to create a scatter gather list of pages mapped by PDEs and PTEs and
+export a Hyper DMA Buf ID back to the compositor.
+
+The compositor then shares this Hyper DMA Buf ID with the SOS's Hyper DMA
+Buf importer driver which then maps the memory represented by this ID in
+the SOS. A proxy application in the SOS can then provide the ID of this driver
+to the SOS i915, which can create its own GEM BO. Finally, the application
+can use it as an EGL image and do any post processing required before
+either providing it to the SOS compositor or directly flipping it on a
+HW plane in the compositor's absence.
+
+This model is highly scalable and can be used to share up to 4 GB worth
+of pages. It is also not limited to only sharing graphics buffers. Other
+buffers for the IPU and others, can also be shared with it. However, it
+does require that the SOS port the Hyper DMA Buffer importer driver. Also,
+the SOS OS must comprehend and implement the DMA buffer sharing model.
+
+For detailed information about this model, please refer to the `Linux
+HYPER_DMABUF Driver High Level Design
+<https://github.com/downor/linux_hyper_dmabuf/blob/hyper_dmabuf_integration_v4/Documentation/hyper-dmabuf-sharing.txt>`_.
+
+Plane-Based Domain Ownership
+----------------------------
+
+.. figure:: images/APL_GVT-g-plane-based.png
+   :width: 600px
+   :align: center
+   :name: plane-based
+
+   Plane-Based Domain Ownership
+
+Yet another mechanism for showing content of both the SOS and UOS on the
+same physical display is called plane-based domain ownership. Under this
+model, both the SOS and UOS are provided a set of HW planes that they can
+flip their contents on to. Since each domain provides its content, there
+is no need for any extra composition to be done through the SOS. The display
+controller handles alpha blending contents of different domains on a
+single pipe. This saves on any complexity on either the SOS or the UOS
+SW stack.
+
+It is important to provide only specific planes and have them statically
+assigned to different Domains. To achieve this, the i915 driver of both
+domains is provided a command line parameter that specifies the exact
+planes that this domain has access to. The i915 driver then enumerates
+only those HW planes and exposes them to its compositor. It is then left
+to the compositor configuration to use these planes appropriately and
+show the correct content on them. No other changes are necessary.
+
+While the biggest benefit of this model is that is extremely simple and
+quick to implement, it also has some drawbacks. First, since each domain
+is responsible for showing the content on the screen, there is no
+control of the UOS by the SOS. If the UOS is untrusted, this could
+potentially cause some unwanted content to be displayed. Also, there is
+no post processing capability, except that provided by the display
+controller (for example, scaling, rotation, and so on). So each domain
+must provide finished buffers with the expectation that alpha blending
+with another domain will not cause any corruption or unwanted artifacts.
+
+Graphics Memory Virtualization
+==============================
+
+To achieve near-to-native graphics performance, GVT-g passes through the
+performance-critical operations, such as Frame Buffer and Command Buffer
+from the VM. For the global graphics memory space, GVT-g uses graphics
+memory resource partitioning and an address space ballooning mechanism.
+For local graphics memory spaces, GVT-g implements per-VM local graphics
+memory through a render context switch because local graphics memory is
+only accessible by the GPU.
+
+Global Graphics Memory
+----------------------
+
+Graphics Memory Resource Partitioning
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+GVT-g partitions the global graphics memory among VMs. Splitting the
+CPU/GPU scheduling mechanism requires that the global graphics memory of
+different VMs can be accessed by the CPU and the GPU simultaneously.
+Consequently, GVT-g must, at any time, present each VM with its own
+resource, leading to the resource partitioning approaching, for global
+graphics memory, as shown in :numref:`mem-part`.
+
+.. figure:: images/APL_GVT-g-mem-part.png
+   :width: 800px
+   :align: center
+   :name: mem-part
+
+   Memory Partition and Ballooning
+
+The performance impact of reduced global graphics memory resource
+due to memory partitioning is very limited according to various test
+results.
+
+Address Space Ballooning
+%%%%%%%%%%%%%%%%%%%%%%%%
+
+The address space ballooning technique is introduced to eliminate the
+address translation overhead, shown in :numref:`mem-part`. GVT-g exposes the
+partitioning information to the VM graphics driver through the PVINFO
+MMIO window. The graphics driver marks the other VMs' regions as
+'ballooned', and reserves them as not being used from its graphics
+memory allocator. Under this design, the guest view of global graphics
+memory space is exactly the same as the host view and the driver
+programmed addresses, using guest physical address, can be directly used
+by the hardware. Address space ballooning is different from traditional
+memory ballooning techniques. Memory ballooning is for memory usage
+control concerning the number of ballooned pages, while address space
+ballooning is to balloon special memory address ranges.
+
+Another benefit of address space ballooning is that there is no address
+translation overhead as we use the guest Command Buffer for direct GPU
+execution.
+
+Per-VM Local Graphics Memory
+----------------------------
+
+GVT-g allows each VM to use the full local graphics memory spaces of its
+own, similar to the virtual address spaces on the CPU. The local
+graphics memory spaces are only visible to the Render Engine in the GPU.
+Therefore, any valid local graphics memory address, programmed by a VM,
+can be used directly by the GPU. The GVT-g device model switches the
+local graphics memory spaces, between VMs, when switching render
+ownership.
+
+GPU Page Table Virtualization
+=============================
+
+Shared Shadow GGTT
+------------------
+
+To achieve resource partitioning and address space ballooning, GVT-g
+implements a shared shadow global page table for all VMs. Each VM has
+its own guest global page table to translate the graphics memory page
+number to the Guest memory Page Number (GPN). The shadow global page
+table is then translated from the graphics memory page number to the
+Host memory Page Number (HPN).
+
+The shared shadow global page table maintains the translations for all
+VMs to support concurrent accesses from the CPU and GPU concurrently.
+Therefore, GVT-g implements a single, shared shadow global page table by
+trapping guest PTE updates, as shown in :numref:`shared-shadow`. The
+global page table, in MMIO space, has 1024K PTE entries, each pointing
+to a 4 KB system memory page, so the global page table overall creates a
+4 GB global graphics memory space. GVT-g audits the guest PTE values
+according to the address space ballooning information before updating
+the shadow PTE entries.
+
+.. figure:: images/APL_GVT-g-shared-shadow.png
+   :width: 600px
+   :align: center
+   :name: shared-shadow
+
+   Shared Shadow Global Page Table
+
+Per-VM Shadow PPGTT
+-------------------
+
+To support local graphics memory access pass-through, GVT-g implements
+per-VM shadow local page tables. The local graphics memory is only
+accessible from the Render Engine. The local page tables have two-level
+paging structures, as shown in :numref:`per-vm-shadow`.
+
+The first level, Page Directory Entries (PDEs), located in the global
+page table, points to the second level, Page Table Entries (PTEs) in
+system memory, so guest accesses to the PDE are trapped and emulated,
+through the implementation of shared shadow global page table.
+
+GVT-g also write-protects a list of guest PTE pages for each VM. The
+GVT-g device model synchronizes the shadow page with the guest page, at
+the time of write-protection page fault, and switches the shadow local
+page tables at render context switches.
+
+.. figure:: images/APL_GVT-g-per-vm-shadow.png
+   :width: 800px
+   :align: center
+   :name: per-vm-shadow
+
+   Per-VM Shadow PPGTT
+
+Prioritized Rendering and Preemption
+====================================
+
+Different Schedulers and Their Roles
+------------------------------------
+
+.. figure:: images/APL_GVT-g-scheduling-policy.png
+   :width: 800px
+   :align: center
+   :name: scheduling-policy
+
+   Scheduling Policy
+
+In the system, there are three different schedulers for the GPU:
+
+-  i915 UOS scheduler
+-  Mediator GVT scheduler
+-  i915 SOS scheduler
+
+Since UOS always uses the host-based command submission (ELSP) model,
+and it never accesses the GPU or the Graphic Micro Controller (GuC)
+directly, its scheduler cannot do any preemption by itself.
+The i915 scheduler does ensure batch buffers are
+submitted in dependency order, that is, if a compositor had to wait for
+an application buffer to finish before its workload can be submitted to
+the GPU, then the i915 scheduler of the UOS ensures that this happens.
+
+The UOS assumes that by submitting its batch buffers to the Execlist
+Submission Port (ELSP), the GPU will start working on them. However,
+the MMIO write to the ELSP is captured by the Hypervisor, which forwards
+these requests to the GVT module. GVT then creates a shadow context
+based on this batch buffer and submits the shadow context to the SOS
+i915 driver.
+
+However, it is dependent on a second scheduler called the GVT
+scheduler. This scheduler is time based and uses a round robin algorithm
+to provide a specific time for each UOS to submit its workload when it
+is considered as a "render owner". The workload of the UOSs that are not
+render owners during a specific time period end up waiting in the
+virtual GPU context until the GVT scheduler makes them render owners.
+The GVT shadow context submits only one workload at
+a time, and once the workload is finished by the GPU, it copies any
+context state back to DomU and sends the appropriate interrupts before
+picking up any other workloads from either this UOS or another one. This
+also implies that this scheduler does not do any preemption of
+workloads.
+
+Finally, there is the i915 scheduler in the SOS. This scheduler uses the
+GuC or ELSP to do command submission of SOS local content as well as any
+content that GVT is submitting to it on behalf of the UOSs. This
+scheduler uses GuC or ELSP to preempt workloads. GuC has four different
+priority queues, but the SOS i915 driver uses only two of them. One of
+them is considered high priority and the other is normal priority with a
+GuC rule being that any command submitted on the high priority queue
+would immediately try to preempt any workload submitted on the normal
+priority queue. For ELSP submission, the i915 will submit a preempt
+context to preempt the current running context and then wait for the GPU
+engine to be idle.
+
+While the identification of workloads to be preempted is decided by
+customizable scheduling policies, once a candidate for preemption is
+identified, the i915 scheduler simply submits a preemption request to
+the GuC high-priority queue. Based on the HW's ability to preempt (on an
+Apollo Lake SoC, 3D workload is preemptible on a 3D primitive level with
+some exceptions), the currently executing workload is saved and
+preempted. The GuC informs the driver using an interrupt of a preemption
+event occurring. After handling the interrupt, the driver submits the
+high-priority workload through the normal priority GuC queue. As such,
+the normal priority GuC queue is used for actual execbuf submission most
+of the time with the high-priority GuC queue only being used for the
+preemption of lower-priority workload.
+
+Scheduling policies are customizable and left to customers to change if
+they are not satisfied with the built-in i915 driver policy, where all
+workloads of the SOS are considered higher priority than those of the
+UOS. This policy can be enforced through an SOS i915 kernel command line
+parameter, and can replace the default in-order command submission (no
+preemption) policy.
+
+AcrnGT
+*******
+
+ACRN is a flexible, lightweight reference hypervisor, built with
+real-time and safety-criticality in mind, optimized to streamline
+embedded development through an open source platform.
+
+AcrnGT is the GVT-g implementation on the ACRN hypervisor. It adapts
+the MPT interface of GVT-g onto ACRN by using the kernel APIs provided
+by ACRN.
+
+:numref:`full-pic` shows the full architecture of AcrnGT with a Linux Guest
+OS and an Android Guest OS.
+
+.. figure:: images/APL_GVT-g-full-pic.png
+   :width: 800px
+   :align: center
+   :name: full-pic
+
+   Full picture of the AcrnGT
+
+AcrnGT in kernel
+=================
+
+The AcrnGT module in the SOS kernel acts as an adaption layer to connect
+between GVT-g in the i915, the VHM module, and the ACRN-DM user space
+application:
+
+-  AcrnGT module implements the MPT interface of GVT-g to provide
+   services to it, including set and unset trap areas, set and unset
+   write-protection pages, etc.
+
+-  It calls the VHM APIs provided by the ACRN VHM module in the SOS
+   kernel, to eventually call into the routines provided by ACRN
+   hypervisor through hyper-calls.
+
+-  It provides user space interfaces through ``sysfs`` to the user space
+   ACRN-DM, so that DM can manage the lifecycle of the virtual GPUs.
+
+AcrnGT in DM
+=============
+
+To emulate a PCI device to a Guest, we need an AcrnGT sub-module in the
+ACRN-DM.  This sub-module is responsible for:
+
+-  registering the virtual GPU device to the PCI device tree presented to
+   guest;
+
+-  registerng the MMIO resources to ACRN-DM so that it can reserve
+   resources in ACPI table;
+
+-  managing the lifecycle of the virtual GPU device, such as creation,
+   destruction, and resetting according to the state of the virtual
+   machine.
--- a/doc/developer-guides/hld/hld-devicemodel.rst
+++ b/doc/developer-guides/hld/hld-devicemodel.rst
@@ -0,0 +1,10 @@
+.. _hld-devicemodel:
+
+Device Model high-level design
+##############################
+
+
+.. toctree::
+   :maxdepth: 1
+
+   ACPI virtualization <acpi-virt>
--- a/doc/developer-guides/hld/hld-emulated-devices.rst
+++ b/doc/developer-guides/hld/hld-emulated-devices.rst
@@ -0,0 +1,11 @@
+.. _hld-emulated-devices:
+
+Emulated Devices high-level design
+##################################
+
+.. toctree::
+   :maxdepth: 1
+
+   GVT-g GPU Virtualization <hld-APL_GVT-g>
+   UART virtualization <uart-virt-hld>
+   Watchdoc virtualization <watchdog-hld>
--- a/doc/developer-guides/hld/hld-hypervisor.rst
+++ b/doc/developer-guides/hld/hld-hypervisor.rst
@@ -0,0 +1,11 @@
+.. _hld-hypervisor:
+
+Hypervisor high-level design
+############################
+
+
+.. toctree::
+   :maxdepth: 1
+
+   Memory management <memmgt-hld>
+   Interrupt management <interrupt-hld>
--- a/doc/developer-guides/hld/hld-overview.rst
+++ b/doc/developer-guides/hld/hld-overview.rst
@@ -0,0 +1,4 @@
+.. _hld-overview:
+
+Overview
+########
--- a/doc/developer-guides/hld/hld-power-management.rst
+++ b/doc/developer-guides/hld/hld-power-management.rst
@@ -0,0 +1,4 @@
+.. _hld-power-management:
+
+Power Management high-level design
+##################################
--- a/doc/developer-guides/hld/hld-security.rst
+++ b/doc/developer-guides/hld/hld-security.rst
--- a/doc/developer-guides/hld/hld-trace-log.rst
+++ b/doc/developer-guides/hld/hld-trace-log.rst
@@ -0,0 +1,4 @@
+.. _hld-trace-log:
+
+Tracing and Logging high-level design
+#####################################
--- a/doc/developer-guides/hld/hld-virtio-devices.rst
+++ b/doc/developer-guides/hld/hld-virtio-devices.rst
@@ -0,0 +1,499 @@
+.. _hld-virtio-devices:
+.. _virtio-hld:
+
+Virtio devices high-level design
+################################
+
+The ACRN Hypervisor follows the `Virtual I/O Device (virtio)
+specification
+<http://docs.oasis-open.org/virtio/virtio/v1.0/virtio-v1.0.html>`_ to
+realize I/O virtualization for many performance-critical devices
+supported in the ACRN project. Adopting the virtio specification lets us
+reuse many frontend virtio drivers already available in a Linux-based
+User OS, drastically reducing potential development effort for frontend
+virtio drivers.  To further reduce the development effort of backend
+virtio drivers, the hypervisor  provides the virtio backend service
+(VBS) APIs, that make  it very straightforward to implement a virtio
+device in the hypervisor.
+
+The virtio APIs can be divided into 3 groups: DM APIs, virtio backend
+service (VBS) APIs, and virtqueue (VQ) APIs, as shown in
+:numref:`be-interface`.
+
+.. figure:: images/virtio-hld-image0.png
+   :width: 900px
+   :align: center
+   :name: be-interface
+
+   ACRN Virtio Backend Service Interface
+
+-  **DM APIs** are exported by the DM, and are mainly used during the
+   device initialization phase and runtime. The DM APIs also include
+   PCIe emulation APIs because each virtio device is a PCIe device in
+   the SOS and UOS.
+-  **VBS APIs** are mainly exported by the VBS and related modules.
+   Generally they are callbacks to be
+   registered into the DM.
+-  **VQ APIs** are used by a virtio backend device to access and parse
+   information from the shared memory between the frontend and backend
+   device drivers.
+
+Virtio framework is the para-virtualization specification that ACRN
+follows to implement I/O virtualization of performance-critical
+devices such as audio, eAVB/TSN, IPU, and CSMU devices. This section gives
+an overview about virtio history, motivation, and advantages, and then
+highlights virtio key concepts. Second, this section will describe
+ACRN's virtio architectures, and elaborates on ACRN virtio APIs. Finally
+this section will introduce all the virtio devices currently supported
+by ACRN.
+
+Virtio introduction
+*******************
+
+Virtio is an abstraction layer over devices in a para-virtualized
+hypervisor. Virtio was developed by Rusty Russell when he worked at IBM
+research to support his lguest hypervisor in 2007, and it quickly became
+the de-facto standard for KVM's para-virtualized I/O devices.
+
+Virtio is very popular for virtual I/O devices because is provides a
+straightforward, efficient, standard, and extensible mechanism, and
+eliminates the need for boutique, per-environment, or per-OS mechanisms.
+For example, rather than having a variety of device emulation
+mechanisms, virtio provides a common frontend driver framework that
+standardizes device interfaces, and increases code reuse across
+different virtualization platforms.
+
+Given the advantages of virtio, ACRN also follows the virtio
+specification.
+
+Key Concepts
+************
+
+To better understand virtio, especially its usage in ACRN, we'll
+highlight several key virtio concepts important to ACRN:
+
+
+Frontend virtio driver (FE)
+  Virtio adopts a frontend-backend architecture that enables a simple but
+  flexible framework for both frontend and backend virtio drivers. The FE
+  driver merely needs to offer services configure the interface, pass messages,
+  produce requests, and kick backend virtio driver. As a result, the FE
+  driver is easy to implement and the performance overhead of emulating
+  a device is eliminated.
+
+Backend virtio driver (BE)
+  Similar to FE driver, the BE driver, running either in user-land or
+  kernel-land of the host OS, consumes requests from the FE driver and sends them
+  to the host native device driver. Once the requests are done by the host
+  native device driver, the BE driver notifies the FE driver that the
+  request is complete.
+
+  Note: to distinguish BE driver from host native device driver, the host
+  native device driver is called "native driver" in this document.
+
+Straightforward: virtio devices as standard devices on existing buses
+  Instead of creating new device buses from scratch, virtio devices are
+  built on existing buses. This gives a straightforward way for both FE
+  and BE drivers to interact with each other. For example, FE driver could
+  read/write registers of the device, and the virtual device could
+  interrupt FE driver, on behalf of the BE driver, in case something of
+  interest is happening.
+
+  Currently virtio supports PCI/PCIe bus and MMIO bus. In ACRN, only
+  PCI/PCIe bus is supported, and all the virtio devices share the same
+  vendor ID 0x1AF4.
+
+  Note: For MMIO, the "bus" is a little bit an overstatement since
+  basically it is a few descriptors describing the devices.
+
+Efficient: batching operation is encouraged
+  Batching operation and deferred notification are important to achieve
+  high-performance I/O, since notification between FE and BE driver
+  usually involves an expensive exit of the guest. Therefore batching
+  operating and notification suppression are highly encouraged if
+  possible. This will give an efficient implementation for 
+  performance-critical devices.
+
+Standard: virtqueue
+  All virtio devices share a standard ring buffer and descriptor
+  mechanism, called a virtqueue, shown in :numref:`virtqueue`. A virtqueue is a
+  queue of scatter-gather buffers. There are three important methods on
+  virtqueues:
+
+  - **add_buf** is for adding a request/response buffer in a virtqueue, 
+  - **get_buf** is for getting a response/request in a virtqueue, and
+  - **kick** is for notifying the other side for a virtqueue to consume buffers.
+
+  The virtqueues are created in guest physical memory by the FE drivers.
+  BE drivers only need to parse the virtqueue structures to obtain
+  the requests and process them. How a virtqueue is organized is
+  specific to the Guest OS. In the Linux implementation of virtio, the
+  virtqueue is implemented as a ring buffer structure called vring.
+
+  In ACRN, the virtqueue APIs can be leveraged directly so that users
+  don't need to worry about the details of the virtqueue. (Refer to guest
+  OS for more details about the virtqueue implementation.)
+
+.. figure:: images/virtio-hld-image2.png
+   :width: 900px
+   :align: center
+   :name: virtqueue
+
+   Virtqueue
+
+Extensible: feature bits
+  A simple extensible feature negotiation mechanism exists for each
+  virtual device and its driver. Each virtual device could claim its
+  device specific features while the corresponding driver could respond to
+  the device with the subset of features the driver understands. The
+  feature mechanism enables forward and backward compatibility for the
+  virtual device and driver.
+
+Virtio Device Modes
+  The virtio specification defines three modes of virtio devices:
+  a legacy mode device, a transitional mode device, and a modern mode
+  device. A legacy mode device is compliant to virtio specification
+  version 0.95, a transitional mode device is compliant to both
+  0.95 and 1.0 spec versions, and a modern mode
+  device is only compatible to the version 1.0 specification.
+
+  In ACRN, all the virtio devices are transitional devices, meaning that
+  they should be compatible with both 0.95 and 1.0 versions of virtio
+  specification.
+
+Virtio Device Discovery
+  Virtio devices are commonly implemented as PCI/PCIe devices. A
+  virtio device using virtio over PCI/PCIe bus must expose an interface to
+  the Guest OS that meets the PCI/PCIe specifications.
+
+  Conventionally, any PCI device with Vendor ID 0x1AF4,
+  PCI_VENDOR_ID_REDHAT_QUMRANET, and Device ID 0x1000 through 0x107F
+  inclusive is a virtio device. Among the Device IDs, the
+  legacy/transitional mode virtio devices occupy the first 64 IDs ranging
+  from 0x1000 to 0x103F, while the range 0x1040-0x107F belongs to
+  virtio modern devices. In addition, the Subsystem Vendor ID should
+  reflect the PCI/PCIe vendor ID of the environment, and the Subsystem
+  Device ID indicates which virtio device is supported by the device.
+
+Virtio Frameworks
+*****************
+
+This section describes the overall architecture of virtio, and then
+introduce ACRN specific implementations of the virtio framework.
+
+Architecture
+============
+
+Virtio adopts a frontend-backend
+architecture, as shown in :numref:`virtio-arch`. Basically the FE and BE driver
+communicate with each other through shared memory, via the
+virtqueues. The FE driver talks to the BE driver in the same way it
+would talk to a real PCIe device. The BE driver handles requests
+from the FE driver, and notifies the FE driver if the request has been
+processed.
+
+.. figure:: images/virtio-hld-image1.png
+   :width: 900px
+   :align: center
+   :name: virtio-arch
+
+   Virtio Architecture
+
+In addition to virtio's frontend-backend architecture, both FE and BE
+drivers follow a layered architecture, as shown in
+:numref:`virtio-fe-be`. Each
+side has three layers: transports, core models, and device types.
+All virtio devices share the same virtio infrastructure, including
+virtqueues, feature mechanisms, configuration space, and buses.
+
+.. figure:: images/virtio-hld-image4.png
+   :width: 900px
+   :align: center
+   :name: virtio-fe-be
+
+   Virtio Frontend/Backend Layered Architecture
+
+Virtio Framework Considerations
+===============================
+
+How to realize the virtio framework is specific to a
+hypervisor implementation. In ACRN, the virtio framework implementations
+can be classified into two types, virtio backend service in user-land
+(VBS-U) and virtio backend service in kernel-land (VBS-K), according to
+where the virtio backend service (VBS) is located. Although different in BE
+drivers, both VBS-U and VBS-K share the same FE drivers. The reason
+behind the two virtio implementations is to meet the requirement of
+supporting a large amount of diverse I/O devices in ACRN project.
+
+When developing a virtio BE device driver, the device owner should choose
+carefully between the VBS-U and VBS-K. Generally VBS-U targets
+non-performance-critical devices, but enables easy development and
+debugging. VBS-K targets performance critical devices.
+
+The next two sections introduce ACRN's two implementations of the virtio
+framework.
+
+User-Land Virtio Framework
+==========================
+
+The architecture of ACRN user-land virtio framework (VBS-U) is shown in
+:numref:`virtio-userland`.
+
+The FE driver talks to the BE driver as if it were talking with a PCIe
+device. This means for "control plane", the FE driver could poke device
+registers through PIO or MMIO, and the device will interrupt the FE
+driver when something happens. For "data plane", the communication
+between the FE and BE driver is through shared memory, in the form of
+virtqueues.
+
+On the service OS side where the BE driver is located, there are several
+key components in ACRN, including device model (DM), virtio and HV
+service module (VHM), VBS-U, and user-level vring service API helpers.
+
+DM bridges the FE driver and BE driver since each VBS-U module emulates
+a PCIe virtio device. VHM bridges DM and the hypervisor by providing
+remote memory map APIs and notification APIs. VBS-U accesses the
+virtqueue through the user-level vring service API helpers.
+
+.. figure:: images/virtio-hld-image3.png
+   :width: 900px
+   :align: center
+   :name: virtio-userland
+
+   ACRN User-Land Virtio Framework
+
+Kernel-Land Virtio Framework
+============================
+
+The architecture of ACRN kernel-land virtio framework (VBS-K) is shown
+in :numref:`virtio-kernelland`.
+
+VBS-K provides acceleration for performance critical devices emulated by
+VBS-U modules by handling the "data plane" of the devices directly in
+the kernel. When VBS-K is enabled for certain device, the kernel-land
+vring service API helpers are used to access the virtqueues shared by
+the FE driver. Compared to VBS-U, this eliminates the overhead of
+copying data back-and-forth between user-land and kernel-land within the
+service OS, but pays with the extra implementation complexity of the BE
+drivers.
+
+Except for the differences mentioned above, VBS-K still relies on VBS-U
+for feature negotiations between FE and BE drivers. This means the
+"control plane" of the virtio device still remains in VBS-U. When
+feature negotiation is done, which is determined by FE driver setting up
+an indicative flag, VBS-K module will be initialized by VBS-U, after
+which all request handling will be offloaded to the VBS-K in kernel.
+
+The FE driver is not aware of how the BE driver is implemented, either
+in the VBS-U or VBS-K model. This saves engineering effort regarding FE
+driver development.
+
+.. figure:: images/virtio-hld-image6.png
+   :width: 900px
+   :align: center
+   :name: virtio-kernelland
+
+   ACRN Kernel-Land Virtio Framework
+
+Virtio APIs
+***********
+
+This section provides details on the ACRN virtio APIs. As outlined previously,
+the ACRN virtio APIs can be divided into three groups: DM_APIs,
+VBS_APIs, and VQ_APIs. The following sections will elaborate on
+these APIs.
+
+VBS-U Key Data Structures
+=========================
+
+The key data structures for VBS-U are listed as following, and their
+relationships are shown in :numref:`VBS-U-data`.
+
+``struct pci_virtio_blk``
+  An example virtio device, such as virtio-blk.
+``struct virtio_common``
+  A common component to any virtio device.
+``struct virtio_ops``
+  Virtio specific operation functions for this type of virtio device.
+``struct pci_vdev``
+  Instance of a virtual PCIe device, and any virtio
+  device is a virtual PCIe device.
+``struct pci_vdev_ops``
+  PCIe device's operation functions for this type
+  of device.
+``struct vqueue_info``
+  Instance of a virtqueue.
+
+.. figure:: images/virtio-hld-image5.png
+   :width: 900px
+   :align: center
+   :name: VBS-U-data
+
+   VBS-U Key Data Structures
+
+Each virtio device is a PCIe device. In addition, each virtio device
+could have none or multiple virtqueues, depending on the device type.
+The ``struct virtio_common`` is a key data structure to be manipulated by
+DM, and DM finds other key data structures through it. The ``struct
+virtio_ops`` abstracts a series of virtio callbacks to be provided by
+device owner.
+
+VBS-K Key Data Structures
+=========================
+
+The key data structures for VBS-K are listed as follows, and their
+relationships are shown in :numref:`VBS-K-data`.
+
+``struct vbs_k_rng``
+  In-kernel VBS-K component handling data plane of a
+  VBS-U virtio device, for example virtio random_num_generator.
+``struct vbs_k_dev``
+  In-kernel VBS-K component common to all VBS-K.
+``struct vbs_k_vq``
+  In-kernel VBS-K component to be working with kernel
+  vring service API helpers.
+``struct vbs_k_dev_inf``
+  Virtio device information to be synchronized
+  from VBS-U to VBS-K kernel module.
+``struct vbs_k_vq_info``
+  A single virtqueue information to be
+  synchronized from VBS-U to VBS-K kernel module.
+``struct vbs_k_vqs_info``
+  Virtqueue(s) information, of a virtio device,
+  to be synchronized from VBS-U to VBS-K kernel module.
+
+.. figure:: images/virtio-hld-image8.png
+   :width: 900px
+   :align: center
+   :name: VBS-K-data
+
+   VBS-K Key Data Structures
+
+In VBS-K, the struct vbs_k_xxx represents the in-kernel component
+handling a virtio device's data plane. It presents a char device for VBS-U
+to open and register device status after feature negotiation with the FE
+driver.
+
+The device status includes negotiated features, number of virtqueues,
+interrupt information, and more. All these status will be synchronized
+from VBS-U to VBS-K. In VBS-U, the ``struct vbs_k_dev_info`` and ``struct
+vbs_k_vqs_info`` will collect all the information and notify VBS-K through
+ioctls. In VBS-K, the ``struct vbs_k_dev`` and ``struct vbs_k_vq``, which are
+common to all VBS-K modules, are the counterparts to preserve the
+related information. The related information is necessary to kernel-land
+vring service API helpers.
+
+DM APIs
+=======
+
+The DM APIs are exported by DM, and they should be used when realizing
+BE device drivers on ACRN.
+
+[API Material from doxygen comments]
+
+VBS APIs
+========
+
+The VBS APIs are exported by VBS related modules, including VBS, DM, and
+SOS kernel modules. They can be classified into VBS-U and VBS-K APIs
+listed as follows.
+
+VBS-U APIs
+----------
+
+These APIs provided by VBS-U are callbacks to be registered to DM, and
+the virtio framework within DM will invoke them appropriately.
+
+[API Material from doxygen comments]
+
+VBS-K APIs
+----------
+
+The VBS-K APIs are exported by VBS-K related modules. Users could use
+the following APIs to implement their VBS-K modules.
+
+APIs provided by DM
+~~~~~~~~~~~~~~~~~~~
+
+[API Material from doxygen comments]
+
+APIs provided by VBS-K modules in service OS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+[API Material from doxygen comments]
+
+VQ APIs
+=======
+
+The virtqueue APIs, or VQ APIs, are used by a BE device driver to
+access the virtqueues shared by the FE driver. The VQ APIs abstract the
+details of virtqueues so that users don't need to worry about the data
+structures within the virtqueues. In addition, the VQ APIs are designed
+to be identical between VBS-U and VBS-K, so that users don't need to
+learn different APIs when implementing BE drivers based on VBS-U and
+VBS-K.
+
+[API Material from doxygen comments]
+
+Below is an example showing a typical logic of how a BE driver handles
+requests from a FE driver.
+
+.. code-block:: c
+
+   static void BE_callback(struct pci_virtio_xxx *pv, struct vqueue_info *vq ) {
+      while (vq_has_descs(vq)) {
+         vq_getchain(vq, &idx, &iov, 1, NULL);
+                /* handle requests in iov */
+                request_handle_proc();
+                /* Release this chain and handle more */
+                vq_relchain(vq, idx, len);
+         }
+      /* Generate interrupt if appropriate. 1 means ring empty \*/
+      vq_endchains(vq, 1);
+   }
+
+Supported Virtio Devices
+************************
+
+All the BE virtio drivers are implemented using the
+ACRN virtio APIs, and the FE drivers are reusing the standard Linux FE
+virtio drivers. For the devices with FE drivers available in the Linux
+kernel, they should use standard virtio Vendor ID/Device ID and
+Subsystem Vendor ID/Subsystem Device ID. For other devices within ACRN,
+their temporary IDs are listed in the following table.
+
+.. table:: Virtio Devices without existing FE drivers in Linux
+   :align: center
+   :name: virtio-device-table
+
+   +--------------+-------------+-------------+-------------+-------------+
+   | virtio       | Vendor ID   | Device ID   | Subvendor   | Subdevice   |
+   | device       |             |             | ID          | ID          |
+   +--------------+-------------+-------------+-------------+-------------+
+   | RPMB         | 0x8086      | 0x8601      | 0x8086      | 0xFFFF      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | HECI         | 0x8086      | 0x8602      | 0x8086      | 0xFFFE      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | audio        | 0x8086      | 0x8603      | 0x8086      | 0xFFFD      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | IPU          | 0x8086      | 0x8604      | 0x8086      | 0xFFFC      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | TSN/AVB      | 0x8086      | 0x8605      | 0x8086      | 0xFFFB      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | hyper_dmabuf | 0x8086      | 0x8606      | 0x8086      | 0xFFFA      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | HDCP         | 0x8086      | 0x8607      | 0x8086      | 0xFFF9      |
+   +--------------+-------------+-------------+-------------+-------------+
+   | COREU        | 0x8086      | 0x8608      | 0x8086      | 0xFFF8      |
+   +--------------+-------------+-------------+-------------+-------------+
+
+The following sections introduce the status of virtio devices currently
+supported in ACRN.
+
+.. toctree::
+   :maxdepth: 1
+
+   virtio-blk
+   virtio-net
+   virtio-console
+   virtio-rnd
--- a/doc/developer-guides/hld/hld-vm-management.rst
+++ b/doc/developer-guides/hld/hld-vm-management.rst
@@ -0,0 +1,4 @@
+.. _hld-vm-management:
+
+VM Management high-level design
+###############################
--- a/doc/developer-guides/hld/hld-vsbl.rst
+++ b/doc/developer-guides/hld/hld-vsbl.rst
@@ -0,0 +1,4 @@
+.. _hld-vsbl:
+
+Virtual Slim-Bootloader high-level design
+#########################################
--- a/doc/developer-guides/hld/images/APL_GVT-g-DM.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-DM.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-access-patterns.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-access-patterns.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-api-forwarding.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-api-forwarding.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-arch.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-arch.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-direct-display.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-direct-display.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-display-virt.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-display-virt.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-full-pic.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-full-pic.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-graphics-arch.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-graphics-arch.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-hyper-dma.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-hyper-dma.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-indirect-display.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-indirect-display.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-interrupt-virt.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-interrupt-virt.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-ive-use-case.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-ive-use-case.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-mediated-pass-through.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-mediated-pass-through.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-mem-part.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-mem-part.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-pass-through.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-pass-through.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-per-vm-shadow.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-per-vm-shadow.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-perf-critical.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-perf-critical.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-plane-based.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-plane-based.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-scheduling-policy.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-scheduling-policy.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-scheduling.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-scheduling.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-shared-shadow.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-shared-shadow.png
--- a/doc/developer-guides/hld/images/APL_GVT-g-workload.png
+++ b/doc/developer-guides/hld/images/APL_GVT-g-workload.png
--- a/doc/developer-guides/hld/images/acpi-image1.png
+++ b/doc/developer-guides/hld/images/acpi-image1.png
--- a/doc/developer-guides/hld/images/acpi-image2.png
+++ b/doc/developer-guides/hld/images/acpi-image2.png
--- a/doc/developer-guides/hld/images/acpi-image3.png
+++ b/doc/developer-guides/hld/images/acpi-image3.png
--- a/doc/developer-guides/hld/images/acpi-image5.png
+++ b/doc/developer-guides/hld/images/acpi-image5.png
--- a/doc/developer-guides/hld/images/interrupt-image2.png
+++ b/doc/developer-guides/hld/images/interrupt-image2.png
--- a/doc/developer-guides/hld/images/interrupt-image3.png
+++ b/doc/developer-guides/hld/images/interrupt-image3.png
--- a/doc/developer-guides/hld/images/interrupt-image4.png
+++ b/doc/developer-guides/hld/images/interrupt-image4.png
--- a/doc/developer-guides/hld/images/interrupt-image5.png
+++ b/doc/developer-guides/hld/images/interrupt-image5.png
--- a/doc/developer-guides/hld/images/interrupt-image6.png
+++ b/doc/developer-guides/hld/images/interrupt-image6.png
--- a/doc/developer-guides/hld/images/interrupt-image7.png
+++ b/doc/developer-guides/hld/images/interrupt-image7.png
--- a/doc/developer-guides/hld/images/mem-image1.png
+++ b/doc/developer-guides/hld/images/mem-image1.png
--- a/doc/developer-guides/hld/images/mem-image2.png
+++ b/doc/developer-guides/hld/images/mem-image2.png
--- a/doc/developer-guides/hld/images/mem-image3.png
+++ b/doc/developer-guides/hld/images/mem-image3.png
--- a/doc/developer-guides/hld/images/mem-image4.png
+++ b/doc/developer-guides/hld/images/mem-image4.png
--- a/doc/developer-guides/hld/images/mem-image5.png
+++ b/doc/developer-guides/hld/images/mem-image5.png
--- a/doc/developer-guides/hld/images/mem-image6.png
+++ b/doc/developer-guides/hld/images/mem-image6.png
--- a/doc/developer-guides/hld/images/mem-image7.png
+++ b/doc/developer-guides/hld/images/mem-image7.png
--- a/doc/developer-guides/hld/images/network-virt-arch.png
+++ b/doc/developer-guides/hld/images/network-virt-arch.png
--- a/doc/developer-guides/hld/images/network-virt-sos-infrastruct.png
+++ b/doc/developer-guides/hld/images/network-virt-sos-infrastruct.png
--- a/doc/developer-guides/hld/images/security-image1.png
+++ b/doc/developer-guides/hld/images/security-image1.png
--- a/doc/developer-guides/hld/images/security-image10.png
+++ b/doc/developer-guides/hld/images/security-image10.png
--- a/doc/developer-guides/hld/images/security-image11.png
+++ b/doc/developer-guides/hld/images/security-image11.png
--- a/doc/developer-guides/hld/images/security-image12.png
+++ b/doc/developer-guides/hld/images/security-image12.png
--- a/doc/developer-guides/hld/images/security-image13.png
+++ b/doc/developer-guides/hld/images/security-image13.png
--- a/doc/developer-guides/hld/images/security-image14.png
+++ b/doc/developer-guides/hld/images/security-image14.png
--- a/doc/developer-guides/hld/images/security-image2.png
+++ b/doc/developer-guides/hld/images/security-image2.png
--- a/doc/developer-guides/hld/images/security-image3.png
+++ b/doc/developer-guides/hld/images/security-image3.png
--- a/doc/developer-guides/hld/images/security-image4.png
+++ b/doc/developer-guides/hld/images/security-image4.png
--- a/doc/developer-guides/hld/images/security-image5.png
+++ b/doc/developer-guides/hld/images/security-image5.png
--- a/doc/developer-guides/hld/images/security-image6.png
+++ b/doc/developer-guides/hld/images/security-image6.png
--- a/doc/developer-guides/hld/images/security-image7.png
+++ b/doc/developer-guides/hld/images/security-image7.png
--- a/doc/developer-guides/hld/images/security-image8.png
+++ b/doc/developer-guides/hld/images/security-image8.png
--- a/doc/developer-guides/hld/images/security-image9.png
+++ b/doc/developer-guides/hld/images/security-image9.png
--- a/doc/developer-guides/hld/images/uart-image1.png
+++ b/doc/developer-guides/hld/images/uart-image1.png
--- a/doc/developer-guides/hld/images/virtio-blk-image01.png
+++ b/doc/developer-guides/hld/images/virtio-blk-image01.png
--- a/doc/developer-guides/hld/images/virtio-blk-image02.png
+++ b/doc/developer-guides/hld/images/virtio-blk-image02.png
--- a/doc/developer-guides/hld/images/virtio-console-arch.png
+++ b/doc/developer-guides/hld/images/virtio-console-arch.png
--- a/doc/developer-guides/hld/images/virtio-hld-image0.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image0.png
--- a/doc/developer-guides/hld/images/virtio-hld-image1.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image1.png
--- a/doc/developer-guides/hld/images/virtio-hld-image2.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image2.png
--- a/doc/developer-guides/hld/images/virtio-hld-image3.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image3.png
--- a/doc/developer-guides/hld/images/virtio-hld-image4.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image4.png
--- a/doc/developer-guides/hld/images/virtio-hld-image5.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image5.png
--- a/doc/developer-guides/hld/images/virtio-hld-image6.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image6.png
--- a/doc/developer-guides/hld/images/virtio-hld-image7.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image7.png
--- a/doc/developer-guides/hld/images/virtio-hld-image8.png
+++ b/doc/developer-guides/hld/images/virtio-hld-image8.png
--- a/doc/developer-guides/hld/images/watchdog-image1.png
+++ b/doc/developer-guides/hld/images/watchdog-image1.png
--- a/doc/developer-guides/hld/images/watchdog-image2.png
+++ b/doc/developer-guides/hld/images/watchdog-image2.png
--- a/doc/developer-guides/hld/index.rst
+++ b/doc/developer-guides/hld/index.rst
@@ -0,0 +1,28 @@
+.. _hld:
+
+High-Level Design Guides
+########################
+
+The ACRN Hypervisor acts as a host with full control of the processor(s)
+and the hardware (physical memory, interrupt management and I/O). It
+provides the User OS with an abstraction of a virtual platform, allowing
+the guest to behave as if were executing directly on a logical
+processor.
+
+These chapters describe the ACRN architecture, high-level design,
+background, and motivation for specific areas within the ACRN hypervisor
+system.
+
+.. toctree::
+   :maxdepth: 1
+
+   Overview <hld-overview>
+   Hypervisor <hld-hypervisor>
+   Device Model <hld-devicemodel>
+   Emulated Devices <hld-emulated-devices>
+   Virtio Devices <hld-virtio-devices>
+   VM Management <hld-vm-management>
+   Power Management <hld-power-management>
+   Tracing and Logging <hld-trace-log>
+   Virtual Bootloader <hld-vsbl>
+   Security <hld-security>
--- a/doc/developer-guides/hld/interrupt-hld.rst
+++ b/doc/developer-guides/hld/interrupt-hld.rst
@@ -0,0 +1,486 @@
+.. _interrupt-hld:
+
+Interrupt Management high-level design
+######################################
+
+
+Overview
+********
+
+This document describes the interrupt management high-level design for
+the ACRN hypervisor.
+
+The ACRN hypervisor implements a simple but fully functional framework
+to manage interrupts and exceptions, as show in
+:numref:`interrupt-modules-overview`. In its native layer, it configures
+the physical PIC, IOAPIC, and LAPIC to support different interrupt
+sources from local timer/IPI to external INTx/MSI. In its virtual guest
+layer, it emulates virtual PIC, virtual IOAPIC and virtual LAPIC, and
+provides full APIs allowing virtual interrupt injection from emulated or
+pass-thru devices.
+
+.. figure:: images/interrupt-image3.png
+   :align: center
+   :width: 600px
+   :name: interrupt-modules-overview
+
+   ACRN Interrupt Modules Overview
+
+In the software modules view shown in :numref:`interrupt-sw-modules`,
+the ACRN hypervisor sets up the physical interrupt in its basic
+interrupt modules (e.g., IOAPIC/LAPIC/IDT).  It dispatches the interrupt
+in the hypervisor interrupt flow control layer to the corresponding
+handlers, that could be pre-defined IPI notification, timer, or runtime
+registered pass-thru devices.  The ACRN hypervisor then uses its VM
+interfaces based on vPIC, vIOAPIC, and vMSI modules, to inject the
+necessary virtual interrupt into the specific VM
+
+.. figure:: images/interrupt-image2.png
+   :align: center
+   :width: 600px
+   :name: interrupt-sw-modules
+
+   ACRN Interrupt SW Modules Overview
+
+Hypervisor Physical Interrupt Management
+****************************************
+
+The ACRN hypervisor is responsible for all the physical interrupt
+handling. All physical interrupts are first handled in VMX root-mode.
+The "external-interrupt exiting" bit in the VM-Execution controls field
+is set to support this. The ACRN hypervisor also initializes all the
+interrupt related modules such as IDT, PIC, IOAPIC, and LAPIC.
+
+Only a few physical interrupts (such as TSC-Deadline timer and IOMMU)
+are fully serviced in the hypervisor. Most interrupts come from pass-thru
+devices whose interrupt are remapped to a virtual INTx/MSI source and
+injected to the SOS or UOS, according to the pass-thru device
+configuration.
+
+The ACRN hypervisor does handle exceptions and any exception coming from
+the VMX root-mode will lead to the CPU halting. For guest exception, the
+hypervisor only traps #MC (machine check), prints a warning message, and
+injects the exception back into the guest OS.
+
+Physical Interrupt Initialization
+=================================
+
+After the ACRN hypervisor get control from the bootloader, it
+initializes all physical interrupt-related modules for all the CPUs. The
+ACRN hypervisor creates a framework to manage the physical interrupt for
+hypervisor-local devices, pass-thru devices, and IPI between CPUs.
+
+IDT
+---
+
+The ACRN hypervisor builds its native Interrupt Descriptor Table (IDT) during
+interrupt initialization. For exceptions, it links to function
+``dispatch_exception``, and for external interrupts it links to function
+``dispatch_interrupt``. Please refer to ``arch/x86/idt.S`` for more details.
+
+LAPIC
+-----
+
+The ACRN hypervisor resets LAPIC for each CPU, and provides basic APIs
+used, for example, by the local timer (TSC Deadline)
+program and IPI notification program.  These APIs include
+write_laipic_reg32, send_lapic_eoi, send_startup_ipi, and
+send_single_ipi.
+
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/arch/x86/lapic.h
+
+PIC/IOAPIC
+----------
+
+The ACRN hypervisor masks all interrupts from PIC, so all the
+legacy interrupts from PIC (<16) are linked to IOAPIC, as shown in
+:numref:`interrupt-pic-pin`.
+
+ACRN will pre-allocate vectors and mask them for these legacy interrupts
+in IOAPIC RTE. For others (>= 16) ACRN will mask them with vector 0 in
+RTE, and the vector will be dynamically allocated on demand.
+
+.. figure:: images/interrupt-image5.png
+   :align: center
+   :width: 600px
+   :name: interrupt-pic-pin
+
+   PIC & IOAPIC Pin Connection
+
+Irq Desc
+--------
+
+The ACRN hypervisor maintains a global ``irq_desc[]`` array shared among the
+CPUs and uses a flat mode to manage the interrupts.  The same
+vector is linked to the same IRQ number for all CPUs.
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   for ``struct irq_desc`` in hypervisor/include/common/irq.h
+
+
+The ``irq_desc[]`` array is indexed by the IRQ number. An
+``irq_handler`` field can be set to a common edge, level, or quick
+handler called from ``interrupt_dispatch``. The ``irq_desc`` structure
+also contains the ``dev_list`` field to maintain this IRQ's action
+handler list.
+
+The global array ``vector_to_irq[]`` is used to manage the vector
+resource. This array is initialized with value ``IRQ_INVALID`` for all
+vectors, and will be set to a valid IRQ number after the corresponding
+vector is registered.
+
+For example, if the local timer registers interrupt with IRQ number 271 and
+vector 0xEF, then the arrays mentioned above will be set to::
+
+    irq_desc[271].irq = 271;
+    irq_desc[271].vector = 0xEF;
+    vector_to_irq[0xEF] = 271;
+
+Physical Interrupt Flow
+=======================
+
+
+When an physical interrupt occurs, and the CPU is running under VMX root
+mode, the interrupt is triggered from the standard native irq flow:
+interrupt gate to irq handler. However, if the CPU is running under VMX
+non-root mode, an external interrupt will trigger a VM exit for reason
+"external-interrupt". See :numref:`interrupt-handle-flow`.
+
+.. figure:: images/interrupt-image4.png
+   :align: center
+   :width: 800px
+   :name: interrupt-handle-flow
+
+   ACRN Hypervisor Interrupt Handle Flow
+
+After an interrupt happens (in either case noted above), the ACRN
+hypervisor jumps to ``dispatch_interrupt``. This function will check
+which vector caused this interrupt, and the corresponding ``irq_desc``
+structure's ``irq_handler`` will be called for the service.
+
+There are several irq_handler's defined in the ACRN hypervisor, as shown
+in :numref:`interrupt-handle-flow`, designed for different uses.  For
+example, ``quick_handler_nolock`` is used when no critical data needs
+protection in the action handlers; the VCPU notification IPI and local
+timer are good example of this use case.
+
+The more complicated ``common_dev_handler_level`` handler is intended
+for pass-thru devices with level triggered interrupts. To avoid
+continuously triggering the interrupt, it initially masks IOAPIC pin and
+unmasks it only when the corresponding vIOAPIC pin gets an explicit EOI
+ACK from the guest.
+
+All the irq handler's finally call their own action handler list, as
+shown here:
+
+.. code-block: c
+
+   struct dev_handler_node \*dev = desc->dev_list;
+   while (dev != NULL) {
+      if (dev->dev_handler != NULL)
+         dev->dev_handler(desc->irq, dev->dev_data);
+      dev = dev->next;
+   }
+
+The common APIs for registering, updating, and unregistering
+interrupt handlers include irq_to_vector, dev_to_irq, dev_to_vector,
+pri_register_handler, normal_register_handler,
+unregister_handler_common, and update_irq_handler.
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/common/irq.h
+
+.. _physical_interrupt_source:
+
+Physical Interrupt Source
+=========================
+
+The ACRN hypervisor handles interrupts from many different sources, as
+shown in :numref:`interrupt-source`:
+
+
+.. list-table:: Physical Interrupt Source
+   :widths: 15 10 60
+   :header-rows: 1
+   :name: interrupt-source
+
+   * - Interrupt Source
+     - Vector
+     - Description
+   * - TSC Deadline Timer
+     - 0xEF
+     - The TSC deadline timer implements the timer framework in
+       the hypervisor based on the LAPIC TSC deadline. This interrupt's
+       target is specific to the CPU to which the LAPIC belongs.
+   * - CPU Startup IPI
+     - N/A
+     - The BSP needs to trigger an INIT-SIPI sequence to wake up the
+       APs. This interrupt's target is specified by the BSP calling
+       `` start_cpus()``.
+   * - VCPU Notify IPI
+     - 0xF0
+     - When the hypervisor needs to kick the VCPU out of VMX non-root
+       mode to do requests such as virtual interrupt injection, EPT
+       flush, etc. This interrupt's target is specified by function
+       ``send_single_ipi()``.
+   * - IOMMU MSI
+     - dynamic
+     - IOMMU device supports an MSI interrupt. The vtd device driver in
+       the hypervisor will register an interrupt to handle dmar fault.
+       This interrupt's target is specified by vtd device driver.
+   * - PTdev INTx
+     - dynamic
+     - All native devices are owned by the guest (SOS or UOS), taking
+       advantage of the pass-thru method. Each pass-thru device connected
+       with IOAPIC/PIC (PTdev INTx) will register an interrupt when
+       its attached interrupt controller pin first gets unmasked.
+       This interrupt's target is defined by and RTE entry in the IOAPIC.
+   * - PTdev MSI
+     - dynamic
+     - All native devices are owned by the guest (SOS or UOS), taking
+       advantage of pass-thru method. Each pass-thru device with
+       enabled MSI (PTdev MSI) will register an interrupt when the SOS
+       does an explicit hypercall. This interrupt's target is defined
+       by an MSI address entry.
+
+Softirq
+=======
+
+ACRN hypervisor implements a simple bottom-half softirq to execute the
+interrupt handler, as showed in :numref:`interrupt-handle-flow`.
+The softirq is executed when an interrupt is enabled. Several APIs for softirq
+are defined including enable_softirq, disable_softirq, raise_softirq,
+and exec_softirq.
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/common/softirq.h
+
+Physical Exception Handling
+===========================
+
+As mentioned earlier, the ACRN hypervisor does not handle any
+physical exceptions. The VMX root mode code path should guarantee no
+exceptions are triggered while the hypervisor is running.
+
+Guest Virtual Interrupt Management
+**********************************
+
+The previous sections describe physical interrupt management in the ACRN
+hypervisor. After a physical interrupt happens, a registered action
+handler is executed. Usually, the action handler represents a service
+for virtual interrupt injection. For example, if an interrupt is
+triggered from a pass-thru device, the appropriate virtual interrupt
+should be injected into its guest VM.
+
+The virtual interrupt injection could also come from an emulated device.
+The I/O mediator in the Service OS (SOS) could trigger an interrupt
+through a hypercall, and then do the virtual interrupt injection in the
+hypervisor.
+
+The following sections give an introduction to the ACRN guest virtual
+interrupt management, including VCPU request for virtual interrupt kick
+off, vPIC/vIOAPIC/vLAPIC for virtual interrupt injection interfaces,
+physical-to-virtual interrupt mapping for a pass-thru device, and the
+process of VMX interrupt/exception injection.
+
+VCPU Request
+============
+
+As mentioned in `physical_interrupt_source`_, physical vector 0xF0 is
+used to kick the VCPU out of its VMX non-root mode, and make a request
+for virtual interrupt injection or other requests such as flush EPT.
+
+The request-make API (vcpu_make_request) and eventid supports virtual interrupt
+injection.
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/common/irq.h
+
+There are requests for exception injection (ACRN_REQUEST_EXCP), vLAPIC
+event (ACRN_REQUEST_EVENT), external interrupt from vPIC
+(ACRN_REQUEST_EXTINT) and non-maskable-interrupt (ACRN_REQUEST_NMI).
+
+The ``vcpu_make_request`` is necessary for a virtual interrupt
+injection.  If the target VCPU is running under VMX non-root mode, it
+will send an IPI to kick it out and results in an external-interrupt
+VM-Exit.  The flow of :numref:`interrupt-handle-flow` could be executed
+to complete the injection of a virtual interrupt.
+
+There are some cases that do not need to send an IPI when making a
+request because the CPU making the request is the target VCPU.  For
+example, the #GP exception request always happens on the current CPU
+when an invalid emulation happens. An external interrupt for a pass-thru
+device always happens on the VCPUs the device belongs to, so after it
+triggers an external-interrupt VM-Exit, the current CPU is also the
+target VCPU.
+
+Virtual PIC
+===========
+
+The ACRN hypervisor emulates a vPIC for each VM based on IO ranges
+0x20-0x21, 0xa0-0xa1, or 0x4d0-0x4d1.
+
+If an interrupt source from vPIC needs to inject an interrupt,
+the vpic_assert_irq, vpic_deassert_irq, or vpic_pulse_irq functions can
+be called to make a request for ACRN_REQUEST_EXTINT or
+ACRN_REQUEST_EVENT:
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/common/vpic.h
+
+The vpic_pending_intr and vpic_intr_accepted APIs are used to query the
+vector being injected and ACK the service, by moving the interrupt from
+request service (IRR) to in service (ISR).
+
+
+Virtual IOAPIC
+==============
+
+ACRN hypervisor emulates a vIOAPIC for each VM based on MMIO
+VIOAPIC_BASE.
+
+If an interrupt source from vIOAPIC needs to inject an interrupt, the
+vioapic_assert_irq, vioapic_dessert_irq, and vioapic_pulse_irq APIs are
+used to make a request for ACRN_REQUEST_EVENT.
+
+As the vIOAPIC is always associated with a vLAPIC, the virtual interrupt
+injection from vIOAPIC will finally trigger a request for an vLAPIC
+event.
+
+Virtual LAPIC
+=============
+
+The ACRN hypervisor emulates a vLAPIC for each VCPU based on MMIO
+DEFAULT_APIC_BASE.
+
+If an interrupt source from vLAPIC needs to inject an interrupt (e.g.,
+from LVT such as an LAPIC timer, from vIOAPIC for a pass-thru device
+interrupt, or from an emulated device for a MSI), vlapic_intr_level,
+vlapic_intr_edge, vlapic_set_local_intr, vlapic_intr_msi,
+vlapic_deliver_intr APIs need to be called, resulting in a request for
+ACRN_REQUEST_EVENT.
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/common/vlapic.h
+
+
+The vlapic_pending_intr and vlapic_intr_accepted APIs are used to query
+the vector that needs to be injected and ACK
+the service that move the interrupt from request service (IRR) to in
+service (ISR).
+
+By default, the ACRN hypervisor enables vAPIC to improve the performance of
+a vLAPIC emulation.
+
+Virtual Exception
+=================
+
+When doing emulation, an exception may be triggered in the hypervisor,
+for example, if guest accesses an invalid vMSR register, or the
+hypervisor needs to inject a #GP, or during instruction emulation, an
+instruction fetch may access a non-exist page from rip_gva, and a #PF
+must be injected.
+
+ACRN hypervisor implements virtual exception injection using the
+vcpu_queue_exception, vcpu_inject_gq, and vcpu_inject_pf APIs.
+
+.. comment
+
+   Need reference to API doc generated from doxygen comments
+   in hypervisor/include/common/irq.h
+
+The ACRN hypervisor uses vcpu_inject_gp/vcpu_inject_pf functions to
+queue exception requests, and follows `Intel Software
+Developer Manual, Vol 3. <SDM vol3>`_ - 6.15, Table 6-5 
+listing conditions for generating a double fault.
+
+.. _SDM vol3: https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html
+
+Interrupt Mapping for a Pass-thru Device
+========================================
+
+A VM can control a PCI device directly through pass-thru device
+assignment. The pass-thru entry is the major info object, and it is:
+
+- A physical interrupt source, and could be a MSI/MSIX entry, PIC pins, or
+  IOAPIC pins
+- Pass-thru remapping information between physical and virtual interrupt
+  source, for MSI/MSIX it is identified by a PCI device's BDF. For
+  PIC/IOAPIC it is identified by the pin number.
+
+.. figure:: images/interrupt-image7.png
+   :align: center
+   :width: 600px
+   :name: interrupt-pass-thru
+
+   Pass-thru Device Entry Assignment
+
+As shown in :numref:`interrupt-pass-thru` above, a UOS will assign its
+pass-thru device entry by the DM, and it will fill its entry info from:
+
+- vPIC/vIOAPIC interrupt mask/unmask
+- MSI IOReq from UOS then MSI hypercall from SOS
+
+The SOS adds its pass-thru device entry at runtime and fills info for:
+
+- vPIC/vIOAPIC interrupt mask/unmask
+- MSI hypercall from SOS
+
+During the pass-thru device entry info filling, the hypervisor builds
+native IOAPIC RTE/MSI entry based on vIOAPIC/vPIC/vMSI configuration,
+and register the physical interrupt handler for it. Then with the pass-thru
+device entry as the handler private data, the physical interrupt can
+be linked to a virtual pin of a guest's vPIC/vIOAPIC or virtual vector of
+a guest's vMSI. The handler then injects the corresponding virtual
+interrupt into the guest, based on vPIC/vIOAPIC/vLAPIC APIs described
+earlier.
+
+Interrupt Storm Mitigation
+==========================
+
+When the Device Model (DM) launches a User OS (UOS), the ACRN hypervisor
+will remap the interrupt for this user OS's pass-through devices. When
+an interrupt occurs for a pass-through device, the CPU core is assigned
+to that User OS gets trapped into the hypervisor. The benefit of such a
+mechanism is that, should an interrupt storm happen in a particular UOS,
+it will have only a minimal effect on the performance of the Service OS.
+
+Interrupt/Exception Injection Process
+=====================================
+
+As shown in :numref:`interrupt-handle-flow`, the ACRN hypervisor injects
+virtual interrupt/exception to the guest before its VM-Entry.
+
+This is done by updating the VMX_ENTRY_INT_INFO_FIELD of the VCPU's
+VMCS. As this field is unique, the interrupt/exception injection must
+follow a priority rule to handle one-by-one.
+
+:numref:`interrupt-injection` below shows the rules about how to inject
+virtual interrupt/exception one-by-one. If a high priority
+interrupt/exception was already injected, the next pending
+interrupt/exception will enable an interrupt window where the next
+injection will be done by the following VM-Exit, triggered by the
+interrupt window.
+
+.. figure:: images/interrupt-image6.png
+   :align: center
+   :width: 600px
+   :name: interrupt-injection
+
+   ACRN Hypervisor Interrupt/Exception Injection Process
--- a/doc/developer-guides/hld/memmgt-hld.rst
+++ b/doc/developer-guides/hld/memmgt-hld.rst
@@ -0,0 +1,248 @@
+.. _memmgt-hld:
+
+Memory Management high-level design
+###################################
+
+This document describes memory management for the ACRN hypervisor.
+
+Overview
+********
+
+In the ACRN hypervisor system, there are few different memory spaces to
+consider.  From the hypervisor's point of view there are:
+
+-  Host Physical Address (HPA): the native physical address space, and
+-  Host Virtual Address (HVA): the native virtual address space based on
+   a MMU. A page table is used to do the translation between HPA and HVA
+   spaces.
+
+And from the Guest OS running on a hypervisor there are:
+
+-  Guest Physical Address (GPA): the guest physical address space from a
+   virtual machine.  GPA to HPA transition is usually based on a
+   MMU-like hardware module (EPT in X86), and associated with a page
+   table
+-  Guest Virtual Address (GVA): the guest virtual address space from a
+   virtual machine based on a vMMU
+
+.. figure:: images/mem-image2.png
+   :align: center
+   :width: 900px
+   :name: mem-overview
+
+   ACRN Memory Mapping Overview
+
+:numref:`mem-overview` provides an overview of the ACRN system memory
+mapping, showing:
+
+-  GVA to GPA mapping based on vMMU on a VCPU in a VM
+-  GPA to HPA mapping based on EPT for a VM in the hypervisor
+-  HVA to HPA mapping based on MMU in the hypervisor
+
+This document illustrates the memory management infrastructure for the
+ACRN hypervisor and how it handles the different memory space views
+inside the hypervisor and from a VM:
+
+-  How ACRN hypervisor manages host memory (HPA/HVA)
+-  How ACRN hypervisor manages SOS guest memory (HPA/GPA)
+-  How ACRN hypervisor & SOS DM manage UOS guest memory (HPA/GPA)
+
+Hypervisor Memory Management
+****************************
+
+The ACRN hypervisor is the primary owner to manage system
+memory. Typically the boot firmware (e.g., EFI) passes the platform physical
+memory layout - E820 table to the hypervisor. The ACRN hypervisor does its memory
+management based on this table.
+
+Physical Memory Layout - E820
+=============================
+
+The boot firmware (e.g., EFI) passes the E820 table through a multiboot protocol.
+This table contains the original memory layout for the platform.
+
+.. figure:: images/mem-image1.png
+   :align: center
+   :width: 900px
+   :name: mem-layout
+
+   Physical Memory Layout Example
+
+:numref:`mem-layout` is an example of the physical memory layout based on a simple
+platform E820 table. The following sections demonstrate different memory
+space management by referencing it.
+
+Physical to Virtual Mapping
+===========================
+
+ACRN hypervisor is running under paging mode, so after receiving
+the platform E820 table, ACRN hypervisor creates its MMU page table
+based on it. This is done by the function init_paging() for all
+physical CPUs.
+
+The memory mapping policy here is:
+
+-  Identical mapping for each physical CPU (ACRN hypervisor's memory
+   could be relocatable in a future implementation)
+-  Map all memory regions with UNCACHED type
+-  Remap RAM regions to WRITE-BACK type
+
+.. figure:: images/mem-image4.png
+   :align: center
+   :width: 900px
+   :name: vm-layout
+
+   Hypervisor Virtual Memory Layout
+
+:numref:`vm-layout` shows:
+
+-  Hypervisor can access all of system memory
+-  Hypervisor has an UNCACHED MMIO/PCI hole reserved for devices, such
+   as for LAPIC/IOAPIC access
+-  Hypervisor has its own memory with WRITE-BACK cache type for its
+   code and data (< 1M part is for secondary CPU reset code)
+
+Service OS Memory Management
+****************************
+
+After the ACRN hypervisor starts, it creates the Service OS as its first
+VM. The Service OS runs all the native device drivers, manage the
+hardware devices, and provides I/O mediation to guest VMs. The Service
+OS is in charge of the memory allocation for Guest VMs as well.
+
+ACRN hypervisor passes the whole system memory access (except its own
+part) to the Service OS. The Service OS must be able to access all of
+the system memory except the hypervisor part.
+
+Guest Physical Memory Layout - E820
+===================================
+
+The ACRN hypervisor passes the original E820 table to the Service OS
+after filtering out its own part. So from Service OS's view, it sees
+almost all the system memory as shown here:
+
+.. figure:: images/mem-image3.png
+   :align: center
+   :width: 900px
+   :name: sos-mem-layout
+
+   SOS Physical Memory Layout
+
+Host to Guest Mapping
+=====================
+
+ACRN hypervisor creates Service OS's host (HPA) to guest (GPA) mapping
+(EPT mapping) through the function
+``prepare_vm0_memmap_and_e820()`` when it creates the SOS VM. It follows
+these rules:
+
+-  Identical mapping
+-  Map all memory range with UNCACHED type
+-  Remap RAM entries in E820 (revised) with WRITE-BACK type
+-  Unmap ACRN hypervisor memory range
+-  Unmap ACRN hypervisor emulated vLAPIC/vIOAPIC MMIO range
+
+The host to guest mapping is static for the Service OS; it will not
+change after the Service OS begins running. Each native device driver
+can access its MMIO through this static mapping. EPT violation is only
+serving for vLAPIC/vIOAPIC's emulation in the hypervisor for Service OS
+VM.
+
+User OS Memory Management
+*************************
+
+User OS VM is created by the DM (Device Model) application running in
+the Service OS. DM is responsible for the memory allocation for a User
+or Guest OS VM.
+
+Guest Physical Memory Layout - E820
+===================================
+
+DM will create the E820 table for a User OS VM based on these simple
+rules:
+
+-  If requested VM memory size < low memory limitation (defined in DM,
+   as 2GB), then low memory range = [0, requested VM memory size]
+-  If requested VM memory size > low memory limitation (defined in DM,
+   as 2GB), then low memory range = [0, 2GB], high memory range = [4GB,
+   4GB + requested VM memory size - 2GB]
+
+.. figure:: images/mem-image6.png
+   :align: center
+   :width: 900px
+   :name: uos-mem-layout
+
+   UOS Physical Memory Layout
+
+DM is doing UOS memory allocation based on hugeTLB mechanism by
+default. The real memory mapping
+may be scattered in SOS physical memory space, as shown below:
+
+.. figure:: images/mem-image5.png
+   :align: center
+   :width: 900px
+   :name: uos-mem-layout-hugetlb
+
+   UOS Physical Memory Layout Based on Hugetlb
+
+Host to Guest Mapping
+=====================
+
+A User OS VM's memory is allocated by the Service OS DM application, and
+may come from different huge pages in the Service OS as shown in
+:ref:`uos-mem-layout-hugetlb`.
+
+As Service OS has the full information of these huge pages size,
+SOS-GPA and UOS-GPA, it works with the hypervisor to complete UOS's host
+to guest mapping using this pseudo code:
+
+.. code-block:: c
+
+   for x in allocated huge pages do
+      x.hpa = gpa2hpa_for_sos(x.sos_gpa)
+      host2guest_map_for_uos(x.hpa, x.uos_gpa, x.size)
+   end
+
+Trusty
+======
+
+For an Android User OS, there is a secure world called "trusty world
+support", whose memory needs are taken care by the ACRN hypervisor for
+security consideration. From the memory management's view, the trusty
+memory space should not be accessible by SOS or UOS normal world.
+
+.. figure:: images/mem-image7.png
+   :align: center
+   :width: 900px
+   :name: uos-mem-layout-trusty
+
+   UOS Physical Memory Layout with Trusty
+
+Memory Interaction
+******************
+
+Previous sections described different memory spaces management in the
+ACRN hypervisor, Service OS, and User OS. Among these memory spaces,
+there are different kinds of interaction, for example, a VM may do a
+hypercall to the hypervisor that includes a data transfer, or an
+instruction emulation in the hypervisor may need to access the Guest
+instruction pointer register to fetch instruction data.
+
+Access GPA from Hypervisor
+==========================
+
+When a hypervisor needs access to the GPA for data transfers, the caller
+from the Guest must make sure this memory range's GPA is address
+continuous. But for HPA in the hypervisor, it could be address
+dis-continuous (especially for UOS under hugetlb allocation mechanism).
+For example, a 4MB GPA range may map to 2 different 2MB huge pages. The
+ACRN hypervisor needs to take care of this kind of data transfer by
+doing EPT page walking based on its HPA.
+
+Access GVA from Hypervisor
+==========================
+
+Likely, when hypervisor need to access GVA for data transfer, both GPA
+and HPA could be address dis-continuous. The ACRN hypervisor must pay
+attention to this kind of data transfer, and handle it by doing page
+walking based on both its GPA and HPA.
--- a/doc/developer-guides/hld/uart-virt-hld.rst
+++ b/doc/developer-guides/hld/uart-virt-hld.rst
@@ -0,0 +1,126 @@
+.. _uart_virtualization:
+
+UART Virtualization
+###################
+
+In ACRN, UART virtualization is implemented as a fully-emulated device.
+In the Service OS (SOS), UART virtualization is implemented in the
+hypervisor itself.  In the User OS (UOS), UART virtualization is
+implemented in the Device Model (DM), and is the primary topic of this
+document.  We'll summarize differences between the hypervisor and DM
+implementations at the end of this document.
+
+
+UART emulation is a typical full-emulation implementation and is a
+good example to learn about I/O emulation in a virtualized environment.
+There is a detailed explanation about the I/O emulation flow in
+ACRN in :ref:`ACRN-io-mediator`.
+
+Architecture
+************
+
+The ACRN DM architecture for UART virtualization is shown here:
+
+.. figure:: images/uart-image1.png
+   :align: center
+   :name: uart-arch
+   :width: 800px
+
+   Device Model's UART virtualization architecture
+
+There are three objects used to emulate one UART device in DM:
+UART registers, rxFIFO, and backend tty devices.
+
+**UART registers** are emulated by member variables in struct
+``uart_vdev``, one variable for each register. These variables are used
+to track the register status programed by the frontend driver. The
+handler of each register depends on the register's functionality.
+
+A **FIFO** is implemented to emulate RX. Normally characters are read
+from the backend tty device when available, then put into the rxFIFO.
+When the Guest application tries to read from the UART, the access to
+register ``com_data`` causes a ``vmexit``. Device model catches the
+``vmexit`` and emulates the UART by returning one character from rxFIFO.
+
+.. note:: When ``com_fcr`` is available, the Guest application can write
+   ``0`` to this register to disable rxFIFO. In this case the rxFIFO in
+   device model degenerates to a buffer containing only one character.
+
+When the Guest application tries to send a character to the UART, it
+writes to the ``com_data`` register, which will cause a ``vmexit`` as
+well.  Device model catches the ``vmexit`` and emulates the UART by
+redirecting the character to the **backend tty device**.
+
+The UART device emulated by the ACRN device model is connected to the system by
+the LPC bus. In the current implementation, two channel LPC UARTs are I/O mapped to
+the traditional COM port addresses of 0x3F8 and 0x2F8. These are defined in
+global variable ``uart_lres``.
+
+There are two options needed for configuring the UART in the ``arcn-dm``
+command line. First, the LPC is defined as a PCI device::
+
+   -s 1:0,lpc
+
+The other option defines a UART port::
+
+   -l com1,stdio
+
+The first parameter here is the name of the UART (must be "com1" or
+"com2"). The second parameter is species the backend
+tty device: ``stdio`` or a path to the dedicated tty device
+node, for example ``/dev/pts/0``.
+
+If you are using a specified tty device, find the name of the terminal
+connected to standard input using the ``tty`` command (e.g.,
+``/dev/pts/1``).  Use this name to define the UART port on the acrn-dm
+command line, for example::
+
+   -l com1,/dev/pts/1
+
+
+When arcn-dm starts, ``pci_lpc_init`` is called as the callback of the
+``vdev_init`` of the PCI device given on the acrn-dm command line.
+Later, ``lpc_init`` is called in ``pci_lpc_init``. ``lpc_init`` iterates
+on the available UART instances defined on the command line and
+initializes them one by one.  ``register_inout`` is called on the port
+region of each UART instance, enabling access to the UART ports to be
+routed to the registered handler.
+
+In the case of UART emulation, the registered handlers are ``uart_read``
+and ``uart_write``.
+
+A similar virtual UART device is implemented in the hypervisor.
+Currently UART16550 is owned by the hypervisor itself and is used for
+debugging purposes.  (The UART properties are configured by parameters
+to the hypervisor command line.) The hypervisor emulates a UART device
+with 0x3F8 address to the SOS and acts as the SOS console. The general
+emulation is the same as used in the device model, with the following
+differences:
+
+-  PIO region is directly registered to the vmexit handler dispatcher via
+   ``vuart_register_io_handler``
+
+-  Two FIFOs are implemented, one for RX, the other of TX
+
+-  RX flow:
+
+   -  Characters are read from the UART HW into a 2048-byte sbuf,
+      triggered by ``console_read``
+
+   -  Characters are read from the sbuf and put to rxFIFO,
+      triggered by ``vuart_console_rx_chars``
+
+   -  A virtual interrupt is sent to the SOS that triggered the read,
+      and characters from rxFIFO are sent to the SOS by emulating a read
+      of register ``UART16550_RBR``
+
+-  TX flow:
+
+   -  Characters are put into txFIFO by emulating a write of register
+      ``UART16550_THR``
+
+   -  Characters in txFIFO are read out one by one, and sent to the console
+      by printf, triggered by ``vuart_console_tx_chars``
+
+   -  Implementation of printf is based on the console, which finally sends
+      characters to the UART HW by writing to register ``UART16550_RBR``
--- a/doc/developer-guides/hld/virtio-blk.rst
+++ b/doc/developer-guides/hld/virtio-blk.rst
@@ -0,0 +1,107 @@
+.. _virtio-blk:
+
+Virtio-blk
+##########
+
+The virtio-blk device is a simple virtual block device. The FE driver
+(in the UOS space) places read, write, and other requests onto the
+virtqueue, so that the BE driver (in the SOS space) can process them
+accordingly.  Communication between the FE and BE is based on the virtio
+kick and notify mechanism.
+
+The virtio device ID of the virtio-blk is ``2``, and it supports one
+virtqueue, the size of which is 64, configurable in the source code.
+
+.. figure:: images/virtio-blk-image01.png
+   :align: center
+   :width: 900px
+   :name: virtio-blk-arch
+
+   Virtio-blk architecture
+
+The feature bits supported by the BE device are shown as follows:
+
+``VIRTIO_BLK_F_SEG_MAX``
+  Maximum number of segments in a request is in seg_max.
+``VIRTIO_BLK_F_BLK_SIZE``
+  Block size of disk is in blk_size.
+``VIRTIO_BLK_F_TOPOLOGY``
+  Device exports information on optimal I/O alignment.
+``VIRTIO_RING_F_INDIRECT_DESC``
+  Support for indirect descriptors
+``VIRTIO_BLK_F_FLUSH``
+  Cache flush command support.
+``VIRTIO_BLK_F_CONFIG_WCE``
+  Device can toggle its cache between writeback and writethrough modes.
+
+
+Virtio-blk-BE design
+********************
+
+.. figure:: images/virtio-blk-image02.png
+   :align: center
+   :width: 900px
+   :name: virtio-blk-be
+
+The virtio-blk BE device is implemented as a legacy virtio device. Its
+backend media could be a file or a partition. The virtio-blk device
+supports writeback and writethrough cache mode. In writeback mode,
+virtio-blk has good write and read performance. To be safer,
+writethrough is set as the default mode, as it can make sure every write
+operation queued to the virtio-blk FE driver layer is submitted to
+hardware storage.
+
+During initialization, virito-blk will allocate 64 ioreq buffers in a
+shared ring used to store the I/O requests.  The freeq, busyq, and pendq
+shown in :numref:`virtio-blk-be` are used to manage requests. Each
+virtio-blk device starts 8 worker threads to process request
+asynchronously.
+
+
+Usage:
+******
+
+The device model configuration command syntax for virtio-blk is::
+
+   -s <slot>,virtio-blk,<filepath>[,options]
+
+- ``filepath`` is the path of a file or disk partition
+- ``options`` include:
+
+  - ``writethru``: write operation is reported completed only when the
+    data has been written to physical storage.
+  - ``writeback``: write operation is reported completed when data is
+    placed in the page cache. Needs to be flushed to the physical storage.
+  - ``ro``: open file with readonly mode.
+  - ``sectorsize``: configured as either
+    ``sectorsize=<sector size>/<physical sector size>`` or
+    ``sectorsize=<sector size>``.
+    The default values for sector size and physical sector size are 512
+  - ``range``: configured as ``range=<start lba in file>/<sub file size>``
+    meaning the virtio-blk will only access part of the file, from the
+    ``<start lba in file>`` to ``<start lba in file> + <sub file site>``.
+
+A simple example for virtio-blk:
+
+1. Prepare a file in SOS folder::
+
+      dd if=/dev/zero of=test.img bs=1M count=1024
+      mkfs.ext4 test.img
+
+#. Add virtio-blk in the DM cmdline, slot number should not duplicate
+   another device::
+
+      -s 9,virtio-blk,/root/test.img
+
+#. Launch UOS, you can find ``/dev/vdx`` in UOS.
+
+   The ``x`` in ``/dev/vdx`` is related to the slot number used.  If
+   If you start DM with two virtio-blks, and the slot numbers are 9 and 10,
+   then, the device with slot 9 will be recognized as ``/dev/vda``, and
+   the device with slot 10 will be ``/dev/vdb``
+
+#. Mount ``/dev/vdx`` to a folder in the UOS, and then you can access it.
+
+
+Successful booting of the User OS verifies the correctness of the
+device.
--- a/doc/developer-guides/hld/virtio-console.rst
+++ b/doc/developer-guides/hld/virtio-console.rst
@@ -0,0 +1,184 @@
+.. _virtio-console:
+
+Virtio-console
+##############
+
+The Virtio-console is a simple device for data input and output.  The
+console's virtio device ID is ``3`` and can have from 1 to 16 ports.
+Each port has a pair of input and output virtqueues used to communicate
+information between the Front End (FE) and Back end (BE) drivers.
+Currently the size of each virtqueue is 64 (configurable in the source
+code).  The FE driver will place empty buffers for incoming data onto
+the receiving virtqueue, and enqueue outgoing characters onto the
+transmitting virtqueue.
+
+A Virtio-console device has a pair of control IO virtqueues as well. The
+control virtqueues are used to communicate information between the
+device and the driver, including: ports being opened and closed on
+either side of the connection, indication from the host about whether a
+particular port is a console port, adding new ports, port
+hot-plug/unplug, indication from the guest about whether a port or a
+device was successfully added, or a port opened or closed.
+
+The virtio-console architecture diagram in ACRN is shown below.
+
+.. figure:: images/virtio-console-arch.png
+   :align: center
+   :width: 700px
+   :name: virtio-console-arch
+
+   Virtio-console architecture diagram
+
+
+Virtio-console is implemented as a virtio legacy device in the ACRN device
+model (DM), and is registered as a PCI virtio device to the guest OS. No changes
+are required in the frontend Linux virtio-console except that the guest
+(UOS) kernel should be built with ``CONFIG_VIRTIO_CONSOLE=y``.
+
+Currently the feature bits supported by the BE device are:
+
+.. list-table:: Feature bits supported by BE drivers
+   :widths: 30 50
+   :header-rows: 0
+
+   * - VTCON_F_SIZE(bit 0)
+     - configuration columns and rows are valid.
+   * - VTCON_F_MULTIPORT(bit 1)
+     - device supports multiple ports, and control virtqueues will be used.
+   * - VTCON_F_EMERG_WRITE(bit 2)
+     - device supports emergency write.
+
+Virtio-console supports redirecting guest output to various backend
+devices. Currently the following backend devices are supported in ACRN
+device model: STDIO, TTY, PTY and regular file.
+
+The device model configuration command syntax for virtio-console is::
+
+   virtio-console,[@]stdio|tty|pty|file:portname[=portpath]\
+      [,[@]stdio|tty|pty|file:portname[=portpath]]
+
+-  Preceding with ``@`` marks the port as a console port, otherwise it is a
+   normal virtio serial port
+
+-  The ``portpath`` can be omitted when backend is stdio or pty
+
+-  The ``stdio/tty/pty`` is tty capable, which means :kbd:`TAB` and
+   :kbd:`BACKSPACE` are supported, as on a regular terminal
+
+-  When tty is used, please make sure the redirected tty is sleeping,
+   (e.g., by ``sleep 2d`` command), and will not read input from stdin before it
+   is used by virtio-console to redirect guest output.
+
+-  Claiming multiple virtio serial ports as consoles is supported,
+   however the guest Linux OS will only use one of them, through the
+   ``console=hvcN`` kernel parameter. For example, the following command
+   defines two backend ports, which are both console ports, but the frontend
+   driver will only use the second port named ``pty_port`` as its hvc
+   console (specified by ``console=hvc1`` in the kernel command
+   line)::
+
+      -s n,virtio-console,@tty:tty_port=/dev/pts/0,@pty:pty_port \
+      -B "root=/dev/vda2 rw rootwait maxcpus=$2 nohpet console=hvc1 console=ttyS0 ..."
+
+
+Console Backend Use Cases
+*************************
+
+The following sections elaborate on each backend.
+
+STDIO
+=====
+
+1. Add a pci slot to the device model (``acrn-dm``) command line::
+
+      -s n,virtio-console,@stdio:stdio_port
+
+#. Add the ``console`` parameter to the guest OS kernel command line::
+
+     console=hvc0
+
+PTY
+===
+
+1. Add a pci slot to the device model (``acrn-dm``) command line::
+
+     -s n,virtio-console,@pty:pty_port
+
+#. Add the ``console`` parameter to the guest os kernel command line::
+
+     console=hvc0
+
+   One line of information, such as shown below, will be printed in the terminal
+   after ``acrn-dm`` is launched (``/dev/pts/0`` may be different,
+   depending on your use case):
+
+   .. code-block: console
+
+      virt-console backend redirected to /dev/pts/0
+
+#. Use a terminal emulator, such as minicom or screen, to connect to the
+   tty node::
+
+     minicom -D /dev/pts/0
+
+   or ::
+
+     screen /dev/pts/0
+
+TTY
+===
+
+1. Identify your tty that will be used as the UOS console:
+
+   - If you're connected to your device over the network via ssh, use
+     the linux ``tty`` command, and it will report the node (may be
+     different in your use case)::
+
+        /dev/pts/0
+        sleep 2d
+
+   - If you do not have network access to your device, use screen
+     to create a new tty::
+
+        screen
+        tty
+
+     you will see (depending on your use case)::
+
+        /dev/pts/0
+
+     Prevent the tty from responding by sleeping::
+
+        sleep 2d
+
+     and detach the tty by pressing :kbd:`CTRL-A` :kbd:`d`.
+
+#. Add a pci slot to the device model (``acrn-dm``)  command line
+   (changing the ``dev/pts/X`` to match your use case)::
+
+      -s n,virtio-console,@tty:tty_port=/dev/pts/X
+
+#. Add the console parameter to the guest OS kernel command line::
+
+      console=hvc0
+
+#. Go back to the previous tty.  For example, if you're using
+   ``screen``, use::
+
+      screen -ls
+      screen -r <pid_of_your_tty>
+
+FILE
+====
+
+The File backend only supports console output to a file (no input).
+
+1. Add a pci slot to the device model (``acrn-dm``) command line,
+   adjusting the ``</path/to/file>`` to your use case::
+
+      -s n,virtio-console,@file:file_port=</path/to/file>
+
+#. Add the console parameter to the guest OS kernel command line::
+
+      console=hvc0
+
--- a/doc/developer-guides/hld/virtio-net.rst
+++ b/doc/developer-guides/hld/virtio-net.rst
@@ -0,0 +1,525 @@
+.. _virtio-net:
+
+Virtio-net
+##########
+
+Virtio-net is the para-virtualization solution used in ACRN for
+networking. The ACRN device model emulates virtual NICs for UOS and the
+frontend virtio network driver, simulating the virtual NIC and following
+the virtio specification. (Refer to :ref:`introduction` and
+:ref:`virtio-hld` background introductions to ACRN and Virtio.)
+
+Here are some notes about Virtio-net support in ACRN:
+
+- Legacy devices are supported, modern devices are not supported
+- Two virtqueues are used in virtio-net: RX queue and TX queue
+- Indirect descriptor is supported
+- TAP backend is supported
+- Control queue is not supported
+- NIC multiple queues are not supported
+
+Network Virtualization Architecture
+***********************************
+
+ACRN's network virtualization architecture is shown below in
+:numref:`net-virt-arch`, and illustrates the many necessary network
+virtualization components that must cooperate for the UOS to send and
+receive data from the outside world.
+
+.. figure:: images/network-virt-arch.png
+   :align: center
+   :width: 900px
+   :name: net-virt-arch
+
+   Network Virtualization Architecture
+
+(The green components are parts of the ACRN solution, while the gray
+components are parts of the Linux kernel.)
+
+Let's explore these components further.
+
+SOS/UOS Network Stack:
+   This is the standard Linux TCP/IP stack, currently the most
+   feature-rich TCP/IP implementation.
+
+virtio-net Frontend Driver:
+   This is the standard driver in the Linux Kernel for virtual Ethernet
+   devices. This driver matches devices with PCI vendor ID 0x1AF4 and PCI
+   Device ID 0x1000 (for legacy devices in our case) or 0x1041 (for modern
+   devices). The virtual NIC supports two virtqueues, one for transmitting
+   packets and the other for receiving packets. The frontend driver places
+   empty buffers into one virtqueue for receiving packets, and enqueues
+   outgoing packets into another virtqueue for transmission. The size of
+   each virtqueue is 1024, configurable in the virtio-net backend driver.
+
+ACRN Hypervisor:
+   The ACRN hypervisor is a type 1 hypervisor, running directly on the
+   bare-metal hardware, and suitable for a variety of IoT and embedded
+   device solutions. It fetches and analyzes the guest instructions, puts
+   the decoded information into the shared page as an IOREQ, and notifies
+   or interrupts the VHM module in the SOS for processing.
+
+VHM Module:
+   The Virtio and Hypervisor Service Module (VHM) is a kernel module in the
+   Service OS (SOS) acting as a middle layer to support the device model
+   and hypervisor. The VHM forwards a IOREQ to the virtio-net backend
+   driver for processing.
+
+ACRN Device Model and virtio-net Backend Driver:
+   The ACRN Device Model (DM) gets an IOREQ from a shared page and calls
+   the virtio-net backend driver to process the request. The backend driver
+   receives the data in a shared virtqueue and sends it to the TAP device.
+
+Bridge and Tap Device:
+   Bridge and Tap are standard virtual network infrastructures. They play
+   an important role in communication among the SOS, the UOS, and the
+   outside world.
+
+IGB Driver:
+   IGB is the physical Network Interface Card (NIC) Linux kernel driver
+   responsible for sending data to and receiving data from the physical
+   NIC.
+
+The virtual network card (NIC) is implemented as a virtio legacy device
+in the ACRN device model (DM). It is registered as a PCI virtio device
+to the guest OS (UOS) and uses the standard virtio-net in the Linux kernel as
+its driver (the guest kernel should be built with
+``CONFIG_VIRTIO_NET=y``).
+
+The virtio-net backend in DM forwards the data received from the
+frontend to the TAP device, then from the TAP device to the bridge, and
+finally from the bridge to the physical NIC driver, and vice versa for
+returning data from the NIC to the frontend.
+
+ACRN Virtio-Network Calling Stack
+*********************************
+
+Various components of ACRN network virtualization are shown in the
+architecture diagram shows in :numref:`net-virt-arch`.  In this section,
+we will use UOS data transmission (TX) and reception (RX) examples to
+explain step-by-step how these components work together to implement
+ACRN network virtualization.
+
+Initialization in Device Model
+==============================
+
+**virtio_net_init**
+
+- Present frontend for a virtual PCI based NIC
+- Setup control plan callbacks
+- Setup data plan callbacks, including TX, RX
+- Setup tap backend
+
+Initialization in virtio-net Frontend Driver
+============================================
+
+**virtio_pci_probe**
+
+- Construct virtio device using virtual pci device and register it to
+  virtio bus
+
+**virtio_dev_probe --> virtnet_probe --> init_vqs**
+
+- Register network driver
+- Setup shared virtqueues
+
+ACRN UOS TX FLOW
+================
+
+The following shows the ACRN UOS network TX flow, using TCP as an
+example, showing the flow through each layer:
+
+**UOS TCP Layer**
+
+.. code-block:: c
+
+   tcp_sendmsg -->
+       tcp_sendmsg_locked -->
+           tcp_push_one -->
+               tcp_write_xmit -->
+                   tcp_transmit_skb -->
+
+**UOS IP Layer**
+
+.. code-block:: c
+
+   ip_queue_xmit -->
+       ip_local_out -->
+           __ip_local_out -->
+               dst_output -->
+                   ip_output -->
+                       ip_finish_output -->
+                           ip_finish_output2 -->
+                               neigh_output -->
+                                   neigh_resolve_output -->
+
+**UOS MAC Layer**
+
+.. code-block:: c
+
+   dev_queue_xmit -->
+       __dev_queue_xmit -->
+           dev_hard_start_xmit -->
+               xmit_one -->
+                   netdev_start_xmit -->
+                       __netdev_start_xmit -->
+
+
+**UOS MAC Layer virtio-net Frontend Driver**
+
+.. code-block:: c
+
+   start_xmit -->                   // virtual NIC driver xmit in virtio_net
+       xmit_skb -->
+           virtqueue_add_outbuf --> // add out buffer to shared virtqueue
+               virtqueue_add -->
+
+       virtqueue_kick -->           // notify the backend
+           virtqueue_notify -->
+               vp_notify -->
+                   iowrite16 -->    // trap here, HV will first get notified
+
+**ACRN Hypervisor**
+
+.. code-block:: c
+
+   vmexit_handler -->                      // vmexit because VMX_EXIT_REASON_IO_INSTRUCTION
+       pio_instr_vmexit_handler -->
+           emulate_io -->                  // ioreq cant be processed in HV, forward it to VHM
+               acrn_insert_request_wait -->
+                   fire_vhm_interrupt -->  // interrupt SOS, VHM will get notified
+
+**VHM Module**
+
+.. code-block:: c
+
+   vhm_intr_handler -->                          // VHM interrupt handler
+       tasklet_schedule -->
+           io_req_tasklet -->
+               acrn_ioreq_distribute_request --> // ioreq can't be processed in VHM, forward it to device DM
+                   acrn_ioreq_notify_client -->
+                       wake_up_interruptible --> // wake up DM to handle ioreq
+
+**ACRN Device Model / virtio-net Backend Driver**
+
+.. code-block:: c
+
+   handle_vmexit -->
+       vmexit_inout -->
+           emulate_inout -->
+               pci_emul_io_handler -->
+                   virtio_pci_write -->
+                       virtio_pci_legacy_write -->
+                           virtio_net_ping_txq -->       // start TX thread to process, notify thread return
+                               virtio_net_tx_thread -->  // this is TX thread
+                                   virtio_net_proctx --> // call corresponding backend (tap) to process
+                                       virtio_net_tap_tx -->
+                                           writev -->    // write data to tap device
+
+**SOS TAP Device Forwarding**
+
+.. code-block:: c
+
+   do_writev -->
+       vfs_writev -->
+           do_iter_write -->
+               do_iter_readv_writev -->
+                   call_write_iter -->
+                       tun_chr_write_iter -->
+                           tun_get_user -->
+                               netif_receive_skb -->
+                                   netif_receive_skb_internal -->
+                                       __netif_receive_skb -->
+                                           __netif_receive_skb_core -->
+
+
+**SOS Bridge Forwarding**
+
+.. code-block:: c
+
+   br_handle_frame -->
+       br_handle_frame_finish -->
+           br_forward -->
+               __br_forward -->
+                   br_forward_finish -->
+                       br_dev_queue_push_xmit -->
+
+**SOS MAC Layer**
+
+.. code-block:: c
+
+   dev_queue_xmit -->
+       __dev_queue_xmit -->
+           dev_hard_start_xmit -->
+               xmit_one -->
+                   netdev_start_xmit -->
+                       __netdev_start_xmit -->
+
+
+**SOS MAC Layer IGB Driver**
+
+.. code-block:: c
+
+   igb_xmit_frame --> // IGB physical NIC driver xmit function
+
+ACRN UOS RX FLOW
+================
+
+The following shows the ACRN UOS network RX flow, using TCP as an example.
+Let's start by receiving a device interrupt. (Note that the hypervisor
+will first get notified when receiving an interrupt even in passthrough
+cases.)
+
+**Hypervisor Interrupt Dispatch**
+
+.. code-block:: c
+
+   vmexit_handler -->                          // vmexit because VMX_EXIT_REASON_EXTERNAL_INTERRUPT
+       external_interrupt_vmexit_handler -->
+           dispatch_interrupt -->
+               common_handler_edge -->
+                  ptdev_interrupt_handler -->
+                     ptdev_enqueue_softirq --> // Interrupt will be delivered in bottom-half softirq
+
+
+**Hypervisor Interrupt Injection**
+
+.. code-block:: c
+
+   do_softirq -->
+       ptdev_softirq -->
+           vlapic_intr_msi -->     // insert the interrupt into SOS
+
+   start_vcpu -->                  // VM Entry here, will process the pending interrupts
+
+**SOS MAC Layer IGB Driver**
+
+.. code-block:: c
+
+   do_IRQ -->
+       ...
+       igb_msix_ring -->
+           igbpoll -->
+               napi_gro_receive -->
+                   napi_skb_finish -->
+                       netif_receive_skb_internal -->
+                           __netif_receive_skb -->
+                               __netif_receive_skb_core --
+
+**SOS Bridge Forwarding**
+
+.. code-block:: c
+
+   br_handle_frame -->
+       br_handle_frame_finish -->
+           br_forward -->
+               __br_forward -->
+                   br_forward_finish -->
+                       br_dev_queue_push_xmit -->
+
+**SOS MAC Layer**
+
+.. code-block:: c
+
+   dev_queue_xmit -->
+       __dev_queue_xmit -->
+           dev_hard_start_xmit -->
+               xmit_one -->
+                   netdev_start_xmit -->
+                       __netdev_start_xmit -->
+
+**SOS MAC Layer TAP Driver**
+
+.. code-block:: c
+
+   tun_net_xmit --> // Notify and wake up reader process
+
+**ACRN Device Model / virtio-net Backend Driver**
+
+.. code-block:: c
+
+   virtio_net_rx_callback -->       // the tap fd get notified and this function invoked
+       virtio_net_tap_rx -->        // read data from tap, prepare virtqueue, insert interrupt into the UOS
+           vq_endchains -->
+               vq_interrupt -->
+                   pci_generate_msi -->
+
+**VHM Module**
+
+.. code-block:: c
+
+   vhm_dev_ioctl -->                // process the IOCTL and call hypercall to inject interrupt
+       hcall_inject_msi -->
+
+**ACRN Hypervisor**
+
+.. code-block:: c
+
+   vmexit_handler -->               // vmexit because VMX_EXIT_REASON_VMCALL
+       vmcall_vmexit_handler -->
+           hcall_inject_msi -->     // insert interrupt into UOS
+               vlapic_intr_msi -->
+
+**UOS MAC Layer virtio_net Frontend Driver**
+
+.. code-block:: c
+
+   vring_interrupt -->              // virtio-net frontend driver interrupt handler
+       skb_recv_done -->            //registed by virtnet_probe-->init_vqs-->virtnet_find_vqs
+           virtqueue_napi_schedule -->
+               __napi_schedule -->
+                   virtnet_poll -->
+                       virtnet_receive -->
+                           receive_buf -->
+
+**UOS MAC Layer**
+
+.. code-block:: c
+
+   napi_gro_receive -->
+       napi_skb_finish -->
+           netif_receive_skb_internal -->
+               __netif_receive_skb -->
+                   __netif_receive_skb_core -->
+
+**UOS IP Layer**
+
+.. code-block:: c
+
+   ip_rcv -->
+       ip_rcv_finish -->
+           dst_input -->
+               ip_local_deliver -->
+                   ip_local_deliver_finish -->
+
+
+**UOS TCP Layer**
+
+.. code-block:: c
+
+   tcp_v4_rcv -->
+       tcp_v4_do_rcv -->
+           tcp_rcv_established -->
+               tcp_data_queue -->
+                   tcp_queue_rcv -->
+                       __skb_queue_tail -->
+
+                   sk->sk_data_ready --> // application will get notified
+
+How to Use
+==========
+
+The network infrastructure shown in :numref:`net-virt-infra` needs to be
+prepared in the SOS before we start. We need to create a bridge and at
+least one tap device (two tap devices are needed to create a dual
+virtual NIC) and attach a physical NIC and tap device to the bridge.
+
+.. figure:: images/network-virt-sos-infrastruct.png
+   :align: center
+   :width: 900px
+   :name: net-virt-infra
+
+   Network Infrastructure in SOS
+
+You can use Linux commands (e.g. ip, brctl) to create this network. In
+our case, we use systemd to automatically create the network by default.
+You can check the files with prefix 50- in the SOS
+``/usr/lib/systemd/network/``:
+
+- `50-acrn.netdev <https://raw.githubusercontent.com/projectacrn/acrn-hypervisor/master/tools/acrnbridge/acrn.netdev>`__
+- `50-acrn.network <https://raw.githubusercontent.com/projectacrn/acrn-hypervisor/master/tools/acrnbridge/acrn.network>`__
+- `50-acrn_tap0.netdev <https://raw.githubusercontent.com/projectacrn/acrn-hypervisor/master/tools/acrnbridge/acrn_tap0.netdev>`__
+- `50-eth.network <https://raw.githubusercontent.com/projectacrn/acrn-hypervisor/master/tools/acrnbridge/eth.network>`__
+
+When the SOS is started, run ``ifconfig`` to show the devices created by
+this systemd configuration:
+
+.. code-block:: none
+
+   acrn-br0 Link encap:Ethernet HWaddr B2:50:41:FE:F7:A3
+      inet addr:10.239.154.43 Bcast:10.239.154.255 Mask:255.255.255.0
+      inet6 addr: fe80::b050:41ff:fefe:f7a3/64 Scope:Link
+      UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
+      RX packets:226932 errors:0 dropped:21383 overruns:0 frame:0
+      TX packets:14816 errors:0 dropped:0 overruns:0 carrier:0
+      collisions:0 txqueuelen:1000
+      RX bytes:100457754 (95.8 Mb) TX bytes:83481244 (79.6 Mb)
+
+   acrn_tap0 Link encap:Ethernet HWaddr F6:A7:7E:52:50:C6
+      UP BROADCAST MULTICAST MTU:1500 Metric:1
+      RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+      TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
+      collisions:0 txqueuelen:1000
+      RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
+
+   enp3s0 Link encap:Ethernet HWaddr 98:4F:EE:14:5B:74
+      inet6 addr: fe80::9a4f:eeff:fe14:5b74/64 Scope:Link
+      UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
+      RX packets:279174 errors:0 dropped:0 overruns:0 frame:0
+      TX packets:69923 errors:0 dropped:0 overruns:0 carrier:0
+      collisions:0 txqueuelen:1000
+      RX bytes:107312294 (102.3 Mb) TX bytes:87117507 (83.0 Mb)
+      Memory:82200000-8227ffff
+
+   lo Link encap:Local Loopback
+      inet addr:127.0.0.1 Mask:255.0.0.0
+      inet6 addr: ::1/128 Scope:Host
+      UP LOOPBACK RUNNING MTU:65536 Metric:1
+      RX packets:16 errors:0 dropped:0 overruns:0 frame:0
+      TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
+      collisions:0 txqueuelen:1000
+      RX bytes:1216 (1.1 Kb) TX bytes:1216 (1.1 Kb)
+
+Run ``brctl show`` to see the bridge ``acrn-br0`` and attached devices:
+
+.. code-block:: none
+
+   bridge name   bridge id STP       enabled   interfaces
+
+   acrn-br0      8000.b25041fef7a3   no        acrn_tap0
+                                               enp3s0
+
+Add a pci slot to the device model acrn-dm command line (mac address is
+optional):
+
+.. code-block:: none
+
+    -s 4,virtio-net,<tap_name>,[mac=<XX:XX:XX:XX:XX:XX>]
+
+When the UOS is lauched, run ``ifconfig`` to check the network. enp0s4r
+is the virtual NIC created by acrn-dm:
+
+.. code-block:: none
+
+   enp0s4 Link encap:Ethernet HWaddr 00:16:3E:39:0F:CD
+      inet addr:10.239.154.186 Bcast:10.239.154.255 Mask:255.255.255.0
+      inet6 addr: fe80::216:3eff:fe39:fcd/64 Scope:Link
+      UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
+      RX packets:140 errors:0 dropped:8 overruns:0 frame:0
+      TX packets:46 errors:0 dropped:0 overruns:0 carrier:0
+      collisions:0 txqueuelen:1000
+      RX bytes:110727 (108.1 Kb) TX bytes:4474 (4.3 Kb)
+
+   lo Link encap:Local Loopback
+      inet addr:127.0.0.1 Mask:255.0.0.0
+      inet6 addr: ::1/128 Scope:Host
+      UP LOOPBACK RUNNING MTU:65536 Metric:1
+      RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+      TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
+      collisions:0 txqueuelen:1000
+      RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
+
+Performance Estimation
+======================
+
+We've introduced the network virtualization solution in ACRN, from the
+top level architecture to the detailed TX and RX flow.  Currently, the
+control plane and data plane are all processed in ACRN device model,
+which may bring some overhead. But this is not a bottleneck for 1000Mbit
+NICs or below. Network bandwidth for virtualization can be very close to
+the native bandwidgh. For high speed NIC (e.g.  10Gb or above), it is
+necessary to separate the data plane from the control plane. We can use
+vhost for acceleration. For most IoT scenarios, processing in user space
+is simple and reasonable.
+
+
--- a/doc/developer-guides/hld/virtio-rnd.rst
+++ b/doc/developer-guides/hld/virtio-rnd.rst
@@ -0,0 +1,21 @@
+.. _virtio-rnd:
+
+Virtio-rnd
+##########
+
+The virtio-rnd entropy device supplies high-quality randomness for guest
+use. The virtio device ID of the virtio-rnd device is 4, and it supports
+one virtqueue, the size of which is 64, configurable in the source code.
+It has no feature bits defined.
+
+When the FE driver requires some random bytes, the BE device will place
+bytes of random data onto the virtqueue.
+
+To launch the virtio-rnd device, use the following virtio command::
+
+   -s <slot>,virtio-rnd
+
+To verify the correctness in user OS, use the following
+command::
+
+   od /dev/random
--- a/doc/developer-guides/hld/watchdog-hld.rst
+++ b/doc/developer-guides/hld/watchdog-hld.rst
@@ -0,0 +1,98 @@
+.. _watchdog-hld:
+
+Watchdog Virtualization in Device Model
+#######################################
+
+This document describes the watchdog virtualization implementation in
+ACRN device model.
+
+Overview
+********
+
+A watchdog is an important hardware component in embedded systems, used
+to monitor the system's running status, and resets the processor if the
+software crashes. In general,  hardware watchdogs rely on a piece of
+software running on the machine which must "kick" the watchdog device
+regularly, say every 10 seconds. If the watchdog doesn't get "kicked"
+after 60 seconds, for example, then the watchdog device asserts the
+RESET line which results in a hard reboot.
+
+For ACRN we emulate the watchdog hardware in the Intel 6300ESB chipset
+as a PCI device called 6300ESB watchdog and is added into the Device
+Model following the PCI device framework. The following
+:numref:`watchdog-device` shows the watchdog device workflow:
+
+.. figure:: images/watchdog-image2.png
+   :align: center
+   :width: 900px
+   :name: watchdog-device
+
+   Watchdog device flow
+
+The DM in the Service OS (SOS) treats the watchdog as a passive device.
+It receives read/write commands from the watchdog driver, does the
+actions, and returns.  In ACRN, the commands are from User OS (UOS)
+watchdog driver.
+
+UOS watchdog work flow
+**********************
+
+When the UOS does a read or write operation on the watchdog device's
+registers or memory space (Port IO or Memory map I/O), it will trap into
+the hypervisor.   The hypervisor delivers the operation to the SOS/DM
+through IPI (inter-process interrupt) or shared memory, and the DM
+dispatches the operation to the watchdog emulation code.
+
+After the DM watchdog finishes emulating the read or write operation, it
+then calls ``ioctl`` to the SOS/kernel (``/dev/acrn_vhm``). VHM will call a
+hypercall to trap into the hypervisor to tell it the operation is done, and
+the hypervisor will set UOS-related VCPU registers and resume UOS so the
+UOS watchdog driver will get the return values (or return status). The
+:numref:`watchdog-workflow` below is a typical operation flow: 
+from UOS to SOS and return back:
+
+.. figure:: images/watchdog-image1.png
+   :align: center
+   :width: 900px
+   :name: watchdog-workflow
+
+   Watchdog operation workflow
+
+Implementation in ACRN and how to use it
+****************************************
+
+In ACRN, the Intel 6300ESB watchdog device emulation is added into the
+DM PCI device tree. Its interface structure is (see
+``devicemodel/include/pci_core.h``):
+
+.. code-block:: c
+
+   struct pci_vdev_ops pci_ops_wdt = {
+      .class_name     = "wdt-i6300esb",
+      .vdev_init      = pci_wdt_init,
+      .vdev_deinit    = pci_wdt_deinit,
+      .vdev_cfgwrite  = pci_wdt_cfg_write,
+      .vdev_cfgread   = pci_wdt_cfg_read,
+      .vdev_barwrite  = pci_wdt_bar_write,
+      .vdev_barread   = pci_wdt_bar_read
+   };
+
+All functions follow the ``pci_vdev_ops`` definitions for PCI device
+emulation.
+
+The main part in the watchdog emulation is the timer thread. It emulates
+the watchdog device timeout management. When it gets the kick action
+from the UOS, it resets the timer. If the timer expires before getting a
+timely kick action, it will call DM API to reboot that UOS.
+
+In the UOS launch script, add: ``-s xx,wdt-i6300esb`` into DM parameters.
+(xx is the virtual PCI BDF number as with other PCI devices)
+
+Make sure the UOS kernel has the I6300ESB driver enabled: ``CONFIG_I6300ESB_WDT=y``. After the UOS
+boots up, the watchdog device will be created as node ``/dev/watchdog``,
+and can be used as a normal device file.
+
+Usually the UOS needs a watchdog service (daemon) to run in userland and
+kick the watchdog periodically. If something prevents the daemon from
+kicking the watchdog, for example the UOS system is hung, the watchdog
+will timeout and the DM will reboot the UOS.