diff --git a/doc/developer-guides/hld/hld-security.rst b/doc/developer-guides/hld/hld-security.rst index 6064065c8..398552c0b 100644 --- a/doc/developer-guides/hld/hld-security.rst +++ b/doc/developer-guides/hld/hld-security.rst @@ -9,20 +9,20 @@ Security high-level design Introduction ************ -This document describes security high level design in ACRN, +This document describes security high-level design in ACRN, including information about: -- Secure Boot in ACRN -- Hypervisor Security Enhancement, including memory management,secure - hypervisor interfaces etc. -- Platform Security Features Virtualizaion, such as the virtualizaion - of TPM(vTPM) and SGX(vSGX) +- Secure booting in ACRN +- Hypervisor security enhancement, including memory management, secure + hypervisor interfaces, etc. +- Platform security features virtualization, such as the virtualization + of TPM (vTPM) and SGX (vSGX) This document is for developers, validation teams, architects, and maintainers of ACRN. -The readers should be familiar with the basic concepts of system -virtualization and ACRN hypervisor implementation. +Readers should be familiar with the basic concepts of system +virtualization and the ACRN hypervisor implementation. Background @@ -37,10 +37,10 @@ single Intel Apollo Lake-I SoC platform. See :numref:`security-ACRN`. :align: center :name: security-ACRN - ACRN hypervisor Overview + ACRN Hypervisor Overview -This document focuses only on the security part of this automotive -system built on top of ACRN hypervisor. This includes how to build a +This document focuses only on the security part of the automotive +system built on top of the ACRN hypervisor. This includes how to build a secure system as well as how to virtualize the security features that the system can provide. @@ -48,7 +48,7 @@ Usages ====== As shown in :numref:`security-vehicle`, the ACRN hypervisor can be -used to build a Software Defined Cockpit (SDC) or In-Vehicle Experience +used to build a Software Defined Cockpit (SDC) or an In-Vehicle Experience (IVE) Solution that consolidates multiple VMs together on a single Intel SoC in-vehicle platform. @@ -64,46 +64,46 @@ In this system, the ACRN hypervisor is running at the most privileged level, VMX root mode, in virtualization technology terms. The hypervisor has full control of platform resources, including the processor, memory, devices, and in some cases, secrets of the guest OS. The ACRN -hypervisor supports multiple guest VMs running in parallel, in the less +hypervisor supports multiple guest VMs running in parallel in the less privileged level called VMX non-root mode. The Service OS (SOS) is a special VM OS. While it runs as a guest VM in VMX non-root mode, it behaves as a privileged guest VM controlling the behavior of other guest VMs. The SOS can create a guest VM, suspend and -resume a guest VM, and provides device mediation services (Device +resume a guest VM, and provide device mediation services (Device Models) for other guest VMs it creates. -In a SDC system, the SOS also contains safety-critical IC (Instrument +In an SDC system, the SOS also contains safety-critical IC (Instrument Cluster) applications. ACRN is designed to make sure the IC applications are well isolated from other applications in the SOS such as Device Models (Mediators). A crash in other guest VM systems must not impact -the IC applications, and not cause any DoS (Deny of Service) attack. +the IC applications, and must not cause any DoS (Deny of Service) attacks. Functional safety is out of scope of this document. In :numref:`security-ACRN`, the other guest VMs are referred to as User OS (UOS). These other VMs provide infotainment services (such as navigation, music, and FM/AM radio) for the front seat or rear seat. -The UOS systems could be based on Linux (LaaG, Linux as a Guest) or +The UOS systems can be based on Linux (LaaG, Linux as a Guest) or Android\* (AaaG, Android as a Guest) depending on the customer's needs -and board configuration. It could also be a mix of Linux and Android +and board configuration. It can also be a mix of Linux and Android systems. -In each UOS, there could be a "side-car" OS system accompanying the -normal OS system. We call these two OS systems "secure world" and +In each UOS, a "side-car" OS system can accompany the normal OS system. We +call these two OS systems "secure world" and "non-secure world", and they are isolated from each other by the hypervisor. Secure world has a higher "privilege level" than non-secure -world, for example, the secure world can access the non-secure world's +world; for example, the secure world can access the non-secure world's physical memory but not vice-versa. This document discusses how this security works and why it is required. Careful consideration should be made when evaluating using the Service OS (SOS) as the Trusted Computing Base (TCB). The Service OS may be a -fairly large system running many lines of code, hence treating it as a +fairly large system running many lines of code; thus, treating it as a TCB doesn't make sense from a security perspective. To achieve the -design purpose of "defense in depth", the system security designer +design purpose of "defense in depth", system security designers should always ask themselves, "What if the SOS is compromised?" and -"What's the impact if this happens?". This HLD document discusses how to +"What's the impact if this happens?" This HLD document discusses how to security-harden the SOS system and mitigate attacks on the SOS. ACRN High-Level Security Architecture @@ -118,7 +118,7 @@ Secure / Verified Boot The security of the entire system built on top of the ACRN hypervisor depends on the security from platform boot to UOS launching. Each layer or module must verify the security of the next layer or module before -transferring control to it. Verification could be checking a +transferring control to it. Verification can be checking a cryptographic signature on the executable of the next step before it is launched. @@ -139,7 +139,7 @@ Boot Flow As shown in :numref:`security-bootflow`, the Converged Security Engine Firmware (CSE FW) behaves as the root of trust in this platform boot -flow. It authenticates and starts the BIOS (SBL), then the SBL is +flow. It authenticates and starts the BIOS (SBL), whereupon the SBL is responsible for authenticating and verifying the ACRN hypervisor image. Currently the SOS kernel is built together with the ACRN hypervisor as one image bundle, so this whole image signature is verified by SBL @@ -147,14 +147,14 @@ before launching. As long as the SOS kernel starts, the SOS kernel will load all its subsystems subsequently. In order to launch a guest UOS, a DM process is -started to launch the virtual BIOS (vSBL), and eventually, the vSBL is +started to launch the virtual BIOS (vSBL), and eventually the vSBL is responsible for verifying and launching the guest UOS kernel (or the Android OS loader for an Android UOS). Secure Boot ----------- -In the entire boot flow, the chain of trust must be unbroken and is +In the entire boot flow, the chain of trust must be unbroken. This is achieved by the secure boot mechanism. Each module in the boot flow must authenticate and verify the next module by using a cryptographic digital signature algorithm. @@ -167,9 +167,9 @@ The 2018 minimal requirements for cryptographic strength currently are: #. SHA256 for image cryptographic hashing. #. RSA2048 for cryptographic digital signature signing and verification. -However, it is strongly recommended that SHA512 and RSA3072+ should be -used for a product shipped in 2018, especially for a product which has a -long production life such as an automotive vehicle. +We strongly recommend that SHA512 and RSA3072+ be used for a product shipped +in 2018, especially for a product which has a long production life such as +an automotive vehicle. The CSE FW image is signed with an Intel RSA private key. All other images should be signed by the responsible OEM. Our customers and @@ -182,7 +182,7 @@ securely. SOS Hardening ------------- -In project ACRN, the reference SOS is based on Clear Linux OS. Customers +In the ACRN project, the reference SOS is based on Clear Linux OS. Customers may choose to use different open source OSes or their own proprietary OS systems. To minimize the attack surfaces and achieve the goal of "defense in depth", there are many common guidelines to ensure the @@ -190,23 +190,23 @@ security of SOS system. As shown in :numref:`security-bootflow` above, the integrity of the UOS depends on the integrity of the DM module and vBIOS/vOSloader in the -SOS. Hence SOS integrity is critical to the entire UOS security. If the +SOS. Hence, SOS integrity is critical to the entire UOS security. If the SOS system is compromised, all the other guest UOS VMs may be jeopardized. In practice, the SOS designer and implementer should obey at least the following rules: -#. Verify the SOS is a closed system and doesn't allow the user to +#. Verify that the SOS is a closed system and doesn't allow the user to install any unauthorized 3rd-party software or components. -#. Verify external peripherals are constrained. +#. Verify that external peripherals are constrained. #. Enable kernel-based hardening techniques, for example dm-verity (to - make sure integrity of DM and vBIOS/vOSloaders), and kernel module + ensure integrity of the DM and vBIOS/vOSloaders), and kernel module signing. #. Enable system level hardening such as MAC (Mandatory Access Control). Detailed configurations and policies are out of scope for this document. -For good references for OS system security hardening and enhancement, +For good references on OS system security hardening and enhancement, see `AGL security `_ and `Android security `_ @@ -234,8 +234,8 @@ Memory Management Enhancement Background ~~~~~~~~~~ -The ACRN hypervisor has ultimate control access of all the platform -memory spaces. (See :ref:`memmgt-hld`.) Note that on the APL platform, +The ACRN hypervisor has ultimate access control of all the platform +memory spaces (see :ref:`memmgt-hld`). Note that on the APL platform, `SGX `_ and `TME `_ are not currently supported. @@ -267,37 +267,37 @@ Memory Access Restrictions The fundamental rules of restricting hypervisor memory access are: #. By default, prohibit any access to all guest VM memory. This means - that initially when the hypervisor sets up its own MMU paging tables + that when the hypervisor initially sets up its own MMU paging tables (HVA->HPA mapping), it only grants permissions for hypervisor memory - space (Excluding guest VM memory) -#. Grant access permission for hypervisor to read/write a specific guest + space (excluding guest VM memory) +#. Grant access permission for the hypervisor to read/write a specific guest VM memory region on demand. The hypervisor must never grant execution permission for itself to fetch any code instructions from guest memory space because there is no reason to do that. -In addition to these rules, the hypervisor must also implement a generic +In addition to these rules, the hypervisor must also implement generic best-practice memory configurations for access to its own memory in host -CR3 MMU paging tables, for example splitting hypervisor code and data -(stack/heap) sections, and then apply W |oplus| X policy, which means if memory +CR3 MMU paging tables, such as splitting hypervisor code and data +(stack/heap) sections, and then applying W |oplus| X policy, which means if memory is Writable, then the hypervisor must make it non-eXecutable. The hypervisor must configure its code as read-only and executable, and configure its data as read-write. Optionally, if there are read-only data sections, it would be best if the hypervisor configures them as read-only. -The following sections will focus on the rules mentioned above for +The following sections focus on the rules mentioned above for memory access restriction on guest VM memory (not restrictions on the hypervisor's own memory access). -SMAP/SMEP Enablement in Hypervisor -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +SMAP/SMEP Enablement in the Hypervisor +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -For the hypervisor to isolate access to guest VM memory space, there are -three typical solutions: +For the hypervisor to isolate access to the guest VM memory space, +three typical solutions exist: -#. **Configure the hypervisor/VMM MMU CR3 paging tables by removing +#. **Configure the hypervisor/VMM MMU CR3 paging tables by removing the execution permission (setting NX bit) or removing mapping completely - (setting not-present) for guest memory space.** + (setting not-present) for the guest memory space.** In practice, this works very well for NX setting to disable instruction fetching from any guest memory space. However, it is not @@ -315,18 +315,18 @@ three typical solutions: #. **Use CR0.WP (write-protection) bit.** This processor feature allows - pages to be protected from supervisor-mode write accesses. - If the host/VMM CR0.WP = 0, supervisor-mode write accesses are + pages to be protected from supervisor-mode write access. + If the host/VMM CR0.WP = 0, supervisor-mode write access is allowed to linear addresses with read-only access rights. If CR0.WP = - 1, they are not allowed. User-mode write accesses are never allowed - to linear addresses with read-only access rights, regardless of the + 1, they are not allowed. User-mode write access is never allowed + for linear addresses with read-only access rights, regardless of the value of CR0.WP. To implement this WP protection, the hypervisor must first configure all the guest memory space as "user-mode" accessible memory, and as - read-only access (in other words, the corresponding paging table + read-only access. In other words, the corresponding paging table entry U/S bit and R/W bit must be set in host CR3 paging tables for - all those guest memory pages). + all those guest memory pages. .. figure:: images/security-image3.png :width: 900px @@ -337,7 +337,7 @@ three typical solutions: This setting seems meaningless since all the code in the ACRN hypervisor is running in Ring 0 (supervisor-mode), and no code in the hypervisor - will be executed in Ring 3 (no user-mode applications in hypervisor / + will be executed in Ring 3 (no user-mode applications in the hypervisor / vmx-root). However, these settings are made in order to make use of the CR0.WP @@ -353,7 +353,7 @@ three typical solutions: This solution is better than the 1st solution above because it doesn't need to change the host CR3 paging tables to map or unmap guest memory pages and doesn't need to flush the TLB. - However, it cannot prevent hypervisor (running in Ring 0 mode) from + However, it cannot prevent the hypervisor (running in Ring 0 mode) from reading guest memory space because this CR0.WP bit doesn't control read access behaviors. This read access protection is essentially required because sometimes there may be secrets in guest memory and if the @@ -364,7 +364,7 @@ three typical solutions: This solution is a best solution because SMAP can prevent the hypervisor from both reading and writing guest memory, and SMEP can - prevent hypervisor from fetching/executing code in guest memory. This + prevent the hypervisor from fetching/executing code in guest memory. This solution also has minimal performance impact; like the CR0.WP protection, it doesn't require TLB flush (incurring a performance penalty) and has less code complexity. @@ -384,10 +384,10 @@ mode cannot fetch instructions from linear addresses that are accessible in user mode. In the ACRN hypervisor, the attacker-supplied memory could be any guest -memory, because hypervisor doesn't trust all the data/code from guest +memory, because the hypervisor doesn't trust all the data/code from guest memory by design. -In order to activate SMEP protection, ACRN hypervisor must: +In order to activate SMEP protection, the ACRN hypervisor must: #. Configure all the guest memory as user-accessible memory (U/S = 1). No matter what settings for NX bit and R/W bit in corresponding host @@ -399,7 +399,7 @@ As an alternative, NX feature is used for this purpose by setting the corresponding NX (non-execution) bit for all the guest memory mapping in host CR3 paging tables. -Since hypervisor code never runs in Ring 3 mode, either of these two +Since the hypervisor code never runs in Ring 3 mode, either of these two solutions works very well. Both solutions are enabled in the ACRN hypervisor. @@ -426,12 +426,12 @@ To manipulate that flag relatively quickly, STAC (set AC flag) and CLAC (clear AC flag) instructions are introduced for this purpose. Note that STAC and CLAC can only be executed in kernel mode (CPL=0). -To activate SMAP protection in ACRN hypervisor: +To activate SMAP protection in the ACRN hypervisor: #. Configure all the guest memory as user-writable memory (U/S bit = 1, and R/W bit = 1) in corresponding host CR3 paging table entries, as shown in :numref:`security-smap` below. -#. Set CR4.SMAP bit. In the entire lifecycle of hypervisor, this bit +#. Set CR4.SMAP bit. In the entire lifecycle of the hypervisor, this bit value always remains one. #. When needed, use STAC instruction to suppress SMAP protection, and use CLAC instruction to restore SMAP protection. @@ -443,7 +443,7 @@ To activate SMAP protection in ACRN hypervisor: Setting SMAP and Configuring U/S=1, R/W=1 for All Guest Memory Pages -For example, :numref:`security-smap` shows a module of hypervisor code +For example, :numref:`security-smap` shows a module of the hypervisor code (running in Ring 0 mode) attempting to perform a legitimate read (or write) access to a data area in guest memory page. @@ -464,10 +464,10 @@ The attack surface can be minimized because there is only a very small window between step 1 and step 3 in which the guest memory can be accessed by hypervisor code running in ring 0. -Rules to Access Guest Memory in Hypervisor -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Rules to Access Guest Memory in the Hypervisor +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In ACRN hypervisor, functions ``stac()`` and ``clac()`` wrap +In the ACRN hypervisor, functions ``stac()`` and ``clac()`` wrap STAC and CLAC instructions respectively, and functions ``copy_to_gpa()``, and ``copy_from_gpa()`` can be used to copy an arbitrary amount of data to or from VM memory area. @@ -478,9 +478,9 @@ guest memory pages, one of functions above must be used. Otherwise, the unintended access from/to the guest memory pages. These functions must also internally check the address availabilities, -for example, ensuring the input address accessed by hypervisor must have +for example, ensuring the input address accessed by the hypervisor must have a valid mapping (GVA->GPA mapping, GPA->HPA EPT mapping and HVA->HPA -host MMU mapping), and must not be in the range of hypervisor memory. +host MMU mapping), and must not be in the range of the hypervisor memory. Details of these ordinary checks are out of scope in this document. @@ -489,7 +489,7 @@ Avoidance of Memory Information Leakage Protecting the hypervisor's memory is critical to the security of the entire platform. The hypervisor must prevent any memory content (e.g. -stack or heap) from leaking to guest VMs. Some of hypervisor memory +stack or heap) from leaking to guest VMs. Some of the hypervisor memory content may contain platform secrets such as SEEDs, which are used as the root key for its guest VMs. `Xen Advisories `_ have many examples of past hypervisor @@ -533,9 +533,9 @@ hypercall invocation in the hypervisor design: #. For those hypercalls that may result in data inconsistent intra hypervisor when they are executed concurrently, such as ``hcall_create_vm()`` ``hcll_destroy_vm()`` etc. spinlock is used to ensure these hypercalls - are processed in hypervisor in a serializing way. + are processed in the hypervisor in a serializing way. -In addition to above rules, there are other regular checks in the +In addition to the above rules, there are other regular checks in the hypercall implementation to prevent hypercalls from being misused. For example, all the parameters must be sanitized, unexpected hypervisor memory overwrite must be avoided, any hypervisor memory content/secrets @@ -550,7 +550,7 @@ emulate legacy I/O access behaviors. Typically, the I/O instructions could be IN, INS/INSB/INSW/INSD, OUT, OUTS/OUTSB/OUTSW/OUTSD with arbitrary port (although not all the I/O -ports are monitored by hypervisor). As with other interface (e.g. +ports are monitored by the hypervisor). As with other interface (e.g. hypercalls), the hypervisor performs security checks for all the I/O access parameters to make sure the emulation behaviors are correct. @@ -572,7 +572,7 @@ There are some other VMEXIT handlers in the hypervisor which might take untrusted parameters and registers from guest VM, for example, MSR write VMEXIT, APIC VMEXIT. -Sanity checks are performed by hypervisor to avoid security issue when +Sanity checks are performed by the hypervisor to avoid security issue when handling those special VMEXIT. Guest Instruction Emulation @@ -615,7 +615,7 @@ In the virtualization environment, a security goal is to ensure guest VM but also for data at rest isolation. Under this situation, if the memory contents of a previous UOS is not -scrubbed by either DM or hypervisor, then the new launched UOS could +scrubbed by either DM or the hypervisor, then the new launched UOS could access the previous UOS's secrets by scanning the memory regions allocated for the new UOS. @@ -710,7 +710,7 @@ Extract-and-Expand Key Derivation Function, `RFC5869 The parameters of HDKF derivation in the hypervisor are: -#. VMInfo= vm-uuid (from hypervisor configuration file) +#. VMInfo= vm-uuid (from the hypervisor configuration file) #. theHash=SHA-256 #. OutSeedLen = 64 in bytes #. Guest Dev and User SEED (dvSEED/uvSEED) @@ -868,9 +868,9 @@ between non-secure world and secure world in a UOS VM. Whenever this hypercall is invoked in UOS, the hypervisor will unconditionally switch to the other world. For example, if it is called -in non-secure world, hypervisor will then switch context to secure +in non-secure world, the hypervisor will then switch context to secure world. After secure world completes its security tasks (or an external -interrupt occurs), this hypercall will be called again, then hypervisor +interrupt occurs), this hypercall will be called again, then the hypervisor will switch context back to non-secure world. During entire world switching process, SOS is not involved. This @@ -919,7 +919,7 @@ BIOS (SBL) right after production device ends its manufacturing process. For each reboot, the BIOS/SBL always retrieves the rKey from CSE FW (or generated from a special SEED that is retrieved from CSE FW, refer to :ref:`platform_root_of_trust`). The SBL hands this over to the -ACRN hypervisor, and hypervisor in turn sends it to the SOS kernel. +ACRN hypervisor, and the hypervisor in turn sends it to the SOS kernel. As an example, secure storage virtualization workflow for data write access is like this: @@ -1033,14 +1033,14 @@ implementation. MEI/HECI Virtualization (vHECI) ------------------------------- -[TO BE ADDED] +This information is forthcoming. Content Protection ================== -ACRN hypervisor is designed to allow guest VMs (typically UOSs) to +The ACRN hypervisor is designed to allow guest VMs (typically UOSs) to playback premium audio/video content. This section describes how the hypervisor will support content protection for each guest UOS VM on APL platform. -[TO BE ADDED] +Additional information is forthcoming.