From 569652fd36e07ad379a26c307d57bf6a648e8d67 Mon Sep 17 00:00:00 2001 From: Justin Cormack Date: Sun, 19 Mar 2017 14:20:06 +0000 Subject: [PATCH] Initial overview of the okernel project Signed-off-by: Justin Cormack --- projects/README.md | 2 +- projects/okernel/README.md | 99 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+), 1 deletion(-) create mode 100644 projects/okernel/README.md diff --git a/projects/README.md b/projects/README.md index 96061d8a3..b5f3faef4 100644 --- a/projects/README.md +++ b/projects/README.md @@ -13,9 +13,9 @@ If you want to create a project, please submit a pull request to create a new d - [Kernel Self Protection Project enhancements](kspp/) - [Mirage SDK](miragesdk/) privilege separation for userspace services - [Wireguard](wireguard/) cryptographic enforced container network separation +- [OKernel](okernel/) intra-kernel protection using EPT (HPE) ## Current projects not yet documented - Clear Linux integration (Intel) - VMWare support (VMWare) - ARM port and secure boot integration (ARM) -- OKernel integration (HPE) diff --git a/projects/okernel/README.md b/projects/okernel/README.md new file mode 100644 index 000000000..78c8acd4c --- /dev/null +++ b/projects/okernel/README.md @@ -0,0 +1,99 @@ +Authors: Chris Dalton , Nigel Edwards + +Split Kernel + +Similar to the nested-kernel work for BSD by Dautenhan[1], the aim of +the split kernel is to introduce a level of intra-kernel protection +into the kernel so that, amongst other things, we can offer lifetime +guarantees over kernel code and data integrity. Unlike the BSD-based +nested kernel work we are focused on the Linux kernel not BSD and do +make use of HW virtualization features such as Extended Page Tables +(EPT) or equivalent to provide protection from malicious kernel +changes. (Our initial prototype is based on Intel x86, but the +intention is to be architecture neutral so we can apply it to other +architectures, including AMD and ARM.) + +The split-kernel provides a (protected) virtualized view of the kernel +for processes entering the kernel through exceptions, syscalls and +interrupts. Though we make use of hardware features designed to +support virtualization, we do not virtualize at the full virtual +machine level (like KVM or VMware, for example). Instead conceptually +our model is closer to the approach prototyped by the DUNE[2] project +where they virtualize much higher up at the user space process +level. DUNE uses the hardware virtualization features to support +virtualization within the user space context of a Linux process to +safely expose privileged hardware features to user programs. We +instead take a cut-line lower down in the OS stack and include the +virtualization of the kernel space context of a process. This kernel +virtualization allows us to introduce a level of intra-kernel +protection into the Linux kernel. + +Our initial prototype consists of a combination of fairly extensive +modifications to the existing DUNE Linux kernel module (which itself +derives from KVM) and a relatively small number of select +modifications to the core Linux kernel code to support the virtualized +kernel cut-line. + +In terms of operation, a process can be switched into 'outer-kernel' +mode which includes creating an EPT 'container' (lower level set of +page tables) for it. After switching, the process resumes running in a +non-root (NR) mode VMCS context even when in kernel context. + +(In the remainder of this README we use root-mode or R-mode to +describe a process which is has full visibility of the page tables: +upper and lower. NR-mode or non-root mode describes a process which +only has visibility of the upper level page tables.) + +With this model, the majority of kernel code can be run within the EPT +'container', offering an enhanced memory protection mechanism whilst +maintaining a single shared kernel image. A small handler loop within +the kernel for each process (thread) handles transitions from NR-mode +to R-mode where necessary to support VMEXITS and provide a privileged +operations interface. + +Once a process is in NR-mode, the ability to make changes to kernel +memory is controlled by permissions on both the upper and lower level +page tables. Our security goal is to use the lower level page tables +to prevent a NR-mode process making malicious changes to the +kernel. For example, as far as possible it should not be able to write +code or data pages NR-mode, or if changes are made, they are isolated +to the NR-mode context. + +If a process in NR-mode attempts to change the kernel memory in +conflict with permissions in the lower-level page tables, a VMEXIT (in +the current prototype which uses Intel VMX) is triggered. R-mode is +then entered where will handle the permission violation. + + +LIMITATIONS AND CAVEATS + +The current implementation does not have any protection of the kernel +in place yet. It is a demonstration that you can create processes run +them in NR-mode using EPTs with a shared kernel. As a further +demonstrations of the concept, it implements protected memory pages, +whereby a process may request a protected memory page which will not +be mapped into the EPTs for other processes. + +The next step, and the subject of our ongoing research is to design +the memory protection architecture for the kernel. Examples of the +things that we are considering protecting from root mode processes +are: + - Protection of the page tables (no NR mode process can modify an + page table) + - Protection of kernel executable code RX only + - Protection of kernel data structures RO + + +REFERENCES: + +[1] Nested Kernel: An Operating System Architecture for Intra-Kernel +Privilege Separation, Nathan Dautenhahn, Theodoros Kasampalis, Will +Dietz, John Criswell, Vikram Adve, ASPLOS '15, Proceedings of the +Twentieth International Conference on Architectural Support for +Programming Languages and Operating Systems, March 2015. + +[2] Dune: Safe user-level access to privileged CPU features, Adam +Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, +and Christos Kozyrakis, OSDI '12, Proceedings of the 10th USENIX +Symposium on Operating Systems Design and Implementation, October +2012.