Add Memorizer to projects

This commits an initial version of the Memorizer tracing tool. It collects and
outputs detailed data on the objects (traced from kmalloc/kmem_cache_alloc) and
accesses, tracking the context of each event with respect to thread ID, program
counter, and for allocations name of process.

Signed-off-by: Nathan Dautenhahn <ndd@cis.upenn.edu>
This commit is contained in:
Nathan Dautenhahn 2017-07-07 23:42:05 -04:00
parent 78e5ddc675
commit b47c64f525
11 changed files with 7883 additions and 0 deletions

View File

@ -24,6 +24,8 @@ If you want to create a project, please submit a pull request to create a new di
IMA policies IMA policies
- [shiftfs](shiftfs/) is a filesystem for mapping mountpoints across user - [shiftfs](shiftfs/) is a filesystem for mapping mountpoints across user
namespaces namespaces
- [Memorizer](memorizer/) is a tool to trace intra-kernel
memory operations.
## Current projects not yet documented ## Current projects not yet documented
- VMWare support (VMWare) - VMWare support (VMWare)

View File

@ -0,0 +1,33 @@
# Memorizer
Memorizer is a tool to trace fine-grained intra-kernel
operations. The goal is to track interactions with memory
objects for the purpose of analyzing fine-grained
interactions amongst components and execution contexts.
Memorizer tracks the following object operations: creation
(alloc), destruction (free), modify (store), access (load),
call, and return.
Nathan D. ([@ndauten]) presented the umbrella project,
Opportunistic Privilege Separation (OPS), and Memorizer at
the [7/9/17 LinuxKit SIG](../../reports/2017-07-09.md) and
[slides](http://nathandautenhahn.com/talks/2017-06-21_ops+memorizer-linuxkit-sig/linuxkit-sig-remark.html#1)
## Usage
See [manual usage docs](./docs/memorizer.txt). Be careful
though because if the event queues are not drained then the
system will run out of memory.
For controlled use see [script + readme](./docs/memorizer/).
This script is not automatically inserted into the runtime
yet.
## Issues
- KASAN is reporting some errors within itself. This is
noisy. Can reduce the console log output level to < 3,
e.g., `echo 3 > /proc/sys/kernel/printk`
- Source should be included soon, but for now there is an
image on Docker Hub.

View File

@ -0,0 +1,92 @@
+--------------------------------------------------+
| Memorizer: Kernel Memory Access Patterns (KMAPs) |
+--------------------------------------------------+
Introduction
============
Memorizer is a tool to record information about access to kernel objects:
specifically, it counts memory accesses from distinct IP addresses in the
kernel source and also the PID that accessed, thereby providing spatial and
temporal dimensions.
Interface via debugfs
=====================
The tool has a very simple interface at the moment. It can:
- Print out some statistics about memory allocations and memory accesses
- Control enable/disable of memory object allocation tracking and memory access
tracing
- Print the KMAP using the debugfs file system
Enable object allocation tracking:
```bash
echo 1 > /sys/kernel/debug/memorizer/memorizer_enabled
```
Enable object access tracking:
```bash
echo 1 > /sys/kernel/debug/memorizer/memorizer_log_access
```
Show allocation statistics:
```bash
cat /sys/kernel/debug/memorizer/show_stats
```
Clear free'd objects:
```bash
echo 1 > /sys/kernel/debug/memorizer/clear_object_list
```
Using Memorizer to Collect KMAPs
================================
Memorizer lacks push style logging and clearing of the object lists, therefore
it has the propensity of overflowing memory. The only way to manage the log and
current set of objects is to manually clear and print the KMAPs.
Therefore, a typical run using memorizer to create KMAPs includes:
```bash
# mount the debugfs filesystem if it isn't already
mount -t debugfs nodev /sys/kernel/debug
# clear free objects: the current system traces from boot with a lot of
# uninteresting data
echo 1 > /sys/kernel/debug/clear_object_list
# enable memorizer object access tracking, which by default is off
echo 1 > /sys/kernel/debug/memorizer_log_access
# Now run whatever test
tar zcf something.tar.gz /somedir &
ssh u@h:/somefile
...
# Disable access logging
echo 0 > /sys/kernel/debug/memorizer/memorizer_log_access
# Disable memorizer object tracking: isn't necessary but will reduce noise
echo 0 > /sys/kernel/debug/memorizer/memorizer_enabled
# Cat the results: make sure to pipe to something
cat /sys/kernel/debug/memorizer/kmap > test.kmap
```
Output Format
=============
Memorizer outputs data as text, which may change if space is a problem. The
format of the kmap file is as follows:
alloc_ip,pid,obj_va_ptr,size,alloc_jiffies,free_jiffies,free_ip,executable
access_ip,access_pid,write_count,read_count
access_ip,access_pid,write_count,read_count
access_ip,access_pid,write_count,read_count
...
...
There are a few special error codes:
- Not all free_ip's could be obtained correctly and therefore some of these
will be 0.
- There is a bug where we insert into the live object map over another
allocation, this implies that we are missing a free. So for now we mark
the free_ip as 0xDEADBEEF.

View File

@ -0,0 +1,36 @@
Files: memorizer.py, test_memorizer.py
Dependencies:
In order to run the test_memorizer w/ linux test suite, you must
wget the latest version from the ltp github repo and set it up.
Ex:
wget https://github.com/linux-test-project/ltp/releases/download/20170116/ltp-full-20170116.tar.bz2
tar xvfj ltp-full-20170116.tar.bz2
# cd into the untarred dir
./configure
make
sudo make install
Good documentation / examples: http://ltp.sourceforge.net/documentation/how-to/ltp.php
memorizer.py: accepts processes to run in quotes.
Ex: python memorizer.py "ls" "mkdir dir"
In order to run the script, you must have your user be in the
memorizer group, which you should setup if not.
How-to: sudo groupadd memorizer; sudo usermod -aG memorizer <user>
You will be queried to enter your pw in order to set group
permissions on the /sys/kernel/debug dirs which include ftrace
and memorizer.
test_memorizer.py: accepts either -e, -m, or -h flags.
Ex: python test_memorizer.py -e
*All modes will run the setup/cleanup checks to ensure all virtual nodes
are being set correctly.
-e: Runs ls, wget, and tar sequentially.
-m: Runs the linux test suite and saves a human-readable log to
/opt/ltp/results/ltp.log
-h: runs both -e and -m
As with the memorizer.py, you will need your user to be in the
memorizer group. Additionally, you will be queried to enter your
pw in order to set group permissions on the /opt/ltp dirs.

View File

@ -0,0 +1,152 @@
import sys,threading,os,subprocess,operator,time
mem_path = "/sys/kernel/debug/memorizer/"
directory = ""
completed = False
def worker(cmd):
ret = os.system(cmd)
if(ret != 0):
print "Failed attempt on: " + cmd
exit(1)
def basic_cleanup():
print "Basic tests completed. Now cleaning up."
ret = os.system("rm UPennlogo2.jpg")
def memManager():
while(not completed):
stats = subprocess.check_output(["free"])
stats_list = stats.split()
total_mem = float(stats_list[7])
used_mem = float(stats_list[8])
memory_usage = used_mem / total_mem
if(memory_usage > 0.8):
ret = os.system("cat " + mem_path + "kmap >> " + directory + "test.kmap")
if ret != 0:
print "Failed to append kmap to temp file"
exit(1)
ret = os.system("echo 1 > " + mem_path + "clear_printed_list")
if ret != 0:
print "Failed to clear printed list"
exit(1)
time.sleep(2)
def startup():
ret = os.system("sudo chgrp -R memorizer /opt/")
if ret != 0:
print "Failed to change group permissions of /opt/"
exit(1)
os.system("sudo chmod -R g+wrx /opt/")
if ret != 0:
print "Failed to grant wrx permissions to /opt/"
exit(1)
# Setup group permissions to ftrace & memorizer directories
ret = os.system("sudo chgrp -R memorizer /sys/kernel/debug/")
if ret != 0:
print "Failed to change memorizer group permissions to /sys/kernel/debug/"
exit(1)
ret = os.system("sudo chmod -R g+wrx /sys/kernel/debug/")
if ret != 0:
print "Failed to grant wrx persmissions to /sys/kernel/debug/"
exit(1)
# Memorizer Startup
ret = os.system("echo 1 > " + mem_path + "clear_object_list")
if ret != 0:
print "Failed to clear object list"
exit(1)
ret = os.system("echo 0 > " + mem_path + "print_live_obj")
if ret != 0:
print "Failed to disable live object dumping"
exit(1)
ret = os.system("echo 1 > " + mem_path + "memorizer_enabled")
if ret != 0:
print "Failed to enable memorizer object allocation tracking"
exit(1)
ret = os.system("echo 1 > " + mem_path + "memorizer_log_access")
if ret != 0:
print "Failed to enable memorizer object access tracking"
exit(1)
def cleanup():
# Memorizer cleanup
ret = os.system("echo 0 > " + mem_path + "memorizer_log_access")
if ret != 0:
print "Failed to disable memorizer object access tracking"
exit(1)
ret = os.system("echo 0 > " + mem_path + "memorizer_enabled")
if ret != 0:
print "Failed to disable memorizer object allocation tracking"
exit(1)
# Print stats
ret = os.system("cat " + mem_path + "show_stats")
if ret != 0:
print "Failed to display memorizer stats"
exit(1)
ret = os.system("echo 1 > " + mem_path + "print_live_obj")
if ret != 0:
print "Failed to enable live object dumping"
exit(1)
# Make local copies of outputs
ret = os.system("cat " + mem_path + "kmap >> " +directory+ "test.kmap")
if ret != 0:
print "Failed to copy live and freed objs to kmap"
exit(1)
ret = os.system("echo 1 > " + mem_path + "clear_object_list")
if ret != 0:
print "Failed to clear all freed objects in obj list"
exit(1)
def main(argv):
global completed
global directory
if len(sys.argv) == 1:
print "Invalid/missing arg. Please enter -e for basic tests, -m for ltp tests, and/or specify a full process to run in quotes. Specify path using the -p <path> otherwise default to ."
return
startup()
processes = []
easy_processes = False
next_arg = False
for arg in argv:
if next_arg:
next_arg = False
directory = str(arg) + "/"
elif arg == '-p':
next_arg = True
#User wants to run ltp
elif arg == '-m':
print "Performing ltp tests"
processes.append("/opt/ltp/runltp -p -l ltp.log")
print "See /opt/ltp/results/ltp.log for ltp results"
#User wants to run wget,ls,etc.
elif arg == '-e':
easy_processes = True
print "Performing basic ls test"
processes.append("ls")
print "Performing wget test"
processes.append("wget http://www.sas.upenn.edu/~egme/UPennlogo2.jpg")
print "Attempting to remove any existing kmaps in the specified path"
os.system("rm " + directory + "test.kmap")
print "Startup completed. Generating threads."
manager = threading.Thread(target=memManager, args=())
manager.start()
threads = []
for process in processes:
try:
t = threading.Thread(target=worker, args=(process,))
threads.append(t)
t.start()
except:
print "Error: unable to start thread"
for thr in threads:
thr.join()
completed = True
manager.join()
print "Threads ran to completion. Cleaning up."
basic_cleanup()
cleanup()
print "Cleanup successful."
return 0
if __name__ == "__main__":
main(sys.argv)

View File

@ -0,0 +1,122 @@
FROM linuxkit/alpine:a44da41b988024aa2c73b28dee8a51d026f6240b AS kernel-build
RUN apk add \
argp-standalone \
automake \
bash \
bc \
binutils-dev \
bison \
build-base \
curl \
diffutils \
flex \
git \
gmp-dev \
gnupg \
installkernel \
kmod \
libelf-dev \
libressl-dev \
libunwind-dev \
linux-headers \
ncurses-dev \
sed \
squashfs-tools \
tar \
xz \
xz-dev \
zlib-dev
ARG KERNEL_VERSION
ARG KERNEL_SERIES
ARG DEBUG
ENV KERNEL_SOURCE=https://www.kernel.org/pub/linux/kernel/v4.x/linux-${KERNEL_VERSION}.tar.xz
ENV KERNEL_SHA256_SUMS=https://www.kernel.org/pub/linux/kernel/v4.x/sha256sums.asc
ENV KERNEL_PGP2_SIGN=https://www.kernel.org/pub/linux/kernel/v4.x/linux-${KERNEL_VERSION}.tar.sign
# PGP keys: 589DA6B1 (greg@kroah.com) & 6092693E (autosigner@kernel.org) & 00411886 (torvalds@linux-foundation.org)
COPY keys.asc keys.asc
# Download and verify kernel
RUN curl -fsSLO ${KERNEL_SHA256_SUMS} && \
gpg2 -q --import keys.asc && \
gpg2 --verify sha256sums.asc && \
KERNEL_SHA256=$(grep linux-${KERNEL_VERSION}.tar.xz sha256sums.asc | cut -d ' ' -f 1) && \
curl -fsSLO ${KERNEL_SOURCE} && \
echo "${KERNEL_SHA256} linux-${KERNEL_VERSION}.tar.xz" | sha256sum -c - && \
xz -d linux-${KERNEL_VERSION}.tar.xz && \
curl -fsSLO ${KERNEL_PGP2_SIGN} && \
gpg2 --verify linux-${KERNEL_VERSION}.tar.sign linux-${KERNEL_VERSION}.tar && \
cat linux-${KERNEL_VERSION}.tar | tar --absolute-names -x && mv /linux-${KERNEL_VERSION} /linux
#COPY linux-slice /linux
COPY kernel_config-${KERNEL_SERIES} /linux/arch/x86/configs/x86_64_defconfig
COPY kernel_config.debug /linux/debug_config
RUN if [ -n "${DEBUG}" ]; then \
sed -i 's/CONFIG_PANIC_ON_OOPS=y/# CONFIG_PANIC_ON_OOPS is not set/' /linux/arch/x86/configs/x86_64_defconfig; \
cat /linux/debug_config >> /linux/arch/x86/configs/x86_64_defconfig; \
fi
# Apply local patches
COPY patches-${KERNEL_SERIES} /patches
WORKDIR /linux
RUN set -e && for patch in /patches/*.patch; do \
echo "Applying $patch"; \
patch -p1 < "$patch"; \
done
RUN mkdir /out
# Kernel
RUN make defconfig && \
make oldconfig && \
make -j "$(getconf _NPROCESSORS_ONLN)" KCFLAGS="-fno-pie" && \
cp arch/x86_64/boot/bzImage /out/kernel && \
cp System.map /out && \
([ -n "${DEBUG}" ] && cp vmlinux /out || true)
# Modules
RUN make INSTALL_MOD_PATH=/tmp/kernel-modules modules_install && \
( DVER=$(basename $(find /tmp/kernel-modules/lib/modules/ -mindepth 1 -maxdepth 1)) && \
cd /tmp/kernel-modules/lib/modules/$DVER && \
rm build source && \
ln -s /usr/src/linux-headers-$DVER build ) && \
( cd /tmp/kernel-modules && tar cf /out/kernel.tar lib )
# Headers (userspace API)
RUN mkdir -p /tmp/kernel-headers/usr && \
make INSTALL_HDR_PATH=/tmp/kernel-headers/usr headers_install && \
( cd /tmp/kernel-headers && tar cf /out/kernel-headers.tar usr )
# Headers (kernel development)
RUN DVER=$(basename $(find /tmp/kernel-modules/lib/modules/ -mindepth 1 -maxdepth 1)) && \
dir=/tmp/usr/src/linux-headers-$DVER && \
mkdir -p $dir && \
cp /linux/.config $dir && \
cp /linux/Module.symvers $dir && \
find . -path './include/*' -prune -o \
-path './arch/*/include' -prune -o \
-path './scripts/*' -prune -o \
-type f \( -name 'Makefile*' -o -name 'Kconfig*' -o -name 'Kbuild*' -o \
-name '*.lds' -o -name '*.pl' -o -name '*.sh' \) | \
tar cf - -T - | (cd $dir; tar xf -) && \
( cd /tmp && tar cf /out/kernel-dev.tar usr/src )
RUN printf "KERNEL_SOURCE=${KERNEL_SOURCE}\n" > /out/kernel-source-info
# perf (Don't compile for 4.4.x, it's broken and tedious to fix)
#RUN if [ "${KERNEL_SERIES}" != "4.4.x" ]; then \
#mkdir -p /build/perf && \
#make -C tools/perf LDFLAGS=-static O=/build/perf && \
#strip /build/perf/perf && \
#cp /build/perf/perf /out; \
#fi
FROM scratch
ENTRYPOINT []
CMD []
WORKDIR /
COPY --from=kernel-build /out/* /

View File

@ -0,0 +1,83 @@
# This builds the supported LinuxKit kernels. Kernels are wrapped up
# in a scratch container, which contains the bzImage, a tar
# ball with modules, the kernel sources, and in some case, the perf binary.
#
# Each kernel is pushed to hub twice:
# - linuxkit/kernel:<kernel>.<major>.<minor>-<hash>
# - linuxkit/kernel:<kernel>.<major>.<minor>
# The <hash> is the git tree hash of the current directory. The build
# will only rebuild the kernel image if the git tree hash changed.
#
# For some kernels we also build a separate package containing the perf utility
# which is specific to a given kernel. perf packages are tagged the same way
# kernel packages.
# Git tree hash of this directory. Override to force build
HASH?=$(shell git ls-tree HEAD -- ../$(notdir $(CURDIR)) | awk '{print $$3}')
# Name and Org on Hub
ORG?=linuxkitprojects
IMAGE:=kernel-memorizer
IMAGE_PERF:=kernel-perf
# Add '-dirty' to hash if the repository is not clean. make does not
# concatenate strings without spaces, so we use the documented trick
# of replacing the space with nothing.
DIRTY=$(shell git diff-index --quiet HEAD --; echo $$?)
ifneq ($(DIRTY),0)
HASH+=-dirty
nullstring :=
space := $(nullstring) $(nullstring)
TAG=$(subst $(space),,$(HASH))
else
TAG=$(HASH)
endif
.PHONY: check tag push
# Targets:
# build: builds all kernels
# push: pushes and sign all tagged kernel images to hub
build:
push:
# A template for defining kernel build
# Arguments:
# $1: Full kernel version, e.g., 4.9.22
# $2: Kernel "series", e.g., 4.9.x
# $3: Build a debug kernel (used as suffix for image)
# This defines targets like:
# build_4.9.x and push_4.9.x and adds them as dependencies
# to the global targets
# Set $3 to "_dbg", to build debug kernels. This defines targets like
# build_4.9.x_dbg and adds "_dbg" to the hub image name.
define kernel
build_$(2)$(3): Dockerfile Makefile $(wildcard patches-$(2)/*) kernel_config-$(2) kernel_config.debug
docker pull $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) || \
docker build \
--build-arg KERNEL_VERSION=$(1) \
--build-arg KERNEL_SERIES=$(2) \
--build-arg DEBUG=$(3) \
--no-cache -t $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) .
push_$(2)$(3): build_$(2)$(3)
@if [ $(DIRTY) -ne 0 ]; then echo "Your repository is not clean. Will not push image"; exit 1; fi
DOCKER_CONTENT_TRUST=1 docker pull $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) || \
(DOCKER_CONTENT_TRUST=1 docker push $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) && \
docker tag $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) $(ORG)/$(IMAGE):$(1)$(3) && \
DOCKER_CONTENT_TRUST=1 docker push $(ORG)/$(IMAGE):$(1)$(3))
build: build_$(2)$(3)
push: push_$(2)$(3)
endef
#
# Build Targets
# Debug targets only for latest stable and LTS stable
#
#$(eval $(call kernel,4.10,4.10.x))
$(eval $(call kernel,4.10,4.10.x,_dbg))
#$(eval $(call kernel,4.11.7,4.11.x,_dbg))
#$(eval $(call kernel,4.9.34,4.9.x))
#$(eval $(call kernel,4.9.34,4.9.x,_dbg))
#$(eval $(call kernel,4.4.74,4.4.x))

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,26 @@
## LinuxKit DEBUG OPTIONS ##
CONFIG_LOCKDEP=y
CONFIG_FRAME_POINTER=y
CONFIG_LOCKUP_DETECTOR=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEBUG_TIMEKEEPING=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_NOTIFIERS=y
CONFIG_PROVE_RCU=y
CONFIG_RCU_TRACE=y
CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y
CONFIG_KGDBOC=y
CONFIG_DEBUG_RODATA_TEST=y
CONFIG_DEBUG_WX=y

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,19 @@
kernel:
image: "linuxkitprojects/kernel-memorizer:4.10_dbg-17e2eee03ab59f8df8a9c10ace003a84aec2f540"
cmdline: "console=ttyS0 page_poison=1"
init:
- linuxkit/init:059b2bb4b6efa5c58cf53fed4d0ea863521959fc
- linuxkit/runc:4a35484aa6f90a1f06cdf1fb36f7056926a084b9
- linuxkit/containerd:b6ffbb669248e3369081a6c4427026aa968a2385
onboot:
- name: dhcpcd
image: linuxkit/dhcpcd:4b7b8bb024cebb1bbb9c8026d44d7cbc8e202c41
command: ["/sbin/dhcpcd", "--nobackground", "-f", "/dhcpcd.conf", "-1"]
services:
- name: getty
image: linuxkit/getty:0bd92d5f906491c20e4177c57f965338fe5a8c5f
env:
- INSECURE=true
trust:
org:
- linuxkit