Add Memorizer to projects

This commits an initial version of the Memorizer tracing tool. It collects and outputs detailed data on the objects (traced from kmalloc/kmem_cache_alloc) and accesses, tracking the context of each event with respect to thread ID, program counter, and for allocations name of process. Signed-off-by: Nathan Dautenhahn <ndd@cis.upenn.edu>
2025-09-05 00:42:54 +00:00 · 2017-07-07 23:42:05 -04:00
parent 78e5ddc675
commit b47c64f525
11 changed files with 7883 additions and 0 deletions
--- a/projects/README.md
+++ b/projects/README.md
@@ -24,6 +24,8 @@ If you want to create a project, please submit a pull request to create a new di
  IMA policies
 - [shiftfs](shiftfs/) is a filesystem for mapping mountpoints across user
  namespaces
 - [Memorizer](memorizer/) is a tool to trace intra-kernel
  memory operations. 
 ## Current projects not yet documented
 - VMWare support (VMWare)
--- a/projects/memorizer/README.md
+++ b/projects/memorizer/README.md
@@ -0,0 +1,33 @@
 # Memorizer
 Memorizer is a tool to trace fine-grained intra-kernel
 operations. The goal is to track interactions with memory
 objects for the purpose of analyzing fine-grained
 interactions amongst components and execution contexts.
 Memorizer tracks the following object operations: creation
 (alloc), destruction (free), modify (store), access (load),
 call, and return. 
 Nathan D. ([@ndauten]) presented the umbrella project,
 Opportunistic Privilege Separation (OPS), and Memorizer at
 the [7/9/17 LinuxKit SIG](../../reports/2017-07-09.md) and
 [slides](http://nathandautenhahn.com/talks/2017-06-21_ops+memorizer-linuxkit-sig/linuxkit-sig-remark.html#1)
 ## Usage
 See [manual usage docs](./docs/memorizer.txt). Be careful
 though because if the event queues are not drained then the
 system will run out of memory. 
 For controlled use see [script + readme](./docs/memorizer/).
 This script is not automatically inserted into the runtime
 yet.
 ## Issues
 - KASAN is reporting some errors within itself. This is
  noisy. Can reduce the console log output level to < 3,
  e.g., `echo 3 > /proc/sys/kernel/printk`
 - Source should be included soon, but for now there is an
  image on Docker Hub. 
--- a/projects/memorizer/docs/memorizer.txt
+++ b/projects/memorizer/docs/memorizer.txt
@@ -0,0 +1,92 @@
             +--------------------------------------------------+
             | Memorizer: Kernel Memory Access Patterns (KMAPs) |
             +--------------------------------------------------+
 Introduction
 ============
 Memorizer is a tool to record information about access to kernel objects:
 specifically, it counts memory accesses from distinct IP addresses in the
 kernel source and also the PID that accessed, thereby providing spatial and
 temporal dimensions.
 Interface via debugfs
 =====================
 The tool has a very simple interface at the moment. It can:
 - Print out some statistics about memory allocations and memory accesses
 - Control enable/disable of memory object allocation tracking and memory access
  tracing
 - Print the KMAP using the debugfs file system
 Enable object allocation tracking:
 ```bash
 echo 1 > /sys/kernel/debug/memorizer/memorizer_enabled
 ```
 Enable object access tracking:
 ```bash
 echo 1 > /sys/kernel/debug/memorizer/memorizer_log_access
 ```
 Show allocation statistics:
 ```bash
 cat /sys/kernel/debug/memorizer/show_stats
 ```
 Clear free'd objects:
 ```bash
 echo 1 > /sys/kernel/debug/memorizer/clear_object_list
 ```
 Using Memorizer to Collect KMAPs
 ================================
 Memorizer lacks push style logging and clearing of the object lists, therefore
 it has the propensity of overflowing memory. The only way to manage the log and
 current set of objects is to manually clear and print the KMAPs.
 Therefore, a typical run using memorizer to create KMAPs includes:
 ```bash
 # mount the debugfs filesystem if it isn't already
 mount -t debugfs nodev /sys/kernel/debug
 # clear free objects: the current system traces from boot with a lot of
 # uninteresting data
 echo 1 > /sys/kernel/debug/clear_object_list
 # enable memorizer object access tracking, which by default is off
 echo 1 > /sys/kernel/debug/memorizer_log_access
 # Now run whatever test
 tar zcf something.tar.gz /somedir &
 ssh u@h:/somefile 
 ...
 # Disable access logging
 echo 0 > /sys/kernel/debug/memorizer/memorizer_log_access
 # Disable memorizer object tracking: isn't necessary but will reduce noise
 echo 0 > /sys/kernel/debug/memorizer/memorizer_enabled
 # Cat the results: make sure to pipe to something
 cat /sys/kernel/debug/memorizer/kmap > test.kmap
 ```
 Output Format
 =============
 Memorizer outputs data as text, which may change if space is a problem. The
 format of the kmap file is as follows:
 alloc_ip,pid,obj_va_ptr,size,alloc_jiffies,free_jiffies,free_ip,executable
  access_ip,access_pid,write_count,read_count
  access_ip,access_pid,write_count,read_count
  access_ip,access_pid,write_count,read_count
    ...
    ...
 There are a few special error codes: 
    - Not all free_ip's could be obtained correctly and therefore some of these
      will be 0.
    - There is a bug where we insert into the live object map over another
      allocation, this implies that we are missing a free. So for now we mark
      the free_ip as 0xDEADBEEF.
--- a/projects/memorizer/docs/scripts/README.txt
+++ b/projects/memorizer/docs/scripts/README.txt
@@ -0,0 +1,36 @@
 Files: memorizer.py, test_memorizer.py
 Dependencies:
 In order to run the test_memorizer w/ linux test suite, you must 
 wget the latest version from the ltp github repo and set it up.
 Ex:
 wget https://github.com/linux-test-project/ltp/releases/download/20170116/ltp-full-20170116.tar.bz2
 tar xvfj ltp-full-20170116.tar.bz2
 # cd into the untarred dir
 ./configure
 make
 sudo make install
 Good documentation / examples: http://ltp.sourceforge.net/documentation/how-to/ltp.php
 memorizer.py: accepts processes to run in quotes. 
 Ex: python memorizer.py "ls" "mkdir dir"
 In order to run the script, you must have your user be in the 
 memorizer group, which you should setup if not.
 How-to: sudo groupadd memorizer; sudo usermod -aG memorizer <user>
 You will be queried to enter your pw in order to set group 
 permissions on the /sys/kernel/debug dirs which include ftrace
 and memorizer.
 test_memorizer.py: accepts either -e, -m, or -h flags.
 Ex: python test_memorizer.py -e
 *All modes will run the setup/cleanup checks to ensure all virtual nodes
 are being set correctly.
 -e: Runs ls, wget, and tar sequentially.
 -m: Runs the linux test suite and saves a human-readable log to 
 /opt/ltp/results/ltp.log
 -h: runs both -e and -m
 As with the memorizer.py, you will need your user to be in the
 memorizer group.  Additionally, you will be queried to enter your
 pw in order to set group permissions on the /opt/ltp dirs.
--- a/projects/memorizer/docs/scripts/memorizer.py
+++ b/projects/memorizer/docs/scripts/memorizer.py
@@ -0,0 +1,152 @@
 import sys,threading,os,subprocess,operator,time
 mem_path = "/sys/kernel/debug/memorizer/"
 directory = ""
 completed = False
 def worker(cmd):
  ret = os.system(cmd)    
  if(ret != 0):
    print "Failed attempt on: " + cmd
    exit(1)
 def basic_cleanup():
  print "Basic tests completed. Now cleaning up."
  ret = os.system("rm UPennlogo2.jpg")
 def memManager():
  while(not completed):
    stats = subprocess.check_output(["free"])
    stats_list = stats.split()
    total_mem = float(stats_list[7])
    used_mem = float(stats_list[8])
    memory_usage = used_mem / total_mem
    if(memory_usage > 0.8):
      ret = os.system("cat " + mem_path + "kmap >> " + directory + "test.kmap")
      if ret != 0:
        print "Failed to append kmap to temp file"
        exit(1)
      ret = os.system("echo 1 > " + mem_path + "clear_printed_list")
      if ret != 0:
        print "Failed to clear printed list"
        exit(1)
    time.sleep(2)
 def startup():
  ret = os.system("sudo chgrp -R memorizer /opt/")
  if ret != 0:
    print "Failed to change group permissions of /opt/"
    exit(1)
  os.system("sudo chmod -R g+wrx /opt/")
  if ret != 0:
    print "Failed to grant wrx permissions to /opt/"
    exit(1)
  # Setup group permissions to ftrace & memorizer directories
  ret = os.system("sudo chgrp -R memorizer /sys/kernel/debug/")
  if ret != 0:
    print "Failed to change memorizer group permissions to /sys/kernel/debug/"
    exit(1)
  ret = os.system("sudo chmod -R g+wrx /sys/kernel/debug/")
  if ret != 0:
    print "Failed to grant wrx persmissions to /sys/kernel/debug/"
    exit(1)
  # Memorizer Startup
  ret = os.system("echo 1 > " + mem_path + "clear_object_list")
  if ret != 0:
    print "Failed to clear object list"
    exit(1)
  ret = os.system("echo 0 > " + mem_path + "print_live_obj")
  if ret != 0:
    print "Failed to disable live object dumping"
    exit(1)
  ret = os.system("echo 1 > " + mem_path + "memorizer_enabled")
  if ret != 0:
    print "Failed to enable memorizer object allocation tracking"
    exit(1)
  ret = os.system("echo 1 > " + mem_path + "memorizer_log_access")
  if ret != 0:
    print "Failed to enable memorizer object access tracking"
    exit(1)
 def cleanup():
  # Memorizer cleanup
  ret = os.system("echo 0 > " + mem_path + "memorizer_log_access")
  if ret != 0:
    print "Failed to disable memorizer object access tracking"
    exit(1)
  ret = os.system("echo 0 > " + mem_path + "memorizer_enabled")
  if ret != 0:
    print "Failed to disable memorizer object allocation tracking"
    exit(1)
  # Print stats
  ret = os.system("cat " + mem_path + "show_stats")
  if ret != 0:
    print "Failed to display memorizer stats"
    exit(1)
  ret = os.system("echo 1 > " + mem_path + "print_live_obj")
  if ret != 0:
    print "Failed to enable live object dumping"
    exit(1)
  # Make local copies of outputs
  ret = os.system("cat " + mem_path + "kmap >> " +directory+ "test.kmap")
  if ret != 0:
    print "Failed to copy live and freed objs to kmap"
    exit(1)
  ret = os.system("echo 1 > " + mem_path + "clear_object_list")
  if ret != 0:
    print "Failed to clear all freed objects in obj list"
    exit(1)
 def main(argv):
  global completed
  global directory
  if len(sys.argv) == 1:
    print "Invalid/missing arg. Please enter -e for basic tests, -m for ltp tests, and/or specify a full process to run in quotes. Specify path using the -p <path> otherwise default to ."
    return
  startup()
  processes = []
  easy_processes = False
  next_arg = False
  for arg in argv:
    if next_arg: 
      next_arg = False
      directory = str(arg) + "/"
    elif arg == '-p':
      next_arg = True
    #User wants to run ltp
    elif arg == '-m':
      print "Performing ltp tests" 
      processes.append("/opt/ltp/runltp -p -l ltp.log")
      print "See /opt/ltp/results/ltp.log for ltp results"
    #User wants to run wget,ls,etc.
    elif arg == '-e':
      easy_processes = True
      print "Performing basic ls test"
      processes.append("ls")
      print "Performing wget test"
      processes.append("wget http://www.sas.upenn.edu/~egme/UPennlogo2.jpg")
  print "Attempting to remove any existing kmaps in the specified path"
  os.system("rm " + directory + "test.kmap")
  print "Startup completed. Generating threads."
  manager = threading.Thread(target=memManager, args=())
  manager.start()
  threads = []
  for process in processes:
    try:
      t = threading.Thread(target=worker, args=(process,))
      threads.append(t)
      t.start()
    except:
      print "Error: unable to start thread"
  for thr in threads:
    thr.join()
  completed = True
  manager.join()
  print "Threads ran to completion. Cleaning up."
  basic_cleanup()
  cleanup()
  print "Cleanup successful."
  return 0
 if __name__ == "__main__":
  main(sys.argv)
--- a/projects/memorizer/kernel-memorizer/Dockerfile
+++ b/projects/memorizer/kernel-memorizer/Dockerfile
@@ -0,0 +1,122 @@
 FROM linuxkit/alpine:a44da41b988024aa2c73b28dee8a51d026f6240b AS kernel-build
 RUN apk add \
    argp-standalone \
    automake \
    bash \
    bc \
    binutils-dev \
    bison \
    build-base \
    curl \
    diffutils \
    flex \
    git \
    gmp-dev \
    gnupg \
    installkernel \
    kmod \
    libelf-dev \
    libressl-dev \
    libunwind-dev \
    linux-headers \
    ncurses-dev \
    sed \
    squashfs-tools \
    tar \
    xz \
    xz-dev \
    zlib-dev
 ARG KERNEL_VERSION
 ARG KERNEL_SERIES
 ARG DEBUG
 ENV KERNEL_SOURCE=https://www.kernel.org/pub/linux/kernel/v4.x/linux-${KERNEL_VERSION}.tar.xz
 ENV KERNEL_SHA256_SUMS=https://www.kernel.org/pub/linux/kernel/v4.x/sha256sums.asc
 ENV KERNEL_PGP2_SIGN=https://www.kernel.org/pub/linux/kernel/v4.x/linux-${KERNEL_VERSION}.tar.sign
 # PGP keys: 589DA6B1 (greg@kroah.com) & 6092693E (autosigner@kernel.org) & 00411886 (torvalds@linux-foundation.org)
 COPY keys.asc keys.asc
 # Download and verify kernel
 RUN curl -fsSLO ${KERNEL_SHA256_SUMS} && \
    gpg2 -q --import keys.asc && \
    gpg2 --verify sha256sums.asc && \
    KERNEL_SHA256=$(grep linux-${KERNEL_VERSION}.tar.xz sha256sums.asc | cut -d ' ' -f 1) && \
    curl -fsSLO ${KERNEL_SOURCE} && \
    echo "${KERNEL_SHA256}  linux-${KERNEL_VERSION}.tar.xz" | sha256sum -c - && \
    xz -d linux-${KERNEL_VERSION}.tar.xz && \
    curl -fsSLO ${KERNEL_PGP2_SIGN} && \
    gpg2 --verify linux-${KERNEL_VERSION}.tar.sign linux-${KERNEL_VERSION}.tar && \
    cat linux-${KERNEL_VERSION}.tar | tar --absolute-names -x && mv /linux-${KERNEL_VERSION} /linux
 #COPY linux-slice /linux
 COPY kernel_config-${KERNEL_SERIES} /linux/arch/x86/configs/x86_64_defconfig
 COPY kernel_config.debug /linux/debug_config
 RUN if [ -n "${DEBUG}" ]; then \
    sed -i 's/CONFIG_PANIC_ON_OOPS=y/# CONFIG_PANIC_ON_OOPS is not set/' /linux/arch/x86/configs/x86_64_defconfig; \
    cat /linux/debug_config >> /linux/arch/x86/configs/x86_64_defconfig; \
    fi
 # Apply local patches
 COPY patches-${KERNEL_SERIES} /patches
 WORKDIR /linux
 RUN set -e && for patch in /patches/*.patch; do \
        echo "Applying $patch"; \
        patch -p1 < "$patch"; \
    done
 RUN mkdir /out
 # Kernel
 RUN make defconfig && \
    make oldconfig && \
    make -j "$(getconf _NPROCESSORS_ONLN)" KCFLAGS="-fno-pie" && \
    cp arch/x86_64/boot/bzImage /out/kernel && \
    cp System.map /out && \
    ([ -n "${DEBUG}" ] && cp vmlinux /out || true)
 # Modules
 RUN make INSTALL_MOD_PATH=/tmp/kernel-modules modules_install && \
    ( DVER=$(basename $(find /tmp/kernel-modules/lib/modules/ -mindepth 1 -maxdepth 1)) && \
      cd /tmp/kernel-modules/lib/modules/$DVER && \
      rm build source && \
      ln -s /usr/src/linux-headers-$DVER build ) && \
    ( cd /tmp/kernel-modules && tar cf /out/kernel.tar lib )
 # Headers (userspace API)
 RUN mkdir -p /tmp/kernel-headers/usr && \
    make INSTALL_HDR_PATH=/tmp/kernel-headers/usr headers_install && \
    ( cd /tmp/kernel-headers && tar cf /out/kernel-headers.tar usr )
 # Headers (kernel development)
 RUN DVER=$(basename $(find /tmp/kernel-modules/lib/modules/ -mindepth 1 -maxdepth 1)) && \
    dir=/tmp/usr/src/linux-headers-$DVER && \
    mkdir -p $dir && \
    cp /linux/.config $dir && \
    cp /linux/Module.symvers $dir && \
    find . -path './include/*' -prune -o \
           -path './arch/*/include' -prune -o \
           -path './scripts/*' -prune -o \
           -type f \( -name 'Makefile*' -o -name 'Kconfig*' -o -name 'Kbuild*' -o \
                      -name '*.lds' -o -name '*.pl' -o -name '*.sh' \) | \
         tar cf - -T - | (cd $dir; tar xf -) && \
    ( cd /tmp && tar cf /out/kernel-dev.tar usr/src )
 RUN printf "KERNEL_SOURCE=${KERNEL_SOURCE}\n" > /out/kernel-source-info
 # perf (Don't compile for 4.4.x, it's broken and tedious to fix)
 #RUN if [ "${KERNEL_SERIES}" != "4.4.x" ]; then \
       #mkdir -p /build/perf && \
       #make -C tools/perf LDFLAGS=-static O=/build/perf && \
       #strip /build/perf/perf && \
       #cp /build/perf/perf /out; \
     #fi
 FROM scratch
 ENTRYPOINT []
 CMD []
 WORKDIR /
 COPY --from=kernel-build /out/* /
--- a/projects/memorizer/kernel-memorizer/Makefile
+++ b/projects/memorizer/kernel-memorizer/Makefile
@@ -0,0 +1,83 @@
 # This builds the supported LinuxKit kernels. Kernels are wrapped up
 # in a scratch container, which contains the bzImage, a tar
 # ball with modules, the kernel sources, and in some case, the perf binary.
 #
 # Each kernel is pushed to hub twice:
 # - linuxkit/kernel:<kernel>.<major>.<minor>-<hash>
 # - linuxkit/kernel:<kernel>.<major>.<minor>
 # The <hash> is the git tree hash of the current directory. The build
 # will only rebuild the kernel image if the git tree hash changed.
 #
 # For some kernels we also build a separate package containing the perf utility
 # which is specific to a given kernel. perf packages are tagged the same way
 # kernel packages.
 # Git tree hash of this directory. Override to force build
 HASH?=$(shell git ls-tree HEAD -- ../$(notdir $(CURDIR)) | awk '{print $$3}')
 # Name and Org on Hub
 ORG?=linuxkitprojects
 IMAGE:=kernel-memorizer
 IMAGE_PERF:=kernel-perf
 # Add '-dirty' to hash if the repository is not clean. make does not
 # concatenate strings without spaces, so we use the documented trick
 # of replacing the space with nothing.
 DIRTY=$(shell git diff-index --quiet HEAD --; echo $$?)
 ifneq ($(DIRTY),0)
 HASH+=-dirty
 nullstring :=
 space := $(nullstring) $(nullstring)
 TAG=$(subst $(space),,$(HASH))
 else
 TAG=$(HASH)
 endif
 .PHONY: check tag push
 # Targets:
 # build: builds all kernels
 # push:  pushes and sign all tagged kernel images to hub
 build:
 push:
 # A template for defining kernel build
 # Arguments:
 # $1: Full kernel version, e.g., 4.9.22
 # $2: Kernel "series", e.g., 4.9.x
 # $3: Build a debug kernel (used as suffix for image)
 # This defines targets like:
 # build_4.9.x and  push_4.9.x and adds them as dependencies
 # to the global targets
 # Set $3 to "_dbg", to build debug kernels. This defines targets like
 # build_4.9.x_dbg and adds "_dbg" to the hub image name.
 define kernel
 build_$(2)$(3): Dockerfile Makefile $(wildcard patches-$(2)/*) kernel_config-$(2) kernel_config.debug
 	docker pull $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) || \
 		docker build \
 			--build-arg KERNEL_VERSION=$(1) \
 			--build-arg KERNEL_SERIES=$(2) \
 			--build-arg DEBUG=$(3) \
 			--no-cache -t $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) .
 push_$(2)$(3): build_$(2)$(3)
 	@if [ $(DIRTY) -ne 0 ]; then echo "Your repository is not clean. Will not push image"; exit 1; fi
 	DOCKER_CONTENT_TRUST=1 docker pull $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) || \
 		(DOCKER_CONTENT_TRUST=1 docker push $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) && \
 		 docker tag $(ORG)/$(IMAGE):$(1)$(3)-$(TAG) $(ORG)/$(IMAGE):$(1)$(3) && \
 		 DOCKER_CONTENT_TRUST=1 docker push $(ORG)/$(IMAGE):$(1)$(3))
 build: build_$(2)$(3)
 push: push_$(2)$(3)
 endef
 #
 # Build Targets
 # Debug targets only for latest stable and LTS stable
 #
 #$(eval $(call kernel,4.10,4.10.x))
 $(eval $(call kernel,4.10,4.10.x,_dbg))
 #$(eval $(call kernel,4.11.7,4.11.x,_dbg))
 #$(eval $(call kernel,4.9.34,4.9.x))
 #$(eval $(call kernel,4.9.34,4.9.x,_dbg))
 #$(eval $(call kernel,4.4.74,4.4.x))
--- a/projects/memorizer/kernel-memorizer/kernel_config-4.10.x
+++ b/projects/memorizer/kernel-memorizer/kernel_config-4.10.x
--- a/projects/memorizer/kernel-memorizer/kernel_config.debug
+++ b/projects/memorizer/kernel-memorizer/kernel_config.debug
@@ -0,0 +1,26 @@
 ## LinuxKit DEBUG OPTIONS ##
 CONFIG_LOCKDEP=y
 CONFIG_FRAME_POINTER=y
 CONFIG_LOCKUP_DETECTOR=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_DEBUG_TIMEKEEPING=y
 CONFIG_DEBUG_RT_MUTEXES=y
 CONFIG_DEBUG_SPINLOCK=y
 CONFIG_DEBUG_MUTEXES=y
 CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y
 CONFIG_DEBUG_LOCK_ALLOC=y
 CONFIG_PROVE_LOCKING=y
 CONFIG_LOCK_STAT=y
 CONFIG_DEBUG_ATOMIC_SLEEP=y
 CONFIG_DEBUG_LIST=y
 CONFIG_DEBUG_NOTIFIERS=y
 CONFIG_PROVE_RCU=y
 CONFIG_RCU_TRACE=y
 CONFIG_KGDB=y
 CONFIG_KGDB_SERIAL_CONSOLE=y
 CONFIG_KGDBOC=y
 CONFIG_DEBUG_RODATA_TEST=y
 CONFIG_DEBUG_WX=y
--- a/projects/memorizer/kernel-memorizer/keys.asc
+++ b/projects/memorizer/kernel-memorizer/keys.asc
--- a/projects/memorizer/memorizer.yml
+++ b/projects/memorizer/memorizer.yml
@@ -0,0 +1,19 @@
 kernel:
  image: "linuxkitprojects/kernel-memorizer:4.10_dbg-17e2eee03ab59f8df8a9c10ace003a84aec2f540"
  cmdline: "console=ttyS0 page_poison=1"
 init:
  - linuxkit/init:059b2bb4b6efa5c58cf53fed4d0ea863521959fc
  - linuxkit/runc:4a35484aa6f90a1f06cdf1fb36f7056926a084b9
  - linuxkit/containerd:b6ffbb669248e3369081a6c4427026aa968a2385
 onboot:
  - name: dhcpcd
    image: linuxkit/dhcpcd:4b7b8bb024cebb1bbb9c8026d44d7cbc8e202c41
    command: ["/sbin/dhcpcd", "--nobackground", "-f", "/dhcpcd.conf", "-1"]
 services:
  - name: getty
    image: linuxkit/getty:0bd92d5f906491c20e4177c57f965338fe5a8c5f
    env:
     - INSECURE=true
 trust:
  org:
    - linuxkit