falco

mirror of https://github.com/falcosecurity/falco.git synced 2026-02-21 14:13:27 +00:00

Author	SHA1	Message	Date
Mark Stemm	1f28f85bdf	K8s audit evts (#450 ) * Add new json/webserver libs, embedded webserver Add two new external libraries: - nlohmann-json is a better json library that has stronger use of c++ features like type deduction, better conversion from stl structures, etc. We'll use it to hold generic json objects instead of jsoncpp. - civetweb is an embeddable webserver that will allow us to accept posted json data. New files webserver.{cpp,h} start an embedded webserver that listens for POSTS on a configurable url and passes the json data to the falco engine. New falco config items are under webserver: - enabled: true\|false. Whether to start the embedded webserver or not. - listen_port. Port that webserver listens on - k8s_audit_endpoint: uri on which to accept POSTed k8s audit events. (This commit doesn't compile entirely on its own, but we're grouping these related changes into one commit for clarity). * Don't use relative paths to find lua code You can look directly below PROJECT_SOURCE_DIR. * Reorganize compiler lua code The lua compiler code is generic enough to work on more than just sinsp-based rules, so move the parts of the compiler related to event types and filterchecks out into a standalone lua file sinsp_rule_utils.lua. The checks for event types/filterchecks are now done from rule_loader, and are dependent on a "source" attribute of the rule being "sinsp". We'll be adding additional types of events next that come from sources other than system calls. * Manage separate syscall/k8s audit rulesets Add the ability to manage separate sets of rules (syscall and k8s_audit). Stop using the sinsp_evttype_filter object from the sysdig repo, replacing it with falco_ruleset/falco_sinsp_ruleset from ruleset.{cpp,h}. It has the same methods to add rules, associate them with rulesets, and (for syscall) quickly find the relevant rules for a given syscall/event type. At the falco engine level, there are new parallel interfaces for both types of rules (syscall and k8s_audit) to: - add a rule: add_k8s_audit_filter/add_sinsp_filter - match an event against rules, possibly returning a result: process_sinsp_event/process_k8s_audit_event At the rule loading level, the mechanics of creating filterchecks objects is handled two factories (sinsp_filter_factory and json_event_filter_factory), both of which are held by the engine. * Handle multiple rule types when parsing rules Modify the steps of parsing a rule's filter expression to handle multiple types of rules. Notable changes: - In the rule loader/ast traversal, pass a filter api object down, which is passed back up in the lua parser api calls like nest(), bool_op(), rel_expr(), etc. - The filter api object is either the sinsp factory or k8s audit factory, depending on the rule type. - When the rule is complete, the complete filter is passed to the engine using either add_sinsp_filter()/add_k8s_audit_filter(). * Add multiple output formatting types Add support for multiple output formatters. Notable changes: - The falco engine is passed along to falco_formats to gain access to the engine's factories. - When creating a formatter, the source of the rule is passed along with the format string, which controls which kind of output formatter is created. Also clean up exception handling a bit so all lua callbacks catch all exceptions and convert them into lua errors. * Add support for json, k8s audit filter fields With some corresponding changes in sysdig, you can now create general purpose filter fields and events, which can be tied together with nesting, expressions, and relational operators. The classes here represent an instance of these fields devoted to generic json objects as well as k8s audit events. Notable changes: - json_event: holds a json object, used by all of the below - json_event_filter_check: Has the ability to extract values out of a json_event object and has the ability to define macros that associate a field like "group.field" with a json pointer expression that extracts a single property's value out of the json object. The basic field definition also allows creating an index e.g. group.field[index], where a std::function is responsible for performing the indexing. This class has virtual void methods so it must be overridden. - jevt_filter_check: subclass of json_event_filter_check and defines the following fields: - jevt.time/jevt.rawtime: extracts the time from the underlying json object. - jevt.value[<json pointer>]: general purpose way to extract any json value out of the underlying object. <json pointer> is a json pointer expression - jevt.obj: Return the entire object, stringified. - k8s_audit_filter_check: implements fields that extract values from k8s audit events. Most of the implementation is in the form of macros like ka.user.name, ka.uri, ka.target.name, etc. that just use json pointers to extact the appropriate value from a k8s audit event. More advanced fields like ka.uri.param, ka.req.container.image use indexing to extract individual values out of maps or arrays. - json_event_filter_factory: used by things like the lua parser api, output formatter, etc to create the necessary objects and return them. - json_event_formatter: given a format string, create the necessary fields that will be used to create a resolved string when given a json_event object. * Add ability to list fields Similar to sysdig's -l option, add --list (<source>) to list the fields supported by falco. With no source specified, will print all fields. Source can be "syscall" for inspector fields e.g. what is supported by sysdig, or "k8s_audit" to list fields supported only by the k8s audit support in falco. * Initial set of k8s audit rules Add an initial set of k8s audit rules. They're broken into 3 classes of rules: - Suspicious activity: this includes things like: - A disallowed k8s user performing an operation - A disallowed container being used in a pod. - A pod created with a privileged pod. - A pod created with a sensitive mount. - A pod using host networking - Creating a NodePort Service - A configmap containing private credentials - A request being made by an unauthenticated user. - Attach/exec to a pod. (We eventually want to also do privileged pods, but that will require some state management that we don't currently have). - Creating a new namespace outside of an allowed set - Creating a pod in either of the kube-system/kube-public namespaces - Creating a serviceaccount in either of the kube-system/kube-public namespaces - Modifying any role starting with "system:" - Creating a clusterrolebinding to the cluster-admin role - Creating a role that wildcards verbs or resources - Creating a role with writable permissions/pod exec permissions. - Resource tracking. This includes noting when a deployment, service, - configmap, cluster role, service account, etc are created or destroyed. - Audit tracking: This tracks all audit events. To support these rules, add macros/new indexing functions as needed to support the required fields and ways to index the results. * Add ability to read trace files of k8s audit evts Expand the use of the -e flag to cover both .scap files containing system calls as well as jsonl files containing k8s audit events: If a trace file is specified, first try to read it using the inspector. If that throws an exception, try to read the first line as json. If both fail, return an error. Based on the results of the open, the main loop either calls do_inspect(), looping over system events, or read_k8s_audit_trace_file(), reading each line as json and passing it to the engine and outputs. * Example showing how to enable k8s audit logs. An example of how to enable k8s audit logging for minikube. * Add unit tests for k8s audit support Initial unit test support for k8s audit events. A new multiplex file falco_k8s_audit_tests.yaml defines the tests. Traces (jsonl files) are in trace_files/k8s_audit and new rules files are in test/rules/k8s_audit. Current test cases include: - User outside allowed set - Creating disallowed pod. - Creating a pod explicitly on the allowed list - Creating a pod w/ a privileged container (or second container), or a pod with no privileged container. - Creating a pod w/ a sensitive mount container (or second container), or a pod with no sensitive mount. - Cases for a trace w/o the relevant property + the container being trusted, and hostnetwork tests. - Tests that create a Service w/ and w/o a NodePort type. - Tests for configmaps: tries each disallowed string, ensuring each is detected, and the other has a configmap with no disallowed string, ensuring it is not detected. - The anonymous user creating a namespace. - Tests for all kactivity rules e.g. those that create/delete resources as compared to suspicious activity. - Exec/Attach to Pod - Creating a namespace outside of an allowed set - Creating a pod/serviceaccount in kube-system/kube-public namespaces - Deleting/modifying a system cluster role - Creating a binding to the cluster-admin role - Creating a cluster role binding that wildcards verbs or resources - Creating a cluster role with write/pod exec privileges * Don't manually install gcc 4.8 gcc 4.8 should already be installed by default on the vm we use for travis.	2018-11-09 10:15:39 -08:00
Mark Stemm	7dbdb00109	Also add endswith to lua parser (#443 ) * Also add endswith to lua parser Add endswith as a symbol so it can be parsed in filter expressions. * Unit test for endswith support Add a test case for endswith support, based on the filename ending with null.	2018-10-18 09:59:13 -07:00
Mark Stemm	6445cdb950	Better copyright notices (#426 ) * Use correct copyright years. Also include the start year. * Improve copyright notices. Use the proper start year instead of just 2018. Add the right owner Draios dba Sysdig. Add copyright notices to some files that were missing them.	2018-09-26 19:49:19 -07:00
Mark Stemm	2352b96d6b	Change license to Apache 2.0 (#419 ) Replace references to GNU Public License to Apache license in: - COPYING file - README - all source code below falco - rules files - rules and code below test directory - code below falco directory - entrypoint for docker containers (but not the Dockerfiles) I didn't generally add copyright notices to all the examples files, as they aren't core falco. If they did refer to the gpl I changed them to apache.	2018-09-20 11:47:10 -07:00
Mark Stemm	512a36dfe1	Conditional rules (#364 ) * Add ability to skip rules for unknown filters Add the ability to skip a rule if its condition refers to a filtercheck that doesn't exist. This allows defining a rules file that contains new conditions that can still has limited backward compatibility with older falco versions. When compiling a filter, return a list of filtercheck names that are present in the ast (which also includes filterchecks from any macros). This set of filtercheck names is matched against the set of filterchecks known to sinsp, expressed as lua patterns, and in the global table defined_filters. If no match is found, the rule loader throws an error. The pattern changes slightly depending on whether the filter has arguments or not. Two filters (proc.apid/proc.aname) can work with or without arguments, so both styles of patterns are used. If the rule has an attribute "skip-if-unknown-filter", the rule will be skipped instead. * Unit tests for skipping unknown filter New unit test for skipping unknown filter. Test cases: - A rule that refers to an unknown filter results in an error. - A rule that refers to an unknown filter, but has "skip-if-unknown-filter: true", can be read, but doesn't match any events. - A rule that refers to an unknown filter, but has "skip-if-unknown-filter: false", returns an error. Also test the case of a filtercheck like evt.arg.xxx working properly with the embedded patterns as well as proc.aname/apid which work both ways.	2018-05-03 14:24:32 -07:00
Mark Stemm	9d3392e9b9	Use better way to skip falco events (#356 ) * Use better way to skip falco events Use the new method falco_consider() to determine which events to skip. This centralizes the logic in a single function. All events will still be considered if falco was run with -A. This depends on https://github.com/draios/sysdig/pull/1105. * Add ability to specify -A flag in tests test attribute all_events corresponds to the -A flag. Add for some tests that would normally refer to skipped events.	2018-04-24 15:23:51 -07:00
Mark Stemm	e922a849a9	Add tests catchall order (#355 ) * Only check whole rule names when matching counts Tweak the regex so a rule my_great_rule doesn't pick up event counts for a rule "great_rule: nnn". * Add ability to skip evttype warnings for rules A new attribute warn_evttypes, if present, suppresses printing warnings related to a rule not matching any event type. Useful if you have a rule where not including an event type is intentional. * Add test for preserving rule order Test the fix for https://github.com/draios/falco/issues/354. A rules file has a event-specific rule first and a catchall rule second. Without the changes in https://github.com/draios/sysdig/pull/1103, the first rule does not match the event.	2018-04-19 09:31:20 -07:00
Mark Stemm	ac190ca457	Properly support syscalls in filter conditions (#352 ) * Properly support syscalls in filter conditions Syscalls have their own numbers but they weren't really handled within falco. This meant that there wasn't a way to handle filters with evt.type=xxx clauses where xxx was a value that didn't have a corresponding event entry (like "madvise", for examples), or where a syscall like open could also be done indirectly via syscall(__NR_open, ...). First, add a new top-level global syscalls that maps from a string like "madvise" to all the syscall nums for that id, just as we do for event names/numbers. In the compiler, when traversing the AST for evt.type=XXX or evt.type in (XXX, ...) clauses, also try to match XXX against the global syscalls table, and return any ids in a standalone table. Also throw an error if an XXX doesn't match any event name or syscall name. The syscall numbers are passed as an argument to sinsp_evttype_filter so it can preindex the filters by syscall number. This depends on https://github.com/draios/sysdig/pull/1100 * Add unit test for syscall support This does a madvise, which doesn't have a ppm event type, both directly and indirectly via syscall(__NR_madvise, ...), as well as an open directly + indirectly. The corresponding rules file matches on madvise and open. The test ensures that both opens and both madvises are detected.	2018-04-17 17:14:45 -07:00
Mark Stemm	c5b3097a65	Add ability to read rules files from directories (#348 ) * Add ability to read rules files from directories When the argument to -r <path> or an entry in falco.yaml's rules_file list is a directory, read all files in the directory and add them to the rules file list. The files in the directory are sorted alphabetically before being added to the list. The installed falco adds directories /etc/falco/rules.available and /etc/falco/rules.d and moves /etc/falco/application_rules.yaml to /etc/falco/rules.available. /etc/falco/rules.d is empty, but the idea is that admins can symlink to /etc/falco/rules.available for applications they want to enable. This will make it easier to add application-specific rulesets that admins can opt-in to. * Unit test for reading rules from directory Copy the rules/trace file from the test multiple_rules to a new test rules_directory. The rules files are in rules/rules_dir/{000,001}*.yaml, and the test uses a rules_file argument of rules_dir. Ensure that the same events are detected.	2018-04-05 17:03:37 -07:00
Mark Stemm	a5daf8b058	Allow append skipped rules (#346 ) * Allow appending to skipped rules If a rule has an append attribute but the original rule was skipped (due to having lower priority than the configured priority), silently skip the appending rule instead of returning an error. * Unit test for appending to skipped rules Unit test verifies fix for appending to skipped rules. One rules file defines a rule with priority WARNING, a second rules file appends to that rules file, and the configured priority is ERROR. Ensures that falco rules without errors.	2018-04-05 10:28:45 -07:00
Mark Stemm	88327abb41	Unit test for fd.net + in operator fixes (#343 ) Tests fix for https://github.com/draios/falco/issues/339. Depends on https://github.com/draios/sysdig/pull/1091.	2018-04-04 14:23:21 -07:00
Mark Stemm	2a3ca21779	Skip output json format (#342 ) * Add option to exclude output property in json fmt New falco.yaml option json_include_output_property controls where the formatted string "output" is included in the json object when json output is enabled. By default the string is included. * Add tests for new json output option New test sets json_include_output_property to false and then verifies that the json output does not contain the surrounding text "Warning an open...".	2018-03-28 11:24:09 -07:00
Mark Stemm	947faca334	Rule updates 2018 02.v2 (#326 ) * Let OMS agent for linux write config Programs are omiagent/omsagent/PerformInventor/in_heartbeat_r* and files are below /etc/opt/omi and /etc/opt/microsoft/omsagent. * Handle really long classpath lines for cassandra Some cassandra cmdlines are so long the classpath truncates the cmdline before the actual entry class gets named. In those cases also look for cassandra-specific config options. * Let postgres binaries read sensitive files Also add a couple of postgres cluster management programs. * Add apt-add-reposit(ory) as a debian mgmt program * Add addl info to debug writing sensitive files Add parent/grandparent process info. * Requrire root directory files to contain / In some cases, a file below root might be detected but the file itself has no directory component at all. This might be a bug with dropped events. Make the test more strict by requiring that the file actually contains a "/". * Let updmap read sensitive files Part of texlive (https://www.tug.org/texlive/) * For selected rules, require proc name to exist Some rules such as reading sensitive files and writing below etc have many exceptions that depend on the process name. In very busy environments, system call events might end up being dropped, which causes the process name to be missing. In these cases, we'll let the sensitive file read/write below etc to occur. That's handled by a macro proc_name_exists, which ensures that proc.name is not "<NA>" (the placeholder when it doesn't exist). * Let ucf write generally below /etc ucf is a general purpose config copying program, so let it generally write below /etc, as long as it in turn is run by the apt program "frontend". * Add new conf writers for couchdb/texmf/slapadd Each has specific subdirectories below /etc * Let sed write to addl temp files below /etc Let sed write to additional temporary files (some directory + "sed") below /etc. All generally related to package installation scripts. * Let rabbitmq(ctl) spawn limited shells Let rabbitmq spawn limited shells that perform read-only tasks like reading processes/ifaces. Let rabbitmqctl generally spawn shells. * Let redis run startup/shutdown scripts Let redis run specific startup/shutdown scripts that trigger at start/stop. They generally reside below /etc/redis, but just looking for the names redis-server.{pre,post}-up in the commandline. * Let erlexec spawn shells https://github.com/saleyn/erlexec, "Execute and control OS processes from Erlang/OTP." * Handle updated trace files As a part of these changes, we updated some of the positive trace files to properly include a process name. These newer trace files have additional opens, so update the expected event counts to match. * Let yum-debug-dump write to rpm database * Additional config writers Symantec AV for Linux, sosreport, semodule (selinux), all with their config files. * Tidy up comments a bit. * Try protecting node apps again Try improving coverage of run shell untrusted by looking for shells below node processes again. Want to see how many FPs this causes before fully committing to it. * Let node run directly by docker count as a service Generally, we don't want to consider all uses of node as a service wrt spawned shells. But we might be able to consider node run directly by docker as a "service". So add that to protected_shell_spawner. * Also add PM2 as a protected shell spawner This should handle cases where PM2 manages node apps. * Remove dangling macros/lists Do a pass over the set of macros/lists, removing most of those that are no longer referred to by any macro/list. The bulk of the macros/lists were related to the rule Run Shell Untrusted, which was refactored to only detect shells run below specific programs. With that change, many of these exceptions were no longer neeeded. * Add a "never_true" macro Add a never_true macro that will never match any event. Useful if you want to disable a rule/macro/etc. * Add missing case to write_below_etc Add the macro veritas_writing_config to write_below_etc, which was mistakenly not added before. * Make tracking shells spawned by node optional The change to generally consider node run directly in a container as a protected shell spawner was too permissive, causing false positives. However, there are some deployments that want to track shells spawned by node as suspect. To address this, create a macro possibly_node_in_container which defaults to never matching (via the never_true) macro. In a user rules file, you can override the macro to remove the never_true clause, reverting to the old behavior. * Add some dangling macros/lists back Some macros/lists are still referred to by some widely used user rules files, so add them back temporarily.	2018-02-26 13:26:28 -05:00
Mark Stemm	8aeef034a6	Remove installer-related traces We removed the installer-related rules, so remove the installer-related traces as well.	2018-01-17 17:40:38 -08:00
Mark Stemm	d6d975e28c	Refactor shell rules (#301 ) * Refactor shell rules to avoid FPs. Refactoring the shell related rules to avoid FPs. Instead of considering all shells suspicious and trying to carve out exceptions for the legitimate uses of shells, only consider shells spawned below certain processes suspicious. The set of processes is a collection of commonly used web servers, databases, nosql document stores, mail programs, message queues, process monitors, application servers, etc. Also, runsv is also considered a top level process that denotes a service. This allows a way for more flexible servers like ad-hoc nodejs express apps, etc to denote themselves as a full server process. * Update event generator to reflect new shell rules spawn_shell is now a silent action. its replacement is spawn_shell_under_httpd, which respawns itself as httpd and then runs a shell. db_program_spawn_binaries now runs ls instead of a shell so it only matches db_program_spawn_process. * Comment out old shell related rules * Modify nodejs example to work w/ new shell rules Start the express server using runit's runsv, which allows falco to consider any shells run by it as suspicious. * Use the updated argument for mkdir In https://github.com/draios/sysdig/pull/757 the path argument for mkdir moved to the second argument. This only became visible in the unit tests once the trace files were updated to reflect the other shell rule changes--the trace files had the old format. * Update unit tests for shell rules changes Shell in container doesn't exist any longer and its functionality has been subsumed by run shell untrusted. * Allow git binaries to run shells In some cases, these are run below a service runsv so we still need exceptions for them. * Let consul agent spawn curl for health checks * Don't protect tomcat There's enough evidence of people spawning general commands that we can't protect it. * Reorder exceptions, add rabbitmq exception Move the nginx exception to the main rule instead of the protected_shell_spawner macro. Also add erl_child_setup (related to rabbitmq) as an allowed shell spawner. * Add additional spawn binaries All off these are either below nginx, httpd, or runsv but should still be allowed to spawn shells. * Exclude shells when ancestor is a pkg mgmt binary Skip shells when any process ancestor (parent, gparent, etc) is a package management binary. This includes the program needrestart. This is a deep search but should prevent a lot of other more detailed exceptions trying to find the specific scripts run as a part of installations. * Skip shells related to serf Serf is a service discovery tool and can in some cases be spawned by apache/nginx. Also allow shells that are just checking the status of pids via kill -0. * Add several exclusions back Add several exclusions back from the shell in container rule. These are all allowed shell spawns that happen to be below nginx/fluentd/apache/etc. * Remove commented-out rules This saves space as well as cleanup. I haven't yet removed the macros/lists used by these rules and not used anywhere else. I'll do that cleanup in a separate step. * Also exclude based on command lines Add back the exclusions based on command lines, using the existing set of command lines. * Add addl exclusions for shells Of note is runsv, which means it can directly run shells (the ./run and ./finish scripts), but the things it runs can not. * Don't trigger on shells spawning shells We'll detect the first shell and not any other shells it spawns. * Allow "runc:" parents to count as a cont entrypnt In some cases, the initial process for a container can have a parent "runc:[0:PARENT]", so also allow those cases to count as a container entrypoint. * Use container_entrypoint macro Use the container_entrypoint macro to denote entering a container and also allow exe to be one of the processes that's the parent of an entrypoint.	2017-11-28 07:04:37 -08:00
Mark Stemm	080305c7a0	Adjust for new severity Shell in container is now debug level, so adjust test case to match.	2017-10-09 13:05:12 -07:00
Mark Stemm	283c6eea99	Fully remove falco package. In case there are modified config files from a prior install.	2017-10-05 18:28:29 -07:00
Mark Stemm	aa073586f1	Add ability to filter events by priority/cleanups Clean up the handling of priority levels within rules. It used to be a mix of strings handled in various places. Now, in falco_common.h there's a consistent type for priority-as-number as well as a list of priority-as-string values. Priorities are passed around as numbers instead of strings. It's still permissive about capitalization. Also add the ability to load rules by severity. New falco config option "priority=<val>"/-o priority=<val> specifies the minimum priority level of rules that will be loaded. Add unit tests for same. The test suppresses INFO notifications for a rule/trace file combination that would otherwise generate them.	2017-10-05 18:07:54 -07:00
Mark Stemm	a38f7f181b	Add ability to append to rules/macros Add the ability to append to rules/macros, like we already do with lists. For rules/macros, if the object has an append: true key, the condition value is appended to the condition of an existing rule/macro with the same name. Like lists, it's an error to specify append: true without there being an existing rule/macro. Also add tests that test the same kind of things we did for lists: - That append: true really does append - That append: false overwrites the rule/macro - That it's an error to append with a prior rule/macro existing.	2017-09-22 17:08:00 -07:00
Mark Stemm	0bc2d4f162	Automated tests for list append. Test the case of appending to a list and appending to a nonexistent list (should error).	2017-08-10 09:36:31 -07:00
Mark Stemm	eecc92736b	Add unit tests for list substitution/order Add new unit tests to check that list substitution is working as expected, with test cases for the list substitution occurring at the beginning, middle, and end of a condition. Also add tests that verify that overrides on list/macro/rule names always occur in order.	2017-06-30 15:12:43 -07:00
Mark Stemm	38f488bfda	Beta rule updates (#247 ) * Updates from beta customers. - add anacron as a cron program * Reorganize package management binaries Split package_management_binaries into two separate lists rpm_binaries and deb_binaries. unattended-upgr is common to both worlds so it's still in package_management_binaries. Also change Write below rpm database to use rpm_binaries instead of its own list. Also add 75-system-updat (truncated) as a shell spawner. * Add rules for jenkins Add rules that allow jenkins to spawn shells, both in containers and directly on the host. Also handle jenkins slaves that run /tmp/slave.jar. * Allow npm to run shells. Not yet allowing node to run shells itself, although we want to add something to reduce node-related FPs. * Allow urlgrabber/git-remote to access /etc urlgrabber and git-remote both try to access the RHEL nss database, containing shared certificates. I may change this in a more general way by changing open_read/open_write to only look for successful opens. * Only look for successful open_read/open_writes Change the macros open_read/open_write to only trigger on successful opens (when fd.num > 0). This is a pretty big change to behavior, but is more intuitive. This required a small update to the open counts for a couple of unit tests, but otherwise they still all passed with this change. * Allow rename_device to write below /dev Part of udev. * Allow cloud-init to spawn shells. Part of https://cloud-init.io/ * Allow python to run a shell that runs sdchecks sdchecks is a part of the sysdig monitor agent. * Allow dev creation binaries to write below etc. Specifically this includes blkid and /etc/blkid/blkid.tab. * Allow git binaries to spawn shells. They were already allowed to run shells in a container. * Add /dev/kmsg as an allowed /dev file Allows userspace programs to write to kernel log. * Allow other make programs to spawn shells. Also allow gmake/cmake to spawn shells and put them in their own list make_binaries. * Add better mesos support. Mesos slaves appear to be in a container due to their cgroup and can run programs mesos-health-check/mesos-docker-exec to monitor the containers on the slave, so allow them to run shells. Add mesos-agent, mesos-logrotate, mesos-fetch as shell spawners both in and out of containers. Add gen_resolvconf. (short for gen_resolvconf.py) as a program that can write to /etc. Add toybox (used by mesos, part of http://landley.net/toybox/about.html) as a shell spawner. * systemd can listen on network ports. Systemd can listen on network ports to launch daemons on demand, so allow it to perform network activity. * Let docker binaries setuid. Let docker binaries setuid and add docker-entrypoi (truncation intentional) to the set of docker binaries. * Change cis-related rules to be less noisy Change the two cis-related falco rules "File Open by Privileged Container" and "Sensitive Mount by Container" to be less noisy. We found in practice that tracking every open still results in too many falco notifications. For now, change the rules to only track the initial process start in the container by looking for vpid=1. This should result in only triggering when a privileged/sensitive mount container is started. This is slightly less coverage but is far less noisy. * Add quay.io/sysdig as trusted containers These are used for sysdig cloud onpremise deployments. * Add gitlab-runner-b(uild) as a gitlab binary. Add gitlab-runner-b (truncated gitlab-runner-build) as a gitlab binary. * Add ceph as a shell spawner. Also allow ceph to spawn shells in a container. * Allow some shells by command line. For some mesos containers, where the container doesn't have an image and is just a tarball in a cgroup/namespace, we don't have any image to work with. In those cases, allow specific command lines. * Allow user 'nobody' to setuid. Allow the user nobody to setuid. This depends on the user nobody being set up in the first place to have no access, but that should be an ok assumption. * Additional allowed shell commandlines * Add additional shells. * Allow multiple users to become themself. Add rule somebody_becoming_themself that handles cases of nobody and www-data trying to setuid to themself. The sysdig filter language doesn't support template/variable values to allow "user.name=X and evt.arg.uid=X for a given X", so we have to enumerate the users. * More known spawn command lines * Let make binaries be run in containers. Some CI/CD pipelines build in containers. * Add additional shell spawning command lines * Add additional apt program apt-listchanges. * Add gitlab-ce as shell spawning container. * Allow PM2 to spawn shells in containers. Was already in the general list, seen in some customers, so adding to the in containers list. * Clean up pass to fix long lines. Take a pass through the rules making sure each line is < 120 characters. * Change tests for privileged container rules. Change unit tests to reflect the new privileged/sensitive mount container rules that only detect container launch.	2017-06-19 11:28:15 -07:00
Mark Stemm	5bafa198c6	Update automated tests to handle new priority lvls The default falco ruleset now has a wider variety of priorities, so adjust the automated tests to match: - Instead of creating a generic test yaml entry for every trace file in traces-{positive,negative,info} with assumptions about detect levels, add a new falco_traces.yaml.in multiplex file that has specific information about the detect priorities and rule detect counts for each trace file. - If a given trace file doesn't have a corresponding entry in falco_traces.yaml.in, a generic entry is added with a simple detect: (True\|False) value and level. That way you can get specific detect levels/counts for existing trace files, but if you forget to add a trace to falco_traces.yaml.in, you'll still get some coverage. - falco_tests.yaml.in isn't added to any longer, so rename it to falco_tests.yaml. - Avocado is now run twice--once on each yaml file. The final test passes if both avocado runs pass.	2017-05-25 12:15:35 -07:00
Mark Stemm	73fbbdb577	Add automated tests for packages/driver installs Add automated tests for running falco from a package and container. As a result, this will also test building the kernel module as well as runnning falco-probe-loader as a backup. In travis.yml, switch to the docker-enabled vm and install dkms. This changed the environment slightly, so change how avocado's python dependencies are installed. After building falco, copy the .deb package to docker/local and build a local docker image based on that package. Add the following new tests: - docker_package: this uses "docker run" to run the image created in travis.yml. This includes using dkms to build the kernel module and load it. In addition, the conf directory is mounted to /host/conf, the rules directory is mounted to /host/rules, and the traces directory is mounted to /host/traces. - docker_package_local_driver: this disables dkms via a volume mount that maps /dev/null to /usr/sbin/dkms and copies the kernel module by hand into the container to /root/.sysdig/falco-probe-....ko. As a result, falco-probe-loader will use the local kernel module instead of building one itself. - debian_package: this installs the .deb package and runs the installed version of falco. Ideally, there'd also be a test for downloading the driver, but since the driver depends on the kernel as well as the falco version string, you can't put a single driver on download.draios.com that will work long-term. These tests depend on the following new test attributes: - package: if present, this points to the docker image/debian package to install. - addl_docker_run_args: if present, will be added to the docker run command. - copy_local_driver: if present, will copy the built kernel module to ~/.sysdig. ~/.sysdig/* is always cleared out before each test. - run_duration: maps to falco's -M <secs> flag - trace_file is now optional. Also add some misc general test changes: - Clean up our use of process.run. By default it will fail a test if the run program returns non-zero, so we don't have to grab the exit status. In addition, get rid of sudo in the command lines and use the sudo attribute instead. - Fix some tests that were writing to files below /tmp/falco_outputs by creating the directory first. Useful when running avocado directly.	2017-03-24 16:54:42 -07:00
Mark Stemm	ec5adfe892	Build and package standalone falco kernel module Start packaging (and building when necessary) a falco-specific kernel module in falco releases. Previously, falco would depend on sysdig and use its kernel module instead. The kernel module was already templated to some degree in various places, so we just had to change the templated name from sysdig/sysdig-probe to falco/falco-probe. In containers, run falco-probe-loader instead of sysdig-probe-loader. This is actually a script in the sysdig repository which is modified in https://github.com/draios/sysdig/pull/789, and uses the filename to indicate what kernel module to build and/or load. For the falco package itself, don't depend on sysdig any longer but instead depend on dkms and its dependencies, using sysdig as a guide on the set of required packages. Additionally, for the package pre-install/post-install scripts start running falco-probe-loader. Finally, add a --version argument to falco so it can pass the desired version string to falco-probe-loader.	2017-03-20 15:56:37 -07:00
Mark Stemm	185729d5d6	Address feedback from PR - Instead of having a possibly null string pointer as the argument to enable_* and process_event, have wrapper versions that assume a default falco ruleset. The default ruleset name is a static member of the falco_engine class, and the default ruleset id is created/found in the constructor. - This makes the whole mechanism simple enough that it doesn't require seprarate testing, so remove the capability within falco to read a ruleset from the environment and remove automated tests that specify a ruleset. - Make pattern/tags/ruleset arguments to enable_* functions const. (I'll squash this down before I commit)	2017-02-10 11:54:30 -08:00
Mark Stemm	88faa7c1e7	Add automated tests for tagged rules Add automated tests that verify the ability to tag sets of rules, disable them with -T, and run them with -t, works: - New test option disable_tags adds -T <tag> arguments to the falco command line, and run_tags adds -t <tag> arguments to the falco command line. - A new trace file open-multiple-files.scap opens 13 different files, and a new rules file has 13 different rules with all combinations of the tags a, b, c (both forward and backward), a rule with an empty list of tags, a rule with no tags field, and a rule with a completely different tag d. Using the above, add tests for: - Both disabling all combations of a, b, c using disable_tags as well as run all combinations of a, b, c, using run_tags. - Specifying both disabled (-T/-D) and enabled (-t) rules. Not allowed. - Specifying a ruleset while having tagged rules enabled, rules based on a name disabled, and no particular rules enabled or disabled.	2017-02-08 11:08:36 -08:00
Mark Stemm	e0a5034a43	Ensure falco-event-generator actions are detected. A new trace file falco-event-generator.scap contains the result of running the falco event generator in docker, via: docker run --security-opt seccomp=unconfined sysdig/falco-event-generator:latest /usr/local/bin/event_generator --once Make sure this trace file detects the exact set of events we expect for each rule. This required adding a new verification method check_detections_by_rule that finds the per-rule counts and compares them to the expected counts, which are included in the test description under the key "detect_counts". This is the first time a trace file for a test is actually in one of the downloaded zip files. This means it will be tested twice (one for simple detect-or-not, once for actual counts). Adding this test showed a problem with Run shell in container rule--since sysdig/falco-event-generator startswith sysdig/falco, it was being treated as a trusted container. Modify the macro trusted_containers to not allow falco-event-generator to be trusted.	2017-02-01 15:02:44 -08:00
Mark Stemm	f4bb49f1f5	Add test for truncated outputs. Add a test that specifically tests truncated outputs. A rule contains an output field %fd.cport which has no value for an open event. Ensure that the rule's output has <NA> for the cport and the remainder of the rule's output is filled in.	2017-01-03 12:58:01 -08:00
Mark Stemm	9ecdf30314	tests for overriding rules/macros/lists New tests that test every possible override: - Overriding a rule with one that doesn't match - Overriding a macro to one that doesn't match - Overriding a top level list to a binary that doesn't match - Overriding an embedded list to one that doesn't match In each case, the override results in no longer matching an open by the program "cat".	2016-12-29 13:32:55 -08:00
Mark Stemm	2a2dcaf25d	Modify plotting script to handle drop stats. New argument --metric, which can be cpu\|drops, controls whether to graph cpu usage or event drop percentage. Titles/axis labels/etc. change appropriately.	2016-12-22 12:55:36 -08:00
Mark Stemm	e6aefef4eb	Add ability to write "extra" stuff to stats file. When run via scripts like run_performance_tests.sh, it's useful to include extra info like the test being run and the specific program variant to the stats file. So support that via the environment. Environment keys starting with FALCO_STATS_EXTRA_XXX will have the XXX and environment value added to the stats file. It's undocumented as I doubt other programs will need this functionality and it keeps the docs simpler.	2016-12-22 12:55:36 -08:00
Mark Stemm	8b116c2ad1	Add unit test for rule with invalid output. Add the ability to check falco's return code with exit_status and to generally match stderr with stderr_contains in a test. Use those to create a test that has an invalid output expression using %not_a_real_field. It expects falco to exit with 1 and the output to contain a message about the invalid output.	2016-12-22 12:55:36 -08:00
Mark Stemm	c6c074ef60	Allow run_performance_tests to run test_mm. Make necessary changes to allow run_performance_tests to invoke the 'test_mm' program we use internally. Also add ability to run with a build directory separate from the source directory and to specify an alternate rules file. Finally, set up the kubernetes demo using sudo, a result of recent changes.	2016-12-22 12:55:36 -08:00
Mark Stemm	1db2339ece	Add test for enabled flag. New test case disables a rule that would otherwise match.	2016-10-24 15:56:45 -07:00
Mark Stemm	897df28036	Add regression tests for configurable outputs. - In the regression tests, make the config file configurable in the multiplex file via 'conf_file'. - A new multiplex file item 'outputs' containing a list of <filename>: <regex> tuples. For each item, the test reads the file and matches each line against the regex. A match must be found for the test to pass. - Add 2 new tests that test file output and program output. They write to files below /tmp/falco_outputs/ and the contents are checked to ensure that alerts are written.	2016-10-24 15:56:45 -07:00
Mark Stemm	81a145fd4f	Verifying rule names can have spaces. Related to discussion on https://github.com/draios/agent/pull/160, verifying we can have rule names with spaces.	2016-10-24 15:56:45 -07:00
Mark Stemm	c140b23678	Add tests for multiple files, disabled rules. Add test that cover reading from multiple sets of rule files and disabling rules. Specific changes: - Modify falco to allow multiple -r arguments to read from multiple files. - In the test multiplex file, add a disabled_rules attribute, containing a sequence of rules to disable. Result in -D arguments when running falco. - In the test multiplex file, 'rules_file' can be a sequence. It results in multiple -r arguments when running falco. - In the test multiplex file, 'detect_level' can be a squence of multiple severity levels. All levels will be checked for in the output. - Move all test rules files to a rules subdirectory and all trace files to a traces subdirectory. - Add a small trace file for a simple cat of /dev/null. Used by the new tests. - Add the following new tests: - Reading from multiple files, with the first file being empty. Ensure that the rules from the second file are properly loaded. - Reading from multiple files with the last being empty. Ensures that the empty file doesn't overwrite anything from the first file. - Reading from multiple files with varying severity levels for each rule. Ensures that both files are properly read. - Disabling rules from a rules file, both with full rule names and regexes. Will result in not detecting anything.	2016-10-24 15:56:45 -07:00
Mark Stemm	917d66e9e8	Create embeddable falco engine. Create standalone classes falco_engine/falco_outputs that can be embedded in other programs. falco_engine is responsible for matching events against rules, and falco_output is responsible for formatting an alert string given an event and writing the alert string to all configured outputs. falco_engine's main interfaces are: - load_rules/load_rules_file: Given a path to a rules file or a string containing a set of rules, load the rules. Also loads needed lua code. - process_event(): check the event against the set of rules and return the results of a match, if any. - describe_rule(): print details on a specific rule or all rules. - print_stats(): print stats on the rules that matched. - enable_rule(): enable/disable any rules matching a pattern. New falco command line option -D allows you to disable one or more rules on the command line. falco_output's main interfaces are: - init(): load needed lua code. - add_output(): add an output channel for alert notifications. - handle_event(): given an event that matches one or more rules, format an alert message and send it to any output channels. Each of falco_engine/falco_output maintains a separate lua state and loads separate sets of lua files. The code to create and initialize the lua state is in a base class falco_common. falco_engine no longer logs anything. In the case of errors, it throws exceptions. falco_logger is now only used as a logging mechanism for falco itself and as an output method for alert messages. (This should really probably be split, but it's ok for now). falco_engine contains an sinsp_evttype_filter object containing the set of eventtype filters. Instead of calling m_inspector->add_evttype_filter() to add a filter created by the compiler, call falco_engine::add_evttype_filter() instead. This means that the inspector runs with a NULL filter and all events are returned from do_inspect. This depends on https://github.com/draios/sysdig/pull/633 which has a wrapper around a set of eventtype filters. Some additional changes along with creating these classes: - Some cleanups of unnecessary header files, cmake include_directory()s, etc to only include necessary includes and only include them in header files when required. - Try to avoid 'using namespace std' in header files, or assuming someone else has done that. Generally add 'using namespace std' to all source files. - Instead of using sinsp_exception for all errors, define a falco_engine_exception class for exceptions coming from the falco engine and use it instead. For falco program code, switch to general exceptions under std::exception and catch + display an error for all exceptions, not just sinsp_exceptions. - Remove fields.{cpp,h}. This was dead code. - Start tracking counts of rules by priority string (i.e. what's in the falco rules file) as compared to priority level (i.e. roughtly corresponding to a syslog level). This keeps the rule processing and rule output halves separate. This led to some test changes. The regex used in the test is now case insensitive to be a bit more flexible. - Now that https://github.com/draios/sysdig/pull/632 is merged, we can delete the rules object (and its lua_parser) safely. - Move loading the initial lua script to the constructor. Otherwise, calling load_rules() twice re-loads the lua script and throws away any state like the mapping from rule index to rule. - Allow an empty rules file. Finally, fix most memory leaks found by valgrind: - falco_configuration wasn't deleting the allocated m_config yaml config. - several ifstreams were being created simply to test which falco config file to use. - In the lua output methods, an event formatter was being created using falco.formatter() but there was no corresponding free_formatter(). This depends on changes in https://github.com/draios/sysdig/pull/640.	2016-10-24 15:56:45 -07:00
Mark Stemm	f05bb2b3ec	Add ability to run agent for performance tests. When the root directory contains the name 'agent', assume we're running an agent and provide appropriate configuration and run the agent using dragent. You can make autodrop or falco configurable within the agent via --agent-autodrop and --falco-agent. Also include some other small changes like timestamping the json points.	2016-08-04 16:03:07 -07:00
Mark Stemm	7b68fc2692	Add tests for event type rule identification Add tests that verify that the event type identification functionality is working. Notable changes: - Modify falco_test.py to additionally check for warnings when loading any set of rules and verify that the event types for each rule match expected values. This is controlled by the new multiplex fields "rules_warning" and "rules_events". - Instead of starting with an empty falco_tests.yaml from scratch from the downloaded trace files, use a checked-in version which defines two tests: - Loading the checked-in falco_rules.yaml and verify that no rules have warnings. - A sample falco_rules_warnings.yaml that has ~30 different mutations of rule filtering expressions. The test verifies for each rule whether or not the rule should result in a warning and what the extracted event types are. The generated tests from the trace files are appended to this file. - Add an empty .scap file to use with the above tests.	2016-07-18 11:26:28 -07:00
Mark Stemm	b76423b31d	Useful scripts to collect/display perf results. Add shell scripts to make it easier to collect performance results from traces, live tests, and phoronix tests. With run_performance_tests.sh you specify the following: - a subject program to run, using --root - a name to give to this set of results, using --variant - a test to run, using --test - a file to write the results to, using --results. For tests that start with "trace", the script runs falco/sysdig on the trace file and measures the time taken to read the file. For other tests, he script handles starting falco/sysdig, starting a cpu measurement script (a wrapper around top, just to provide identical values to what you would see using top) to measure the cpu usage of falco/sysdig, and running a live test. The measurement interval for cpu usage depends on the test being run--10 seconds for most tests, 2 seconds for shorter tests. The output is written as json to the file specified in --results. Also add R scripts to easily display the results from the shell script. plot-live.r shows a linechart of the cpu usage for the provided variants over time. plot-traces.r shows grouped barcharts showing user/system/total time taken for the provided variants and traces. One bug--you have to make the results file actual json by adding leading/trailing []s.	2016-07-18 10:45:30 -07:00
Mark Stemm	8ffb553c75	Add ability to run branch-specific trace files. Pass the travis branch to run_regression_tests.sh. When downloading trace files, first look for a file traces-XXX-$BRANCH and if found download it. This allows testing out a set of changes with a trace file specifically for that branch, that can be moved to the normal file once the PR is merged. Also increase the timeout for the spawned falco process from 1 to 3 minutes. In debug mode, the kubernetes demo was taking slightly over 1 minute.	2016-07-12 08:22:29 -07:00
Mark Stemm	995e61210e	Add regression tests for json output. Modify falco_test.py to look for a boolean multiplex attribute 'json_output'. If true, examine the lines of the output and for any line that begins with '{', parse it as json and ensure it has the 4 attributes we expect. Modify run_regression_tests to have a utility function prepare_multiplex_fileset that does the work of looping over files in a directory, along with detect, level, and json output arguments. The appropriate multiplex attributes are added for each file. Use that utility function to test json output for the positive and informational directories along with non-json output. The negative directory is only tested once.	2016-06-07 14:04:53 -07:00
Mark Stemm	fc6d775e5b	Add additional rules/tests for pipe installers. Add additional rules related to using pipe installers within a fbash session: - Modify write_etc to only trigger if not in a fbash session. There's a new rule write_etc_installer which has the same conditions when in a fbash session, logging at INFO severity. - A new rule write_rpm_database warns if any non package management program tries to write below /var/lib/rpm. - Add a new warning if any program below a fbash session tries to open an outbound network connection on ports other than http(s) and dns. - Add INFO level messages when programs in a fbash session try to run package management binaries (rpm,yum,etc) or service management (systemctl,chkconfig,etc) binaries. In order to test these new INFO level rules, make up a third class of trace files traces-info.zip containing trace files that should result in info-level messages. To differentiate warning and info level detection, add an attribute to the multiplex file "detect_level", which is "Warning" for the files in traces-positive and "Info" for the files in traces-info. Modify falco_test.py to look specifically for a non-zero count for the given detect_level. Doing this exposed a bug in the way the level-specific counts were being recorded--they were keeping counts by level name, not number. Fix that.	2016-06-06 10:29:41 -07:00
Mark Stemm	b3ae480fac	Another round of rule cleanups. Do another round of rule cleanups now that we have a larger set of positive and negative trace files to work with. Outside of this commit, there are now trace files for all the positive rules, a docker-compose startup and teardown, and some trace files from the sysdig cloud staging environment. Also add a script that runs sysdig with a filter that removes all the syscalls not handled by falco as well as a few other high-volume, low-information syscalls. This script was used to create the staging environment trace files. Notable rule changes: - The direction for write_binary_dir/write_etc needs to be exit instead of enter, as the bin_dir clause works on the file descriptor returned by the open/openat call. - Add login as a trusted binary that can read sensitive files (occurs for direct console logins). - sshd can read sensitive files well after startup, so exclude it from the set of binaries that can trigger read_sensitive_file_trusted_after_startup. - limit run_shell_untrusted to non-containers. - Disable the ssh_error_syslog rule for now. With the current restriction on system calls (no read/write/sendto/recvfrom/etc), you won't see the ssh error messages. Nevertheless, add a string to look for to indicate ssh errors and add systemd's true location for the syslog device. - Sshd attemps to setuid even when it's not running as root, so exclude it from the set of binaries to monitor for now. - Let programs that are direct decendants of systemd spawn user management tasks for now. - Temporarily disable the EACCESS rule. This rule is exposing a bug in sysdig in debug mode, https://github.com/draios/sysdig/issues/598. The rule is also pretty noisy so I'll keep it disabled until the sysdig bug is fixed. - The etc_dir and bin_dir macros both have the problem that they match pathnames with /etc/, /bin/, etc in the middle of the path, as sysdig doesn't have a "begins with" comparison. Add notes for that. - Change spawn_process to spawned_process to indicate that it's for the exit side of the execve. Also use it in a few places that were looking for the same conditions without any macro. - Get rid of adduser_binaries and fold any programs not already present into shadowutils_binaries. - Add new groups sysdigcloud_binaries and sysdigcloud_binaries_parent and add them as exceptions for write_etc/write_binary_dir. - Add yum as a package management binary and add it as an exception to write_etc/write_binary_dir. - Change how db_program_spawned_process works. Since all of the useful information is on the exit side of the event, you can't really add a condition based on the process being new. Isntead, have the rule check for a non-database-related program being spawned by a database-related program. - Allow dragent to run shells. - Add sendmail, sendmail-msp as a program that attempts to setuid. - Some of the *_binaries macros that were based on dpkg -L accidentally contained directories in addition to end files. Trim those. - Add systemd-logind as a login_binary. - Add unix_chkpwd as a shadowutils_binary. - Add parentheses around any macros that group items using or. I found this necessary when the macro is used in the middle of a list of and conditions. - Break out system_binaries into a new subset user_mgmt_binaries containing login_, passwd_, and shadowutils_ binaries. That way you don't have to pull in all of system_binaries when looking for sensisitive files or user management activity. - Rename fs-bash to fbash, thinking ahead to its more likely name.	2016-05-25 17:40:01 -07:00
Mark Stemm	4751546c03	Add correctness tests using Avocado Start using the Avocado framework for automated regression testing. Create a test FalcoTest in falco_test.py which can run on a collection of trace files. The script test/run_regression_tests.sh is responsible for pulling zip files containing the positive (falco should detect) and negative (falco should not detect) trace files, creating a Avocado multiplex file that defines all the tests (one for each trace file), running avocado on all the trace files, and showing full logs for any test that didn't pass. The old regression script, which simply ran falco, has been removed. Modify falco's stats output to show the total number of events detected for use in the tests. In travis.yml, pull a known stable version of avocado and build it, including installing any dependencies, as a part of the build process.	2016-05-24 13:56:48 -07:00
Mark Stemm	450c347ef3	Add a basic test to run falco. Add a basic test that loads the kernel module from the source directory and runs falco. No testing of behavior yet.	2016-05-17 17:43:09 -07:00
Mark Stemm	c9d2550ecd	Add minimal travis support. Add minimal travis.yml file that builds and packages falco. No actual tests yet.	2016-05-17 16:16:34 -07:00

49 Commits