* Add falco service to k8s install/update labels
Update the instructions for K8s RBAC installation to also create a
service that maps to port 8765 of the falco pod. This allows other
services to access the embedded webserver within falco.
Also clean up the set of labels to use a consistent app: falco-example,
role:security for each object.
* Cange K8s Audit Example to use falco daemonset
Change the K8s Audit Example instructions to use minikube in conjunction
with a falco daemonset running inside of minikube. (We're going to start
prebuilding kernel modules for recent minikube variants to make this
possible).
When running inside of minikube in conjunction with a service, you have
to go through some additional steps to find the ClusterIP associated
with the falco service and use that ip when configuring the k8s audit
webhook. Overall it's still a more self-contained set of instructions,
though.
* Add new json/webserver libs, embedded webserver
Add two new external libraries:
- nlohmann-json is a better json library that has stronger use of c++
features like type deduction, better conversion from stl structures,
etc. We'll use it to hold generic json objects instead of jsoncpp.
- civetweb is an embeddable webserver that will allow us to accept
posted json data.
New files webserver.{cpp,h} start an embedded webserver that listens for
POSTS on a configurable url and passes the json data to the falco
engine.
New falco config items are under webserver:
- enabled: true|false. Whether to start the embedded webserver or not.
- listen_port. Port that webserver listens on
- k8s_audit_endpoint: uri on which to accept POSTed k8s audit events.
(This commit doesn't compile entirely on its own, but we're grouping
these related changes into one commit for clarity).
* Don't use relative paths to find lua code
You can look directly below PROJECT_SOURCE_DIR.
* Reorganize compiler lua code
The lua compiler code is generic enough to work on more than just
sinsp-based rules, so move the parts of the compiler related to event
types and filterchecks out into a standalone lua file
sinsp_rule_utils.lua.
The checks for event types/filterchecks are now done from rule_loader,
and are dependent on a "source" attribute of the rule being
"sinsp". We'll be adding additional types of events next that come from
sources other than system calls.
* Manage separate syscall/k8s audit rulesets
Add the ability to manage separate sets of rules (syscall and
k8s_audit). Stop using the sinsp_evttype_filter object from the sysdig
repo, replacing it with falco_ruleset/falco_sinsp_ruleset from
ruleset.{cpp,h}. It has the same methods to add rules, associate them
with rulesets, and (for syscall) quickly find the relevant rules for a
given syscall/event type.
At the falco engine level, there are new parallel interfaces for both
types of rules (syscall and k8s_audit) to:
- add a rule: add_k8s_audit_filter/add_sinsp_filter
- match an event against rules, possibly returning a result:
process_sinsp_event/process_k8s_audit_event
At the rule loading level, the mechanics of creating filterchecks
objects is handled two factories (sinsp_filter_factory and
json_event_filter_factory), both of which are held by the engine.
* Handle multiple rule types when parsing rules
Modify the steps of parsing a rule's filter expression to handle
multiple types of rules. Notable changes:
- In the rule loader/ast traversal, pass a filter api object down,
which is passed back up in the lua parser api calls like nest(),
bool_op(), rel_expr(), etc.
- The filter api object is either the sinsp factory or k8s audit
factory, depending on the rule type.
- When the rule is complete, the complete filter is passed to the
engine using either add_sinsp_filter()/add_k8s_audit_filter().
* Add multiple output formatting types
Add support for multiple output formatters. Notable changes:
- The falco engine is passed along to falco_formats to gain access to
the engine's factories.
- When creating a formatter, the source of the rule is passed along
with the format string, which controls which kind of output formatter
is created.
Also clean up exception handling a bit so all lua callbacks catch all
exceptions and convert them into lua errors.
* Add support for json, k8s audit filter fields
With some corresponding changes in sysdig, you can now create general
purpose filter fields and events, which can be tied together with
nesting, expressions, and relational operators. The classes here
represent an instance of these fields devoted to generic json objects as
well as k8s audit events. Notable changes:
- json_event: holds a json object, used by all of the below
- json_event_filter_check: Has the ability to extract values out of a
json_event object and has the ability to define macros that associate
a field like "group.field" with a json pointer expression that
extracts a single property's value out of the json object. The basic
field definition also allows creating an index
e.g. group.field[index], where a std::function is responsible for
performing the indexing. This class has virtual void methods so it
must be overridden.
- jevt_filter_check: subclass of json_event_filter_check and defines
the following fields:
- jevt.time/jevt.rawtime: extracts the time from the underlying json object.
- jevt.value[<json pointer>]: general purpose way to extract any
json value out of the underlying object. <json pointer> is a json
pointer expression
- jevt.obj: Return the entire object, stringified.
- k8s_audit_filter_check: implements fields that extract values from
k8s audit events. Most of the implementation is in the form of macros
like ka.user.name, ka.uri, ka.target.name, etc. that just use json
pointers to extact the appropriate value from a k8s audit event. More
advanced fields like ka.uri.param, ka.req.container.image use
indexing to extract individual values out of maps or arrays.
- json_event_filter_factory: used by things like the lua parser api,
output formatter, etc to create the necessary objects and return
them.
- json_event_formatter: given a format string, create the necessary
fields that will be used to create a resolved string when given a
json_event object.
* Add ability to list fields
Similar to sysdig's -l option, add --list (<source>) to list the fields
supported by falco. With no source specified, will print all
fields. Source can be "syscall" for inspector fields e.g. what is
supported by sysdig, or "k8s_audit" to list fields supported only by the
k8s audit support in falco.
* Initial set of k8s audit rules
Add an initial set of k8s audit rules. They're broken into 3 classes of
rules:
- Suspicious activity: this includes things like:
- A disallowed k8s user performing an operation
- A disallowed container being used in a pod.
- A pod created with a privileged pod.
- A pod created with a sensitive mount.
- A pod using host networking
- Creating a NodePort Service
- A configmap containing private credentials
- A request being made by an unauthenticated user.
- Attach/exec to a pod. (We eventually want to also do privileged
pods, but that will require some state management that we don't
currently have).
- Creating a new namespace outside of an allowed set
- Creating a pod in either of the kube-system/kube-public namespaces
- Creating a serviceaccount in either of the kube-system/kube-public
namespaces
- Modifying any role starting with "system:"
- Creating a clusterrolebinding to the cluster-admin role
- Creating a role that wildcards verbs or resources
- Creating a role with writable permissions/pod exec permissions.
- Resource tracking. This includes noting when a deployment, service,
- configmap, cluster role, service account, etc are created or destroyed.
- Audit tracking: This tracks all audit events.
To support these rules, add macros/new indexing functions as needed to
support the required fields and ways to index the results.
* Add ability to read trace files of k8s audit evts
Expand the use of the -e flag to cover both .scap files containing
system calls as well as jsonl files containing k8s audit events:
If a trace file is specified, first try to read it using the
inspector. If that throws an exception, try to read the first line as
json. If both fail, return an error.
Based on the results of the open, the main loop either calls
do_inspect(), looping over system events, or
read_k8s_audit_trace_file(), reading each line as json and passing it to
the engine and outputs.
* Example showing how to enable k8s audit logs.
An example of how to enable k8s audit logging for minikube.
* Add unit tests for k8s audit support
Initial unit test support for k8s audit events. A new multiplex file
falco_k8s_audit_tests.yaml defines the tests. Traces (jsonl files) are
in trace_files/k8s_audit and new rules files are in
test/rules/k8s_audit.
Current test cases include:
- User outside allowed set
- Creating disallowed pod.
- Creating a pod explicitly on the allowed list
- Creating a pod w/ a privileged container (or second container), or a
pod with no privileged container.
- Creating a pod w/ a sensitive mount container (or second container), or a
pod with no sensitive mount.
- Cases for a trace w/o the relevant property + the container being
trusted, and hostnetwork tests.
- Tests that create a Service w/ and w/o a NodePort type.
- Tests for configmaps: tries each disallowed string, ensuring each is
detected, and the other has a configmap with no disallowed string,
ensuring it is not detected.
- The anonymous user creating a namespace.
- Tests for all kactivity rules e.g. those that create/delete
resources as compared to suspicious activity.
- Exec/Attach to Pod
- Creating a namespace outside of an allowed set
- Creating a pod/serviceaccount in kube-system/kube-public namespaces
- Deleting/modifying a system cluster role
- Creating a binding to the cluster-admin role
- Creating a cluster role binding that wildcards verbs or resources
- Creating a cluster role with write/pod exec privileges
* Don't manually install gcc 4.8
gcc 4.8 should already be installed by default on the vm we use for
travis.
Update the express version to mitigate some security vulnerabilities.
Update the port to match the one used by demo.yml.
Change to /usr/src/app so npm install works as expected.
* Use correct copyright years.
Also include the start year.
* Improve copyright notices.
Use the proper start year instead of just 2018.
Add the right owner Draios dba Sysdig.
Add copyright notices to some files that were missing them.
Replace references to GNU Public License to Apache license in:
- COPYING file
- README
- all source code below falco
- rules files
- rules and code below test directory
- code below falco directory
- entrypoint for docker containers (but not the Dockerfiles)
I didn't generally add copyright notices to all the examples files, as
they aren't core falco. If they did refer to the gpl I changed them to
apache.
* Reopen file/program outputs on SIGUSR1
When signaled with SIGUSR1, close and reopen file and program based
outputs. This is useful when combined with logrotate to rotate logs.
* Example logrotate config
Example logrotate config that relies on SIGUSR1 to rotate logs.
* Ensure options exist for all outputs
Options may not be provided for some outputs (like stdout), so create an
empty set of options in that case.
Add an example puppet module for falco. This module configures the main
falco configuration file /etc/falco/falco.yaml, providing templates for
all configuration options.
It installs falco using debian/rpm packages and installs/manages it as a
systemd service.
* Use kubernetes.default to reach k8s api server
Originally raised in #296, but since then we documented rbac and
without-rbac methods, so mirroring the change here.
* Mount docker socket/dev read-write
This matches the direct docker run commands, which also mount those
resources read-write.
* Refactor shell rules to avoid FPs.
Refactoring the shell related rules to avoid FPs. Instead of considering
all shells suspicious and trying to carve out exceptions for the
legitimate uses of shells, only consider shells spawned below certain
processes suspicious.
The set of processes is a collection of commonly used web servers,
databases, nosql document stores, mail programs, message queues, process
monitors, application servers, etc.
Also, runsv is also considered a top level process that denotes a
service. This allows a way for more flexible servers like ad-hoc nodejs
express apps, etc to denote themselves as a full server process.
* Update event generator to reflect new shell rules
spawn_shell is now a silent action. its replacement is
spawn_shell_under_httpd, which respawns itself as httpd and then runs a
shell.
db_program_spawn_binaries now runs ls instead of a shell so it only
matches db_program_spawn_process.
* Comment out old shell related rules
* Modify nodejs example to work w/ new shell rules
Start the express server using runit's runsv, which allows falco to
consider any shells run by it as suspicious.
* Use the updated argument for mkdir
In https://github.com/draios/sysdig/pull/757 the path argument for mkdir
moved to the second argument. This only became visible in the unit tests
once the trace files were updated to reflect the other shell rule
changes--the trace files had the old format.
* Update unit tests for shell rules changes
Shell in container doesn't exist any longer and its functionality has
been subsumed by run shell untrusted.
* Allow git binaries to run shells
In some cases, these are run below a service runsv so we still need
exceptions for them.
* Let consul agent spawn curl for health checks
* Don't protect tomcat
There's enough evidence of people spawning general commands that we
can't protect it.
* Reorder exceptions, add rabbitmq exception
Move the nginx exception to the main rule instead of the
protected_shell_spawner macro. Also add erl_child_setup (related to
rabbitmq) as an allowed shell spawner.
* Add additional spawn binaries
All off these are either below nginx, httpd, or runsv but should still
be allowed to spawn shells.
* Exclude shells when ancestor is a pkg mgmt binary
Skip shells when any process ancestor (parent, gparent, etc) is a
package management binary. This includes the program needrestart. This
is a deep search but should prevent a lot of other more detailed
exceptions trying to find the specific scripts run as a part of
installations.
* Skip shells related to serf
Serf is a service discovery tool and can in some cases be spawned by
apache/nginx. Also allow shells that are just checking the status of
pids via kill -0.
* Add several exclusions back
Add several exclusions back from the shell in container rule. These are
all allowed shell spawns that happen to be below
nginx/fluentd/apache/etc.
* Remove commented-out rules
This saves space as well as cleanup. I haven't yet removed the
macros/lists used by these rules and not used anywhere else. I'll do
that cleanup in a separate step.
* Also exclude based on command lines
Add back the exclusions based on command lines, using the existing set
of command lines.
* Add addl exclusions for shells
Of note is runsv, which means it can directly run shells (the ./run and
./finish scripts), but the things it runs can not.
* Don't trigger on shells spawning shells
We'll detect the first shell and not any other shells it spawns.
* Allow "runc:" parents to count as a cont entrypnt
In some cases, the initial process for a container can have a parent
"runc:[0:PARENT]", so also allow those cases to count as a container
entrypoint.
* Use container_entrypoint macro
Use the container_entrypoint macro to denote entering a container and
also allow exe to be one of the processes that's the parent of an
entrypoint.
If a daemonset specifies a command, this overrides the entrypoint. In
falco's case, the entrypoint handles the details of loading the kernel
driver, so specifying a command accidently prevents the driver from
being loaded.
This happens to work if you had a previously loaded sysdig_probe driver
lying around.
The fix is to specify args instead. In this case, the driver will be
loaded via the entrypoint.
This fixes https://github.com/draios/falco/issues/225.
Add example k8s yaml files that allow for running falco as a k8s
daemonset and the event generator as a deployment, running on 1 node.
Falco is configured to send its output to a slack webhook corresponding
to the #demo-falco-alerts channel on sysdig's public slack channel.
The output is is k8s friendly by using -pk, -k (k8s api server), and
-K (credentials to communicate with api server).
Small changes to improve the use of falco_event_generator with falco:
- In event_generator, some actions like exec_ls won't trigger
notifications on their own. So exclude them from -a all.
- For all actions, print details on what the action will do.
- For actions that won't result in a falco notification in containers,
note that in the output.
- The short version of --once wasn't working, fix the getopt.
- Explicitly saying -a all wasn't working, fix.
- Don't rely on an external ruleset in the nodejs docker-compose
demo--the built in rules are sufficient now.
Simple docker-compose environment that starts a simple express server
with a poorly-designed /api/exec/<cmd> endpoint that executes arbitrary
commands, and uses falco to detect running bash.
Adding docker-compose based example of man-in-the-middle attack against
installation scripts and how it can be detected using sysdig falco.
The docker-compose environment starts a good web server, compromised
nginx installation, evil web server, and a copy of sysdig falco. The
README walks through the process of compromising a client by using curl
http://localhost/get-software.sh | bash and detecting the compromise
using ./fbash.
The fbash program included in this example fixes https://github.com/draios/falco/issues/46.