First of a handful of PRs to start clarifying the independence of Falco
I don't see any breaking changes here, just cosmetic changes.
Signed-off-by: Kris Nova <kris@nivenly.com>
The call to rule_loader.load_rules only returns 2 values, so only pop
two values from the stack. This fixes#906.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
As a part of the changes in
https://github.com/falcosecurity/falco/pull/826/, we added several
breaking changes to rules files like renaming/removing some filter
fields. This isn't ideal for customers who are using their own rules
files.
We shouldn't break older rules files in this way, so add some minimal
backwards compatibility which adds back the fields that were
removed *and* actually used in k8s_audit_rules.yaml. They have the same
functionality as before. One exception is
ka.req.binding.subject.has_name, which was only used in a single output
field for debugging and shouldn't have been in the rules file in the
first place. This always returns the string "N/A".
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Support the notion of a message for all fields in a single class, and
making sure it's wrapped as well as the other fields.
This is used to display a single message about how indexing working for
ka.* filter fields and what IDX_ALLOWED/IDX_NUMERIC/IDX_KEY means,
rather than repeating the same text over and over in every field.
The wrapping is handled by a function falco::utils::wrap_text.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Refactor how JSON event/k8s audit events extract values in two important
ways:
1. An event can now extract multiple values.
2. The extracted value is a class json_event_value instead of a simple
string.
The driver for 1. was that some filtercheck fields like
"ka.req.container.privileged" actually should extract multiple values,
as a pod can have multiple containers and it doesn't make sense to
summarize that down to a single value.
The driver for 2. is that by having an object represent a single
extracted value, you can also hold things like numbers e.g. ports, uids,
gids, etc. and ranges e.g. [0:3]. With an object, you can override
operators ==, <, etc. to do comparisons between the numbers and ranges,
or even set membership tests between extracted numbers and sets of
ranges.
This is really handy for a lot of new fields implemented as a part of
PSP support, where you end up having to check for overlaps between the
paths, images, ports, uids, etc in a K8s Audit Event and the acceptable
values, ranges, path prefixes enumerated in a PSP.
Implementing these changes also involve an overhaul of how aliases are
implemented. Instead of having an optional "formatting" function, where
arguments to the formatting function were expressed as text within the
index, define optional extraction and indexing functions. If an
extraction function is defined, it's responsible for taking the full
json object and calling add_extracted_value() to add values. There's a
default extraction function that uses a list of json_pointers with
automatic iteration over array values returned by a json pointer.
There's still a notion of filter fields supporting indexes--that's
simply handled within the default extraction or custom extraction
function. And for most fields, there won't be a need to write a custom
extraction function simply to implement indexing.
Within a json_event_filter_check object, instead of having a single
extracted value as a string, hold a vector of extracted json_event_value
objects (vector because order matters) and a set of json_event_value
objects (for set comparisons) as m_evalues. Values on the right hand
side of the expression are held as a set m_values.
json_event_filter_check::compare now supports IN/INTERSECTS as set
comparisons. It also supports PMATCH using path_prefix_search objects,
which simplifies checks like ka.req.pod.volumes.hostpath--now they can
be expressed as "ka.req.pod.volumes.hostpath intersects (/proc,
/var/run/docker.sock, /, /etc, /root)" instead of
"ka.req.volume.hostpath[/proc]=true or
ka.req.volume.hostpath[/root]=true or ...".
Define ~10 new filtercheck fields that extract pod properties like
hostIpc, readOnlyRootFilesystem, etc. that are relevant for PSP validation.
As a part of these changes, also clarify the names of filter fields
related to pods to always have a .pod in the name. Furthermore, fields
dealing with containers in a pod always have a .pod.containers prefix in
the name.
Finally, change the comparisons for existing k8s audit rules to use
"intersects" and/or "in" when appropriate instead of a single equality
comparison.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Related to the changes in https://github.com/draios/sysdig/pull/1501,
add support for an "intersects" operator that verifies if any of the
values in the rhs of an expression are found in the set of extracted
values.
For example:
(a,b,c) in (a,b) is false, but (a,b,c) intersects (a,b) is true.
The code that implements CO_INTERSECTS is in a different commit.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
I wasn't able to compile the dev branch with gcc 5.4 (e.g. not using the
builder), getting this error:
```
.../falco/userspace/falco/grpc_server.cpp:40:109: error: specialization of ‘template<class Request, class Response> void falco::grpc::request_stream_context<Request, Response>::start(falco::grpc::server*)’ in different namespace [-fpermissive]
void falco::grpc::request_stream_context<falco::output::request, falco::output::response>::start(server* srv)
^
In file included from .../falco/userspace/falco/grpc_server.cpp:26:0:
.../falco/userspace/falco/grpc_server.h:102:7: error: from definition of ‘template<class Request, class Response> void falco::grpc::request_stream_context<Request, Response>::start(falco::grpc::server*)’ [-fpermissive]
void start(server* srv);
```
It looks like gcc 5.4 doesn't handle a declaration with namespace blocks
but a definition with namespaces in the
function. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56480 has more
detail.
A workaround is to add `namespace falco {` and `namespace grpc {` around
the declarations.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
If this work as intended PR will automatically get the area labels depending on the files he modified.
In case the user wants it can still apply other areas manually, by slash command, or editing the PR template during the opening of the PR.
Signed-off-by: Leonardo Di Donato <leodidonato@gmail.com>
Properly parse multi-document yaml files e.g. blocks separated by
---. This is easily handled by lyaml itself--you just need to pass the
option all = true to yaml.load, and each document will be provided as a table.
This does break the table iteration a bit, so some more refactoring:
- Create a load_state table that holds context like the current
- document index, the required_engine_version, etc.
- Pull out the parts that parse a single document to load_rules_doc(),
which is given the table for a single document + load_state.
- Simplify get_orig_yaml_obj to just provide a single row index and
- return all rows from that point to the next blank line or line
starting with '-'
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Make additional improvements to display relevant context when validating
files. This handles cases where a macro/rule overwrites a prior rule.
- Instead of saving the index into the array of lines for each rule,
save the rule yaml itself, as a property 'context' for each object.
- When appending rules, the context of the base macro/rule and the
context of the appended rule/macro are concatenated.
- New functions get_orig_yaml_obj, build_error, and
build_error_with_context handle building the error string.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Fix a couple of small bugs when verifying macro/rule objects:
1) Yaml can have document separators "---", and those were mistakenly
being considered array items.
2) When reading macros and rules and using array position to find the
right document offset, the overall object order should be
used (e.g. this is the 5th object from the file) and not the array
position (e.g. this is the 3rd rule from the file).
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Given the compiler we currently use, you can't actually enable/disable
regexes in falco_engine::enable_rule using a regex pattern. The regex
either will fail to compile or will compile but not actually match
strings. This is noted on the c++11 compatibility notes for gcc 4.8.2:
https://gcc.gnu.org/onlinedocs/gcc-4.8.2/libstdc++/manual/manual/status.html#status.iso.2011.
The only use of using enable_rule was treating the regex pattern as a
substring match anyway, so we can change the engine to treat the pattern
as a substring.
So change the method/supporting sub-classes to note that the argument is
a substring match, and change falco itself to refer to substrings
instead of patterns.
This fixes https://github.com/falcosecurity/falco/issues/742.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Instead of relying on lua errors to pass back parse errors, pass back an
explicit true + required engine version or false + error message.
Also clean up the error message to display info + context on the
error. When the error related to yaml parsing, use the row number passed
back in lyaml's error string to print the specific line with the error.
When parsing rules/macros/lists, print the object being parsed alongside
the error.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
When parsing rules files with -V (validate), print info on the result of
loading the rules file to stdout. That way a caller can capture stdout
to pass along any rules parsing error.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
To speed up list expansion, instead of using regexes to replace a list
name with its contents, do string searches followed by examining the
preceding/following characters for the proper delimiter.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
We shouldn't need to clean up strings via a cleanup function and don't
need to do it via a bunch of string.gsub() functions.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Instead of iterating over the entire list of filters and doing pattern
matches against each defined filter, perform table lookups.
For filters that take arguments e.g. proc.aname[3] or evt.arg.xxx, split
the filtercheck string on bracket/dot and check the values against a
table.
There are now two tables of defined filters: defined_arg_filters and
defined_noarg_filters. Each filter is put into a table depending on
whether the filter takes an argument or not.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Json-related filtercheck fields supported indexing with brackets, but
when looking at the field descriptions you couldn't tell if a field
allowed an index, required an index, or did not allow an index.
This information was available, but it was a part of the protected
aliases map within the class.
Move this to the public field information so it can be used outside the
class.
Also add m_ prefixes for member names, now that the struct isn't
trivial.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Add more accurate tracking of the number of falco rules loaded per
ruleset, which are made available via the engine method
::num_rules_for_ruleset().
In the ruleset objects, keep track if a filter wrapper is actually
added/removed and if so increment/decrement the count.
For a while, falco has set the inspector drop mode to 1, which should
discard several classes of events that weren't necessary to use most
falco rules.
However, it was mistakenly being called before the inspector was opened,
which meant it wasn't actually doing anything.
Fix this by setting the dropping mode after the inspector open.
On some spot testing on a moderately loaded environment, this results in
a 30-40% drop in the number of system calls processed per second, and
should result in a nice boost in performance.
* Update engine fields checksum for fd.dev.*
New fields fd.dev.*, so updating the fields checksum.
* Print a message why the trace file can't be read.
At debug level only, but better than nothing.
* Adjust tests to match new container_started macro
Now that the container_started macro works either on the container event
or the first process being spawned in a container, we need to adjust the
counts for some rules to handle both cases.
* Add option to display times in ISO 8601 UTC
ISO 8601 time is useful when, say, running falco in a container, which
may have a different /etc/localtime than the host system.
A new config option time_format_iso_8601 controls whether log message
and event times are displayed in ISO 8601 in UTC or in local time. The
default is false (display times in local time).
This option is passed to logger init as well as outputs. For outputs it
eventually changes the time format field from %evt.time/%jevt.time to
%evt.time.iso8601/%jevt.time.iso8601.
Adding this field changes the falco engine version so increment it.
This depends on https://github.com/draios/sysdig/pull/1317.
* Unit test for ISO 8601 output
A unit test for ISO 8601 output ensures that both the log and event time
is in ISO 8601 format.
* Use ISO 8601 output by default in containers
Now that we have an option that controls iso 8601 output, use it by
default in containers. We do this by changing the value of
time_format_iso_8601 in falco.yaml in the container.
* Handle errors in strftime/asctime/gmtime
A placeholder "N/A" is used in log messages instead.
When creating syscall event drop alerts, instead of including just the
total and dropped event count, include all possible causes of drops as
well as whether bpf is enabled.
* Make stats file interval configurable
New argument --stats_interval=<msec> controls the interval at which
statistics are written to the stats file. The default is 5000 ms (5 sec)
which matches the prior hardcoded interval.
The stats interval is triggered via signals, so an interval below ~250ms
will probably interfere with falco's behavior.
* Add ability to emit general purpose messages
A new method falco_outputs::handle_msg allows emitting generic messages
that have a "rule", message, and output fields, but aren't exactly tied
to any event and aren't passed through an event formatter.
This allows falco to emit "events" based on internal checks like kernel
buffer overflow detection.
* Clean up newline handling for logging
Log messages from falco_logger::log may or may not have trailing
newlines. Handle both by always adding a newline to stderr logs and
always removing any newline from syslog logs.
* Add method to get sequence from subkey
New variant of get_sequence that allows fetching a list of items from a
key + subkey, for example:
key:
subkey:
- list
- items
- here
Both use a shared method get_sequence_from_node().
* Monitor syscall event drops + optional actions
Start actively monitoring the kernel buffer for syscall event drops,
which are visible in scap_stats.n_drops, and add the ability
to take actions when events are dropped. The -v (verbose) and
-s (stats filename) arguments also print out information on dropped
events, but they were only printed/logged without any actions.
In falco config you can specify one or more of the following actions to
take when falco notes system call drops:
- ignore (do nothing)
- log a critical message
- emit an "internal" falco alert. It looks like any other alert with a
time, "rule", message, and output fields but is not related to any
rule in falco_rules.yaml/other rules files.
- exit falco (the idea being that the restart would be monitored
elsewhere).
A new module syscall_event_drop_mgr is called for every event and
collects scap stats every second. If in the prior second there were
drops, perform_actions() handles the actions.
To prevent potential flooding in high drop rate environments, actions
are goverened by a token bucket with a rate of 1 actions per 30 seconds,
with a max burst of 10 seconds. We might tune this later based on
experience in busy environments.
This might be considered a fix for
https://github.com/falcosecurity/falco/issues/545. It doesn't
specifically flag falco rules alerts when there are drops, but does
make it easier to notice when there are drops.
* Add unit test for syscall event drop detection
Add unit tests for syscall event drop detection. First, add an optional
config option that artifically increments the drop count every
second. (This is only used for testing).
Then add test cases for each of the following:
- No dropped events: should not see any log messages or alerts.
- ignore action: should note the drops but not log messages or alert.
- log action: should only see log messages for the dropped events.
- alert action: should only see alerts for the dropped events.
- exit action: should see log message noting the dropped event and exit
with rc=1
A new trace file ping_sendto.scap has 10 seconds worth of events to
allow the periodic tracking of drops to kick in.
* Add support for container metaevent to detect container spawning
Create a new macro "container_started" to check both the old and
the new check.
Also, only look for execve exit events with vpid=1.
* Use TBB_INCLUDE_DIR for consistency w sysdig,agent
Previously it was a mix of TBB_INCLUDE and TBB_INCLUDE_DIR.
* Build using matching sysdig branch, if exists
* Expose required_engine_version when loading rules
When loading a rules file, have alternate methods that return the
required_engine_version. The existing methods remain unchanged and just
call the new methods with a dummy placeholder.
* Add --support argument to print support bundle
Add an argument --support that can be used as a single way to collect
necessary support information, including the falco version, config,
commandline, and all rules files.
There might be a big of extra structure to the rules files, as they
actually support an array of "variants", but we're thinking ahead to
cases where there might be a comprehensive library of rules files and
choices, so we're adding the extra structure.
* Add ability to print field names only
Add ability to print field names only instead of all information about
fields (description, etc) using -N cmdline option.
This will be used to add some versioning support steps that check for a
changed set of fields.
* Add an engine version that changes w/ filter flds
Add a method falco_engine::engine_version() that returns the current
engine version (e.g. set of supported fields, rules objects, operators,
etc.). It's defined in falco_engine_version.h, starts at 2 and should be
updated whenever a breaking change is made.
The most common reason for an engine change will be an update to the set
of filter fields. To make this easy to diagnose, add a build time check
that compares the sha256 output of "falco --list -N" against a value
that's embedded in falco_engine_version.h. A mismatch fails the build.
* Check engine version when loading rules
A rules file can now have a field "required_engine_version N". If
present, the number is compared to the falco engine version. If the
falco engine version is less, an error is thrown.
* Unit tests for engine versioning
Add a required version: 2 to one trace file to check the positive case
and add a new test that verifies that a too-new rules file won't be loaded.
* Rename falco test docker image
Rename sysdig/falco to falcosecurity/falco in unit tests.
* Don't pin falco_rules.yaml to an engine version
Currently, falco_rules.yaml is compatible with versions <= 0.13.1 other
than the required_engine_version object itself, so keep that line
commented out so users can use this rules file with older falco
versions.
We'll uncomment it with the first incompatible falco engine change.
* Allow SSL for k8s audit endpoint
Allow enabling SSL for the Kubernetes audit log web server. This
required adding two new configuration options: webserver.ssl_enabled and
webserver.ssl_certificate. To enable SSL add the below to the webserver
section of the falco.yaml config:
webserver:
enabled: true
listen_port: 8765s
k8s_audit_endpoint: /k8s_audit
ssl_enabled: true
ssl_certificate: /etc/falco/falco.pem
Note that the port number has an s appended to indicate SSL
for the port which is how civetweb expects SSL ports be denoted. We
could change this to dynamically add the s if ssl_enabled: true.
The ssl_certificate is a combination SSL Certificate and corresponding
key contained in a single file. You can generate a key/cert as follows:
$ openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem
$ cat certificate.pem key.pem > falco.pem
$ sudo cp falco.pem /etc/falco/falco.pem
fix ssl option handling
* Add notes on how to create ssl certificate
Add notes on how to create the ssl certificate to the config comments.
In the common case, falco doesn't generate much output, so it's
desirable to not buffer it in case you're tail -fing some logs.
So change the default for buffered outputs to false.
As per https://github.com/draios/sysdig/pull/1275, the gen_event class
mandate the implementation of two new methods.
This change aims to simplify the implementation of a generic event
processing infrastructure, that could handle both sinsp and json
events.
The -Wextra compile-time option will enable additional diagnostic
warnigns. The -Werror option will cause the compiler to treat warnings
as errors. This change adds a build time option,
BUILD_WARNINGS_AS_ERRORS, to conditionally enable those flags. Note
that depending on the compiler you're using, if you enable this option,
compilation may fail (some compiler version have additional warnings
that have not yet been resolved).
Testing with these options in place identified a destructor that was
throwing an exception. C++11 doesn't allow destructors to throw
exceptions, so those throw's would have resulted in calls to
terminate(). I replace them with an error log and a call to assert().
It's possible to call event_tags_for_ruleset/evttypes_for_ruleset for a
ruleset that hasn't been loaded. In this case, it's possible to go past
the end of the m_rulesets array.
After fixing that, it's also possible to go past the end of the
event_tags array in event_tags_for_ruleset().
So in both cases, check the index against the array size before
indexing.
* Add new json/webserver libs, embedded webserver
Add two new external libraries:
- nlohmann-json is a better json library that has stronger use of c++
features like type deduction, better conversion from stl structures,
etc. We'll use it to hold generic json objects instead of jsoncpp.
- civetweb is an embeddable webserver that will allow us to accept
posted json data.
New files webserver.{cpp,h} start an embedded webserver that listens for
POSTS on a configurable url and passes the json data to the falco
engine.
New falco config items are under webserver:
- enabled: true|false. Whether to start the embedded webserver or not.
- listen_port. Port that webserver listens on
- k8s_audit_endpoint: uri on which to accept POSTed k8s audit events.
(This commit doesn't compile entirely on its own, but we're grouping
these related changes into one commit for clarity).
* Don't use relative paths to find lua code
You can look directly below PROJECT_SOURCE_DIR.
* Reorganize compiler lua code
The lua compiler code is generic enough to work on more than just
sinsp-based rules, so move the parts of the compiler related to event
types and filterchecks out into a standalone lua file
sinsp_rule_utils.lua.
The checks for event types/filterchecks are now done from rule_loader,
and are dependent on a "source" attribute of the rule being
"sinsp". We'll be adding additional types of events next that come from
sources other than system calls.
* Manage separate syscall/k8s audit rulesets
Add the ability to manage separate sets of rules (syscall and
k8s_audit). Stop using the sinsp_evttype_filter object from the sysdig
repo, replacing it with falco_ruleset/falco_sinsp_ruleset from
ruleset.{cpp,h}. It has the same methods to add rules, associate them
with rulesets, and (for syscall) quickly find the relevant rules for a
given syscall/event type.
At the falco engine level, there are new parallel interfaces for both
types of rules (syscall and k8s_audit) to:
- add a rule: add_k8s_audit_filter/add_sinsp_filter
- match an event against rules, possibly returning a result:
process_sinsp_event/process_k8s_audit_event
At the rule loading level, the mechanics of creating filterchecks
objects is handled two factories (sinsp_filter_factory and
json_event_filter_factory), both of which are held by the engine.
* Handle multiple rule types when parsing rules
Modify the steps of parsing a rule's filter expression to handle
multiple types of rules. Notable changes:
- In the rule loader/ast traversal, pass a filter api object down,
which is passed back up in the lua parser api calls like nest(),
bool_op(), rel_expr(), etc.
- The filter api object is either the sinsp factory or k8s audit
factory, depending on the rule type.
- When the rule is complete, the complete filter is passed to the
engine using either add_sinsp_filter()/add_k8s_audit_filter().
* Add multiple output formatting types
Add support for multiple output formatters. Notable changes:
- The falco engine is passed along to falco_formats to gain access to
the engine's factories.
- When creating a formatter, the source of the rule is passed along
with the format string, which controls which kind of output formatter
is created.
Also clean up exception handling a bit so all lua callbacks catch all
exceptions and convert them into lua errors.
* Add support for json, k8s audit filter fields
With some corresponding changes in sysdig, you can now create general
purpose filter fields and events, which can be tied together with
nesting, expressions, and relational operators. The classes here
represent an instance of these fields devoted to generic json objects as
well as k8s audit events. Notable changes:
- json_event: holds a json object, used by all of the below
- json_event_filter_check: Has the ability to extract values out of a
json_event object and has the ability to define macros that associate
a field like "group.field" with a json pointer expression that
extracts a single property's value out of the json object. The basic
field definition also allows creating an index
e.g. group.field[index], where a std::function is responsible for
performing the indexing. This class has virtual void methods so it
must be overridden.
- jevt_filter_check: subclass of json_event_filter_check and defines
the following fields:
- jevt.time/jevt.rawtime: extracts the time from the underlying json object.
- jevt.value[<json pointer>]: general purpose way to extract any
json value out of the underlying object. <json pointer> is a json
pointer expression
- jevt.obj: Return the entire object, stringified.
- k8s_audit_filter_check: implements fields that extract values from
k8s audit events. Most of the implementation is in the form of macros
like ka.user.name, ka.uri, ka.target.name, etc. that just use json
pointers to extact the appropriate value from a k8s audit event. More
advanced fields like ka.uri.param, ka.req.container.image use
indexing to extract individual values out of maps or arrays.
- json_event_filter_factory: used by things like the lua parser api,
output formatter, etc to create the necessary objects and return
them.
- json_event_formatter: given a format string, create the necessary
fields that will be used to create a resolved string when given a
json_event object.
* Add ability to list fields
Similar to sysdig's -l option, add --list (<source>) to list the fields
supported by falco. With no source specified, will print all
fields. Source can be "syscall" for inspector fields e.g. what is
supported by sysdig, or "k8s_audit" to list fields supported only by the
k8s audit support in falco.
* Initial set of k8s audit rules
Add an initial set of k8s audit rules. They're broken into 3 classes of
rules:
- Suspicious activity: this includes things like:
- A disallowed k8s user performing an operation
- A disallowed container being used in a pod.
- A pod created with a privileged pod.
- A pod created with a sensitive mount.
- A pod using host networking
- Creating a NodePort Service
- A configmap containing private credentials
- A request being made by an unauthenticated user.
- Attach/exec to a pod. (We eventually want to also do privileged
pods, but that will require some state management that we don't
currently have).
- Creating a new namespace outside of an allowed set
- Creating a pod in either of the kube-system/kube-public namespaces
- Creating a serviceaccount in either of the kube-system/kube-public
namespaces
- Modifying any role starting with "system:"
- Creating a clusterrolebinding to the cluster-admin role
- Creating a role that wildcards verbs or resources
- Creating a role with writable permissions/pod exec permissions.
- Resource tracking. This includes noting when a deployment, service,
- configmap, cluster role, service account, etc are created or destroyed.
- Audit tracking: This tracks all audit events.
To support these rules, add macros/new indexing functions as needed to
support the required fields and ways to index the results.
* Add ability to read trace files of k8s audit evts
Expand the use of the -e flag to cover both .scap files containing
system calls as well as jsonl files containing k8s audit events:
If a trace file is specified, first try to read it using the
inspector. If that throws an exception, try to read the first line as
json. If both fail, return an error.
Based on the results of the open, the main loop either calls
do_inspect(), looping over system events, or
read_k8s_audit_trace_file(), reading each line as json and passing it to
the engine and outputs.
* Example showing how to enable k8s audit logs.
An example of how to enable k8s audit logging for minikube.
* Add unit tests for k8s audit support
Initial unit test support for k8s audit events. A new multiplex file
falco_k8s_audit_tests.yaml defines the tests. Traces (jsonl files) are
in trace_files/k8s_audit and new rules files are in
test/rules/k8s_audit.
Current test cases include:
- User outside allowed set
- Creating disallowed pod.
- Creating a pod explicitly on the allowed list
- Creating a pod w/ a privileged container (or second container), or a
pod with no privileged container.
- Creating a pod w/ a sensitive mount container (or second container), or a
pod with no sensitive mount.
- Cases for a trace w/o the relevant property + the container being
trusted, and hostnetwork tests.
- Tests that create a Service w/ and w/o a NodePort type.
- Tests for configmaps: tries each disallowed string, ensuring each is
detected, and the other has a configmap with no disallowed string,
ensuring it is not detected.
- The anonymous user creating a namespace.
- Tests for all kactivity rules e.g. those that create/delete
resources as compared to suspicious activity.
- Exec/Attach to Pod
- Creating a namespace outside of an allowed set
- Creating a pod/serviceaccount in kube-system/kube-public namespaces
- Deleting/modifying a system cluster role
- Creating a binding to the cluster-admin role
- Creating a cluster role binding that wildcards verbs or resources
- Creating a cluster role with write/pod exec privileges
* Don't manually install gcc 4.8
gcc 4.8 should already be installed by default on the vm we use for
travis.
Add a signal handler for SIGHUP that sets a global variable g_restart.
All the real execution of falco was already centralized in a standalone
function falco_init(), so simply exit on g_restart=true and call
falco_init() in a loop that restarts if g_restart is set to true.
Take care to not daemonize more than once and to reset the getopt index
to 1 on restart.
This fixes https://github.com/falcosecurity/falco/issues/432.
* Also add endswith to lua parser
Add endswith as a symbol so it can be parsed in filter expressions.
* Unit test for endswith support
Add a test case for endswith support, based on the filename ending with null.
* Use correct copyright years.
Also include the start year.
* Improve copyright notices.
Use the proper start year instead of just 2018.
Add the right owner Draios dba Sysdig.
Add copyright notices to some files that were missing them.
Replace references to GNU Public License to Apache license in:
- COPYING file
- README
- all source code below falco
- rules files
- rules and code below test directory
- code below falco directory
- entrypoint for docker containers (but not the Dockerfiles)
I didn't generally add copyright notices to all the examples files, as
they aren't core falco. If they did refer to the gpl I changed them to
apache.
* Proactively enable rules instead of only disabling
Previously, rules were enabled by default. Some performance improvements
in https://github.com/draios/sysdig/pull/1126 broke this, requiring that
each rule is explicitly enabled or disabled for a given ruleset.
So if enabled is true, explicitly enable the rule for the default ruleset.
* Get rid of shadowed res variable.
It was used both for the inspector loop and the falco result.
* Add ability to skip rules for unknown filters
Add the ability to skip a rule if its condition refers to a filtercheck
that doesn't exist. This allows defining a rules file that contains new
conditions that can still has limited backward compatibility with older
falco versions.
When compiling a filter, return a list of filtercheck names that are
present in the ast (which also includes filterchecks from any
macros). This set of filtercheck names is matched against the set of
filterchecks known to sinsp, expressed as lua patterns, and in the
global table defined_filters. If no match is found, the rule loader
throws an error.
The pattern changes slightly depending on whether the filter has
arguments or not. Two filters (proc.apid/proc.aname) can work with or
without arguments, so both styles of patterns are used.
If the rule has an attribute "skip-if-unknown-filter", the rule will be
skipped instead.
* Unit tests for skipping unknown filter
New unit test for skipping unknown filter. Test cases:
- A rule that refers to an unknown filter results in an error.
- A rule that refers to an unknown filter, but has
"skip-if-unknown-filter: true", can be read, but doesn't match any events.
- A rule that refers to an unknown filter, but has
"skip-if-unknown-filter: false", returns an error.
Also test the case of a filtercheck like evt.arg.xxx working properly
with the embedded patterns as well as proc.aname/apid which work both ways.
* Use better way to skip falco events
Use the new method falco_consider() to determine which events to
skip. This centralizes the logic in a single function. All events will
still be considered if falco was run with -A.
This depends on https://github.com/draios/sysdig/pull/1105.
* Add ability to specify -A flag in tests
test attribute all_events corresponds to the -A flag. Add for some tests
that would normally refer to skipped events.
* Only check whole rule names when matching counts
Tweak the regex so a rule my_great_rule doesn't pick up event counts for
a rule "great_rule: nnn".
* Add ability to skip evttype warnings for rules
A new attribute warn_evttypes, if present, suppresses printing warnings
related to a rule not matching any event type. Useful if you have a rule
where not including an event type is intentional.
* Add test for preserving rule order
Test the fix for https://github.com/draios/falco/issues/354. A rules
file has a event-specific rule first and a catchall rule second. Without
the changes in https://github.com/draios/sysdig/pull/1103, the first
rule does not match the event.
* Properly support syscalls in filter conditions
Syscalls have their own numbers but they weren't really handled within
falco. This meant that there wasn't a way to handle filters with
evt.type=xxx clauses where xxx was a value that didn't have a
corresponding event entry (like "madvise", for examples), or where a
syscall like open could also be done indirectly via syscall(__NR_open,
...).
First, add a new top-level global syscalls that maps from a string like
"madvise" to all the syscall nums for that id, just as we do for event
names/numbers.
In the compiler, when traversing the AST for evt.type=XXX or evt.type in
(XXX, ...) clauses, also try to match XXX against the global syscalls
table, and return any ids in a standalone table.
Also throw an error if an XXX doesn't match any event name or syscall name.
The syscall numbers are passed as an argument to sinsp_evttype_filter so
it can preindex the filters by syscall number.
This depends on https://github.com/draios/sysdig/pull/1100
* Add unit test for syscall support
This does a madvise, which doesn't have a ppm event type, both directly
and indirectly via syscall(__NR_madvise, ...), as well as an open
directly + indirectly. The corresponding rules file matches on madvise
and open.
The test ensures that both opens and both madvises are detected.
To further reduce falco's cpu usage, start setting the inspector in
"autodrop" mode with a sampling ratio of 1. When autodrop mode is
enabled, a second class of events (those having EF_ALWAYS_DROP in the
syscall table, or those syscalls that do not have specific handling in
the syscall table) are also excluded.
* Add ability to read rules files from directories
When the argument to -r <path> or an entry in falco.yaml's rules_file
list is a directory, read all files in the directory and add them to the
rules file list. The files in the directory are sorted alphabetically
before being added to the list.
The installed falco adds directories /etc/falco/rules.available and
/etc/falco/rules.d and moves /etc/falco/application_rules.yaml to
/etc/falco/rules.available. /etc/falco/rules.d is empty, but the idea is
that admins can symlink to /etc/falco/rules.available for applications
they want to enable.
This will make it easier to add application-specific rulesets that
admins can opt-in to.
* Unit test for reading rules from directory
Copy the rules/trace file from the test multiple_rules to a new test
rules_directory. The rules files are in rules/rules_dir/{000,001}*.yaml,
and the test uses a rules_file argument of rules_dir. Ensure that the
same events are detected.
* Reopen file/program outputs on SIGUSR1
When signaled with SIGUSR1, close and reopen file and program based
outputs. This is useful when combined with logrotate to rotate logs.
* Example logrotate config
Example logrotate config that relies on SIGUSR1 to rotate logs.
* Ensure options exist for all outputs
Options may not be provided for some outputs (like stdout), so create an
empty set of options in that case.
* Allow appending to skipped rules
If a rule has an append attribute but the original rule was skipped (due
to having lower priority than the configured priority), silently skip
the appending rule instead of returning an error.
* Unit test for appending to skipped rules
Unit test verifies fix for appending to skipped rules. One rules file
defines a rule with priority WARNING, a second rules file appends to
that rules file, and the configured priority is ERROR.
Ensures that falco rules without errors.
* Add option to exclude output property in json fmt
New falco.yaml option json_include_output_property controls where the
formatted string "output" is included in the json object when json
output is enabled. By default the string is included.
* Add tests for new json output option
New test sets json_include_output_property to false and then verifies
that the json output does *not* contain the surrounding text "Warning an
open...".
* Add the ability to validate multiple rules files
Allow multiple -V arguments just as we do with multiple -r arguments.
* With verbose output, print dangling macros/lists
Start tracking whether or not a given macro/list is actually used when
compiling the set of rules. Every macro/list has an attribute used,
which defaults to false and is set to true whenever it is referred to in
a macro/rule/list.
When run with -v, any macro/list that still has used=false results in a
warning message.
Also, it turns out the fix for
https://github.com/draios/falco/issues/197 wasn't being applied to
macros. Fix that.
The rules CMakeLists.txt, which controls the installation of the falco
rules files, was in the engine CMakeLists.txt, which meant that programs
that included the engine would also include rules files.
This may not always be desired, so move the rules CMakeLists.txt to the
main falco CMakeLists.txt instead.
A new falco.yaml option buffered_outputs, also controlled by
-U/--unbuffered, sets unbuffered outputs for the output methods. This is
especially useful with keep_alive files/programs where you want the
output right away.
Also add cleanup methods for the output channels that ensure output to
the file/program is flushed and closed.
Add the ability to keep file/program outputs open (i.e. writing to the
same open file/program for multiple notifications). A new option to the
file/program output "keep_alive", if true, keeps the file/program pipe
open across events.
This makes the need for unbuffered output aka
https://github.com/draios/falco/issues/211 more pressing. Will add that next.
When json output is set, add a sub-object called output_fields to the
json output that contains the individual templated fields from the
output string. Makes it easier to parse those fields.
This fixes https://github.com/draios/falco/issues/261.
These changes allow for a local rules file that will be preserved across
upgrades and allows the main rules file to be overwritten across upgrades.
- Move all config/rules files below /etc/falco/
- Add a "local rules" file /etc/falco/falco_rules.local.yaml. The intent
is that it contains modifications/deltas to the main rules file
/etc/falco/falco_rules.yaml. The main falco_rules.yaml should be
treated as immutable.
- All config files are flagged so they are not overwritten on upgrade.
- Change the handling of the config item "rules_file" in falco.yaml to
allow a list of files. By default, this list contains:
[/etc/falco/falco_rules.yaml, /etc/falco/falco_rules.local.yaml].
Also change rpm/debian packaging to ensure that the above files are
preserved across upgrades:
- Use relative paths for share/bin dirs. This ensures that when packaged
as rpms they won't be flagged as config files.
- Add CMAKE_INSTALL_PREFIX to FALCO_ENGINE_LUA_DIR now that it's relative.
- In debian packaging, flag
/etc/falco/{falco.yaml,falco_rules.yaml,falco_rules.local.yaml} as
conffiles. That way they are preserved across upgrades if modified.
- In rpm packaging when using cmake, any files installed with an
absolute path are automatically flagged as %config. The only files
directly installed are now the config files, so that addresses the problem.
Add CMAKE_INSTALL_PREFIX to lua dir.
Clean up the handling of priority levels within rules. It used to be a
mix of strings handled in various places. Now, in falco_common.h there's
a consistent type for priority-as-number as well as a list of
priority-as-string values. Priorities are passed around as numbers
instead of strings. It's still permissive about capitalization.
Also add the ability to load rules by severity. New falco
config option "priority=<val>"/-o priority=<val> specifies the minimum
priority level of rules that will be loaded.
Add unit tests for same. The test suppresses INFO notifications for a
rule/trace file combination that would otherwise generate them.
Add the ability to append to rules/macros, like we already do with
lists. For rules/macros, if the object has an append: true key, the
condition value is appended to the condition of an existing rule/macro
with the same name.
Like lists, it's an error to specify append: true without there being an
existing rule/macro.
Also add tests that test the same kind of things we did for lists:
- That append: true really does append
- That append: false overwrites the rule/macro
- That it's an error to append with a prior rule/macro existing.
List nodes can now have an 'append' key. If present and true, any values
in this list will be appended to the end of any existing list with the
same name.
It is an error to have a list with 'append' true that has a name that is
not an existing list.
When performing list substitution, only replace a list name when it is
surrounded by whitespace or expected punctuation characters. Lua
patterns don't have a notion of this-or-that patterns e.g. (^|abc), so
we have 3 versions of the substitution depending on whether he list name
occurs in the beginning, middle, or end of a string.
This fixes#197.
Also validate macros when they are parsed. Macros are also validated as
a part of rules being parsed, but it's possible to have an individual
rules file containing only macros, or a macro not explicitly tied to any
rule. In this case, it's useful to be able to check the macro to see if
it contains dangling macro references.
When parsing condition expressions, if the type of an ast node is
String (aka quoted string), don't trim whitespace from the value. This
ensures that conditions that want to match exact strings e.g. command
lines with leading/trailing spaces will work properly.
This fixes#253.
Start packaging (and building when necessary) a falco-specific kernel
module in falco releases. Previously, falco would depend on sysdig and
use its kernel module instead.
The kernel module was already templated to some degree in various
places, so we just had to change the templated name from
sysdig/sysdig-probe to falco/falco-probe.
In containers, run falco-probe-loader instead of
sysdig-probe-loader. This is actually a script in the sysdig repository
which is modified in https://github.com/draios/sysdig/pull/789, and uses
the filename to indicate what kernel module to build and/or load.
For the falco package itself, don't depend on sysdig any longer but instead
depend on dkms and its dependencies, using sysdig as a guide on the set
of required packages.
Additionally, for the package pre-install/post-install scripts start
running falco-probe-loader.
Finally, add a --version argument to falco so it can pass the desired
version string to falco-probe-loader.
Use the sinsp_evt_formatter_cache added in
https://github.com/draios/sysdig/pull/771 instead of a local cache. This
simplifies the lua side quite a bit, as it only needs to call
format_output(), and clean up everything via free_formatters() in
output_cleanup().
On the C side, use a sinsp_evt_formatter object and use it in
format_event().
In C functions that implement lua functions, don't directly throw
falco_exceptions, which results in opaque error messages like:
Mon Feb 27 10:09:58 2017: Runtime error: Error invoking function output:
C++ exception. Exiting.
Instead, return lua errors via lua_error().
- Instead of having a possibly null string pointer as the argument to
enable_* and process_event, have wrapper versions that assume a
default falco ruleset. The default ruleset name is a static member of
the falco_engine class, and the default ruleset id is created/found
in the constructor.
- This makes the whole mechanism simple enough that it doesn't require
seprarate testing, so remove the capability within falco to read a
ruleset from the environment and remove automated tests that specify
a ruleset.
- Make pattern/tags/ruleset arguments to enable_* functions const.
(I'll squash this down before I commit)
- in lua, look for a tags attribute to each rule. This is passed up in
add_filter as a tags argument (as a lua table). If not present, an
empty table is used. The tags table is iterated to populate a set
of tags as strings, which is passed to add_filter().
- A new method falco_engine::enable_rule_by_tag is similar to
enable_rule(), but is given a set of tag strings. Any rules containing
one of the tags is enabled/disabled.
- The list of event types has been changed to a set to more accurately
reflect its purpose.
- New argument to falco -T allows disabling all rules matching a given
tag, via enable_rule_by_tag(). It can be provided multiple times.
- New argument to falco -t allows running those rules matching a given
tag. If provided all rules are first disabled. It can be
provided multiple times, but can not be combined with -T or
-D (disable rules by name)
- falco_enging supports the notion of a ruleset. The idea is that you
can choose a set of rules that are enabled/disabled by using
enable_rule()/enable_rule_by_tag() in combination with a
ruleset. Later, in process_event() you include that ruleset and the
rules you had previously enabled will be run.
- rulsets are provided as strings in enable_rule()/enable_rule_by_tag()
and as numbers in process_event()--this avoids the overhead of string
lookups per-event. Ruleset ids are created on the fly as needed. A
utility method find_ruleset_id() looks up the ruleset id for a given
name. The default ruleset is NULL string/0 numeric if not provided.
- Although the ruleset is a useful falco engine feature, it isn't that
important to the falco standalone program, so it's not
documented. However, you can change the ruleset by providing
FALCO_RULESET in the environment.
Prefix output strings with * so they are always permissive in the
engine.
In falco outputs, which adds its own prefix, remove any leading * before
adding the custom prefix.
Allow any list/macro/rule to be overridden by a subsequent file. The
persistent state that lives across invocations of load_rules are the 3
arrays ordered_{list,macro,rule}_names, which have the
lists/macros/rules in the order in which they first appear, and tables
{rules,macros,lists}_by_name, which maps from a name to a yaml object.
With each call to load_rules, the set of loaded rules is reset and the
state of expanded lists, compiled macros, compiled rules, and rule
metadata are recreated from scratch, using the ordered_*_names arrays
and *_by_name tables. That way, any list/macro/rule can be redefined in
a subsequent file with new values.
Add the ability to clear the set of loaded rules from lua. It simply
recreates the sinsp_evttype_filter instance m_evttype_filter, which is
now a unique_ptr.
sinsp_utils::get_current_time_ns() has the same purpose as
get_epoch_ns(), and now that we're including the token bucket in
falco_engine, it's easy to package the dependency. So use that function
instead.
Add token-bucket based rate limiting for falco notifications.
The token bucket is implemented in token_bucket.cpp (actually in the
engine directory, just to make it easier to include in other
programs). It maintains a current count of tokens (i.e. right to send a
notification). Its main method is claim(), which attemps to claim a
token and returns true if one was claimed successfully. It has a
configurable configurable max burst size and rate. The token bucket
gains "rate" tokens per second, up to a maximum of max_burst tokens.
These parameters are configurable in falco.yaml via the config
options (defaults shown):
outputs:
rate: 1
max_burst: 1000
In falco_outputs::handle_event(), try to claim a token, and if
unsuccessful log a debug message and return immediately.
Previously, log messages had levels, but it only influenced the level
argument passed to syslog(). Now, add the ability to control log level
from falco itself.
New falco.yaml argument "log_level" can be one of the strings
corresponding to the well-known syslog levels, which is converted to a
syslog-style level as integer.
In falco_logger::log(), skip messages below the specified level.
Instead of creating a formatter for each event, cache them and create
them only when needed. A new function output_cleanup cleans up the
cached formatters, and is called in the destructor if init() was called.
When run via scripts like run_performance_tests.sh, it's useful to
include extra info like the test being run and the specific program
variant to the stats file. So support that via the
environment. Environment keys starting with FALCO_STATS_EXTRA_XXX will
have the XXX and environment value added to the stats file.
It's undocumented as I doubt other programs will need this functionality
and it keeps the docs simpler.
With -s, periodically fetch capture stats from the inspector and write
them to the provided file.
Separate class StatsFileWriter handles the details. It does rely on a
timer + SIGALRM handler so you can only practically create a single
object, but it does keep the code/state separate.
The output format has a sample number, the set of current stats, a
delta with the difference from the prior sample, and the percentage of
events dropped during that sample.
Change falco_engine::process_event to return a unique_ptr that wraps the
rule result, so it won't be leaked if this method throws an exception.
This means that callers don't need to create their own.
Validate rule outputs when loading rules by attempting to create a
formatter based on the rule's output field. If there's an error, it will
propagate up through load_rules and cause falco to exit rather than
discover the problem only when trying to format the event and the rule's
output field.
This required moving formats.{cpp,h} into the falco engine directory
from the falco general directory. Note that these functions are loaded
twice in the two lua states used by falco (engine and outputs).
There's also a couple of minor cleanups:
- falco_formats had a private instance variable that was unused, remove
it.
- rename the package for the falco_formats functions to formats instead
of falco so it's more standalone.
- don't throw a c++ exception in falco_formats::formatter. Instead
generate a lua error, which is handled more cleanly.
- free_formatter doesn't return any values, so set the return value of
the function to 0.
container.info handling used to be handled by the the falco_outputs
object. However, this caused problems for applications that only used
the falco engine, doing their own output formatting for matching events.
Fix this by moving output formatting into the falco engine itself. The
part that replaces %container.info/adds extra formatting to the end of a
rule's output now happens while loading the rule.
Related to the changes in https://github.com/draios/agent/pull/267,
improve error messages when trying to load sets of rules with errors:
- Check that yaml parsing of rules_content actually resulted in
something.
- Return an error for rules that have an empty name.
- Return an error for yaml objects that aren't a rule/macro/list.
- When compiling, don't print an error message, simply return one,
including a wrapper "can not compile ..." string.
Instead of having FALCO_SHARE_DIR be a relative path, fully specify it
by prepending CMAKE_INSTALL_PREFIX in the top level CMakeLists.txt and
don't prepend CMAKE_INSTALL_PREFIX in config_falco_engine.h.in. This
makes it consistent with its use in the agent.
Collect stats on the number of events processed and dropped. When run
with -v, print these stats. This duplicates syddig behavior and can be
useful when dianosing problems related to dropped events throwing off
internal state tracking.
Bring over functionality from sysdig to write trace files. This is easy
as all of the code to actually write the files is in the inspector. This
just handles the -w option and arguments.
This can be useful to write a trace file in parallel with live event
monitoring so you can reproduce it later.
The logic for detecting if a file exists was backwards. It would treat a
file as existing if it could *not* be opened. Reverse that logic so it
works.
This fixes https://github.com/draios/falco/issues/135.
Copy handling of -pk/-pm/-pc/-k/-m arguments from sysdig. All of the
relevant code was already in the inspector so that was easy.
The information from k8s/mesos/containers is used in two ways:
- In rule outputs, if the format string contains %container.info, that
is replaced with the value from -pk/-pm/-pc, if one of those options
was provided. If no option was provided, %container.info is replaced
with a generic %container.name (id=%container.id) instead.
- If the format string does not contain %container.info, and one of
-pk/-pm/-pc was provided, that is added to the end of the formatting
string.
- If -p was specified with a general value (i.e. not
kubernetes/mesos/container), the value is simply added to the end and
any %container.info is replaced with the generic value.
There are a lot of command line options now, so sort them alphabetically
in the usage and getopt handling to make them easier to find.
Also rename -p <pidfile> to -P <pidfile>, thinking ahead to the next
commit.
If a rule has a enabled attribute, and if the value is false, call the
engine's enable_rule() method to disable the rule. Like add_filter,
there's a static method which takes the object as the first argument and
a non-static method that calls the engine.
This fixes#72.
The falco engine changes broke the output methods that take
configuration (like the filename for file output, or the program for
program output). Fix that by properly passing the options argument to
each method's output function.
Add test that cover reading from multiple sets of rule files and
disabling rules. Specific changes:
- Modify falco to allow multiple -r arguments to read from multiple
files.
- In the test multiplex file, add a disabled_rules attribute,
containing a sequence of rules to disable. Result in -D arguments
when running falco.
- In the test multiplex file, 'rules_file' can be a sequence. It
results in multiple -r arguments when running falco.
- In the test multiplex file, 'detect_level' can be a squence of
multiple severity levels. All levels will be checked for in the
output.
- Move all test rules files to a rules subdirectory and all trace files
to a traces subdirectory.
- Add a small trace file for a simple cat of /dev/null. Used by the
new tests.
- Add the following new tests:
- Reading from multiple files, with the first file being
empty. Ensure that the rules from the second file are properly
loaded.
- Reading from multiple files with the last being empty. Ensures
that the empty file doesn't overwrite anything from the first
file.
- Reading from multiple files with varying severity levels for each
rule. Ensures that both files are properly read.
- Disabling rules from a rules file, both with full rule names
and regexes. Will result in not detecting anything.
Add the ability to drop events at the falco engine level in a way that
can scale with the dropping that already occurs at the kernel/inspector
level.
New inline function should_drop_evt() controls whether or not events are
matched against the set of rules, and is controlled by two
values--sampling ratio and sampling multiplier.
Here's how the sampling ratio and multiplier influence whether or not an
event is dropped in should_drop_evt(). The intent is that
m_sampling_ratio is generally changing external to the engine e.g. in
the main inspector class based on how busy the inspector is. A sampling
ratio implies no dropping. Values > 1 imply increasing levels of
dropping. External to the engine, the sampling ratio results in events
being dropped at the kernel/inspector interface. The sampling
multiplier is an amplification to the sampling factor in
m_sampling_ratio. If 0, no additional events are dropped other than
those that might be dropped by the kernel/inspector interface. If 1,
events that make it past the kernel module are subject to an additional
level of dropping at the falco engine, scaling with the sampling ratio
in m_sampling_ratio.
Unlike the dropping that occurs at the kernel level, where the events in
the first part of each second are dropped, this dropping is random.
Move the c++ and lua code implementing falco engine/falco common to its
own directory userspace/engine. It's compiled as a static library
libfalco_engine.a, and has its own CMakeLists.txt so it can be included
by other projects.
The engine's CMakeLists.txt has a add_subdirectory for the falco rules
directory, so including the engine also builds the rules.
The variables you need to set to use the engine's CMakeLists.txt are:
- CMAKE_INSTALL_PREFIX: the root directory below which everything is
installed.
- FALCO_ETC_DIR: where to install the rules file.
- FALCO_SHARE_DIR: where to install lua code, relative to the
- install/package root.
- LUAJIT_INCLUDE: where to find header files for lua.
- FALCO_SINSP_LIBRARY: the library containing sinsp code. It will be
- considered a dependency of the engine.
- LPEG_LIB/LYAML_LIB/LIBYAML_LIB: locations for third-party libraries.
- FALCO_COMPONENT: if set, will be included as a part of any install()
commands.
Instead of specifying /usr/share/falco in config_falco_*.h.in, use
CMAKE_INSTALL_PREFIX and FALCO_SHARE_DIR.
The lua code for the engine has also moved, so the two lua source
directories (userspace/engine/lua and userspace/falco/lua) need to be
available separately via falco_common, so make it an argument to
falco_common::init.
As a part of making it easy to include in another project, also clean up
LPEG build/defs. Modify build-lpeg to add a PREFIX argument to allow for
object files/libraries being in an alternate location, and when building
lpeg, put object files in a build/ subdirectory.
Create standalone classes falco_engine/falco_outputs that can be
embedded in other programs. falco_engine is responsible for matching
events against rules, and falco_output is responsible for formatting an
alert string given an event and writing the alert string to all
configured outputs.
falco_engine's main interfaces are:
- load_rules/load_rules_file: Given a path to a rules file or a string
containing a set of rules, load the rules. Also loads needed lua code.
- process_event(): check the event against the set of rules and return
the results of a match, if any.
- describe_rule(): print details on a specific rule or all rules.
- print_stats(): print stats on the rules that matched.
- enable_rule(): enable/disable any rules matching a pattern. New falco
command line option -D allows you to disable one or more rules on the
command line.
falco_output's main interfaces are:
- init(): load needed lua code.
- add_output(): add an output channel for alert notifications.
- handle_event(): given an event that matches one or more rules, format
an alert message and send it to any output channels.
Each of falco_engine/falco_output maintains a separate lua state and
loads separate sets of lua files. The code to create and initialize the
lua state is in a base class falco_common.
falco_engine no longer logs anything. In the case of errors, it throws
exceptions. falco_logger is now only used as a logging mechanism for
falco itself and as an output method for alert messages. (This should
really probably be split, but it's ok for now).
falco_engine contains an sinsp_evttype_filter object containing the set
of eventtype filters. Instead of calling
m_inspector->add_evttype_filter() to add a filter created by the
compiler, call falco_engine::add_evttype_filter() instead. This means
that the inspector runs with a NULL filter and all events are returned
from do_inspect. This depends on
https://github.com/draios/sysdig/pull/633 which has a wrapper around a
set of eventtype filters.
Some additional changes along with creating these classes:
- Some cleanups of unnecessary header files, cmake include_directory()s,
etc to only include necessary includes and only include them in header
files when required.
- Try to avoid 'using namespace std' in header files, or assuming
someone else has done that. Generally add 'using namespace std' to all
source files.
- Instead of using sinsp_exception for all errors, define a
falco_engine_exception class for exceptions coming from the falco
engine and use it instead. For falco program code, switch to general
exceptions under std::exception and catch + display an error for all
exceptions, not just sinsp_exceptions.
- Remove fields.{cpp,h}. This was dead code.
- Start tracking counts of rules by priority string (i.e. what's in the
falco rules file) as compared to priority level (i.e. roughtly
corresponding to a syslog level). This keeps the rule processing and
rule output halves separate. This led to some test changes. The regex
used in the test is now case insensitive to be a bit more flexible.
- Now that https://github.com/draios/sysdig/pull/632 is merged, we can
delete the rules object (and its lua_parser) safely.
- Move loading the initial lua script to the constructor. Otherwise,
calling load_rules() twice re-loads the lua script and throws away any
state like the mapping from rule index to rule.
- Allow an empty rules file.
Finally, fix most memory leaks found by valgrind:
- falco_configuration wasn't deleting the allocated m_config yaml
config.
- several ifstreams were being created simply to test which falco
config file to use.
- In the lua output methods, an event formatter was being created using
falco.formatter() but there was no corresponding free_formatter().
This depends on changes in https://github.com/draios/sysdig/pull/640.
New command line option 'A', related to the boolean all_events instructs
falco to run on all events, and not just those without the EF_DROP_FALCO
flag set.
When all_events is true, the checks for ignored events/syscalls are
skipped when loading rules.
Add a new output type "program" that writes a formatted event to a
configurable program, using io.popen().
Each notification results in one invocation of the program.
Instead of combining all rules into one huge filter expression and
giving it to the inspector, keep each filter expression separate and
annotate it with the events for which the rule applies.
This uses the capabilties in draios/sysdig#627
to have multiple sets of event-specific filters.
Change traverse_ast to allow a set of node types instead of a single
node type.
Within the compiler, a new pass over the ast get_evttypes looks for
evt.type clauses, converts the evt.type as a string to any event type
ids for which it may apply, and passes that back with the compiled
rule.
As rule conditions may refer to evt.types in negative
contexts (i.e. evt.type != XXX, or not evt.type = XXX), this pass
prefers rules that list event type checks at the beginning of
conditions, and allows other rules with a warning.
When traversing the ast looking for evt.type checks, once any "!=" or
"not ..." is seen, no other evt.type checks are "allowed". If one
is found, the rule is considered ambiguous wrt event types. In this
case, a warning is printed and the rule is associated with a catchall
set that runs for all event types.
Also, instead of rejecting rules with no event type check, print a
warning and associate it with the catchall set.
In the rule loader, create a new global events that maps each event as a
string to the list of event ids for which it may apply. Instead of
calling install_filter once after all rules have been loaded, call a new
function add_filter for each rule. In turn, it passes the rule and list
of event ids to the inspector using add_evttype_filter().
Also, with -v (verbose) also print the exact set of events found for
each event type. This is used by a upcoming change to the set of unit
tests.
Add a verbose flag -v which implies printing additional info. This is
passed down to lua during load_rules and sets the per-module verbose
value for the compiler and parser modules.
Later commits will use this to print additional info when loading rules.
https://github.com/draios/sysdig/pull/623 adds support for a startswith
operator to allow for string prefix matching. Modify the parser to
recognize that operator, and use that operator for rules that really
want to check the beginning of a pathname, directory, etc. to make them
faster and avoid FPs.
Once sysdig adds support for handling "in (...)" filter expressions as
set membership tests, it will be advantageous to combine lists of items
together into a single list so they can all be checked in a single set
membership test.
This commit adds support for a new yaml item type "list" containing a
field "name" and field "items" containing a list of items. These are
represented as a yaml list, which allows yaml to handle some of the
initial parsing with the list items maintained natively in lua.
Allow lists to contain list references by expanding any references to
the items in the list, before storing the list items in
state.lists.
When parsing macro or rule conditions, replace all references to a list
name with the list items as a comma separated string.
Modify the falco rules to switch to lists whenever possible. The
new convention is to use the suffix _binaries for lists of program names
and _procs for macros that define a filter expression using the list.
Instead of using sysdig's json output, which only contains the fields
from the format string without any formatting text, use the string
output to build a json object containing the format string, rule name,
severity, and the event time (converted to a json-friendly ISO8601).
This fixes https://github.com/draios/falco/issues/82.
Add additional rules related to using pipe installers within a fbash
session:
- Modify write_etc to only trigger if *not* in a fbash session. There's
a new rule write_etc_installer which has the same conditions when in
a fbash session, logging at INFO severity.
- A new rule write_rpm_database warns if any non package management
program tries to write below /var/lib/rpm.
- Add a new warning if any program below a fbash session tries to open
an outbound network connection on ports other than http(s) and dns.
- Add INFO level messages when programs in a fbash session try to run
package management binaries (rpm,yum,etc) or service
management (systemctl,chkconfig,etc) binaries.
In order to test these new INFO level rules, make up a third class of
trace files traces-info.zip containing trace files that should result in
info-level messages.
To differentiate warning and info level detection, add an attribute to
the multiplex file "detect_level", which is "Warning" for the files in
traces-positive and "Info" for the files in traces-info. Modify
falco_test.py to look specifically for a non-zero count for the given
detect_level.
Doing this exposed a bug in the way the level-specific counts were being
recorded--they were keeping counts by level name, not number. Fix that.
Start using the Avocado framework for automated regression
testing. Create a test FalcoTest in falco_test.py which can run on a
collection of trace files. The script test/run_regression_tests.sh is
responsible for pulling zip files containing the positive (falco should
detect) and negative (falco should not detect) trace files, creating a
Avocado multiplex file that defines all the tests (one for each trace
file), running avocado on all the trace files, and showing full logs for
any test that didn't pass.
The old regression script, which simply ran falco, has been removed.
Modify falco's stats output to show the total number of events detected
for use in the tests.
In travis.yml, pull a known stable version of avocado and build it,
including installing any dependencies, as a part of the build process.
At shutdown, print stats on the number of rules triggered by severity
and rule name. This is done by a lua function print_stats and the
associated table rule_output_counts.
When passing rules to outputs, update the counts in rule_output_counts.
Add signal handlers for SIGINT/SIGTERM that set a shutdown
flag. Initialize the live inspector with a timeout so the main loop can
watch the flag set by the signal handlers.
Make sure that references to variables that may be paths (which in turn
may contain spaces) are quoted, so cmake won't break on the spaces.
This fixes https://github.com/draios/falco/issues/79.
When run with -l <rule>, falco will print the name/description of the
single rule <rule> and exit. With -L, falco will print the
name/description of all rules.
All the work is done in lua in the rule loader. A new lua function
describe_rule calls the local function describe_single_rule once or
multiple times depending on -l/-L. describe_single_rule prints the rule
name and a wrapped version of the rule description.
Add name and description fields to all rules. The name field is actually
a field called 'rule', which corresponds to the 'macro' field for
macros.
Within the rule loader, the state changes slightly. There are two
indices into the set of rules 'rules_by_name' and
'rules_by_idx' (formerly 'outputs'). They both now contain the original
table from the yaml parse. One field 'level' is added which is the
priority mapped to a number.
Get rid of the notion of default priority or output. Every rule must now
provide both.
Go through all current rules and add names and descriptions.
Remove the old use of the '-o' command line option, it wasn't being
used.
Allow any config file option to be overridden on the command line, via
--option/-o. These options are applied to the configuration object after
reading the file, ensuring the command line options override anything in
the config file.
To support this, add some methods to yaml_configuration that allows you
to set the value for a top level key or key + subkey, and methods to
falco_configuration that allow providing a set of command line arguments
alongside the config file.
Ensure that any fatal error is always printed to stderr even if stderr
logging is not enabled. This makes sure that falco won't silently exit
on an error. This is especially important when daemonizing and when an
initial fatal error occurs first.
As a part of this, change all fatal errors to throw exceptions instead,
so all fatal errors get routed through the exception handler.
Improve daemonization by reopening stdin/stdout/stderr to /dev/null so
you don't have to worry about writing to a closed stderr on exit.
Henri pointed out that events may also be flagged as ignored. So
populate a second table with the set of ignored events, rename
check_for_ignored_syscalls to check_for_ignored_syscalls_events, and
separately check each table based on whether the LHS of the expression
is evt.type or syscall.type.
Add support for daemonizing via the --daemon flag. If daemonized, the
pid is written to the file provided via the --pidfile flag. When
daemonized, falco immediately returns an error if stderr output or
logging was chosen on the command line.
Clean up handling of outputs to match the expected use case (daemon):
- syslog output is enabled by default
- stdout output is disabled by default
- If not configured at all, both outputs are enabled.
Also fix some bugs I found while running via packages:
- There were still some references to the old rules filename
falco_rules.conf.
- The redhat package mistakenly defined some system directories like
/etc, /etc/init.d. Add them to the exclusion list (See
https://cmake.org/Bug/view.php?id=13609 for context).
- Clean up some of the error messages to be more consistent.
After this I was able to build and install debian and rpm
packages. Starting the falco service ran falco as a daemon with syslog
output.
Create a table containing the filtered syscalls and set it as the lua
global m_lua_ignored_syscalls == ignored_syscalls.
In the parser, add a general purpose ast traversal function
traverse_ast that calls a callback for all nodes of a specific type.
In the compiler, add a new function check_for_ignored_syscalls that uses
the traversal function to be called back for all "BinaryRelOp"
nodes (i.e. X = Y, X in [a, b, c], etc). For those nodes, if the lhs is
a field 'evt.type' or 'syscall.type' and the rhs contains one of the
ignored syscalls, throw an error.
Call check_for_ignored_syscalls after parsing any macro or rule
filter. The thrown error will contain the macro or rule that had the
ignored syscall.
In the next commit I'll change the rules to skip the ignored syscalls.
Uses yaml parsing lib to parse a yaml file comprising of a list of
macros and rules, like:
- macro: bin_dir
condition: fd.directory in (/bin, /sbin, /usr/bin, /usr/sbin)
- macro: core_binaries
condition: proc.name in (ls, mkdir, cat, less, ps)
- condition: (fd.typechar = 4 or fd.typechar=6) and core_binaries
output: "%evt.time: %proc.name network with %fd.l4proto"
- condition: evt.type = write and bin_dir
output: "%evt.time: System binary modified (file '%fd.filename' written by process %proc.name)"
- condition: container.id != host and proc.name = bash
output: "%evt.time: Shell running in container (%proc.name, %container.id)"
Rather than do include_directory() on the whole sysdig repo, just do it
for driver, libscap, and libsinp.
This is a step on the way to building a digwatch package.
As pointed out by Loris, timestamping output messages should be a
responsibility of the output/collection system.
So as a first step towards this, add timestamps automatically for output
formats, and remove them from rules.
the high-level change is that events matching a rule are now send into a
lua "on_event" function for handling, rather than doing the handling
down in c++.
more specifics:
before, the lua "load_rule" function registered formatters with
associated IDs with the c++ side, which later used this state to
reconcile events with formats and print output accordingly.
now, no such state is kept on the c++ side. the lua "load_rule" function
maintains the id->formatters map, and uses it to print outputs when it
receives events.
this change simplifies the existing flow and will also make the forthcoming
implementation of function outputs far simpler than it would have been
in the current setup.
This change adds syntax support for function call outputs. For example:
... | syslog(evt, WARN)
Regular outputs are still allowed and parsed in the same way.
Before this change, events were only printed if they had all the
fields (same behavior as with sysdig when the output format doesn't have
a leading "*"). With this change, all events are printed; those that
don't have all fields are prefixed with a notification.
Move compiler loading out of libsinsp/lua_parser.cpp and into a new
class in digwatch/rules.cpp.
This way the libsinsp support is strictly about providing a lua API for
scripts to setup filters. Loading the actual parser and rules is logic
that belongs in the app (digwatch in this case, maybe sysdig down the
line) rather than there.