Now the context doesn't have a column pointer but it either a line for
yaml parse errors or an individual object (rule, macro, etc) from the
parsed yaml.
Ensure that compiling filters for rules or macros doesn't result in
throwing lua errors. Instead, return a bool status + the return
value(s). If the status is false, the next return value is an error
message.
Change the semantics of lua load_rules to return a
successful/unsuccessful status instead of throwing errors when loading
fails. On success, load_rules returns [true, required_engine_version]
and on failure load_rules returns [false, row, col, error string]. The
row/col will be used to include context on error.
Falco's output itself now prints validation results to stdout, which
makes it easier to capture the output and pass it along. Log messages
continue to go to stderr.
Still need to finish going through load_rules and return all errors
directly instead of throwing a lua error.
If the rules file can't be parsed as yaml, lyaml returns a line and
column number. Add some context showing the lines around the line number
and a pointer to the column.
To speed up list expansion, instead of using regexes to replace a list
name with its contents, do string searches followed by examining the
preceding/following characters for the proper delimiter.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
We shouldn't need to clean up strings via a cleanup function and don't
need to do it via a bunch of string.gsub() functions.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Instead of iterating over the entire list of filters and doing pattern
matches against each defined filter, perform table lookups.
For filters that take arguments e.g. proc.aname[3] or evt.arg.xxx, split
the filtercheck string on bracket/dot and check the values against a
table.
There are now two tables of defined filters: defined_arg_filters and
defined_noarg_filters. Each filter is put into a table depending on
whether the filter takes an argument or not.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Json-related filtercheck fields supported indexing with brackets, but
when looking at the field descriptions you couldn't tell if a field
allowed an index, required an index, or did not allow an index.
This information was available, but it was a part of the protected
aliases map within the class.
Move this to the public field information so it can be used outside the
class.
Also add m_ prefixes for member names, now that the struct isn't
trivial.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
Add more accurate tracking of the number of falco rules loaded per
ruleset, which are made available via the engine method
::num_rules_for_ruleset().
In the ruleset objects, keep track if a filter wrapper is actually
added/removed and if so increment/decrement the count.
For a while, falco has set the inspector drop mode to 1, which should
discard several classes of events that weren't necessary to use most
falco rules.
However, it was mistakenly being called before the inspector was opened,
which meant it wasn't actually doing anything.
Fix this by setting the dropping mode after the inspector open.
On some spot testing on a moderately loaded environment, this results in
a 30-40% drop in the number of system calls processed per second, and
should result in a nice boost in performance.
* Update engine fields checksum for fd.dev.*
New fields fd.dev.*, so updating the fields checksum.
* Print a message why the trace file can't be read.
At debug level only, but better than nothing.
* Adjust tests to match new container_started macro
Now that the container_started macro works either on the container event
or the first process being spawned in a container, we need to adjust the
counts for some rules to handle both cases.
* Add option to display times in ISO 8601 UTC
ISO 8601 time is useful when, say, running falco in a container, which
may have a different /etc/localtime than the host system.
A new config option time_format_iso_8601 controls whether log message
and event times are displayed in ISO 8601 in UTC or in local time. The
default is false (display times in local time).
This option is passed to logger init as well as outputs. For outputs it
eventually changes the time format field from %evt.time/%jevt.time to
%evt.time.iso8601/%jevt.time.iso8601.
Adding this field changes the falco engine version so increment it.
This depends on https://github.com/draios/sysdig/pull/1317.
* Unit test for ISO 8601 output
A unit test for ISO 8601 output ensures that both the log and event time
is in ISO 8601 format.
* Use ISO 8601 output by default in containers
Now that we have an option that controls iso 8601 output, use it by
default in containers. We do this by changing the value of
time_format_iso_8601 in falco.yaml in the container.
* Handle errors in strftime/asctime/gmtime
A placeholder "N/A" is used in log messages instead.
When creating syscall event drop alerts, instead of including just the
total and dropped event count, include all possible causes of drops as
well as whether bpf is enabled.
* Make stats file interval configurable
New argument --stats_interval=<msec> controls the interval at which
statistics are written to the stats file. The default is 5000 ms (5 sec)
which matches the prior hardcoded interval.
The stats interval is triggered via signals, so an interval below ~250ms
will probably interfere with falco's behavior.
* Add ability to emit general purpose messages
A new method falco_outputs::handle_msg allows emitting generic messages
that have a "rule", message, and output fields, but aren't exactly tied
to any event and aren't passed through an event formatter.
This allows falco to emit "events" based on internal checks like kernel
buffer overflow detection.
* Clean up newline handling for logging
Log messages from falco_logger::log may or may not have trailing
newlines. Handle both by always adding a newline to stderr logs and
always removing any newline from syslog logs.
* Add method to get sequence from subkey
New variant of get_sequence that allows fetching a list of items from a
key + subkey, for example:
key:
subkey:
- list
- items
- here
Both use a shared method get_sequence_from_node().
* Monitor syscall event drops + optional actions
Start actively monitoring the kernel buffer for syscall event drops,
which are visible in scap_stats.n_drops, and add the ability
to take actions when events are dropped. The -v (verbose) and
-s (stats filename) arguments also print out information on dropped
events, but they were only printed/logged without any actions.
In falco config you can specify one or more of the following actions to
take when falco notes system call drops:
- ignore (do nothing)
- log a critical message
- emit an "internal" falco alert. It looks like any other alert with a
time, "rule", message, and output fields but is not related to any
rule in falco_rules.yaml/other rules files.
- exit falco (the idea being that the restart would be monitored
elsewhere).
A new module syscall_event_drop_mgr is called for every event and
collects scap stats every second. If in the prior second there were
drops, perform_actions() handles the actions.
To prevent potential flooding in high drop rate environments, actions
are goverened by a token bucket with a rate of 1 actions per 30 seconds,
with a max burst of 10 seconds. We might tune this later based on
experience in busy environments.
This might be considered a fix for
https://github.com/falcosecurity/falco/issues/545. It doesn't
specifically flag falco rules alerts when there are drops, but does
make it easier to notice when there are drops.
* Add unit test for syscall event drop detection
Add unit tests for syscall event drop detection. First, add an optional
config option that artifically increments the drop count every
second. (This is only used for testing).
Then add test cases for each of the following:
- No dropped events: should not see any log messages or alerts.
- ignore action: should note the drops but not log messages or alert.
- log action: should only see log messages for the dropped events.
- alert action: should only see alerts for the dropped events.
- exit action: should see log message noting the dropped event and exit
with rc=1
A new trace file ping_sendto.scap has 10 seconds worth of events to
allow the periodic tracking of drops to kick in.
* Add support for container metaevent to detect container spawning
Create a new macro "container_started" to check both the old and
the new check.
Also, only look for execve exit events with vpid=1.
* Use TBB_INCLUDE_DIR for consistency w sysdig,agent
Previously it was a mix of TBB_INCLUDE and TBB_INCLUDE_DIR.
* Build using matching sysdig branch, if exists
* Expose required_engine_version when loading rules
When loading a rules file, have alternate methods that return the
required_engine_version. The existing methods remain unchanged and just
call the new methods with a dummy placeholder.
* Add --support argument to print support bundle
Add an argument --support that can be used as a single way to collect
necessary support information, including the falco version, config,
commandline, and all rules files.
There might be a big of extra structure to the rules files, as they
actually support an array of "variants", but we're thinking ahead to
cases where there might be a comprehensive library of rules files and
choices, so we're adding the extra structure.
* Add ability to print field names only
Add ability to print field names only instead of all information about
fields (description, etc) using -N cmdline option.
This will be used to add some versioning support steps that check for a
changed set of fields.
* Add an engine version that changes w/ filter flds
Add a method falco_engine::engine_version() that returns the current
engine version (e.g. set of supported fields, rules objects, operators,
etc.). It's defined in falco_engine_version.h, starts at 2 and should be
updated whenever a breaking change is made.
The most common reason for an engine change will be an update to the set
of filter fields. To make this easy to diagnose, add a build time check
that compares the sha256 output of "falco --list -N" against a value
that's embedded in falco_engine_version.h. A mismatch fails the build.
* Check engine version when loading rules
A rules file can now have a field "required_engine_version N". If
present, the number is compared to the falco engine version. If the
falco engine version is less, an error is thrown.
* Unit tests for engine versioning
Add a required version: 2 to one trace file to check the positive case
and add a new test that verifies that a too-new rules file won't be loaded.
* Rename falco test docker image
Rename sysdig/falco to falcosecurity/falco in unit tests.
* Don't pin falco_rules.yaml to an engine version
Currently, falco_rules.yaml is compatible with versions <= 0.13.1 other
than the required_engine_version object itself, so keep that line
commented out so users can use this rules file with older falco
versions.
We'll uncomment it with the first incompatible falco engine change.
* Allow SSL for k8s audit endpoint
Allow enabling SSL for the Kubernetes audit log web server. This
required adding two new configuration options: webserver.ssl_enabled and
webserver.ssl_certificate. To enable SSL add the below to the webserver
section of the falco.yaml config:
webserver:
enabled: true
listen_port: 8765s
k8s_audit_endpoint: /k8s_audit
ssl_enabled: true
ssl_certificate: /etc/falco/falco.pem
Note that the port number has an s appended to indicate SSL
for the port which is how civetweb expects SSL ports be denoted. We
could change this to dynamically add the s if ssl_enabled: true.
The ssl_certificate is a combination SSL Certificate and corresponding
key contained in a single file. You can generate a key/cert as follows:
$ openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem
$ cat certificate.pem key.pem > falco.pem
$ sudo cp falco.pem /etc/falco/falco.pem
fix ssl option handling
* Add notes on how to create ssl certificate
Add notes on how to create the ssl certificate to the config comments.
In the common case, falco doesn't generate much output, so it's
desirable to not buffer it in case you're tail -fing some logs.
So change the default for buffered outputs to false.
As per https://github.com/draios/sysdig/pull/1275, the gen_event class
mandate the implementation of two new methods.
This change aims to simplify the implementation of a generic event
processing infrastructure, that could handle both sinsp and json
events.
The -Wextra compile-time option will enable additional diagnostic
warnigns. The -Werror option will cause the compiler to treat warnings
as errors. This change adds a build time option,
BUILD_WARNINGS_AS_ERRORS, to conditionally enable those flags. Note
that depending on the compiler you're using, if you enable this option,
compilation may fail (some compiler version have additional warnings
that have not yet been resolved).
Testing with these options in place identified a destructor that was
throwing an exception. C++11 doesn't allow destructors to throw
exceptions, so those throw's would have resulted in calls to
terminate(). I replace them with an error log and a call to assert().
It's possible to call event_tags_for_ruleset/evttypes_for_ruleset for a
ruleset that hasn't been loaded. In this case, it's possible to go past
the end of the m_rulesets array.
After fixing that, it's also possible to go past the end of the
event_tags array in event_tags_for_ruleset().
So in both cases, check the index against the array size before
indexing.
* Add new json/webserver libs, embedded webserver
Add two new external libraries:
- nlohmann-json is a better json library that has stronger use of c++
features like type deduction, better conversion from stl structures,
etc. We'll use it to hold generic json objects instead of jsoncpp.
- civetweb is an embeddable webserver that will allow us to accept
posted json data.
New files webserver.{cpp,h} start an embedded webserver that listens for
POSTS on a configurable url and passes the json data to the falco
engine.
New falco config items are under webserver:
- enabled: true|false. Whether to start the embedded webserver or not.
- listen_port. Port that webserver listens on
- k8s_audit_endpoint: uri on which to accept POSTed k8s audit events.
(This commit doesn't compile entirely on its own, but we're grouping
these related changes into one commit for clarity).
* Don't use relative paths to find lua code
You can look directly below PROJECT_SOURCE_DIR.
* Reorganize compiler lua code
The lua compiler code is generic enough to work on more than just
sinsp-based rules, so move the parts of the compiler related to event
types and filterchecks out into a standalone lua file
sinsp_rule_utils.lua.
The checks for event types/filterchecks are now done from rule_loader,
and are dependent on a "source" attribute of the rule being
"sinsp". We'll be adding additional types of events next that come from
sources other than system calls.
* Manage separate syscall/k8s audit rulesets
Add the ability to manage separate sets of rules (syscall and
k8s_audit). Stop using the sinsp_evttype_filter object from the sysdig
repo, replacing it with falco_ruleset/falco_sinsp_ruleset from
ruleset.{cpp,h}. It has the same methods to add rules, associate them
with rulesets, and (for syscall) quickly find the relevant rules for a
given syscall/event type.
At the falco engine level, there are new parallel interfaces for both
types of rules (syscall and k8s_audit) to:
- add a rule: add_k8s_audit_filter/add_sinsp_filter
- match an event against rules, possibly returning a result:
process_sinsp_event/process_k8s_audit_event
At the rule loading level, the mechanics of creating filterchecks
objects is handled two factories (sinsp_filter_factory and
json_event_filter_factory), both of which are held by the engine.
* Handle multiple rule types when parsing rules
Modify the steps of parsing a rule's filter expression to handle
multiple types of rules. Notable changes:
- In the rule loader/ast traversal, pass a filter api object down,
which is passed back up in the lua parser api calls like nest(),
bool_op(), rel_expr(), etc.
- The filter api object is either the sinsp factory or k8s audit
factory, depending on the rule type.
- When the rule is complete, the complete filter is passed to the
engine using either add_sinsp_filter()/add_k8s_audit_filter().
* Add multiple output formatting types
Add support for multiple output formatters. Notable changes:
- The falco engine is passed along to falco_formats to gain access to
the engine's factories.
- When creating a formatter, the source of the rule is passed along
with the format string, which controls which kind of output formatter
is created.
Also clean up exception handling a bit so all lua callbacks catch all
exceptions and convert them into lua errors.
* Add support for json, k8s audit filter fields
With some corresponding changes in sysdig, you can now create general
purpose filter fields and events, which can be tied together with
nesting, expressions, and relational operators. The classes here
represent an instance of these fields devoted to generic json objects as
well as k8s audit events. Notable changes:
- json_event: holds a json object, used by all of the below
- json_event_filter_check: Has the ability to extract values out of a
json_event object and has the ability to define macros that associate
a field like "group.field" with a json pointer expression that
extracts a single property's value out of the json object. The basic
field definition also allows creating an index
e.g. group.field[index], where a std::function is responsible for
performing the indexing. This class has virtual void methods so it
must be overridden.
- jevt_filter_check: subclass of json_event_filter_check and defines
the following fields:
- jevt.time/jevt.rawtime: extracts the time from the underlying json object.
- jevt.value[<json pointer>]: general purpose way to extract any
json value out of the underlying object. <json pointer> is a json
pointer expression
- jevt.obj: Return the entire object, stringified.
- k8s_audit_filter_check: implements fields that extract values from
k8s audit events. Most of the implementation is in the form of macros
like ka.user.name, ka.uri, ka.target.name, etc. that just use json
pointers to extact the appropriate value from a k8s audit event. More
advanced fields like ka.uri.param, ka.req.container.image use
indexing to extract individual values out of maps or arrays.
- json_event_filter_factory: used by things like the lua parser api,
output formatter, etc to create the necessary objects and return
them.
- json_event_formatter: given a format string, create the necessary
fields that will be used to create a resolved string when given a
json_event object.
* Add ability to list fields
Similar to sysdig's -l option, add --list (<source>) to list the fields
supported by falco. With no source specified, will print all
fields. Source can be "syscall" for inspector fields e.g. what is
supported by sysdig, or "k8s_audit" to list fields supported only by the
k8s audit support in falco.
* Initial set of k8s audit rules
Add an initial set of k8s audit rules. They're broken into 3 classes of
rules:
- Suspicious activity: this includes things like:
- A disallowed k8s user performing an operation
- A disallowed container being used in a pod.
- A pod created with a privileged pod.
- A pod created with a sensitive mount.
- A pod using host networking
- Creating a NodePort Service
- A configmap containing private credentials
- A request being made by an unauthenticated user.
- Attach/exec to a pod. (We eventually want to also do privileged
pods, but that will require some state management that we don't
currently have).
- Creating a new namespace outside of an allowed set
- Creating a pod in either of the kube-system/kube-public namespaces
- Creating a serviceaccount in either of the kube-system/kube-public
namespaces
- Modifying any role starting with "system:"
- Creating a clusterrolebinding to the cluster-admin role
- Creating a role that wildcards verbs or resources
- Creating a role with writable permissions/pod exec permissions.
- Resource tracking. This includes noting when a deployment, service,
- configmap, cluster role, service account, etc are created or destroyed.
- Audit tracking: This tracks all audit events.
To support these rules, add macros/new indexing functions as needed to
support the required fields and ways to index the results.
* Add ability to read trace files of k8s audit evts
Expand the use of the -e flag to cover both .scap files containing
system calls as well as jsonl files containing k8s audit events:
If a trace file is specified, first try to read it using the
inspector. If that throws an exception, try to read the first line as
json. If both fail, return an error.
Based on the results of the open, the main loop either calls
do_inspect(), looping over system events, or
read_k8s_audit_trace_file(), reading each line as json and passing it to
the engine and outputs.
* Example showing how to enable k8s audit logs.
An example of how to enable k8s audit logging for minikube.
* Add unit tests for k8s audit support
Initial unit test support for k8s audit events. A new multiplex file
falco_k8s_audit_tests.yaml defines the tests. Traces (jsonl files) are
in trace_files/k8s_audit and new rules files are in
test/rules/k8s_audit.
Current test cases include:
- User outside allowed set
- Creating disallowed pod.
- Creating a pod explicitly on the allowed list
- Creating a pod w/ a privileged container (or second container), or a
pod with no privileged container.
- Creating a pod w/ a sensitive mount container (or second container), or a
pod with no sensitive mount.
- Cases for a trace w/o the relevant property + the container being
trusted, and hostnetwork tests.
- Tests that create a Service w/ and w/o a NodePort type.
- Tests for configmaps: tries each disallowed string, ensuring each is
detected, and the other has a configmap with no disallowed string,
ensuring it is not detected.
- The anonymous user creating a namespace.
- Tests for all kactivity rules e.g. those that create/delete
resources as compared to suspicious activity.
- Exec/Attach to Pod
- Creating a namespace outside of an allowed set
- Creating a pod/serviceaccount in kube-system/kube-public namespaces
- Deleting/modifying a system cluster role
- Creating a binding to the cluster-admin role
- Creating a cluster role binding that wildcards verbs or resources
- Creating a cluster role with write/pod exec privileges
* Don't manually install gcc 4.8
gcc 4.8 should already be installed by default on the vm we use for
travis.