The trace file traces-positive/run-shell-untrusted.scap has an old
execve event number (PPME_SYSCALL_EXECVE_18), which was replaced by
PPME_SYSCALL_EXECVE_19 in 2018.
Given the changes in https://github.com/falcosecurity/libs/pull/94,
these events are now skipped. So change the test to note that *no*
events will be detected.
As a bit of context, event numbers won't be changing any longer--a
change around the same time 298fbde8029020ce3fbddd07e2910b59cc402b8b
allowed for extending existing events to add new parameters instead of
having to define a new event number just to add a new parameter. So
the notion of "old events" should not exist for any event created
after mid-to-late 2018.
Signed-off-by: Mark Stemm <mark.stemm@gmail.com>
options
The following options have been added:
* -v (verbose)
* -p (prepare falco_traces test suite)
* -b (specify custom branch for downloading trace files)
* -d (specify the build directory)
Signed-off-by: Leonardo Di Donato <leodidonato@gmail.com>
* Update engine fields checksum for fd.dev.*
New fields fd.dev.*, so updating the fields checksum.
* Print a message why the trace file can't be read.
At debug level only, but better than nothing.
* Adjust tests to match new container_started macro
Now that the container_started macro works either on the container event
or the first process being spawned in a container, we need to adjust the
counts for some rules to handle both cases.
* Skip incomplete container info for container start
In the container_started macro, ensure that the container metadata is
complete after either the container event (very unlikely) or after the
exec of the first process into the container (very likely now that
container metadata fetches are async).
When using these rules with older falco versions, this macro will still
work as the synchronous container metadata fetch will result in a
repository that isn't "incomplete".
* Update test traces to have full container info
Some test trace files used for regression tests didn't have full
container info, and once we started looking for those fields, the tests
stopped working.
So update the traces, and event counts to match.
* Use correct copyright years.
Also include the start year.
* Improve copyright notices.
Use the proper start year instead of just 2018.
Add the right owner Draios dba Sysdig.
Add copyright notices to some files that were missing them.
Replace references to GNU Public License to Apache license in:
- COPYING file
- README
- all source code below falco
- rules files
- rules and code below test directory
- code below falco directory
- entrypoint for docker containers (but not the Dockerfiles)
I didn't generally add copyright notices to all the examples files, as
they aren't core falco. If they did refer to the gpl I changed them to
apache.
* Let OMS agent for linux write config
Programs are omiagent/omsagent/PerformInventor/in_heartbeat_r* and files
are below /etc/opt/omi and /etc/opt/microsoft/omsagent.
* Handle really long classpath lines for cassandra
Some cassandra cmdlines are so long the classpath truncates the cmdline
before the actual entry class gets named. In those cases also look for
cassandra-specific config options.
* Let postgres binaries read sensitive files
Also add a couple of postgres cluster management programs.
* Add apt-add-reposit(ory) as a debian mgmt program
* Add addl info to debug writing sensitive files
Add parent/grandparent process info.
* Requrire root directory files to contain /
In some cases, a file below root might be detected but the file itself
has no directory component at all. This might be a bug with dropped
events. Make the test more strict by requiring that the file actually
contains a "/".
* Let updmap read sensitive files
Part of texlive (https://www.tug.org/texlive/)
* For selected rules, require proc name to exist
Some rules such as reading sensitive files and writing below etc have
many exceptions that depend on the process name. In very busy
environments, system call events might end up being dropped, which
causes the process name to be missing.
In these cases, we'll let the sensitive file read/write below etc to
occur. That's handled by a macro proc_name_exists, which ensures that
proc.name is not "<NA>" (the placeholder when it doesn't exist).
* Let ucf write generally below /etc
ucf is a general purpose config copying program, so let it generally
write below /etc, as long as it in turn is run by the apt program
"frontend".
* Add new conf writers for couchdb/texmf/slapadd
Each has specific subdirectories below /etc
* Let sed write to addl temp files below /etc
Let sed write to additional temporary files (some directory + "sed")
below /etc. All generally related to package installation scripts.
* Let rabbitmq(ctl) spawn limited shells
Let rabbitmq spawn limited shells that perform read-only tasks like
reading processes/ifaces.
Let rabbitmqctl generally spawn shells.
* Let redis run startup/shutdown scripts
Let redis run specific startup/shutdown scripts that trigger at
start/stop. They generally reside below /etc/redis, but just looking for
the names redis-server.{pre,post}-up in the commandline.
* Let erlexec spawn shells
https://github.com/saleyn/erlexec, "Execute and control OS processes
from Erlang/OTP."
* Handle updated trace files
As a part of these changes, we updated some of the positive trace files
to properly include a process name. These newer trace files have
additional opens, so update the expected event counts to match.
* Let yum-debug-dump write to rpm database
* Additional config writers
Symantec AV for Linux, sosreport, semodule (selinux), all with their
config files.
* Tidy up comments a bit.
* Try protecting node apps again
Try improving coverage of run shell untrusted by looking for shells
below node processes again. Want to see how many FPs this causes before
fully committing to it.
* Let node run directly by docker count as a service
Generally, we don't want to consider all uses of node as a service wrt
spawned shells. But we might be able to consider node run directly by
docker as a "service". So add that to protected_shell_spawner.
* Also add PM2 as a protected shell spawner
This should handle cases where PM2 manages node apps.
* Remove dangling macros/lists
Do a pass over the set of macros/lists, removing most of those that are
no longer referred to by any macro/list. The bulk of the macros/lists
were related to the rule Run Shell Untrusted, which was refactored to
only detect shells run below specific programs. With that change, many
of these exceptions were no longer neeeded.
* Add a "never_true" macro
Add a never_true macro that will never match any event. Useful if you
want to disable a rule/macro/etc.
* Add missing case to write_below_etc
Add the macro veritas_writing_config to write_below_etc, which was
mistakenly not added before.
* Make tracking shells spawned by node optional
The change to generally consider node run directly in a container as a
protected shell spawner was too permissive, causing false
positives. However, there are some deployments that want to track shells
spawned by node as suspect. To address this, create a macro
possibly_node_in_container which defaults to never matching (via the
never_true) macro. In a user rules file, you can override the macro to
remove the never_true clause, reverting to the old behavior.
* Add some dangling macros/lists back
Some macros/lists are still referred to by some widely used user rules
files, so add them back temporarily.
* Refactor shell rules to avoid FPs.
Refactoring the shell related rules to avoid FPs. Instead of considering
all shells suspicious and trying to carve out exceptions for the
legitimate uses of shells, only consider shells spawned below certain
processes suspicious.
The set of processes is a collection of commonly used web servers,
databases, nosql document stores, mail programs, message queues, process
monitors, application servers, etc.
Also, runsv is also considered a top level process that denotes a
service. This allows a way for more flexible servers like ad-hoc nodejs
express apps, etc to denote themselves as a full server process.
* Update event generator to reflect new shell rules
spawn_shell is now a silent action. its replacement is
spawn_shell_under_httpd, which respawns itself as httpd and then runs a
shell.
db_program_spawn_binaries now runs ls instead of a shell so it only
matches db_program_spawn_process.
* Comment out old shell related rules
* Modify nodejs example to work w/ new shell rules
Start the express server using runit's runsv, which allows falco to
consider any shells run by it as suspicious.
* Use the updated argument for mkdir
In https://github.com/draios/sysdig/pull/757 the path argument for mkdir
moved to the second argument. This only became visible in the unit tests
once the trace files were updated to reflect the other shell rule
changes--the trace files had the old format.
* Update unit tests for shell rules changes
Shell in container doesn't exist any longer and its functionality has
been subsumed by run shell untrusted.
* Allow git binaries to run shells
In some cases, these are run below a service runsv so we still need
exceptions for them.
* Let consul agent spawn curl for health checks
* Don't protect tomcat
There's enough evidence of people spawning general commands that we
can't protect it.
* Reorder exceptions, add rabbitmq exception
Move the nginx exception to the main rule instead of the
protected_shell_spawner macro. Also add erl_child_setup (related to
rabbitmq) as an allowed shell spawner.
* Add additional spawn binaries
All off these are either below nginx, httpd, or runsv but should still
be allowed to spawn shells.
* Exclude shells when ancestor is a pkg mgmt binary
Skip shells when any process ancestor (parent, gparent, etc) is a
package management binary. This includes the program needrestart. This
is a deep search but should prevent a lot of other more detailed
exceptions trying to find the specific scripts run as a part of
installations.
* Skip shells related to serf
Serf is a service discovery tool and can in some cases be spawned by
apache/nginx. Also allow shells that are just checking the status of
pids via kill -0.
* Add several exclusions back
Add several exclusions back from the shell in container rule. These are
all allowed shell spawns that happen to be below
nginx/fluentd/apache/etc.
* Remove commented-out rules
This saves space as well as cleanup. I haven't yet removed the
macros/lists used by these rules and not used anywhere else. I'll do
that cleanup in a separate step.
* Also exclude based on command lines
Add back the exclusions based on command lines, using the existing set
of command lines.
* Add addl exclusions for shells
Of note is runsv, which means it can directly run shells (the ./run and
./finish scripts), but the things it runs can not.
* Don't trigger on shells spawning shells
We'll detect the first shell and not any other shells it spawns.
* Allow "runc:" parents to count as a cont entrypnt
In some cases, the initial process for a container can have a parent
"runc:[0:PARENT]", so also allow those cases to count as a container
entrypoint.
* Use container_entrypoint macro
Use the container_entrypoint macro to denote entering a container and
also allow exe to be one of the processes that's the parent of an
entrypoint.
* Updates from beta customers.
- add anacron as a cron program
* Reorganize package management binaries
Split package_management_binaries into two separate lists rpm_binaries
and deb_binaries. unattended-upgr is common to both worlds so it's still
in package_management_binaries.
Also change Write below rpm database to use rpm_binaries instead of its
own list.
Also add 75-system-updat (truncated) as a shell spawner.
* Add rules for jenkins
Add rules that allow jenkins to spawn shells, both in containers and
directly on the host.
Also handle jenkins slaves that run /tmp/slave.jar.
* Allow npm to run shells.
Not yet allowing node to run shells itself, although we want to add
something to reduce node-related FPs.
* Allow urlgrabber/git-remote to access /etc
urlgrabber and git-remote both try to access the RHEL nss database,
containing shared certificates. I may change this in a more general way
by changing open_read/open_write to only look for successful opens.
* Only look for successful open_read/open_writes
Change the macros open_read/open_write to only trigger on successful
opens (when fd.num > 0). This is a pretty big change to behavior, but
is more intuitive.
This required a small update to the open counts for a couple of unit
tests, but otherwise they still all passed with this change.
* Allow rename_device to write below /dev
Part of udev.
* Allow cloud-init to spawn shells.
Part of https://cloud-init.io/
* Allow python to run a shell that runs sdchecks
sdchecks is a part of the sysdig monitor agent.
* Allow dev creation binaries to write below etc.
Specifically this includes blkid and /etc/blkid/blkid.tab.
* Allow git binaries to spawn shells.
They were already allowed to run shells in a container.
* Add /dev/kmsg as an allowed /dev file
Allows userspace programs to write to kernel log.
* Allow other make programs to spawn shells.
Also allow gmake/cmake to spawn shells and put them in their own list
make_binaries.
* Add better mesos support.
Mesos slaves appear to be in a container due to their cgroup and can run
programs mesos-health-check/mesos-docker-exec to monitor the containers
on the slave, so allow them to run shells.
Add mesos-agent, mesos-logrotate, mesos-fetch as shell spawners both in
and out of containers.
Add gen_resolvconf. (short for gen_resolvconf.py) as a program that can
write to /etc.
Add toybox (used by mesos, part of http://landley.net/toybox/about.html)
as a shell spawner.
* systemd can listen on network ports.
Systemd can listen on network ports to launch daemons on demand, so
allow it to perform network activity.
* Let docker binaries setuid.
Let docker binaries setuid and add docker-entrypoi (truncation
intentional) to the set of docker binaries.
* Change cis-related rules to be less noisy
Change the two cis-related falco rules "File Open by Privileged
Container" and "Sensitive Mount by Container" to be less noisy. We found
in practice that tracking every open still results in too many falco
notifications.
For now, change the rules to only track the initial process start in the
container by looking for vpid=1. This should result in only triggering
when a privileged/sensitive mount container is started. This is slightly
less coverage but is far less noisy.
* Add quay.io/sysdig as trusted containers
These are used for sysdig cloud onpremise deployments.
* Add gitlab-runner-b(uild) as a gitlab binary.
Add gitlab-runner-b (truncated gitlab-runner-build) as a gitlab binary.
* Add ceph as a shell spawner.
Also allow ceph to spawn shells in a container.
* Allow some shells by command line.
For some mesos containers, where the container doesn't have an image and
is just a tarball in a cgroup/namespace, we don't have any image to work
with. In those cases, allow specific command lines.
* Allow user 'nobody' to setuid.
Allow the user nobody to setuid. This depends on the user nobody being
set up in the first place to have no access, but that should be an ok
assumption.
* Additional allowed shell commandlines
* Add additional shells.
* Allow multiple users to become themself.
Add rule somebody_becoming_themself that handles cases of nobody and
www-data trying to setuid to themself. The sysdig filter language
doesn't support template/variable values to allow "user.name=X and
evt.arg.uid=X for a given X", so we have to enumerate the users.
* More known spawn command lines
* Let make binaries be run in containers.
Some CI/CD pipelines build in containers.
* Add additional shell spawning command lines
* Add additional apt program apt-listchanges.
* Add gitlab-ce as shell spawning container.
* Allow PM2 to spawn shells in containers.
Was already in the general list, seen in some customers, so adding to
the in containers list.
* Clean up pass to fix long lines.
Take a pass through the rules making sure each line is < 120 characters.
* Change tests for privileged container rules.
Change unit tests to reflect the new privileged/sensitive mount
container rules that only detect container launch.
The default falco ruleset now has a wider variety of priorities, so
adjust the automated tests to match:
- Instead of creating a generic test yaml entry for every trace file in
traces-{positive,negative,info} with assumptions about detect levels,
add a new falco_traces.yaml.in multiplex file that has specific
information about the detect priorities and rule detect counts for each
trace file.
- If a given trace file doesn't have a corresponding entry in
falco_traces.yaml.in, a generic entry is added with a simple
detect: (True|False) value and level. That way you can get specific
detect levels/counts for existing trace files, but if you forget to
add a trace to falco_traces.yaml.in, you'll still get some coverage.
- falco_tests.yaml.in isn't added to any longer, so rename it to
falco_tests.yaml.
- Avocado is now run twice--once on each yaml file. The final test
passes if both avocado runs pass.