log2ban

Parse a log stream and track per-IP-address anomaly scores. An ipset is maintained for blocking subsequent access according to the configured ruleset and threshold.

Public facing services such as webservers are persistently exposed to opportunistic malicious probing. Such mostly automated requests are a continuous annoyance, needlessly waste resources, and possibly pose a security risk over time. For these reasons, tools such as fail2ban parse log files and ban offending IP addresses already on the firewall layer.

By giving the watched daemons control over reporting which addresses misbehave, this forms an el-cheapo WAF/IDS to some extent, just asynchronously relying on access or error logs. Apart from reading the systemd journal, fail2ban uses polling of on-disk log files and timestamp parsing for continuation.

If no such logs are desirable or available, one alternative is to directly monitor the log stream in real-time from a pipe. Thus, log2ban similarly evaluates a given ruleset against the input stream from a named pipe and maintains “anomaly scores” and a firewall ipset accordingly.

single, self-contained Python script without any dependencies
monitor single log stream from a fifo or stdin, as provided by a syslog or service daemon
batch multiple ipset executions in one restore operation, relying on builtin set timeouts
multiple rulesets, single per-IP anomaly score
simple and transparent configuration

Usage

The script can be directly invoked and contains all relevant parts: Parse the input stream, maintain internal counters, and perform ipset actions. Both input stream and ipset can be created on-demand, which is often not optimal though, as logging daemons and firewall rules might depend on them beforehand.

Updates are done in certain intervals, which possibly allows merging multiple ipset executions into a single restore call. Note that the banning timeout functionality is delegated to the ipset itself, see below for an example on creating sets using the timeout argument.

The decision on whether to add or remove an IP address to this set depends on the configured threshold. As negative scores (when used) can also accumulate to build up a “trust budget”, there is a minimum, too. The score is tracked internally up to the given timeout and limit.

usage: log2ban.py [-h] [--dry-run] --input STREAM [--input-create] --ruleset INI
                  --ipset NAME [--ipset-create] [--ipset-create-maxelem NUM] [--ipset-create-timeout SEC]
                  [--ipset-exe CMD] [--update-interval SEC] [--score-threshold NUM] [--score-min NUM]
                  [--score-timeout SEC] [--score-limit LIMIT] [--exclude-local]
                  [--systemd] [--log-level LVL] [--log-time]

Parse a log stream and track per-IP 'anomaly scores'.
An ipset is maintained for blocking subsequent access according to the configured ruleset and threshold.

options:
  -h, --help                  show this help message and exit
  --dry-run                   do not actually perform ipset operations (default: False)

input options:
  --input STREAM              pipe to read from or '-' for stdin
  --input-create              create the fifo with current umask if needed (default: False)
  --ruleset INI               ruleset configuration file

ipset options:
  --ipset NAME                name of the ipset to maintain
  --ipset-create              create the ipset if needed (default: False)
  --ipset-create-maxelem NUM  max ipset size when creating (default: 65536)
  --ipset-create-timeout SEC  ipset timeout when creating (default: 300)
  --ipset-exe CMD             path to ipset command binary (default: /usr/sbin/ipset)

update options:
  --update-interval SEC       how often to apply ipset changes (default: 5.0)
  --score-threshold NUM       add address when above, remove otherwise (default: 0)
  --score-min NUM             minimum value, if negative scores are involved (default: 0)

tracking options:
  --score-timeout SEC         how long to locally keep address scores (default: 300.0)
  --score-limit LIMIT         number of addresses to track locally (default: 65536)
  --exclude-local             ignore results for reserved local addresses (default: False)

log options:
  --systemd                   try to signal systemd readiness (default: False)
  --log-level LVL             log level (default: info)
  --log-time                  prefix log with timestamp (default: False)

Apart from the pipe and ipset to use, a ruleset configuration file is needed – which defines regular expression matches for incrementing or decrementing the corresponding IP address anomaly score.

Ruleset Configuration

Matches for parsing log lines are defined by a single file in standard Python INI syntax. A ruleset is formed by a main section and arbitrary but uniquely named child sections. The main regular expression must match for all subsections to be evaluated, too.

Individual runs maintain a “variable context”, i.e, each parsing result can provide named capture groups that upcoming matches can backreference as target. In the end, an address variable that contains the IP address to be blocked should exist.

; Declare main ruleset match called 'www', parsing log lines such as:
; 2020-10-16T08:56:45.973503+00:00 www http: 203.0.113.88 www.example.com 0.004 [16/Oct/2020:10:56:45 +0200] "GET /foo/ HTTP/2.0" 200 2207 "https://www.example.com/" "Mozilla/5.0 (X11; Linux x86_64)"
; Note that this first match must apply and should already define an 'address'.
[ruleset.www]
match = ^[^ ]+ [^ ]+ http: (?P<address>([0-9]+\.)+[0-9]+) (?P<host>[^ ]*) ([0-9.]+) \[([^\]]*)\] "(?P<request>[^"]*)" (?P<status>[0-9]+) ([0-9]+) "(?P<referer>[^"]*)" "(?P<ua>[^"]*)".*

; Capture group variables can be parsed further into other variables, with fallback values.
[ruleset.www.request]
target = $request
match = ^(?P<method>[^ ]+) (?P<url>[^ ]+) .*$
defaults =
    method = XXX
    url = /

; Multiple variables can be templated together. Here, exiting without assigning any score for ignoring.
[ruleset.www.default-request]
target = $host $method $url
match = ^www\.example\.com GET /$
last = true

; Successful and thus most likely benign requests can contribute to a negative score.
[ruleset.www.ok-status]
target = $status
match = ^2..|304$
score = -1
last = true

; Suspicious URLs as blacklist receive an additional penalty.
[ruleset.www.bad-url]
target = $url
match = .*[/_.&?=-]((wp-)?admin|(wp-)?login|wp-includes|wlwmanifest|wp-includes|cgi-bin|phpunit|phpinfo|php[mM]y[aA]dmin|passwd|xmlrpc|w00tw00t)([/_.?&=-].*|$)
score = 3

; Matches can be negated and case-insensitive.
[ruleset.www.bad-method]
target = $method
match = ^GET|HEAD|POST|OPTIONS$
negated = true
ignore_case = true
score = 2

; Per default, increasing the score upon webserver errors.
[ruleset.www.bad-status]
target = $status
match = ^[45]..$
score = 1

Each section accepts the following settings:

target: Template to run against according to the current variable context, otherwise match the log line.
match: Regular expression, named capture groups will be added to the current line’s context.
ignore_case: Perform case-insensitive matching.
negated: Invert the outcome of the match, i.e., successfully match if the pattern does not apply.
defaults: Variables to set even if the pattern as a whole or a capture group does not apply.
enabled: Whether this match should be evaluated.
last: Stop processing of the current ruleset after successful match.
score: Upon successful match, add this value to the IP address score.

The score for the current line can be increased or decreased by each rule. If none is assigned, the address will be ignored. If zero, the detected activity will not change the overall score, but will reset a possibly existing block timeout. Otherwise, the value will be added to the overall per-address score, and cause an (un-)ban according to the configured threshold.

Installation

While technically no installation is required, preparations include creating (and persisting) an ipset, firewall rule, and named pipe – for example:

# install the script in a convenient place
install -v -T log2ban.py /usr/local/bin/log2ban
# create ipset and firewall rule
ipset -exist create logbanset hash:ip family inet timeout 300
iptables -t filter -I INPUT -p tcp -m set --match-set logbanset src -m comment --comment "log2ban" -j DROP
# restore set and rule across reboots
netfilter-persistent save
# create the ruleset and log pipe that should be read from
mkfifo --mode=0660 /var/log/http.log.fifo
(umask 137 ; touch /etc/log2ban.ini)
chown root:syslog /var/log/http.log.fifo /etc/log2ban.ini

For the firewall rule (not necessarily iptables), the positioning or chain with respect to conntrack should be taken into account. If only banning of new connections is possible, effectiveness can be improved by closing suspicious connections early in the daemons to be watched, such as via keepalive_timeout/0 for nginx.

For being able to access ipsets, elevated permissions are usually needed. With systemd, the CAP_NET_ADMIN capability can be inherited selectively:

[Unit]
# /etc/systemd/system/netfilter-log2ban.service
Description=log2ban
After=netfilter-persistent.service rsyslog.service
BindsTo=netfilter-persistent.service rsyslog.service

[Service]
User=syslog
Group=syslog
AmbientCapabilities=CAP_NET_ADMIN
ProtectSystem=strict

Type=notify
ExecStart=/usr/local/bin/log2ban --systemd --input /var/log/http.log.fifo --ruleset /etc/log2ban.ini \
          --ipset logbanset --update-interval 5 --score-min -5 --score-threshold 5 --exclude-local
KillSignal=SIGINT
Restart=always

[Install]
WantedBy=rsyslog.service

Centrally redirecting (access) logs can be done for example by an rsyslog match on the daemon or webserver name, here called http:

# /etc/rsyslog.d/10-http-to-pipe.conf
module(load="builtin:ompipe")
if $programname == "http" then {
    action(type="ompipe" Pipe="/var/log/http.log.fifo")
}