Syslog logsearch dashboard
Syslog server and built-in Web-GUI with a simple query language, statistics plotting, and alerts. Logs are maintained as JSON files, no databases are involved.
For smaller or self-hosted scenarios, stacks for log management such as Loki/Grafana, Elasticsearch ELK, or Graylog are often overkill – in especially for low volume logs, or on limited hardware such as on a Raspberry Pi. The standard setup on most distributions is still a local syslog daemon, where even the journal or container logs can be forwarded to. Relying on syslog only also allows to centrally collect logs from multiple machines very easily with no additional software required for a bit more advanced use-cases.
This is the successor and partial reimplementation of the now legacy Logsearch Web-GUI project that finds and parses existing plain-text log files on-demand.
With syslogsearch
, logs can be centrally collected by forwarding syslog messages to it.
Loglines are parsed and stored in rotating local JSON files.
A convenient built-in Web interface allows searching with a simple query language and statistics plotting support.
Also, a list of queries that is continuously evaluated can be given for alerting. An arbitrary command is triggered upon match, with the commandline being interpolated according to the record in question.
To summarize, this project aims to be the all-in-one solution for private medium-sized setups, with a single script file that contains:
- Syslog server (RFC5424 over TCP)
- Rotating file storage (per default 15 daily JSON shard files)
- Alerts (watched queries that trigger arbitrary commands)
- Webserver (aiohttp with basic authentication and HTTPS support)
- Frontend with statistics and logtail (self-contained vanilla HTML/CSS/JS)
- Query language (simple DSL with boolean, string, and expression matches)
- JSON HTTP API (due to completely static frontend)
Even “longer searches” over at least “tens of thousands” log messages per day should be no problem for neither weak backend servers (Raspberry Pi) nor weak frontend devices (Mobile).
Query Language and GUI
The log query language resembles other DSLs for this purpose.
In general, matches are formed by field:value pairs.
Available fields names are *
for any, facility
, severity
, remote_host
/client
, hostname
/host
, application
/app
, pid
, and message
/msg
.
Search values are (optionally quoted) strings or regular expressions in between /
slash characters (case-insensitive with /i
).
Multiple matches can be chained together with AND
/OR
.
More complex queries involving both AND
/OR
must be fully parenthesized.
Matches as well as sub-queries in parentheses can be negated by a NOT
prefix.
With this, the query language can be kept simple and context-sensitive. For example:
*:"foo bar"
NOT application:kernel
application:kernel OR NOT facility:/daemon|kernel/
(app:/^cron$/i OR hostname:localhost) AND NOT (NOT message:"init:" OR pid:1)
The full lark grammar specification can be found in the sources.
Apart from the main query input, the timeframe to consider can be restricted by two datetime pickers.
This is also the interval for the result histogram, which depicts the number of overall and matched records.
The plot can be scaled interactively or automatically and total numbers can be hidden for a per-query result view.
The pie charts below group all query results into the values of facility
, severity
, hostname
, and application
.
Only an excerpt of the actual logs is shown per default. This can be adjusted by the controls for result limiting and pagination.
The frontend is implemented with plain vanilla HTML/CSS/JS, the plotly JS library as only dependency for interactive plots comes with the corresponding local python package – so no CDN is involved. While inspecting logs on small screens is not really convenient, the GUI itself is also usable on and responsive to mobile devices.
Configuration and Usage
Syslog server, HTTP/S API server, and frontend come bundled together as a single Python script.
usage: syslogsearch.py [-h] [--verbose] [--debug] [--read-only] [--reverse]
[--syslog-bind-all] [--syslog-port PORT] [--http-bind-all] [--http-port PORT]
[--http-basic-auth] [--https-certfile PEM] [--https-keyfile PEM]
[--data-dir DIR] [--flush-interval SEC] [--max-buffered LEN] [--max-shards NUM]
[--file-format EXT] [--json-full] [--filters TXT] [--alerts JSON]
Syslog server accepting RFC5424 compliant messages over TCP.
Records are maintained in flat JSON files.
A convenient built-in Web interface allows searching with a simple query language and statistics plotting support.
optional arguments:
-h, --help show this help message and exit
--verbose enable verbose logging (default: False)
--debug enable even more verbose debug logging (default: False)
--read-only don't actually write received records (default: False)
--reverse order results and pagination by most recent entry first (default: False)
--systemd-notify try to signal systemd readiness (default: False)
--syslog-bind-all bind to 0.0.0.0 instead of localhost only (default: False)
--syslog-port PORT syslog TCP port to listen on (default: 5140)
--http-bind-all bind to 0.0.0.0 instead of localhost only (default: False)
--http-port PORT HTTP port to listen on (default: 5141)
--http-basic-auth require basic authentication using credentials from LOGSEARCH_USER and LOGSEARCH_PASS environment variables (default: False)
--https-certfile PEM certificate for HTTPS
--https-keyfile PEM key file for HTTPS
--data-dir DIR storage directory (default: .)
--flush-interval SEC write out records this often (default: 60.0)
--max-buffered LEN record limit before early flush (default: 1000)
--max-shards NUM number of 'daily' files to maintain (default: 15)
--file-format EXT format for new files, either 'json' or 'csv' (default: json)
--json-full assume full objects instead of tuples in JSON mode (default: False)
--filters TXT filter query definitions file
--alerts JSON alert definitions file
Even though there is HTTPS and authentication support, using a reverse proxy is recommended when exposing Python webserver implementations to untrusted environments in general.
Tested for Python 3.8 and 3.10. Once started, logging, storage, and frontend can be tried out by for example:
logger -t logger -p user.info --rfc5424 --server 127.0.0.1 --port 5140 --tcp -- "Hello World!"
Log Alerts and Filters
The log stream can be continuously watched for certain queries in order to trigger alert traps. Arbitrary commands are executed upon match – the configured commandline will be interpolated according to the record in question. For example, an alerts configuration file could look like:
[
{
"name": "Undervoltage Alert Notification",
"query": "application:kernel AND message:/Under-?voltage detected/",
"command": ["alert-trap.sh", "Alert: Undervoltage detected at $hostname", "$timestamp $facility $severity $application $message"],
"limit": 0.0
}
]
Note that using variables in the command to call (first list item) is technically supported but might impose a security risk. Trap commands are executed by a separate thread that sequentially spawns processes and are subject to being killed after a certain timeout. An optional rate-limit in seconds prevents bursts.
In order to skip certain log messages, a filter ruleset can be provided.
The file should consist of one query per line, empty lines and #
prefixes are ignored.
If any expression matches, the record will be discarded right away and is not subject to further processing.
JSON Web API
As the served Web GUI frontend is completely static and self-contained, it does nothing but call its own HTTP API.
This main endpoint can also be accessed directly or via custom scripts at /api/search?q=…
for performing arbitrary queries.
Corresponding search statistics and structured results are returned in a single JSON object.
Accepted GET parameters:
q=query
- Search query expression string to evaluate, as supported by the Web GUI. Note that standard URL encoding rules apply.
s=start
e=end
- Start and end time for limiting the search window, as Unix timestamp.
o=offset
l=limit
- Pagination for actually returned entries. Defaults to zero offset and no limit – everything. Note that all found matches affect the statistics, not just the currently returned window.
The same signature is provided by /api/tail
for plaintext results.
As auxiliary endpoint, /api/flush
writes buffered records to disk on next occasion. This is already done periodically and/or automatically, though.
For verifying the respective configuration, /api/alerts
and /api/filters
return the list of active alert rules or filters correspondingly.
File Formats
While records in API responses are always JSON objects, the storage format can be chosen independently. As repeating the keys for every line would be a huge waste, JSON tuples (i.e., lists) are used instead per default. This can be changed by the corresponding commandline flag, for example to ease external processing. Also, each line is parseable for itself, there is no global surrounding array.
As third option, tab-separated CSV is supported, which is slightly more compact than JSON. Although the parser can be kept very simple, JSON deserialization is usually still faster, presumably because it can be backed by “native” C code.
Installation
With the provided setup.py
, installation to /usr/local/bin/syslogsearch
is as simple as:
sudo pip install -U --ignore-installed .
sudo pip show syslogsearch
sudo pip uninstall --yes syslogsearch
These install
and uninstall
commands are also wrapped by the Makefile
, as well as virtual environment installation, type checking, and linting.
Lots of linux distributions come with rsyslog
per default.
So when not using syslogsearch
directly, all logs can centrally be forwarded by for example in /etc/rsyslogd.d/10-syslogsearch.conf
:
*.* @@127.0.0.1:5140;RSYSLOG_SyslogProtocol23Format
Similarly, syslog-ng
should also be able to forward RFC5424-compliant log streams.
In order to make sure that no important logs bypass a local syslog daemon,
the Docker logging driver and systemd journal forward
can become relevant.
A minimal systemd unit file /etc/systemd/system/syslogsearch.service
could look like the following:
[Unit]
Description=Syslog logsearch dashboard
[Service]
User=syslog
Group=adm
Type=notify
ExecStart=/usr/local/bin/syslogsearch --systemd-notify --verbose --data-dir /var/log/syslogsearch/
[Install]
WantedBy=syslog.service
Similarly, building and running with Docker is also straight-forward:
FROM python:3.10-slim
RUN mkdir /install /var/log/syslogsearch
COPY requirements.txt setup.py syslogsearch.py /install/
RUN pip install -U /install && rm -rf /install
EXPOSE 5140
EXPOSE 5141
VOLUME /var/log/syslogsearch
ENTRYPOINT ["/usr/local/bin/syslogsearch"]
CMD ["--syslog-bind-all", "--http-bind-all", "--verbose", "--data-dir", "/var/log/syslogsearch/"]
The container image itself can (and should) run --read-only
and with any non-root --user
that matches the external volume permissions.