Logfile merger

Merge multiple logfiles to a single log stream, the interleaved result will again be sorted by time. Input files can be decompressed on-the-fly.

logmerge is a small but handy tool that reads multiple logfiles in parallel and outputs a single combined log stream. Each line’s timestamp gets evaluated and the interleaved result will again be overall sorted by time. Input files ending with .gz will be decompressed on-the-fly.

A common scenario would be to read logs from multiple sources to provide an overview or to allow further unified processing, similar to e.g. journalctl nowadays. Also, logmerge can be used to fix shell globbing issues when an argument list like 'log log.1 log.11.tag.gz log.2 …' won’t be properly sorted or is partially compressed. In addition, the output can be restricted to a given time frame.

As each individual file is assumed to be already sorted for itself, logmerge does not need to really sort the loglines, scanning through and selective merging is enough. Thus, there is no full buffering or actual sorting needed, keeping resource usage low and throughput high.

        Input A           Output          Input B        
+---------------------+           +---------------------+
| Jan 01 13:01:59 ... | --> |     |                     |
|                     |     | <-- | Jan 05 21:34:33 ... |
|                     |     | <-- | Feb 19 10:25:01 ... |
| Mar 05 17:06:03 ... | --> |     |                     |
|                     |     | <-- | Mar 23 02:42:20 ... |
| Apr 24 09:42:23 ... | --> |     |                     |
+---------------------+     v     +---------------------+

In this example with only two input files, the result will contain six lines of sorted output. Both files are processed in parallel and in each step, the line with the smallest timestamp gets chosen for output.

Merge and sort logfiles: Usage

Call logmerge by providing all the input logfiles to be combined:

logmerge [-b] [-v] [-s timestamp] [-e timestamp] [-t num] [-f format] files...
-b
best-effort/graceful: don’t fail if a file could not be opened, there is no input, or upon other correctable errors
-v
verbose: log upon parsing errors (to stderr)
-s
start: the timestamp for the earliest entry to print (inclusive), parsed according to the given format
-e
end: timestamp for when to stop (exclusive), parsed according to the given format
-t
token: skip this number of space-separated tokens for timestamp parsing, i.e. useful for when the line does not start with the proper time
-f
provide format for timestamp parsing – default: %b %d %H:%M:%S (e.g. Jan 01 13:01:59)

For example:

logmerge /var/log/hosts/example.org.log /var/log/hosts/example.org.log.1.gz /var/log/hosts/example.com.log | webalizer

To build locally, simply type make. No special dependencies or additional external libraries are needed, apart from libz for decompression support (which can be disabled, though). System-wide installation is not required but make install can (build and) copy the binary into the system. Or use the 64bit debian package provided below.

Code & Download