Title: Systemd journald cheatsheet
       Author: Solène
       Date: 24 December 2024
       Tags: linux
       Description: In this blog post, you will learn how to deal with logs
       managed by systemd-journald
       
       # Introduction
       
       This blog post is part of a series that will be about Systemd
       ecosystem, today's focus is on journaling.
       
       Systemd got a regrettable reputation since its arrival mid 2010.  I
       think this is due to Systemd being radically different than traditional
       tooling, and people got lost without a chance to be noticed beforehand
       they would have to deal with it.  The transition was maybe rushed a bit
       with a half-baked product, in addition to the fact users had to learn
       new paradigms and tooling to operate their computer.
       
       Nowadays, Systemd is working well, and there are serious non-Systemd
       alternatives, so everyone should be happy. :)
       
       # Introduction to journald
       
       Journald is the logging system that was created as part of Systemd.  It
       handles logs created by all Systemd units.  A huge difference compared
       to the traditional logs is that there is a single journal file acting
       as a database to store all the data.  If you want to read logs, you
       need to use `journalctl` command to extract data from the database as
       it is not plain text.
       
       Most of the time journald logs data from units by reading their
       standard error and output, but it is possible to send data to journald
       directly.
       
       On the command line, you can use `systemd-cat` to run a program or pipe
       data to it to send them to logs.
       
 (HTM) systemd-cat man page
       
       # Journalctl 101
       
       Here is a list of the most common cases you will encounter:
       
       * View new logs live: `journalctl -f`
       * View last 2000 lines of logs: `journalctl -n 2000`
       * Restrict logs to a given unit: `journalctl -u nginx.service`
       * Pattern matching: `journalctl -g somepattern`
       * Filter by date (since): `journalctl --since="10 minutes ago"` or
       `journalctl --since="1 hour ago"` or `journalctl --since=2024-12-01`
       * Filter by date (range): `journalctl --since="today" --until="1 hour
       ago"` or `journalctl --since="2024-12-01 12:30:00" --until="2024-12-01
       16:00:00"`
       * Filter logs since boot: `journalctl -b`
       * Filter logs to previous (n-1) boot: `journalctl -b -1`
       * Switch date time output to UTC: `journalctl --utc`
       
       You can use multiple parameters at the same time:
       
       * Last 200 lines of logs of nginx since current boot: `journalctl -n
       200 -u nginx -b`
       * Live display of nginx logs files matching "wp-content": `journalctl
       -f -g wg-content -u nginx`
       
 (HTM) journalctl man page
       
       # Send logs to syslog
       
       If you want to bypass journald and send all messages to syslog to
       handle your logs with it, you can edit the file
       `/etc/systemd/journald.conf` to add the line `ForwardToSyslog=Yes`.
       
       This will make journald relay all incoming messages to syslog, so you
       can process your logs as you want.
       
       Restart journald service: `systemctl restart systemd-journal.service`
       
 (HTM) systemd-journald man page
 (HTM) journald.conf man page
       
       # Journald entries metadata
       
       Journalctl contains a lot more information than just the log line (raw
       content).  Traditional syslog files contain the date and time, maybe
       the hostname, and the log message.
       
       This is just for information, only system administrators will ever need
       to dig through this, it is important to know it exists in case you need
       it.
       
       ## Example
       
       Here is what journald stores for each line (pretty printed from json
       output), using samba server as an example.
       
       ```
       # journalctl -u smbd -o json -n 1 | jq
       {
         "_EXE": "/usr/libexec/samba/rpcd_winreg",
         "_CMDLINE": "/usr/libexec/samba/rpcd_winreg --configfile=/etc/samba/smb.conf --worker-group=4 --worker-index=5 --debuglevel=0",
         "_RUNTIME_SCOPE": "system",
         "__MONOTONIC_TIMESTAMP": "749298223244",
         "_SYSTEMD_SLICE": "system.slice",
         "MESSAGE": "  Copyright Andrew Tridgell and the Samba Team 1992-2023",
         "_MACHINE_ID": "f23c6ba22f8e02aaa8a9722df464cae3",
         "_SYSTEMD_INVOCATION_ID": "86f0f618c0b7dedee832aef6b28156e7",
         "_BOOT_ID": "42d47e1b9a109551eaf1bc82bd242aef",
         "_GID": "0",
         "PRIORITY": "5",
         "SYSLOG_IDENTIFIER": "rpcd_winreg",
         "SYSLOG_TIMESTAMP": "Dec 19 11:00:03 ",
         "SYSLOG_RAW": "<29>Dec 19 11:00:03 rpcd_winreg[4142801]:   Copyright Andrew Tridgell and the Samba Team 1992-2023\n",
         "_CAP_EFFECTIVE": "1ffffffffff",
         "_SYSTEMD_UNIT": "smbd.service",
         "_PID": "4142801",
         "_HOSTNAME": "pelleteuse",
         "_SYSTEMD_CGROUP": "/system.slice/smbd.service",
         "_UID": "0",
         "SYSLOG_PID": "4142801",
         "_TRANSPORT": "syslog",
         "__REALTIME_TIMESTAMP": "1734606003126791",
         "__CURSOR": "s=1ab47d484c31144909c90b4b97f3061d;i=bcdb43;b=42d47e1b9a109551eaf1bc82bd242aef;m=ae75a7888c;t=6299d6ea44207;x=8d7340882cc85cab",
         "_SOURCE_REALTIME_TIMESTAMP": "1734606003126496",
         "SYSLOG_FACILITY": "3",
         "__SEQNUM": "12376899",
         "_COMM": "rpcd_winreg",
         "__SEQNUM_ID": "1ab47d484c31144909c90b4b97f3061d",
         "_SELINUX_CONTEXT": "unconfined\n"
       }
       ```
       
       The "real" log line is the value of `SYSLOG_RAW`, everything else is
       created by journald when it receives the information.
       
       ## Filter
       
       As the logs can be extracted in JSON format, it becomes easy to parse
       them properly using any programming language able to deserialize JSON
       data, this is far more robust than piping lines to AWK / grep, although
       it can work "most of the time" (until it does not due to a weird
       input).
       
       On the command line, you can query/filter such logs using `jq` which is
       a bit the awk of JSON.  For instance, if I output all the logs of
       "today" to filter lines generated by the binary `/usr/sbin/sshd`, I can
       use this:
       
       ```
       journalctl --since="today" -o json | jq -s '.[] | select(._EXE == "/usr/sbin/sshd")'
       ```
       
       This command line will report each line of logs where "_EXE" field is
       exactly "/usr/sbin/sshd" and all the metadata.  This kind of data can
       be useful when you need to filter tightly for a problem or a security
       incident.
       
       The example above was made easy as it is a bit silly in its form:
       filtering on SSH server can be done with `journalctl -u sshd.service
       --since=today`.
       
       # Conclusion
       
       Journald is a powerful logging system, journalctl provides a single
       entry point to extract and filter logs in a unified system.
       
       With journald, it became easy to read logs of multiple services over a
       time range, and log rotation is now a problem of the past for me.