https://two-wrongs.com/awk-state-machine-parser-pattern.html Two Wrongs The AWK State Machine Parser Pattern * Home * Tags * Feed * About * xkqr.org The AWK State Machine Parser Pattern by kqr , published 2018-01-16 Tags: * programming * awk * unix I run the Ganglia Monitoring System on the home lan to keep an eye on temperatures, loads, disk usages, network throughput and such on the gateway and, more recently, my desktop workstation as well^1^1 The primary reason is that I have noticed some instability, and I would like to know if it's related to temperature.. Since temperature is not reported by default, let's say we want to parse the output^2^2 Using the default format, which is only loosely structured. There is a -u flag that you should use to get machine friendly output, but we'll pretend that doesn't exist for the purpose of this article. of lm_sensors and report it to the monitoring system by using the gmetric command. So this is what we have to extract relevant numbers from. radeon-pci-0100 Adapter: PCI adapter temp1: +50.5degC (crit = +120.0degC, hyst = +90.0degC) k10temp-pci-00c3 Adapter: PCI adapter temp1: +36.8degC (high = +70.0degC) (crit = +80.0degC, hyst = +78.0degC) f71889ed-isa-0480 Adapter: ISA adapter +3.3V: +3.23 V in1: +1.07 V (max = +2.04 V) in2: +1.09 V in3: +0.89 V in4: +0.58 V in5: +1.23 V in6: +1.53 V 3VSB: +3.25 V Vbat: +3.31 V fan1: 3978 RPM fan2: 0 RPM ALARM fan3: 0 RPM ALARM temp1: +31.0degC (high = +85.0degC, hyst = +81.0degC) (crit = +80.0degC, hyst = +76.0degC) sensor = transistor temp2: +43.0degC (high = +85.0degC, hyst = +77.0degC) (crit = +100.0degC, hyst = +92.0degC) sensor = thermistor temp3: +38.0degC (high = +70.0degC, hyst = +68.0degC) (crit = +85.0degC, hyst = +83.0degC) sensor = transistor It's divided into sections, but not in a way that awk understands right away. As you have suspected from the title, the key is to realise that whenever you encounter a section heading (or something else that indicates you've left the previous section), you make a note of that in the program state. So in my script, that is expressed as In[1]: #!/usr/bin/awk -f BEGIN { # Which section do we start the script in? unit = "none"; } /^radeon-pci-/ { # Encountered the GPU section header unit = "GPU"; } /^k10temp-pci-/ { # Encountered the CPU section header unit = "CPU"; } /^f71889ed-isa-/ { # I don't know what this is, actually, so # I'm just going to call them SYS_1 etc. unit = "SYS"; } /^temp[0-9]:/ { # Yay, a temperature reading! # Which number does this reading have? number = substr($1, 5, 1); # Which temperature is given? match($2, /+([0-9.]+)degC/, matches); temp = substr( $2, matches[1, "start"], matches[1, "length"] ); # Store the results in an array temps[unit "_" number] = temp; } END { # Call gmetric for all temperatures found for (temp in temps) { system( "gmetric -t uint16 -u Celsius" " -n " temp " -v " temps[temp] ); } } The general pattern, as you probably realise, is a bunch of small pattern-action statements that look like^3^3 Where text enclosed in angle brackets represent metasyntactic variables, i.e. placeholders in this generic example. In[2]: / / { =
} and then one or more pattern-action statements that actually extract data that look something like In[3]: / ) { case
: ; case
: ; } } Once you're fairly comfortable with awk, using it for these things is so convenient that I highly recommend spending some time to get familiar with it. It's a tiny language that can be learnt in a day or so, but surprisingly useful. Sidenotes ^1 The primary reason is that I have noticed some instability, and I would like to know if it's related to temperature. ^2 Using the default format, which is only loosely structured. There is a -u flag that you should use to get machine friendly output, but we'll pretend that doesn't exist for the purpose of this article. ^3 Where text enclosed in angle brackets represent metasyntactic variables, i.e. placeholders in this generic example. Feel free to show your support by buying me a coffee, iff you want to. Or send me an email. Emails show nearly as much support as coffee. Shoutout to my amazing wife without whose support I would never make it past the first sentence. tracking gif