* * * * * The monitoring of uninterruptable power supplies I've been dealing with UPS (Uninterruptable Power Supply) problems [1] for a week and a half now, and it's finally calmed down a bit. Bunny's UPS has been replaced, and I'm waiting for Smirk to order battery replacements for my UPS so in the mean time, I'm using a spare UPS from The Company. Bunny suspects the power situation here at Chez Boca is due to some overgrown trees interfering with the power lines, causing momentary fluctuations in the power and basically playing hell with not only the UPSes but the DVRs as well. This past Wednesday was particuarly bad—the UPS would take a hit and drop power to my computers, and by the time I got up and running, I would take another hit (three times, all within half an hour). It got so bad I ended up climbing around underneath the desks rerunning power cables with the hope of keeping my computers powered for more than ten minutes. It wasn't helping matters that I was fighting my syslogd replacement [2] during each reboot (but that's another post [3]). So Smirk dropped off a replacement UPS, and had I just used the thing, yesterday might have been better. But nooooooooooooooooo! I want to monitor the device (because, hey, I can), but since it's not an APC [4], I can't use apcupsd [5] to monitor it (Bunny's new UPS is an APC, and the one I have with the dead battery is an APC). In searching for some software to monitor the Cyber Power 1000AVR LCD [6] UPS, I came across NUT (Network UPS Tools) [7], which supports a whole host of UPSes [8], and it looks like it can support monitoring multiple UPSes on a single computer (functionality that apcupsd lacks). It's nice, but it does have its quirks (and caused me to have nuclear meltdowns yesterday). I did question the need for five configuration files and its own user accounting system, but upon reflection, the user acccounting system is probably warranted (maybe), given that you can remotely command the UPSes to shutdown. And the configurations files aren't that complex; I just found them annoying. I also found the one process per UPS, plus two processes for monitoring, a bit excessive, but the authors of the program were following the Unix philosophy of small tools collectively working together. Okay, I can deal. The one quirk that drove me towards nuclear meltdown was the inability of the USB (Universal Serial Bus) “driver” (the program that actually queries the UPS over the USB bus) to work properly when a particular directive was present in the configuration file and running in “explore” mode (used to query the UPS for all its information). So I have the following in the UPS configuration file: > [apc1000] > driver = usbhid-ups > port = auto > desc = "APC Back UPS XS 1000" > vendorid = 051D > I try to run usbhid-ups in explore mode, and it fails. Comment out the vendorid, but add it to the commnd line, and it works. But without the vendorid, the usbhid-ups program wouldn't function normally (it's the interface between the monitoring processes and the UPS). It's bad enough that you can only use the explore mode when the rest of the UPS monitoring software isn't running, but this? It took me about three hours to figure out what was (or wasn't) going on. You can obviously generate kilowatt usage, yet I can't query for it over USB? Not even as a vendor extention? You suck!] [9] Then there was the patch I made to keep NUT from logging every second to syslogd (I changed one line from “if result > 0 return else log error” to “if result >= 0 return else log error” since 0 isn't an error code), then I found this bug report [10] on the mailing list archive, and yes, that bug was affecting me as well; after I applied the patch, I was able to get more informtion from the Cyber Power UPS (and it didn't affect the monitoring of the APC). And their logging program, upslog, doesn't log to syslogd. It's not even an option. I could however, have it output to stdout and pipe that into logger, but that's an additional four processes (two per UPS) just to log some stats into syslogd. Fortunately, the protocol used to communicate with the UPS monitoring software is well documented and easy to implement, so it was an easy thing to write a script (Lua, of course) to query the information I wanted to log to syslogd and run that every five minutes via cron. Now, the information you get is impressive. apcupsd gives out rather terse information like (from Bunny's system, which is still running apcupsd): > APC : 001,038,0997 > DATE : Sat Apr 17 22:23:25 EDT 2010 > HOSTNAME : bunny-desktop > VERSION : 3.14.6 (16 May 2009) debian > UPSNAME : apc-xs900 > CABLE : USB Cable > MODEL : Back-UPS XS 900 > UPSMODE : Stand Alone > STARTTIME: Thu Apr 08 23:20:10 EDT 2010 > STATUS : ONLINE > LINEV : 118.0 Volts > LOADPCT : 16.0 Percent Load Capacity > BCHARGE : 084.0 Percent > TIMELEFT : 48.4 Minutes > MBATTCHG : 5 Percent > MINTIMEL : 3 Minutes > MAXTIME : 0 Seconds > SENSE : Low > LOTRANS : 078.0 Volts > HITRANS : 142.0 Volts > ALARMDEL : Always > BATTV : 25.9 Volts > LASTXFER : Unacceptable line voltage changes > NUMXFERS : 6 > XONBATT : Fri Apr 16 00:40:37 EDT 2010 > TONBATT : 0 seconds > CUMONBATT: 11 seconds > XOFFBATT : Fri Apr 16 00:40:39 EDT 2010 > SELFTEST : NO > STATFLAG : 0x07000008 Status Flag > MANDATE : 2007-07-03 > SERIALNO : JB0727006727 > BATTDATE : 2143-00-36 > NOMINV : 120 Volts > NOMBATTV : 24.0 Volts > NOMPOWER : 540 Watts > FIRMWARE : 830.E6 .D USB FW:E6 > APCMODEL : Back-UPS XS 900 > END APC : Sat Apr 17 22:24:00 EDT 2010 > NUT will give back: > battery.charge: 42 > battery.charge.low: 10 > battery.charge.warning: 50 > battery.date: 2001/09/25 > battery.mfr.date: 2003/02/18 > battery.runtime: 3330 > battery.runtime.low: 120 > battery.type: PbAc > battery.voltage: 24.8 > battery.voltage.nominal: 24.0 > device.mfr: American Power Conversion > device.model: Back-UPS RS 1000 > device.serial: JB0307050741 > device.type: ups > driver.name: usbhid-ups > driver.parameter.pollfreq: 30 > driver.parameter.pollinterval: 2 > driver.parameter.port: auto > driver.parameter.vendorid: 051D > driver.version: 2.4.3 > driver.version.data: APC HID 0.95 > driver.version.internal: 0.34 > input.sensitivity: high > input.transfer.high: 138 > input.transfer.low: 97 > input.transfer.reason: input voltage out of range > input.voltage: 121.0 > input.voltage.nominal: 120 > ups.beeper.status: disabled > ups.delay.shutdown: 20 > ups.firmware: 7.g3 .D > ups.firmware.aux: g3 > ups.load: 2 > ups.mfr: American Power Conversion > ups.mfr.date: 2003/02/18 > ups.model: Back-UPS RS 1000 > ups.productid: 0002 > ups.serial: JB0307050741 > ups.status: OL CHRG > ups.test.result: No test initiated > ups.timer.reboot: 0 > ups.timer.shutdown: -1 > ups.vendorid: 051d > Same information, but better variable names, plus you can query for any number of variables. Not all UPSes support all variables, though (and there are plenty more variables that my UPSes don't support, like temperature). You can also send commands to the UPS (for instance, I was able to shut off the beeper on the failing APC) using this software. So yes, it's nice, but its quirky nature was something I wasn't expecting after a week of electric musical chairs. [1] gopher://gopher.conman.org/1Phlog:2010/04/07 [2] gopher://gopher.conman.org/0Phlog:2010/02/09.1 [3] gopher://gopher.conman.org/0Phlog:2010/04/18.1 [4] http://www.apc.com/ [5] http://sourceforge.net/projects/apcupsd/ [6] http://www.cyberpowersystems.com/products/ups-systems/browse-by-category/intelligent-lcd-ups/CP1000AVRLCD.html?selectedTabId=overview&imageI=#tab-box [7] http://www.networkupstools.org/ [8] http://www.networkupstools.org/compat/stable.html [9] gopher://gopher.conman.org/IPhlog:2010/04/17/ups.jpg [10] http://lists.alioth.debian.org/pipermail/nut-upsdev/2010-March/004673.html Email Sean Conner at sean@conman.org .