Post AxfaGIUzUSSAOYDQky by zwol@masto.hackers.town
 (DIR) More posts by zwol@masto.hackers.town
 (DIR) Post #AxfaGGJvacPpdf1ZEe by zwol@masto.hackers.town
       2025-08-29T16:21:22Z
       
       2 likes, 0 repeats
       
       I have to rebuild a server again and so I've got a buncha heterodox hot takes about server configuration best practices rattling around in my head again. Who wants to hear them? Each like = 1 take, until I run out.
       
 (DIR) Post #AxfaGHXn2Z69QwW9Ts by zwol@masto.hackers.town
       2025-08-29T17:07:15Z
       
       0 likes, 0 repeats
       
       More hits than I was expecting! Ok, here we go.1. Don't install sudo. Instead, set sshd to allow root logins via key only ("PermitRootLogin prohibit-password"). This makes privilege escalation significantly harder.
       
 (DIR) Post #AxfaGIUzUSSAOYDQky by zwol@masto.hackers.town
       2025-08-29T17:08:42Z
       
       0 likes, 0 repeats
       
       2. Closely related to 1: If you can swing it such that root is the _only_ account that's unlocked for shell access -- every other account has a locked password and '/usr/sbin/nologin' for its shell -- that also makes privilege escalation significantly harder.
       
 (DIR) Post #AxfaGJjusRzEF8Crey by zwol@masto.hackers.town
       2025-08-29T17:11:12Z
       
       1 likes, 0 repeats
       
       3. A really good reason to separate /usr and /home is that you can then mount /home with the nosuid flag, and, ideally, also the noexec flag.
       
 (DIR) Post #AxfaGRFiwaaZXo8ri4 by zwol@masto.hackers.town
       2025-08-29T17:11:39Z
       
       1 likes, 0 repeats
       
       4. Turn off memory overcommit.
       
 (DIR) Post #AxfaGZ1U3Px3dxXbcm by zwol@masto.hackers.town
       2025-08-29T17:14:08Z
       
       0 likes, 0 repeats
       
       5. Moving ssh to a nonstandard port _probably_ isn't worth the hassle.  I need to do some actual statistics on this, but I _think_ that if you disable remote password login, and set up fail2ban or equivalent for port 22, that's more than good enough, unless you're a hyperscaler, in which case you probably have a separate control-plane network interface anyway.
       
 (DIR) Post #AxfaGgWEBVhetKyl16 by zwol@masto.hackers.town
       2025-08-29T17:14:41Z
       
       0 likes, 0 repeats
       
       6. fail2ban is most definitely worth it, but other firewalling has a good chance of being more trouble than it's worth.  In particular, I wouldn't bother with dropping instead of rejecting SYNs to closed ports, and I'm dubious about firewalling by port number. (About the only thing that's good for, IMO, is putting an extra speed bump in front of an adversary who already has remote execution and wants to persist it.)
       
 (DIR) Post #AxfaGoSGg7HzVNBhxI by zwol@masto.hackers.town
       2025-08-29T17:21:43Z
       
       0 likes, 0 repeats
       
       7. The biggest single thing you can do to protect yourself against privilege escalation *into the kernel* is to use a monolithic kernel.  (There are a bunch of other places that you need to tweak if you want to be _really sure_ no one can ever inject code into supervisor mode, but without CONFIG_MODULES=n or equivalent they're all pointless.)  It's a damn shame that current Linux distros make this so hard.
       
 (DIR) Post #AxfaGwE1mweTbWaRs0 by zwol@masto.hackers.town
       2025-08-29T17:23:47Z
       
       1 likes, 0 repeats
       
       8. Closely related to 7: The sheer complexity of Linux's various "mandatory access control" modules - SELinux, AppArmor, etc - leads me to believe that most people are better off not only not using them, but configuring them out of the kernel.I don't know what the *BSD equivalents are, or if they even have any, but if they do I suspect they're equally complex, because the people who *want* this feature have complex needs.
       
 (DIR) Post #AxfaH43Kgaqlsfe1JI by zwol@masto.hackers.town
       2025-08-29T17:26:13Z
       
       0 likes, 0 repeats
       
       9. Don't bother with "secure boot" or with encrypted disks. The only threats these protect against, that are relevant to normal people, are the "malicious hotel staff" and/or "laptop thief" attacks, which are moot for a server. And the risk of bricking the system is very real.
       
 (DIR) Post #AxfaHBqrgpdA4JsAzI by zwol@masto.hackers.town
       2025-08-29T17:28:54Z
       
       0 likes, 0 repeats
       
       10. You need remote system monitoring, but you probably don't need it to be fancy.  The most important thing is that you get an alert if the server goes down.  Something like <https://github.com/Cyclenerd/static_status>, run from a cron job on a separate machine, is plenty good enough for most people.
       
 (DIR) Post #AxfaHJtHniJx094Ds8 by zwol@masto.hackers.town
       2025-08-29T17:33:20Z
       
       0 likes, 0 repeats
       
       11. Find a way to put the system configuration under version control. However, don't get too fancy with it.  NixOS-style declarative system configuration is plenty good enough for most people.  Ansible or similar is also plenty good enough for most people. Kubernetes is for hyperscalers.
       
 (DIR) Post #AxfaHRVTU81kcbzJrs by zwol@masto.hackers.town
       2025-08-29T17:36:50Z
       
       0 likes, 0 repeats
       
       12. Containers are massively overrated.  You can get 90% of the same inter-service isolation, for less than half as much effort, with traditional Unix accounts.  If you're willing to put up with systemd, it can impose restrictions on daemons that, as far as I can tell, are every bit as strong as what containers do.
       
 (DIR) Post #AxfaHZOg9HLR5qs0O0 by zwol@masto.hackers.town
       2025-08-29T17:46:21Z
       
       0 likes, 0 repeats
       
       13. I call this the Highlander Principle of Software Package Management: There should be *one* package manager on your computer, and it should manage *all* of the installed software on your computer.This is, IMO, true for all computers no matter what, but it is especially important for servers.  If you have to run more than two or three commands to get *everything* updated with the latest security patches, that's too much work and you're going to fall behind on it.
       
 (DIR) Post #AxfaHhOyO4Kjv54LxI by zwol@masto.hackers.town
       2025-08-29T17:47:55Z
       
       0 likes, 0 repeats
       
       13a. The Highlander Principle has three important corollaries:- If you insist on using containers, use them *only* for service configuration and isolation, not for package management.- An OS that manages its "core" separately from "applications" or "ports" or whatever is misdesigned.- If you need software that hasn't been packaged in the form your OS-level package manager wants, you need to package it in that form yourself.
       
 (DIR) Post #AxfaHpDvJ2FSB7xdqK by zwol@masto.hackers.town
       2025-08-29T17:50:37Z
       
       0 likes, 0 repeats
       
       (thread paused till evening, I should do my actual job somewhat)
       
 (DIR) Post #Axi5OiQwR3CA2ejfSy by zwol@masto.hackers.town
       2025-08-30T00:41:35Z
       
       0 likes, 0 repeats
       
       Resuming the thread. Can I think of 36 more hot takes about server administration? We'll see.I probably should've said back at the beginning that I'm a strictly small-time sysadmin. I've got two cloud servers, one for public-facing stuff and one for private stuff, and there's only one other person who has write access to anything on them. My problems are simple, and my goal is to spend as little time on maintenance as possible. If your situation is different, you might not want to do like me.
       
 (DIR) Post #Axi5OjVwPwmxOS5Atk by zwol@masto.hackers.town
       2025-08-30T00:45:59Z
       
       0 likes, 0 repeats
       
       14. You should definitely have an automated process scanning your logfiles and notifying you when unusual things happen. But you need to spend time tuning it so it _only_ notifies you about the unusual stuff.For example, out of the box, logcheck will (or used to, anyway) tell you about all the _failed_ attempts to guess ssh passwords. You don't care. They failed. Filter that out.
       
 (DIR) Post #Axi5OkQJ2NsKDGSBkm by zwol@masto.hackers.town
       2025-08-30T00:51:55Z
       
       1 likes, 0 repeats
       
       15. Remember I said my goal is to do as little maintenance work as possible? So I am a big believer in boring, slow-moving operating systems. I don't need the fancy new stuff, I need to be confident that I can install the latest batch of security patches without anything breaking.For twenty years I trusted Debian to do this for me. Then they botched their transition to merged /usr. I don't know if I will ever be able to trust them again.
       
 (DIR) Post #Axi5OrIPU8ibWnoTuy by zwol@masto.hackers.town
       2025-08-30T00:57:25Z
       
       1 likes, 0 repeats
       
       16. You're probably thinking, but what if I actually do need the fancy new stuff?  My answer is: Rearrange things so you don't.  It's fine to be a late adopter. It's fine to be still using HTTP/1.1 and CGI and other things that work the same way they did five years ago.
       
 (DIR) Post #Axi5OyQoxGbbeunnV2 by zwol@masto.hackers.town
       2025-08-30T00:59:39Z
       
       0 likes, 0 repeats
       
       17. Make your servers automatically patch themselves. Since you're using a boring, slow-moving OS, this is safe, and it means you don't have to remember to do it.
       
 (DIR) Post #Axi5P692HH8Ha4XhtA by zwol@masto.hackers.town
       2025-08-30T01:05:14Z
       
       0 likes, 0 repeats
       
       18. You need an NTP daemon. For people who aren't running a time _server_, 'chrony' currently appears to be the least bad NTP daemon on the market.
       
 (DIR) Post #Axi5PDMlQdGzyg1Gl6 by zwol@masto.hackers.town
       2025-08-30T01:06:59Z
       
       0 likes, 0 repeats
       
       19. Most Unixes come with a bunch of off-the-shelf services that you probably _don't_ need on your server.  Take some time to go through the list of all the installed packages and weed out everything you don't need.  Be aggressive.  You can always put them back later.
       
 (DIR) Post #Axi5PKSh2zAvz5qbTM by zwol@masto.hackers.town
       2025-08-30T01:09:53Z
       
       1 likes, 0 repeats
       
       20.  However, you should make sure you _do_ have 'dig', 'ping', 'traceroute', 'tcpdump', 'lsof', and 'strace' installed.  These are essential troubleshooting tools, and some of them are very useful in situations where the network is busted and you _can't_ download them on the fly.
       
 (DIR) Post #Axi5PRSx0QXZhv1PLU by zwol@masto.hackers.town
       2025-08-30T01:11:41Z
       
       0 likes, 0 repeats
       
       21. This is more of a "Unix shell usability" hot take than a "sysadmin" hot take, but did you know you can embed newlines in your shell prompt?  PS1='\u@\h\n\w\n\$ ' is significantly more ergonomic than the usual PS1='\u@\h \w \$ '.
       
 (DIR) Post #Axi5PZKjkqiw6lExea by zwol@masto.hackers.town
       2025-08-30T01:20:02Z
       
       0 likes, 0 repeats
       
       22. You gotta have automated data backups. You knew that already. But: those data backups should be going to a machine in a *different data center* than the one where the server itself is. Preferably, one run by a different organization.
       
 (DIR) Post #Axi5PglE8l4Z8wgiPo by zwol@masto.hackers.town
       2025-08-30T01:21:47Z
       
       0 likes, 0 repeats
       
       23. You should _not_ be backing up the OS. Instead, you should be prepared to recreate the server from scratch at any moment, and then restore the latest data backup onto the new machine. That's what the version controlled system configuration is for.
       
 (DIR) Post #Axi5PnnbyI8gyMrDbU by zwol@masto.hackers.town
       2025-08-30T01:25:50Z
       
       0 likes, 0 repeats
       
       24. It's a pain in the ass to set up, but consider having block-level integrity protection (dm_integrity on Linux) as your lowest layer of storage management. This is especially valuable with RAID, because it fixes the problem where, if a RAID stripe is self-inconsistent, you don't know which copy of the stripe is the good one.(Disk encryption *should* have integrity protection built in, but doesn't always. Disk encryption without integrity protection is dangerous, for cryptographic reasons.)
       
 (DIR) Post #Axi5Pub6hBIQ3c3paK by zwol@masto.hackers.town
       2025-08-30T01:27:10Z
       
       0 likes, 0 repeats
       
       25. If you're building or specifying a server from the hardware on up: get the ECC RAM.  Actually do this no matter what kind of computer you're building.
       
 (DIR) Post #Axi5Q258ruTrGnAPs8 by zwol@masto.hackers.town
       2025-08-30T01:35:53Z
       
       0 likes, 0 repeats
       
       26. Some security hardening tips that don't appear in most security hardening guides:* Disable core dumps.* On 64-bit machines, disable 32-bit executable compatibility.* Enlarge the size of the NULL pointer guard region at the bottom of the address space.  4194304 (4MiB) is safe. Higher is more effective as a security measure (I'd *like* to set this to 2**32!) but higher breaks non-PIE executables.
       
 (DIR) Post #Axi5QAKgC1f8rIUwCW by zwol@masto.hackers.town
       2025-08-30T01:39:40Z
       
       0 likes, 0 repeats
       
       27. Linux's defaults for how much RAM can get filled up with "dirty pages" (data that needs to get written to persistent storage Real Soon Now), before it actually starts doing writeback, are *way* too high for how much RAM modern computers have.  Set both `vm.dirty_background_bytes` and `vm.dirty_bytes` to no more than a few tens of megabytes.
       
 (DIR) Post #Axi5QHaXDTVLMUyCO0 by zwol@masto.hackers.town
       2025-08-30T01:46:16Z
       
       0 likes, 0 repeats
       
       28. Use a boring, reliable file system. Ideally, use a boring, reliable file system written by people who understand that file systems exist to serve the needs of applications -- but I'm not sure there *is* any such file system!  (see https://wiki.postgresql.org/wiki/Fsync_Errors )
       
 (DIR) Post #Axi5QPZPXXMK7KVPkG by zwol@masto.hackers.town
       2025-08-30T01:47:25Z
       
       0 likes, 0 repeats
       
       29. Do not use any file system that doesn't have an offline consistency check and repair tool (fsck), no matter how boring and reliable it otherwise appears to be (yeah, ZFS, I'm talking about you)
       
 (DIR) Post #Axi5QX6HXioPTIwGS8 by zwol@masto.hackers.town
       2025-08-30T01:53:46Z
       
       0 likes, 0 repeats
       
       30. You probably do not want the tsuris of being a sysadmin for people you don't know personally. In fact, you probably do not want the tsuris of being a sysadmin for more than about five people who you *do* know personally.(Yes, I speak from experience.  If I could send *one message* back in time to myself in 1995, it would be "do not sign up to sysadmin the I.I. Rabi Program computer lab.")
       
 (DIR) Post #Axi5Qem10xM1GlWByS by zwol@masto.hackers.town
       2025-08-30T02:36:16Z
       
       0 likes, 0 repeats
       
       that feels like a good place to end the thread for this evening, might pick it up again tomorrow
       
 (DIR) Post #Axi7KmZ9KSr5blpKkK by zwol@masto.hackers.town
       2025-08-30T23:22:47Z
       
       0 likes, 0 repeats
       
       @lanodan see take #29
       
 (DIR) Post #Axi8lUtBG8skYbUzYm by zwol@masto.hackers.town
       2025-08-30T23:31:17Z
       
       0 likes, 0 repeats
       
       @lanodan zpool scrub *does not* repair, or even validate, the on-disk data structure. It only checks the block checksums. I have actually been burnt by this.