[HN Gopher] All my servers have an 8 GB empty file on disk
       ___________________________________________________________________
        
       All my servers have an 8 GB empty file on disk
        
       Author : sonicrocketman
       Score  : 524 points
       Date   : 2021-03-25 18:40 UTC (4 hours ago)
        
 (HTM) web link (brianschrader.com)
 (TXT) w3m dump (brianschrader.com)
        
       | joana035 wrote:
       | Mind you can also use tune2fs. It has an option "-m" that one can
       | tune how much reserved space is dedicated to root user.
        
       | pritambarhate wrote:
       | All my servers have an alarm when disk space goes above 70%. It
       | sends an email every hour once the disk usage goes above 70%.
       | Never had a server go down because of disk space issue after
       | adopting this practise.
       | 
       | Also one of the main reasons server disks go full is generally
       | log files. Always remember to "logrotate" your log files and you
       | will not have this issue that much.
       | 
       | Yes one more thing, for all user uploaded files use external
       | storage like NFS or S3.
        
         | ghostly_s wrote:
         | Is there a package I can install to set this up?
        
           | Moto7451 wrote:
           | Icinga is a common solution for monitoring FS and other use
           | metrics. I imagine his setup, if custom rolled, is a shell
           | script checking df and sending an email when the usage is at
           | or above 70
        
         | tetha wrote:
         | This goes into the same vein I was going to point out.
         | 
         | Most uncontrolled space usage comes from logs, users doing user
         | things, or something like build servers just eating temporary
         | and caching storage for lunch. Databases also tend to have
         | uncontrolled space usage, but that tends to be wanted.
         | 
         | So, if you push /var/log to it's own 20-30Gb partition, a mad
         | logger cannot fill up /. It can kill logging, but no logging is
         | better than fighting with a full /. Similar things with /home -
         | let users fill up their home dirs and pout and scream about
         | it... but / is still fine. And you can use their input to
         | provide more storage, if they have useful workflows.
         | 
         | Something like databases - where their primary use case is to
         | grow - need monitoring though to add storage as necessary.
        
         | bonestamp2 wrote:
         | > for all user uploaded files use external storage like NFS or
         | S3
         | 
         | We send our log files to S3 too. I mean, we write them locally
         | (EC2) and then push them to S3 every minute.
         | 
         | Then we have a tool that will let us search the log files in S3
         | and it will parse these rotated log files and join together the
         | relevant pieces depending on what we're looking for (or all of
         | it for a specific time period if we don't know what we're
         | looking for).
         | 
         | This is great because if the server goes down and we can't
         | access it, or the instance is gone, we can still see log files
         | from shortly before the problem occured. We also use bugsnag,
         | etc for real time logging and tracking where possible.
        
         | gowld wrote:
         | Disk space for an active server is so cheap; why not alert at
         | 30%?
        
       | 1123581321 wrote:
       | This is clever. Our shaky version of this, historically, has been
       | to run ncdu and hastily delete the first large log file we see.
       | It's not ideal.
        
         | kernelsanderz wrote:
         | ncdu saves my bacon at least every few months. I do machine
         | learning and am always running out of space!
        
           | dylan604 wrote:
           | sounds like your machine doesn't seem to be learning the
           | right things.
        
         | sonicrocketman wrote:
         | (that's actually how I solved the original issue that I
         | reference in the post, and how I got the idea for this silly
         | solution)
        
         | ttyprintk wrote:
         | The filesystems where headroom matters are var, tmp and
         | sometimes root. I like this strategy with logfiles because
         | nethack.log.gz.30 was approximately as important as empty
         | space. Keeping another 8gb on root and tmp seems extreme.
        
       | johbjo wrote:
       | No. Careful partitioning is the solution to this problem. Monitor
       | the growth of your partitions and make sure nothing on rootfs or
       | other sensitive partitions grow significantly.
        
         | slaymaker1907 wrote:
         | I don't think this has to be an either/or scenario. Having some
         | bloat you can get rid of quickly is a nice backup in case your
         | monitoring fails for whatever reason.
        
       | reph2097 wrote:
       | This is stupid. Just don't make your servers use up all space.
       | That's why ext can reserve space for root, 5% by default.
        
       | michelb wrote:
       | Showing off that i'm not a sysadmin, but wouldn't a monitoring
       | daemon work? Once disk usage grows past a certain uncomfortable
       | threshold you get an email/notification to see what's up. I mean
       | you obviously are monitoring other server vitals anyway right?
        
         | jodrellblank wrote:
         | Cases mentioned below where space fills up quickly due to a
         | bug, maybe yes. Outside that there's the problem that you can
         | ignore the emails (or be sick, asleep, etc). Worse if they go
         | to a team and everyone is busy and assumes someone else will
         | deal. Bad if you aren't in charge and tell people in charge and
         | they nod and don't decide anything - they prefer to run at
         | 70/80/90/95% used indefinitely instead of signing a cheque.
         | 
         | When the drive fills and everything breaks and, you /have/ to
         | respond, and it becomes socially ok to make it your highest
         | priority and drop everything. An email "only a few weeks until
         | maybe it runs out of space" is much harder to prioritise and
         | get anyone to care. With this system, the time when it fills
         | and breaks has some flex for you to not go down with the
         | server, and save your own bacon. It's as much a fix for the
         | organization as anything else.
         | 
         | I see this most in smaller company aging systems where they had
         | ample storage and drives for a few years ago when they were
         | new, now they're crammed full with the growth of software,
         | libraries, updates, data, new services being deployed on them,
         | increased demands, and nobody wants to commit to more storage
         | for an older system towards the end of its warranties, but they
         | definitely don't want to commit to the much larger cost of a
         | replacement and all the project work, and running at 90% full
         | costs nothing and involves no decisions. 91%. 92%.
        
         | willbutler wrote:
         | Monitoring is a good idea, regardless. However, there are cases
         | where a bug or some other issue can cause disk usage to ramp
         | too quickly for someone to respond to an alert.
        
       | 00deadbeef wrote:
       | Isn't using LVM and holding some space back a better solution for
       | this?
       | 
       | Also I keep databases on their own partition so that nothing else
       | can accidentally fill up the space and lead to data loss.
        
         | jstanley wrote:
         | Maybe, but author says:
         | 
         | > On Linux servers it can be incredibly difficult for any
         | process to succeed if the disk is full.
         | 
         | You won't feel too clever if you come to grow your LVM volume
         | into the free space and it won't work because there's no free
         | space on the filesystem! :)
         | 
         | (I don't actually know if this would fail or not - but the
         | point is "rm spacer.img" is pretty much guaranteed not to
         | fail).
        
           | LinuxBender wrote:
           | I've used LVM for this purpose plenty of times. lvm2 at least
           | has not prevented me from extending a full disk. lvm +
           | reserved blocks + a small spacer file are all decent options,
           | even better when used together.
        
         | JshWright wrote:
         | rm /path/to/big.file is faster than looking up the commands to
         | expand the LVM volume and grow the filesystem.
        
           | 00deadbeef wrote:
           | They're not hard to memorise. I never look them up.
        
             | nkellenicki wrote:
             | I don't use lvm often enough to memorise the commands. "rm
             | spacer.img" is short and easy for _anyone_ to remember.
        
             | JshWright wrote:
             | Good for you. That's not how my memory works. If I don't
             | use a command regularly, I don't trust myself to remember
             | it correctly. Even if I did though, that's a multi-step
             | process, compared to the single command needed to remove a
             | file.
        
         | Spivak wrote:
         | Tomato potato. If you use LVM or anything like it to reserve
         | space then in your failure situation you have to extend the lv,
         | partition, and fs before the space becomes available. More work
         | than just rm'ing the file.
         | 
         | I think the ideal ideal is tuning the reserved blocks in your
         | filesystem.                  xfs_io -x -c 'resblks ...
        
           | 00deadbeef wrote:
           | It's not really much more work. One command extra.
        
       | pgray wrote:
       | tune2fs -m not good enough?
        
       | k__ wrote:
       | I remember a discussion here about a dude who did this with
       | memory in game development. People didn't like the idea very
       | much.
       | 
       | To me it has a taste of domain squatting or GPU scalping, but you
       | don't do it with strangers, but your team.
        
       | crabmusket wrote:
       | This reminds me of a similar story in a classic Gamasutra
       | article[1] (the section is "The Programming Antihero", and I'd
       | recommend the other pages of the article for a few good
       | chuckles). Apocryphal or not, it makes for a good story.
       | 
       | > I can see how sometimes, when you're up against the wall,
       | having a bit of memory tucked away for a rainy day can really
       | make a difference. Funny how time and experience changes
       | everything.
       | 
       | [1]
       | https://www.gamasutra.com/view/feature/132500/dirty_coding_t...
        
       | Denvercoder9 wrote:
       | > Copy commands and even deletions can fail or take forever as
       | memory tries to swap to a full disk
       | 
       | That's only a problem if your memory is full as well, and even
       | then, I've never encountered a server that uses a swapfile
       | instead of a swap partition.
        
         | jstanley wrote:
         | That's also only a problem if your swap partition is mounted
         | from a file on your filesystem, which is an exceedingly
         | uncommon configuration.
        
         | yjftsjthsd-h wrote:
         | Even a swap file shouldn't matter, since it's still not sparse.
         | The one exception is if you're on a system that dynamically
         | adds and removes swap files - I believe darwin does that, and I
         | _think_ it might be possible to do on Linux(?) but I 've not
         | actually seen it done.
        
         | Deathmax wrote:
         | Not quite the same situation as described in the article, but
         | it is still possible for the kernel to swap memory in and out
         | of disk even without a swap file/partition. Memory used for
         | storing executable binaries is allowed to be moved out of
         | memory, as a copy of it lives on disk. This means you can still
         | encounter memory thrashing (and thus system unresponsiveness)
         | under low memory situations.
        
           | Denvercoder9 wrote:
           | > Memory used for storing executable binaries is allowed to
           | be moved out of memory, as a copy of it lives on disk.
           | 
           | On Linux, this is not necessarily the case, as you can change
           | the file on disk while the executable is running. I don't
           | know if Linux just keeps executable code in memory all the
           | time, or if it is smart enough to detect whether a copy of
           | executable pages still lives on disk.
        
             | unilynx wrote:
             | You should get a "text file busy" error if you try that.
             | 
             | What you can do is delete and then recreate the executable.
             | Then the deleted data simply sticks around on disk until
             | it's no longer referenced
        
       | benibela wrote:
       | I have a dual-boot laptop with windows and linux, and use the
       | ntfs partition to share data between them
       | 
       | Recently, I extracted a large archive with Linux on the ntfs, and
       | the partition was full
       | 
       | Then Windows did not start anymore
       | 
       | Linux would only mount the partition as read-only, because it was
       | marked dirty after the failed start. Finally I found a tool to
       | reset the mark, and delete the files.
       | 
       | Now Windows starts again, but my user account is broken. It
       | always says "Your Start menu isn't working. We'll try to fix it
       | the next time you sign in.", then I sign out, and it is still
       | broken
       | 
       | I had to make a new user account
        
         | MayeulC wrote:
         | > Finally I found a tool to reset the mark, and delete the
         | files.
         | 
         | fsck?
        
       | CodeBeater wrote:
       | The gastric balloon of linux servers
        
       | Blikkentrekker wrote:
       | > _On Linux servers it can be incredibly difficult for any
       | process to succeed if the disk is full. Copy commands and even
       | deletions can fail or take forever as memory tries to swap to a
       | full disk_
       | 
       | I don't understand this. Swap is either a swap partition, or a
       | specific swap file, all of which allocated in advance, so the
       | fullness of the storage should have no bearing.
        
       | rrauenza wrote:
       | I thought this was gonna be about the obscenely large sparse file
       | /var/log/last.
       | 
       | I really wish they would move it from a sparse memmap() file to a
       | btree or something.
        
       | AlisdairO wrote:
       | One other option is increasing the reserved block count (
       | https://ma.ttias.be/change-reserved-blocks-ext3-ext4-filesys...
       | ). This has the nice side effect of increasing the space
       | available for critical daemons.
       | 
       | If you haven't customised this, in a pinch you can still lower it
       | down a bit to buy some time.
        
         | throw0101a wrote:
         | ZFS has explicit reservations:
         | 
         | > _The minimum amount of space guaranteed to a dataset and its
         | descendants. When the amount of space used is below this value,
         | the dataset is treated as if it were taking up the amount of
         | space specified by its reservation. Reservations are accounted
         | for in the parent datasets ' space used, and count against the
         | parent datasets' quotas and reservations._
         | 
         | * https://openzfs.github.io/openzfs-docs/man/8/zfsprops.8.html
         | 
         | These are done on a per dataset basis (basically a directory
         | delineated boundary).
        
         | stonesweep wrote:
         | I suspect the blog author did not understand this (based on the
         | content) - as a Linode user myself, I just had a look at one of
         | my VMs and they install with the regular 5% reserved space
         | (ext4/Debian).
        
           | birdyrooster wrote:
           | Funny because I have always tune2fs -m1 or tune2fs -m0
           | because the reserved space was never supposed to scale
           | linearly with hard drive capacities and is not useful to
           | userspace in anyway. Have never had any issues and been doing
           | it for decades in commercial applications. In some cases,
           | where you probably shouldn't be using ext3/4 anyways, we are
           | talking about reclaiming TBs of reserved space.
           | 
           | It's important to note that mkfs doesn't care if you are
           | formatting the root partition or a data volume partition, it
           | will still reserve space for the kernel.
        
             | stonesweep wrote:
             | Fun trivia: you can actually set per-device settings in
             | /etc/mke2fs.conf with all sorts of alternate defaults.
             | https://man7.org/linux/man-pages/man5/mke2fs.conf.5.html
        
             | tytso wrote:
             | If you try to use a file system to 99% full --- and it
             | doesn't matter whether it is a 10GB file system or a 10TB
             | file system, you _will_ see significant performance
             | penalties as the file system gets badly fragmented. So that
             | 's why having a fixed percentage even for massively big
             | disks still makes sense.
             | 
             | Disk space is cheap enough that even 5% of a 14TB disk is
             | really not that much money --- and if you see bad
             | performance, and then have to pay $$$ to switch to an SSD,
             | maybe it would have been much cheaper to use a HDD with a
             | larger free space reserve....
        
         | sonicrocketman wrote:
         | I did not know this. Good solution.
        
         | GekkePrutser wrote:
         | I know about this, but I do think it's not a bad idea doing
         | what he does because the reserved block count is for root and
         | most server processes still run as root. And it's usually them
         | that are causing the disk to fill. Though I suppose this also
         | makes the problem itself more prominent in the first place. I
         | guess if you run into this a lot, stricter monitoring would be
         | a better solution.
         | 
         | The way I found out about it originally was because I was using
         | external storage drives and I was never able to fit as much as
         | I expected :D
         | 
         | Luckily you can easily change this without reformatting.
        
           | chousuke wrote:
           | What servers usually run as root? Some may start as root, but
           | usually drop privileges for the actual server processes
           | quickly, eg. apache, nginx, sshd.
           | 
           | Nothing that actually does the "serving" or accesses data
           | should be running as root.
        
             | GekkePrutser wrote:
             | No but the logfile writers are usually running as root
             | AFAIK. And this is what tends to fill up the disk.
        
               | edoceo wrote:
               | Mine don't run as root.
        
               | vntok wrote:
               | You should fix it then, takes no time at all.
        
               | repiret wrote:
               | Thats not really something that needs fixing.
        
               | derefr wrote:
               | On systemd systems, logfiles are written to disk under
               | the journald user, `systemd-journal`.
        
               | [deleted]
        
               | kiwijamo wrote:
               | Is that true for all logfiles? I still have plenty of
               | daemons (by default) writing directly to some file in
               | /var/log eg EXIM, Apache, and the like. Also plenty of
               | system stuff still write to files in that directory. And
               | yes this is a machine that uses systemd.
        
               | comex wrote:
               | But those daemons don't usually have their own log writer
               | processes running as root, do they? Instead, either the
               | log file is accessible by the user the daemon is running
               | as, or the daemon opens the log file as root before
               | dropping privileges for the rest of its operation.
        
               | stonesweep wrote:
               | Most vendors (Debian/Ubuntu, RHEL/clones, etc.) add a
               | hook into rsyslog to be a partner with the systemd logger
               | and write out text files next to the journal - they
               | realize that a lot of people dislike dealing with
               | journalctl (I'm one of them) and provide an alternate
               | hook already installed and working for you behind the
               | scenes.
               | 
               | This is for daemons using syslog methodology, not direct
               | writers like apache/nginx/mysql/etc; think more like
               | cron, systemd, chrony, NetworkManager, and so forth. The
               | vendors are not all aligned on what goes where (example:
               | on RHEL, pacemaker/crm write to their own logs buy on
               | openSUSE they're sent to syslog) - the actual results
               | differ slightly from vendor to vendor.
               | 
               | DIY distros like Arch do not implement the rsyslog
               | backend by default, you have to set it up yourself
               | following the wiki - only journalctl is there by default.
        
               | GekkePrutser wrote:
               | Ah good point, I use Alpine on all my servers so it's
               | more traditional logs.
        
             | znpy wrote:
             | It used to be common, before "the cloud", to have many
             | apparently unnecessary partitions in a server install. One
             | for /, one for /var, one for /home, one for swap at the low
             | sector numbers...
             | 
             | The idea is that /var filling up would not make the system
             | unrecoverable.
        
       | zeta0134 wrote:
       | If you happen to use ext as your default filesystem, check the
       | output of tune2fs; it's possible your distro has conveniently
       | defaulted some 2-5% of disk space as "reserved" for just such an
       | occasion. As the root user, in a pinch, you can set that to 0%
       | and immediately relieve filesystem pressure, buying you a little
       | bit more time to troubleshoot whatever the real problem is that
       | filled the disk in the first place.
        
       | tomaszs wrote:
       | Interesting. I would try to notify myself when space is getting
       | low. But I like the solution because of it's simplicity
        
       | loloquwowndueo wrote:
       | If you don't have monitoring to tell you when the disk is more
       | than X% full, then you're at risk for more failure scenarios than
       | just a full disk (usually trivial to buy time by deleting old
       | logs).
        
         | ed25519FUUU wrote:
         | Happens all of the time even with monitoring. Somebody enables
         | debug monitoring and it fills up in 3 minutes.
        
         | macintux wrote:
         | If the problem arises during a migration or other significant
         | event, which it sounds like Marco's did, the alert will usually
         | be triggered just in time to tell you why you're already in a
         | world of pain.
        
       | badcc wrote:
       | This trick has certainly saved me more times than I am willing to
       | admit! I usually roll with: `fallocate -l 8G
       | DELETE_IF_OUT_OF_SPACE.img`
        
         | innomatics wrote:
         | Nice. I liked the trick in the article, but was wondering if it
         | might confuse the next admin who needs to figure it out.
         | 
         | Was also thinking of customising the login greeting to mention
         | the file (/etc/motd).
        
       | fggg444 wrote:
       | it feels like setting your watch 5 minutes fast, it's not a real
       | solution
        
         | dylan604 wrote:
         | can you do this on smartwatches? I know someone that went to
         | full extreme of hour push, but they did this by setting their
         | system time to the next time zone over.
        
       | cmckn wrote:
       | This really goes to show, there is more than one way to skin a
       | cat. Yeah the guy could probably overhaul his entire approach to
       | system administration, but also...this works. Well-placed hacks
       | are maybe my favorite thing.
        
       | IncRnd wrote:
       | This solution is what creates the problem. If you want warnings
       | when the free disk space is low, set-up warnings for when the
       | free disk space is low.
        
       | londons_explore wrote:
       | Linux has this built in...
       | 
       | By default, only root can use the last 5% of disk space.
       | 
       | That means you can fire up a root shell and know you have a
       | buffer of free space to resolve the issue.
        
         | mamon wrote:
         | But now we have Docker, which means all the containerized
         | workflows will run as root....
        
           | londons_explore wrote:
           | I suspect you need to be root in the root user namespace...
           | So docker doesn't get this special power...
        
             | cmeacham98 wrote:
             | Docker does not use user namespaces by default (and some
             | features are unavailable when using them).
        
         | cmeacham98 wrote:
         | I believe this is an ext{2,3,4} feature. Unsure if it exists on
         | btrfs, zfs, etc.
        
       | tshaddox wrote:
       | How is this better than sounding alarms when free disk space
       | drops below 8GB? If you're going to ignore the alarms, then
       | you're going to have the same problem after you remove your
       | spacer file and the disk fills up again!
        
         | [deleted]
        
         | _wldu wrote:
         | It requires far less configuration.
        
           | tshaddox wrote:
           | How so? You'll presumably need to configure some way to be
           | notified when the disk space is full anyway.
        
         | lostcolony wrote:
         | It isn't either/or. It's very likely both.
        
         | busterarm wrote:
         | Sometimes your alarms are broken due to misconfiguration.
        
           | tshaddox wrote:
           | Both solutions assume that you will have some way of knowing
           | when the disk is full. Whether the "alarm" is an automated
           | health monitoring system, or an angry customer calling your
           | cell phone, there's no point in discussing how to solve
           | problems without assuming that you have some way of knowing
           | there a problem exists.
        
         | frenchie14 wrote:
         | This would work if you have sufficient time between alarm and
         | failure. If some issue or process uses up all of your available
         | disk space in a short time span, you won't have that luxury.
         | Hopefully, the author is using alerts on top of having this
         | failsafe
        
         | zokier wrote:
         | Compare the rate an haywire process can fill up disk to your
         | response time to alarms, and you got your answer right there.
        
           | tshaddox wrote:
           | I don't understand. You will still have an alarm when the
           | disk fills up, and you will need to respond and delete the
           | spacer file. Your response time latency will be the same,
           | right?
        
           | luckylion wrote:
           | Okay, so now you have a disk full, only become aware of it
           | when it's full and your database throws errors. You have an
           | easy way to fix it, just delete the spacer file. But what
           | good does that do? You're still in the mess where your
           | database is really unhappy.
           | 
           | On the other hand, if your monitoring was set up well, you
           | got a notification and had time to react to it _before_ it
           | was at 100%.
           | 
           | Granted, if you have a process that just wrote a file at
           | maximum speed, that time window is tiny, but that's not
           | usually what happens in my experience. What happens is that
           | something starts logging more and it slowly builds up while
           | you're happy that your server is running so well that you
           | don't need to pay attention. And then the alert comes and
           | tells you that there's less than 10% space available, and you
           | have plenty of time to investigate and avert the crisis.
        
             | vineyardmike wrote:
             | >You have an easy way to fix it, just delete the spacer
             | file. But what good does that do?
             | 
             | You solve the issue right then and there. Step 1. realize
             | there is a space issue and get to terminal Step 2. free
             | space so any solution has memory Step 3. Solve by doing
             | <??? specifics ???>
        
               | tshaddox wrote:
               | So if your alarm sounded when there was 8 GB of free disk
               | space (instead of 0 GB), then you could still respond in
               | the same amount of time and you would still have an
               | additional 8 GB worth of padding while you determined the
               | root cause. The only difference is that you wouldn't need
               | to actually go in and delete the spacer file (and
               | potentially have downtime in the time it takes you to
               | delete the spacer file).
               | 
               | Another way to think of this is that you have the 8 GB
               | spacer file, but when the disk fills up the spacer file
               | is automatically deleted and your alarm goes off. Which
               | is literally the same as having your alarm go off when
               | free disk space reaches 8 GB.
        
       | alvarlagerlof wrote:
       | Sorry, but I cannot read this at all. Please increase the font
       | thickness.
        
       | kaydub wrote:
       | Stories like this and my own past memories make me so happy to
       | work somewhere big.
        
       | Something1234 wrote:
       | I have an empty leader on my hard drive so that I can recover if
       | I accidentally nuke the front of it with dd while making a live
       | usb. So it's not a bad idea, and it's super effective so far it
       | hasn't been tested, and hopefully I never will need to.
        
         | ttyprintk wrote:
         | A good reason to partition swap before /boot.
        
       | aidenn0 wrote:
       | This won't work with ZFS, as it may be impossible to delete a
       | file on ZFS when disk is full. The equivalent in ZFS is to create
       | an empty dataset with reserved space.
        
         | throwaway525142 wrote:
         | For me, it was possible to truncate -s 0 a large file on a full
         | disk with ZFS.
        
         | davemtl wrote:
         | A way to prevent this is to create a dataset and reserve n
         | amount of space, typically 10-20% and set it read-only.. before
         | the pool gets full. Then when the pool fills up, you can reduce
         | the reservation to be able to clean up files.
        
         | hikarudo wrote:
         | Thanks, I've been wondering about the "proper" way of doing
         | this in ZFS and this method hadn't come up in my searches.
        
       | geocrasher wrote:
       | For everyone saying "This isn't a real solution!" I'd like to
       | explain why I think you're wrong.
       | 
       | 1) It's not intended to be a Real Solution(tm). It's intended to
       | buy the admin some time to solve the Real Issue.
       | 
       | 2) Having a failsafe on standby such as this will save an admin's
       | butt when it's 2am and PagerDuty won't shut up, and you're just
       | awake enough to apply a temp fix and work on it in the morning.
       | 
       | 3) Because "FIX IT NOW OR ELSE" is a thing. Okay, sure. Null the
       | file and then fill it with 7GB. Problem solved, for now.
       | Everybody is happy and now I can work on the Real Problem: Bob
       | won't stop hoarding spam.
       | 
       | That is all.
        
         | tlibert wrote:
         | Real Solutions (tm) are indeed nice, but hackers get shit done
         | - this is an utterly shameless hack, and I do it myself.
        
         | keeperofdakeys wrote:
         | I find that either a server needs more space, or has files that
         | can be deleted. For the former you just increase the disk
         | space, since most things are VMs these days and increasing
         | space is easy. For the latter you can usually delete enough
         | files to get the service backup before you start the proper
         | cleanup.
         | 
         | If you really need some reserve space (physical server), I'd
         | much rather store it in a vg (or zfs/btrfs subvolume). Will you
         | remember the file exists at 2am? What about the other admins on
         | your team?
        
           | cbo100 wrote:
           | > Will you remember the file exists at 2am? What about the
           | other admins on your team?
           | 
           | Hopefully if you were doing something like this it would be
           | part of your standard incident response runsheet/checklist.
        
         | luckylion wrote:
         | > 1) It's not intended to be a Real Solution(tm). It's intended
         | to buy the admin some time to solve the Real Issue.
         | 
         | If you don't have monitoring, will you even be aware that your
         | disk is filling up?
         | 
         | If you do have monitoring, why are you artificially filling up
         | your disk so that it will be at 100% more quickly instead of
         | just setting your monitoring up to alert you when it's at
         | $whateverItWasSetToMinusEightGB?
        
           | ben509 wrote:
           | One argument in favor of it is the 8GB file may cause a
           | runaway process to crash, leaving you without it continuing
           | to chew up space and able to recover.
           | 
           | A second argument is it's not opened by any process. One
           | problem I've had fixing disk full errors was figuring out
           | which process still had a file open.
           | 
           | (For any POSIX noobs: the space occupied by a file is
           | controlled by its inode. Deleting a file "unlinks" the inode
           | from the directory, but an open filehandle counts as a link
           | to that inode. Until all links to the inode are deleted, the
           | OS won't release the space occupied by the file. Particularly
           | with log files, you need to kill any processes that have it
           | open to actually reclaim the disk space.)
        
           | [deleted]
        
           | smarx007 wrote:
           | An extra failsafe? You can do both. What if your cron/netdata
           | are not forwarding emails for some reason (eg nullmailer gets
           | errors from Mailgun)?
        
             | luckylion wrote:
             | Right, but again, what good does the spacer file do if
             | you're not aware that you're running low on disk space?
             | That is: if your monitoring isn't working, how do you know
             | that you need to quickly make room?
             | 
             | And if your monitoring is working correctly, the spacer
             | file really serves no purpose other than lowering the
             | available disk space.
        
               | smarx007 wrote:
               | 1. When your DBMS is no longer responding to queries,
               | your boss and your customers replace your monitoring
               | system (unlimited free phone calls 24/7 included ;). Case
               | in point: HN is often a better place to check than Google
               | Cloud status page, for example.
               | 
               | 2. Maybe you didn't get it, but "nullmailer not
               | forwarding cron email due to mailgun problems" was a bit
               | too specific to be an example I just made up, wasn't it?
               | Again, the premise "if your monitoring is working
               | correctly" is not a good one to base your reasoning upon.
               | Especially if you have 1 VM (VPS) and not a whole k8s
               | cluster with a devops team with rotational on-call
               | assignments.
        
               | CJefferson wrote:
               | The reason was, I thought, discussed in the article.
               | 
               | When you actually fill up your disc, many linux commands
               | will simply fail to run, meaning getting out of that
               | state is extremely difficult. Deleting the file means you
               | have room to move files around / run emacs / whatever, to
               | fix the problem.
        
               | pvorb wrote:
               | Somebody _will_ notify you. If the service is just for
               | yourself, you don 't need monitoring at all.
        
               | luckylion wrote:
               | Yes, yes, but they will notify you _after_ your service
               | is down (because that 's when they notice), in part
               | thanks to a spacer file that eats up available disk space
               | without being of any use. A monitoring service would
               | notify you _before_ your service is down, users grab
               | pitchforks and start looking for torches.
               | 
               | I understand the benefit to be able to quickly delete
               | some file to be able to run some command that would need
               | space, though I find that highly theoretical. If it's
               | your shell that requires space to start, you won't be
               | able to run the command to remove the spacer, and once
               | you're in the shell, I've never found it hard to clean up
               | space; path autocompletion is the only noticeable victim
               | usually. And at this point, the services are down anyhow,
               | and you likely don't want to restart them before figuring
               | out what the problem was, so I don't see the point of
               | quickly being able to make some room.
               | 
               | It feels like "having two flat tires at the same it is
               | highly unlikely, so I always drive with a flat tire just
               | to make sure I don't get an unforeseen flat tire". It's
               | cute, but I'd look for a new job if anyone in the company
               | suggested that unironically.
        
           | ineedasername wrote:
           | Because even if you have monitoring, some unforseen issue
           | rapidly eating disk space at 3:00 am may not give you the
           | time to solve it without downtime or degraded performance
           | unless you can _immediately_ remove the bottleneck while you
           | troubleshoot.
        
             | tshaddox wrote:
             | Then why not automate the removal of the 8 GB spacer file
             | when the disk gets full? Or in other words, just sound your
             | alarms when there is 8 GB of free disk space.
        
               | ineedasername wrote:
               | I actually suggested exactly that in another comment,
               | thoigtnto do it in stages: 4gb with an alarm, the more
               | alarms and the other 4gb if not resolved.
        
           | marricks wrote:
           | Monitors can fail, you can miss an email, etc etc etc
           | 
           | There's always a big gap between what should never happen
           | because you planned well and what does happen
        
           | apocalyptic0n3 wrote:
           | Besides runaway log files that aren't being properly rotated,
           | human error can cause it too. I managed to completely eat up
           | the disk space of one of our staging servers a few weeks ago
           | trying to tar up a directory so I could work on it locally.
           | Didn't realize the directory was 31GB and we only had 25GB of
           | space. By the time the notification for 80% usage was
           | triggered (no more than 2 minutes after we hit 80%), the
           | entire disk was full. Luckily it was just a staging server
           | and no real harm was done, but such a mistake could have just
           | as easily been made on a production server. In this case, the
           | obvious solution is to just delete the file you were creating
           | but if you're running a more complicated process that is
           | generating logs and many files, it may not be so easy and
           | this 8GB empty file might be useful after you cancel the
           | process.
        
         | tehjoker wrote:
         | This reminds me of the reserve tank toggle on some motorcycles.
         | When you run out of gas, you switch the toggle and drive
         | directly to a gas station.
        
           | hprotagonist wrote:
           | The bikes I've had that have had reserve tanks have also been
           | old enough to raise the disconcerting follow-on question,
           | which is: "is the reserve gas also full of sludgey crap
           | that's settled in the tank and hasn't been disturbed really
           | in a year, and am i about to run that through my poor carbs?"
        
             | jedberg wrote:
             | My friend had a truck with a reserve tank, but it was the
             | same size as the main tank, so he would just flip the
             | switch at every fill up to make sure they both got used.
        
           | rcthompson wrote:
           | Motorboat fuel tanks have a reserve as well. It's just a
           | raised area that splits the bottom of the tank into 2
           | separate concave areas. One of the concave areas contains the
           | end of the fuel line, and the other doesn't. When you run out
           | of gas, you tip the tank up to dump the remaining gas from
           | the other basin into the main one, and then you restart the
           | engine (or keep it from stopping at all if you're quick
           | enough on the draw) and head for the docks.
        
           | 0_____0 wrote:
           | always fun when you're barreling down the highway and the
           | engine starts to lean out, prompting you to hurriedly locate
           | and switch the petcock over before the engine stalls
           | completely.
           | 
           | suppose then that you go fill up and forget to set the
           | petcock back to normal. 8ball says: "I see a long walk in
           | your future."
        
             | geocrasher wrote:
             | I once put a new fuel pump in a Chevy pickup with two tanks
             | on the side of the road because I was switched to the empty
             | tank. Good times.
        
             | jessaustin wrote:
             | IME it doesn't take too many hikes to learn that part of
             | the procedure for turning off the engine is "turn the fuel
             | switch off reserve".
        
               | 0_____0 wrote:
               | out of years of riding it's only happened to me a couple
               | times.
               | 
               | one time i was eastbound on the bay bridge when my bike
               | started to sputter. i'd just reassembled the tank and had
               | left the screw-style reserve fuel valve open, so there
               | was no reserve fuel to be had. a very kind lady put her
               | blinkers on behind me and followed as i coasted the last
               | few hundred yards toward yerba buena island.
               | 
               | i pushed my bike up the ramp and looked in the tank to
               | assess. it's a dirtbike, so the tank has two distinct
               | "lobes" to accomodate the top tube of the frame. I had a
               | few ounces in the tank but they were not in the lobe with
               | the fuel pickup, so i dumped the bike on its side to get
               | the fuel to slosh over to where i wanted it.
               | 
               | i got back on the highway and, going quite slowly and
               | gently, managed to get to the gas station at west oakland
               | bart, the engine leaning out and sputtering right as i
               | rolled into their lot.
        
             | dnautics wrote:
             | Surprised there isn't a mechanism that mechanically
             | switches the petcock over when you put a fuel nozzle up to
             | the port
        
               | mplewis wrote:
               | Most motorcycles with a manual petcock are very manual in
               | nature. Often this is to minimize the number of moving
               | parts that could die on you if you take it into rural
               | areas. An automatic petcock adds more complexity that
               | could cause a malfunction.
        
               | xkcd-sucks wrote:
               | Typically there aren't two separate tanks - In one tank
               | there are two tubes at different heights. As the fuel
               | level falls below the height of the "main" tube the
               | engine sputters, then turning the petcock engages the
               | lower down "reserve" tube which is still below the fuel
               | level. It's more of a warning than a true reserve, and
               | most bikes with an actual fuel gauge don't have a
               | reserve.
        
               | quesera wrote:
               | Most motorcycles are surprisingly manual. This was
               | originally a necessity (like in cars), but remains
               | aesthetically preferable for many riders.
               | 
               | OTOH, Honda Goldwings have stereo systems. They might
               | grow an automatic fuel reserve switcher-backer someday
               | too. :)
        
               | abruzzi wrote:
               | Fuel injected motorcycles don't have reserve (at least,
               | none that I've seen.) instead they have low fuel lights
               | or full fuel gauges. I'm guessing it's because the fuel
               | pumps are in the tank and the fuel injection system needs
               | high pressure.
        
               | ericbarrett wrote:
               | Fuel injectors require filtered gas because even small
               | particles can clog them, and said filter is more likely
               | to be clogged or even compromised by sucking up the last
               | drops of fuel (and scale and debris) in the tank, so the
               | low-fuel warning is required.
               | 
               | Carb jets can get clogged, too, but are wider since
               | they're not under as much pressure. Also, since they're a
               | wear item they're a lot easier to clean and/or replace.
        
               | names_are_hard wrote:
               | Many new bikes come with a lot of rider aids for safety
               | (ABS, TCS) as well as all kinds of electronics (fuel
               | maps), so this is changing. But of course manual
               | transmission won't go away until bike are electric.
               | 
               | I am one of those who likes things old school. My bike
               | still has a carburetor, has no fuel light or tachometer,
               | and I have certainly had some practice reaching down to
               | turn the fuel petcock to reserve while sputtering on the
               | highway. If they didn't intend for me to do that, why did
               | they put it on the left side? :)
        
               | driverdan wrote:
               | Some newer bikes, like mine, don't have a reserve
               | petcock. They have a low fuel light. No forgetting about
               | the petcock and an obvious warning light instead of
               | sputtering.
        
               | bigiain wrote:
               | Some older bikes, like my '99 Ducati Monster, don't have
               | a petcock. It has a low fuel light that first failed in
               | around 2002, and for which that part that fails (the in-
               | tank float switch) stopped being available in about 2015
               | or so. No petcock _or_ warning light. (And that trip
               | where the speedo cable fails so I couldn't even use thew
               | trip meter to estimate fuel requirements was a fun
               | one...)
        
         | dilyevsky wrote:
         | Setup proper monitoring and never get to the Real Issue to
         | begin with. These sysadmin hacks are not helpful
        
           | berkes wrote:
           | "proper monitoring" is extremely broad. And, I would say,
           | almost unreachable goal.
           | 
           | You have it mail you when it goes over 80% disk usage (and
           | what if you are on holiday)? Does it mail all colleagues? Who
           | picks it up (I thought Bob picked it up, but Bob thought Anne
           | picked it up. So no one did)? Does it come and wake you in
           | person when it reaches 92%?
           | 
           | Will this catch this async job that fails (but should never)
           | in an endless loop but keeps creating 20MB json files as fast
           | as the disk allows it to?
           | 
           | Is it an alerting that finds anomalies in trends? Will it be
           | fast enough for you to come online before that job has filled
           | the disk?
           | 
           | I've been doing a lot of hosting management and such. And
           | there is one constant: all unforeseen issues are unforeseen.
        
             | geocrasher wrote:
             | > I've been doing a lot of hosting management and such. And
             | there is one constant: all unforeseen issues are
             | unforeseen.
             | 
             | I work in hosting too, and have been for a long time. I
             | feel ya.
        
             | dilyevsky wrote:
             | Slack warning/ticket at 75%, page at 85% (to oncall
             | obviously). Don't let user workload crap into your root
             | partition. I've been doing this for over 10 years and
             | managed many thousands of nodes and literally don't recall
             | full disk problem unless it was in staging somewhere where
             | monitoring was deliberately disabled.
        
           | geocrasher wrote:
           | In a perfect world, this is true. But we don't have one of
           | those.
        
           | [deleted]
        
         | solidasparagus wrote:
         | This is one of those great solutions where they got 90% of the
         | value of the Real Solution(tm) with 5 minutes of work.
        
       | diego wrote:
       | This points to a much more serious problem. This is 2021 and the
       | technology is from the 90s, with a really poor user experience
       | design. Your car warns you when you're low on fuel, but your
       | server doesn't if you're low on critical resources.
        
         | dmingod666 wrote:
         | Exactly, it's 1990s 'cool' - the time it took him to write the
         | blog, he could have written a script that would send him
         | updates on all his devices...
        
           | ineedasername wrote:
           | There's no reason not to have multiple fail-safes. Receiving
           | the alert on a device at 3am would still mean he could free
           | up 8gb immediately and have breathing room to solve the
           | problem. And remember this is for a single admin. Asking such
           | a person to be on call 24-7 all year, vacations, holidays,
           | weekends... Having a quick way to get breathing room can
           | significantly reduce the stress & cognitive load of worrying
           | about such things in your off-time.
        
             | dmingod666 wrote:
             | He didn't mention he has alerts. Sure if alerts is your
             | first line of defense, this is a nice thing to do.
        
         | rozap wrote:
         | Everyone has this kind of alerting set up, but that's not the
         | point. The beauty of this solution is that it's dead simple and
         | will never fail. Alerting can fail or be ignored.
         | 
         | It's the same as old VW beetles which had a reserve gas tank.
         | When you ran out of gas you opened a valve and you could limp
         | to a gas station. Less likely to fail versus a 1950's era gauge
         | that is telling you you're low. Also impossible to ignore it.
        
           | dmingod666 wrote:
           | The 'beauty' artificially chokes your HDD and produces the
           | same problems that you are trying to avoid.. not a sane way
           | to proactively manage your disk usage.
        
           | goatinaboat wrote:
           | _It 's the same as old VW beetles which had a reserve gas
           | tank. When you ran out of gas you opened a valve and you
           | could limp to a gas station_
           | 
           | In scuba diving there used to be "J-valves". When you had 50
           | bar left in the tank they would cut out. Then you would pull
           | to reenable your air and return to the surface.
           | Unsurprisingly they are no longer popular.
        
           | mypalmike wrote:
           | Same was true of most motorcycles until rather recently,
           | though with motorcycles it was rare that there was a fuel
           | gauge at all. A sputtering engine was how you knew it was
           | low. And I believe that like with motorcycles, the "reserve
           | tank" in an old Beetle is really the same tank - there are
           | two hoses located in the tank at different heights.
        
           | copperfoil wrote:
           | > The beauty of this solution is that it's dead simple and
           | will never fail. Alerting can fail or be ignored.
           | 
           | It's not that straightforward IMO. Would this file be deleted
           | before the space is filled? If so, there is alerting in
           | place, and it assumes there's a way to delete files before
           | space fills up. If this file is deleted after space fills up,
           | how is this different from not having the file, other than
           | making finding files to delete easier? Then what happens
           | after that? If you delete the file and realize there's
           | nothing else to delete, you'd have to solve the problem the
           | same way if you didn't use this method.
        
             | vineyardmike wrote:
             | >you'd have to solve the problem the same way if you didn't
             | use this method.
             | 
             | What if the solution required some amount of free space?
             | (eg. installing a package or swap)
        
         | mtone wrote:
         | Assuming we're talking about VMs (2021 etc.), for a SME is
         | there any downside to giving 2TB of space to your discs and let
         | dynamic allocation do the work?
         | 
         | Perhaps consolidate/defrag once a year. Even monitoring total
         | usage more often than that is probably not worth the effort -
         | just buy ample cheap storage.
         | 
         | Also, there was a tradition to split drives into OS, DB, DB
         | Logs. That was mostly a rust performance thing and these days
         | is probably just voluntary management overhead.
         | 
         | RAM is another story.
        
           | jodrellblank wrote:
           | If you are using less space than the underlying datastore,
           | there's no benefit to dynamic allocation, you may as well
           | give the servers larger fixed disks. If you are thinking that
           | one server might need more than the fixed size for a sudden
           | growth, then you need to be monitoring to deal with that
           | because that will run out of your space. If you are
           | overprovisioning the datastore, you have the same problem at
           | a level lower, and need to be monitoring that and alerting
           | for that instead (as well).
           | 
           | > " _just buy ample cheap storage_ "; " _That was mostly a
           | rust performance thing and these days is probably just_ "
           | 
           | In the UK a 6TB enterprise rust disk is PS150 and a 2TB
           | enterprise SSD is PS300, it's 6x the price to SSD everything,
           | and take 3x more drive bays so add more for that. And you can
           | never "just" buy more storage than you ever need - apart from
           | the obvious "when you bought it, you thought you were buying
           | enough, because if you thought you needed more you would have
           | bought more", so that amounts to saying "just know the future
           | better", but it can't happen because Parkinson's Law ("work
           | expands so as to fill the time available for its completion")
           | applies to storage, the more there is available, the more
           | things appear to fill it up.
           | 
           | Room for a test restore of the backups in that space. Room
           | for a clone of the database to do some testing. Room for a
           | trial of a new product. Room for a copy of all the installers
           | and packagers for convenience. Room for a massive central
           | logging server there. What do you mean it's full?
        
           | qw3rty01 wrote:
           | One VM using excessively more disk space than it's supposed
           | to can potentially cause data corruption in all the other VMs
           | on that system. For just spinning VMs up and down for
           | testing, you probably won't run into that issue, but on a
           | production system, it could potentially cause some massive
           | downtime
        
             | cure wrote:
             | Virtual machine disk space (e.g. Xen, Linode, AWS EC2, or
             | similar) does not work this way. Each VM gets a dedicated
             | amount of disk space allocated to it, they don't all share
             | a pool of free space.
        
               | jodrellblank wrote:
               | Yes they do with the "dyanmic allocation" the parent
               | comment mentions; VMware datastore has 1TB total, you put
               | VMs in with dynamically expanding disks they are sharing
               | the same 1TB of free space and will fill it if they all
               | want their max space at the same time and you've
               | overprovisioned their max space.
               | 
               | And if you haven't overprovisioned their max space, you
               | may as well not be using dynamic allocation and use fixed
               | size disks.
               | 
               | Even then, snapshots will grow forever and fill the
               | space, and then you hope you have a "spacer.img" file you
               | can delete from the datastore, because you can't remove
               | snapshots when the disk is full and you're stuck. It's
               | the same problem, at a lower level.
        
               | cure wrote:
               | I see, a VMware feature, thanks for clarifying. I suppose
               | it's a nice idea in theory, but you'd have to be crazy to
               | use that in production, or for any workload that you care
               | about. It would just be a ticking time bomb.
        
         | Yizahi wrote:
         | Alerting is also a hack really. In 2021 the operating SYSTEM
         | should work as a system - complexly managing it's resources and
         | make intelligent decisions. Ideally OS should dynamically
         | reserve as much resources as needed on it's own.
        
         | copperfoil wrote:
         | Linux servers aren't like mass consumer products. It's assumed
         | users know what they're doing and can build and configure what
         | they need on top of it.
         | 
         | > This is 2021 and the technology is from the 90s I don't see
         | how this is a valid point. Is integrated circuit technology
         | outdated because it was developed in the 60s?
        
         | lostcolony wrote:
         | You car also doesn't drop from "alarm" to "empty" in thirty
         | seconds. A HD on a VM with a bad behaving process can.
        
       | tyingq wrote:
       | Careful how you create it. Several ways to create large files can
       | make a sparse file, which I don't think removing will actually
       | help later.
        
         | zepearl wrote:
         | I think that in the past I saw that when creating a file with
         | e.g. ...                 dd if=/dev/zero of=deleteme.file bs=1M
         | count=8196
         | 
         | ...the "free space" shown by "df" slowly decreased while the
         | file was being created, but then once the operation completed
         | that "free space" magically went back to its original value =>
         | the big existing file (full of "0"s) was basically not using
         | any storage.
         | 
         | Is this what you mean?
         | 
         | I just tried to replicate this behaviour but, dammit, I cannot
         | demonstrate that right now as the behaviour so far was the
         | expected one (free storage decreasing when creating the file
         | and sticking to that even after the completion of the
         | operation).
         | 
         | I strongly believe that that's what I saw in the past (when I
         | was preallocating image files to then be used by KVM VMs), but
         | now I'm wondering if I'm imagining things... :P
         | 
         | EDIT: this happened when using ext4 and/or xfs (don't remember)
         | without using any compression.
        
           | tyingq wrote:
           | dd will create sparse files if you use the seek option, like:
           | dd if=/dev/zero of=a_sparse_file bs=1 count=0 seek=8G
        
       | beervirus wrote:
       | > even deletions can fail or take forever
       | 
       | > in a moment of full-disk crisis I can simply delete it and buy
       | myself some critical time to debug and fix the problem
       | 
       | Uhh...
        
       | cwt137 wrote:
       | In theory, this is a good idea, but doesn't protect you in all
       | cases. I have had instances on a few of my application servers
       | where an event happened that dumped GB's worth of log data to the
       | log files in a matter of a couple of minutes and filled up the
       | drive (Thanks fast SSDs!). If I employed the strategy in the
       | article, it would have only bought me a couple of more minutes
       | worth of time, if that!
        
       | mgarfias wrote:
       | Why not keep an eye on the disk and expand the fs before it goes
       | south?
        
         | jbverschoor wrote:
         | Because sometimes it can fill up quite quickly. And the extra
         | space will give to the headroom you need. Will def do this on
         | all my servers
        
           | qwertox wrote:
           | Yes. I like the idea very much and also think that I will do
           | this with a couple of my machines.
        
       | terramex wrote:
       | Ah, the classic 'speed-up loop' approach:
       | https://thedailywtf.com/articles/The-Speedup-Loop
       | 
       | About the blogpost itself:
       | 
       |  _The disk filled up, and that 's one thing you don't want on a
       | Linux server--or a Mac for that matter. When the disk is full
       | nothing good happens._
       | 
       | I had this happen few times on a Mac and every time I was shocked
       | that if disk gets full you cannot even delete a file and the only
       | option is to do a full system reboot. I was also unable to save
       | any open file, even to external disk and suffered minor data loss
       | every time due to that.
       | 
       | What is the proper way of dealing with such issue on macOS? (or
       | other systems, if they behave the same way)
        
         | ghostly_s wrote:
         | MacOS has given the user nagging "startup disk is almost full "
         | prompts for as long as I can remember, yours doesn't?
        
           | dylan604 wrote:
           | And users have been ignoring that message as long as MacOS
           | has been giving them. Maybe even longer
        
         | whartung wrote:
         | _I had this happen few times on a Mac and every time I was
         | shocked that if disk gets full you cannot even delete a file
         | and the only option is to do a full system reboot. I was also
         | unable to save any open file, even to external disk and
         | suffered minor data loss every time due to that._
         | 
         | This just happened to me. I got the best error message I've
         | ever seen. Something akin to "Can not remove file because the
         | disk is full." This wasn't from the Finder, this was command
         | line rm.
         | 
         | On the Mac it's also exacerbated by the fact that swap will use
         | the system drive and can fill up the disk, and can not be
         | stopped. If you have some rogue process consuming RAM, among
         | other things, your disk will suffer until it is full. And, as
         | mentioned, macOS does not behave well with a full disk.
         | 
         | And, even if you've remedied the swap issue (i.e. killed the
         | process), there's no way I know to recover the swap files
         | created without restarting.
         | 
         | Just seems like the design is trouble waiting to happen, and it
         | has happened to me.
         | 
         | When this last happened, somehow it managed to corrupt my
         | external Time Machine volume.
        
         | TheAdamAndChe wrote:
         | I don't know with Mac, but this is why many Linux distros
         | recommend putting /home is on a separate partition. If it
         | fills, it won't lock up the whole system.
         | 
         | Fun story with this. Ubuntu now has an experimental root-on-zfs
         | feature. I installed it and started playing with some docker
         | containers, trying to compile a certain version of pytorch.
         | Suddenly, my computer crashed. Apparently, my root partition
         | filled because docker installed everything on the same
         | partition as my OS, crashing everything immediately.
        
       | davidelettieri wrote:
       | I always thought that database files should be on a different
       | drive from the os. If the db fills up the HD, the os is still
       | running smoothly.
        
         | lazyweb wrote:
         | Yep, ideally you'd have seperate partitons for /var, /tmp,
         | /home, root, any application/db data ..
        
       | monksy wrote:
       | To prevent the root fs block from filling up. That's why I always
       | partition my home+var+opt partition away from the root partition.
        
       | anonymousisme wrote:
       | One thing that many Linux/Unix users do not know is that all
       | commonly used filesystems have a "reserved" amount of space to
       | which only "root" can write. The typical format (mkfs) default is
       | to leave 5% of the disk reserved. The reserved space can be
       | modified (by root) any time, and it can be specified as a block
       | count or a percentage.
       | 
       | As long as your application does not have root privileges, it
       | will hit the wall when the free+reserved space runs out. Instead
       | of the clumsy "spacer.img" solution, one could simply
       | (temporarily) reduce the reserved space to quickly recover from a
       | disk full condition.
        
         | reph2097 wrote:
         | Of course the application is running as root, duh.
        
       | gkarthik92 wrote:
       | So far, my first stop to temporarily get more disk space was to
       | reduce the size of the swapfile which on a lot of servers seems
       | to be allotted >1x the requirement.
       | 
       | Will be switching to this hack! Perfect illustration of the KISS
       | principle (Keep it simple, stupid).
        
         | bluedino wrote:
         | Useful for people who still do 2x physical memory and you have
         | a server with 64+ Gb
        
       | rags2riches wrote:
       | Back in my early university days the disks always seemed to be
       | full at inconvenient times on the shared Unix systems we used.
       | Some students resorted to "reserving" disk space when available.
       | Which of course made the overall situation even worse.
        
       | Saris wrote:
       | It's interesting to me that linux doesn't natively reserve a
       | little space to allow basic commands like directory listing and
       | file deletion to function even with a full disk.
       | 
       | Because really the biggest problem when I've had a partition get
       | full, is I sometimes can't even delete the offending log file.
        
         | myself248 wrote:
         | It still boggles my mind that the act of deleting a file can
         | fail because it requires space to perform the act.
         | 
         | If y'ask me, that's a fundamental design flaw. Of course nobody
         | asked me...
        
           | chungy wrote:
           | Depends entirely on the design of the file system. In copy-
           | on-write file systems, it's a necessity: you need to at least
           | allocate a new metadata block that doesn't record the
           | existence of some file... and that's assuming you don't have
           | snapshots keeping it allocated anyway.
           | 
           | You can run into real trouble on btrfs if you fill it, it has
           | no reserve space to protected from this scenario. ZFS at
           | least reserves a fraction of the total space so that deletes
           | are allowed to work even when the pool reaches 100% capacity.
        
       | SethTro wrote:
       | Same idea as this game development legend
       | 
       | https://www.dodgycoder.net/2012/02/coding-tricks-of-game-dev...
       | 
       | > he had put aside those two megabytes of memory early in the
       | development cycle. He knew from experience that it was always
       | impossible to cut content down to memory budgets, and that many
       | projects had come close to failing because of it. So now, as a
       | regular practice, he always put aside a nice block of memory to
       | free up when it's really needed.
        
         | Cerium wrote:
         | In my work it is very common to make the memory map a little
         | smaller than it has to be. If you can't ship an initial version
         | in a reduced footprint you will have no hope of shipping future
         | bugfixes.
        
         | 295310e0 wrote:
         | If true, I hate that story. Think of the better art assets that
         | were needlessly left behind. How is it that said block of
         | memory had never been identified by any profiling?
        
           | emmab wrote:
           | If it would be detected by profiling that does make the
           | technique asymmetric in that it would only stick around if
           | nobody profiled to find it.
        
             | hinkley wrote:
             | Or if you didn't have an understanding with the sort of
             | people who would run the profiler...
        
           | usefulcat wrote:
           | > Think of the better art assets that were needlessly left
           | behind.
           | 
           | Consider how long it takes to edit or recreate art assets to
           | reduce their size. Depending on the asset, you might be
           | basically starting over from scratch. Rewriting code to
           | reduce its size is likely to be an even worse option,
           | introducing new bugs and possibly running slower to boot. At
           | least smaller, simpler art assets are likely to render
           | faster.
           | 
           | This is also the kind of problem that's more likely to occur
           | later in the schedule, when time is even more scarce. Between
           | these two factors (lack of time and amount of effort required
           | to get art assets which are both decent looking and smaller),
           | I think in practice you're actually more likely to get better
           | quality art assets by having an artificially reduced memory
           | budget from the outset.
        
           | _carbyau_ wrote:
           | I see it as a "Choose your problem." affair.
           | 
           | 1. Deal with possibly multiple issues possibly involving
           | multiple people with the politics that entails resulting in a
           | lot of stress for all involved as any one issue could render
           | it a complete failure.
           | 
           | 2. Have extra space you can decide to optimise if you want.
           | You could even have politics and arguments over what to
           | optimise, but if nothing happens it all still works so there
           | is a lot less stress.
           | 
           | I pick 2.
        
         | Bost wrote:
         | There's a difference between "The server is not responding
         | right now. We're loosing customers.", and "Low resources during
         | product development". Actually the latter may be a case of
         | enforcing premature optimization. So no, it's not the same
         | idea.
        
           | smarx007 wrote:
           | I think we are thinking of a different baseline. You are
           | thinking along the lines of "this should run, we can reduce
           | server costs later", I would suggest (if I may) "the app
           | needs to run on any Android device with 2GB RAM". And then
           | you develop a game to run on a 1.5GB RAM phone, expecting
           | that it will eventually fit into 2GB RAM budget.
        
         | benhurmarcel wrote:
         | https://thedailywtf.com/articles/The-Speedup-Loop
        
         | pjmorris wrote:
         | I'd read in 'Apollo: Race To The Moon', Murray and Cox, that
         | the booster engineers had done something similar with their
         | weight budget, something the spacecraft engineers wound up
         | needing. Contingency funds of all sorts are a great thing.
        
       | xen2xen1 wrote:
       | I will be doing this. Marvelous idea.
        
       | gfody wrote:
       | having an 8gb file you know you can delete isn't really all that
       | helpful if everything has already gone disk-full-fracked. you
       | should really have an alarm on free space, especially if you're
       | an indie.
        
         | kelnos wrote:
         | Sure, but sometimes the disk filling up is caused by something
         | runaway and fast. If your "60% full" alarm goes off and the
         | disk fills up 2 minutes later, you're still stuck.
         | 
         | With a "ballast file" (as another commenter termed it), you can
         | decide exactly when processes get to start consuming disk
         | again, and that can give you some headroom to fix the problem.
        
       | liaukovv wrote:
       | I'm sorry for meta comment, but this font is barely readable.
        
         | nielsbot wrote:
         | It's pretty light. Use Reader mode?
        
         | davidcollantes wrote:
         | Not just the font of choice, but the formatting (hanging
         | indent?). It makes it harder to read when there is little
         | volume of text.
        
           | patrickserrano wrote:
           | Glad I wasn't the only one that found the indented paragraphs
           | made it difficult to read.
        
       | kchoudhu wrote:
       | This is why I insist on data and root partitions on all the
       | machines I administer. Go ahead and kill the data partition, at
       | least the root partition will keep the system up and running.
        
       | rgj wrote:
       | In the mid nineties I worked in a research institute. There was a
       | large shared Novell drive which was always on the verge of full.
       | Almost every day we were asked to clean up our files as much as
       | possible. There were no disc quota for some reason.
       | 
       | One day I was working with my colleague and when the fileserver
       | was full he went to a project folder and removed a file called
       | balloon.txt which immediately freed up a few percent of disk
       | space.
       | 
       | Turned out that we had a number of people who, as soon as the
       | disk had some free space, created large files in order to reserve
       | that free space for themself. About half the capacity of the
       | fileserver was taken up by balloon.txt files.
        
         | twistedpair wrote:
         | I worked at a large company during a migration from Lotus to
         | Outlook. We were told we'd get our current Lost email storage +
         | 100MiB as a new email quota limit under Outlook.
         | 
         | I made a bunch of 100MiB files of `/dev/random` noise (so they
         | don't compress, compressed size was part of the quota) and
         | emailed them to myself before the migration, to get a few GiB
         | of quota buffer.
         | 
         | My co-workers were constantly having to delete old emails in
         | Outlook to stay under quota, but not me. I'd just delete one of
         | my jumbo attachment emails, as needed. ;)
        
           | ABeeSea wrote:
           | Email quotas aren't just a cost thing. It forces deletion of
           | files/communications that aren't relevant anymore. The last
           | thing the legal department wants is some executive's laptop
           | with 10 years of undeleted email to make it's way to
           | discovery.
        
             | easton wrote:
             | Then why not just tell Exchange to delete any emails older
             | than 5 years (or whatever your lawyers tell you to put)?
        
         | gowld wrote:
         | You had a community-driven quota system.
        
           | manifoldgeo wrote:
           | Sounds like they had a tragedy of the commons, too haha
        
             | agumonkey wrote:
             | is there a model to solve these ?
        
             | Dylan16807 wrote:
             | Partly that, partly the opposite.
             | 
             | It's basically reserving part of the disk for very
             | important things only, which scares off less important
             | uses. Like making the commons seem more polluted than it
             | actually is to get some action taken.
             | 
             | If those files weren't there, the space would probably fill
             | up, but now without any emergency relief valves.
             | 
             | It would be better if these files were a smaller fraction
             | of space and had more oversight... but that's just a quota
             | system. This is something halfway in between real quotas
             | and full-on tragedy of the commons.
        
         | the-rc wrote:
         | At the opposite end, I heard a story of actually full storage
         | from the beginning of the century, when I worked at a "large
         | medical and research institution in the Midwest". They had
         | expensive SMB shares (NetApp?) that kept getting full all the
         | time. So they did the sane thing in the era of Napster: they
         | started deleting MP3 files, with or without prior warning.
         | Pretty soon, they got an angry call that music could not be
         | played in the operating room. Oops. Surgeons, as you can guess,
         | were treated like royalty and didn't appreciate seeing their
         | routines disrupted.
        
           | tapland wrote:
           | D: I like getting to listen to something in the MRI though.
        
         | bane wrote:
         | This is a surprisingly common hoarding behavior among humans
         | using scarce resources. In technology you see it everywhere,
         | virtualization infrastructure, disk storage, etc.
         | 
         | This is actually kind of clever. How the tribal knowledge for
         | how to "reserve space" was developed and disseminated would be
         | pretty interesting to study.
        
           | jachee wrote:
           | In Pittsburgh, it's evolved into the parking chair.
           | 
           | https://en.wikipedia.org/wiki/Parking_chair
        
         | tomrod wrote:
         | This is very common in F500 companies. It's also a symptom of
         | dysfunction.
        
         | victor9000 wrote:
         | Perfect example of the tragedy of the commons. If individuals
         | don't create these balloon files then they won't be able to use
         | the file server when they need it, yet by creating these
         | balloon files the collective action depletes the shared
         | resource of its main function.
        
           | Rule35 wrote:
           | It's a decentralized implementation of a quota system.
           | 
           | By slowly releasing supply you prevent anyone having to self-
           | regulate (which requires unreasonable deprivation, OR global
           | knowledge) and everyone bases their decisions off of the only
           | global signal, free space.
        
           | [deleted]
        
           | njovin wrote:
           | This is similar to how some government agencies retain their
           | budgets.
           | 
           | At the end of the budget period they've only spent 80% of
           | their allocated budget, so they throw out a bunch of
           | perfectly good equipment/furniture/etc. and order new stuff
           | so that their budget doesn't get cut the following year,
           | rather than accepting that maybe they were over-budgeted to
           | begin with.
           | 
           | Rinse, repeat, thus continuing the cycle of wasting X% of the
           | budget every year.
        
         | ChuckMcM wrote:
         | Okay, that is hilarious.
         | 
         | I use some scripts that monitor disk space, and monitor disk
         | usages by "subsystem" (logs, mail, services, etc) using Nagios.
         | And as DevOps Borat says, "Disk not full unless Nagios say
         | 'Disk is full'" :_) Although long before it is full it starts
         | warning me.
         | 
         | It doesn't go off very much, but it did when I had a bunch of
         | attacks on my web server that started core dumping and that
         | filled up disk reasonably quickly.
         | 
         | Back in the day we actually put different things in different
         | partitions so that we could partition failures but that seems
         | out of favor with a lot of the distros these days.
        
         | tinus_hn wrote:
         | Same thing happens with floating licenses, if they are too
         | scarce, people open the program first thing in the morning
         | 'just in case' and keep a license reserved all day.
        
           | legulere wrote:
           | Seems like a good example how private property and therefore
           | capitalism can be seen as the same as a tragedy of the
           | commons.
        
             | [deleted]
        
             | dmingod666 wrote:
             | What?
        
               | michaelt wrote:
               | If it's morally incorrect to get to the license server
               | early and snatch a roaming license you might not use to
               | its fullest, is it not also wrong to access the property
               | market early and snatch a bunch of land you might not use
               | to its fullest?
        
               | jsmith99 wrote:
               | The difference is that market mechanisms are meant to
               | allocate items to those that value them the most (assumed
               | to be the same as their willingness to pay). Claiming a
               | licence by contrast has little cost.
        
               | Kaze404 wrote:
               | I think a homeless person values an empty house way more
               | than a real estate owner.
        
               | ipaddr wrote:
               | It is unhealthy group behaviour. Your actions reduces the
               | groups access. By sharing you increase group access and
               | group success.
               | 
               | When you buy a property for speculation you take a risk
               | on your ability to increase market value (or outside
               | reasons). You could lose money.
               | 
               | Hoarding group resources makes you a net negative in your
               | group. Hoarding property alone means you had to create or
               | borrow enough value to obtain the property and finance
               | upkeep. Someone has the money to build a new home. That
               | transaction is a net positive.
        
             | buffet_overflow wrote:
             | I still know people that pre-emptively buy toilet paper
             | "because hoarders might buy the rest of it" with absolutely
             | no introspection.
        
               | woko wrote:
               | Hoarders will buy the rest of it.
               | 
               | During the 2-month lockdown a year ago, I would purchase
               | 4 frozen pizzas at a time when I had a chance to buy
               | them, because I was so upset that I could not buy one
               | when I wanted a single one during the 2 previous weeks,
               | because of hoarders who had been faster than me.
               | 
               | People think of toilet papers, but it is not just that.
               | Pasta, rice, flour, yeast, plenty of useful things went
               | missing due to hoarders during lockdown. Thankfully, I
               | don't eat meat, because there was no meat at the
               | supermarkets. Fish was also hard to find in the frozen
               | shelves of the supermarkets. I am not a big fish eater
               | either, so it was fine for me as well, but you get the
               | idea: if you don't hoard a tiny bit, you get increasingly
               | frustrated because hoarders will hoard, and you will have
               | to wait for weeks before you get the chance to have what
               | you want.
        
               | _carbyau_ wrote:
               | I noticed a weird thing here. All the regular pasta
               | rapidly disappeared. Rice also was gone.
               | 
               | But "Gluten Free"(I am a GFer) pasta - in this case,
               | pasta made from rice - was fine!
               | 
               | I amused myself with the idea that people:"would rather
               | starve to death than eat that GF crap." :-)
        
               | jakeva wrote:
               | but that's the point, by responding in that way you have
               | become one of those hoarders.
        
               | overboard2 wrote:
               | You could say it's a real tragedy
        
               | smegger001 wrote:
               | a common one at that
        
               | akdor1154 wrote:
               | This is a really perfect illustration of both parts of
               | the parent comment.
        
               | bane wrote:
               | The meat situation was kind of interesting actually. At
               | the beginning of the lockdown I remember going shopping
               | for lots of shelf-stable foods. Very perishable stuff
               | like meat or fresh veggies were out or hard to find (and
               | yes freezing meat works fine I know). However, lots of
               | stores had _tons_ of shelf stable boxed and canned goods
               | and for meat jerky and other dried meats which are simple
               | to boil and turn into simple soups in case of emergencies
               | -- and yet nobody was really buying them at that time.
               | 
               | I think I "prepped" for the worst by buying a 10 lb bag
               | of flour, 40 lb of rice, 5 lbs of oatmeal, a sack of
               | potatoes and a few bags of beef jerky and trail mix. Even
               | if the worst didn't come to pass I figured we'd
               | eventually eat it all anyways and it wouldn't really be
               | hoarded, but in pinch we could ration it and it would
               | last a few months and give us nutritious meals. It's
               | basically what ships crews used to survive on during the
               | age of sail as they spent months at sea. Not a ton of
               | variety but it will keep you alive.
        
               | plank_time wrote:
               | I bought a bunch of rice, and all that happened was
               | little rice bugs started living in it, so I had to throw
               | it all away. But if there were a rice shortage, I would
               | probably have eaten it.
        
               | icansearch wrote:
               | If you freeze it for 24 hours it will kill them off. Can
               | be a good idea to do that when it comes into the house
               | anyway as they might be in there already.
               | 
               | Easy enough to separate them out of the rice after
               | freezing too.
        
               | joshjdr wrote:
               | Extra protein (not to mention a few deviations from the
               | OP)?
        
               | random5634 wrote:
               | The trick is to do exactly what they tell you not too
               | immediately.
               | 
               | Masks don't protect against covid - jump on Alibaba and
               | buy some.
               | 
               | Don't buy extra toiled paper, there will be no limits on
               | its sale - immediately do a order on amazon prime.
               | Because within a few days there will be price caps on the
               | sale of TP (increased prices would deter hoarding) so
               | everything will sell out right away.
               | 
               | I think people have really learned this by now - as SOON
               | as the announcements come that there is nothing to worry
               | about and they won't do price caps or other things, they
               | will be doing exactly that soon.
               | 
               | Places like Alibaba do seem to continue to function -
               | masks are maybe 50 cents per mask instead of 5, so folks
               | buy a lot less, but you can still get a few for $5. Same
               | with TP, it's not dirt cheap per roll, but you can get a
               | few.
        
               | whatshisface wrote:
               | To rationalize this phenomenon... They don't warn you
               | about it until they think you'd be worried about it, and
               | they don't think you'd be worried about it until they
               | start thinking about doing it. ;)
        
               | nostrademons wrote:
               | It's not really irrational or some behavior that they'd
               | change if they _did_ introspect. I suspect a significant
               | portion of these people are internally like  "Yeah, I'm
               | part of the problem now, but the problem is not going to
               | go away if I don't join in on it, I'll just get screwed."
               | 
               | In other words, they understand game theory and tragedy
               | of the commons.
        
               | 45ure wrote:
               | The introspection or lack thereof, might be defined by
               | Nash equilibrium. Nevertheless, the situation depicted in
               | the article is a life hack -- exactly as described. There
               | is no malicious intent/compliance, rather it is about
               | having the foresight and ingenuity to save the day.
               | 
               | https://theconversation.com/a-toilet-paper-run-is-like-a-
               | ban...
        
             | [deleted]
        
             | xupybd wrote:
             | This is the opposite of private property ownership.
             | Property ownership comes at a cost. That cost increases as
             | supply dries up. In this case there is no cost to the
             | individual, only to the group.
        
               | worik wrote:
               | There is nothing intrinsic about property that implies
               | cast.
               | 
               | I think you are confusing property and scarcity.
        
               | frenchy wrote:
               | > Property ownership comes at a cost.
               | 
               | Sort-of. Typically there is a cost associated with
               | getting rights over a property, either by manufacturing,
               | or in the case of land, by purchase or through the
               | efforts of settling. However, once you have ownership of
               | a property, it's usually relatively cheap to continue own
               | (except, I suppose, if you consider the risk of a
               | communist revolution or something).
               | 
               | This is esentially how the British monarchy earns their
               | generous sums of money. They stole land from the Brits
               | during the Norman conquest about 1000 years ago, and now
               | they rent it back to them for a handsome profit (though
               | the whole thing is rather complicated now and they only
               | get a portion of the money).
        
               | DevKoala wrote:
               | In California you pay ~1.5% in property taxes per year.
               | Owning property is not cheap.
               | 
               | In fact you never really own it.
        
               | angry_octet wrote:
               | Unless you're a golf course in LA, and cajole a permanent
               | exemption!
        
               | [deleted]
        
             | gwright wrote:
             | Not really.
        
             | economusty wrote:
             | More like the opposite. Imagine if real estate was
             | available without trading something of value(like the free
             | space on the drive). I don't see how that is like
             | capitalism, in one case the resource was free and finite
             | and it was almost always taken by those who don't need it.
             | In the other case you have to trade something of value and
             | there is ample land in general to trade. Imagine tomorrow
             | Biden get on TV and says "property rights are dissolved,
             | take what you want", my guess is that in a single short
             | time period all land in the world would be claimed. Now
             | imagine if the users of the shared disks page had to plan
             | and requisition disk space.
        
       | ineedasername wrote:
       | A great idea, but it still leaves the possibility for performance
       | issues prior to an admin's ability to address is. Some like two
       | 4gb blocks might work better: if you get within, say, 200mb of
       | storage limits you remove the first one and trigger an
       | email/text/whatever to the admin, that way they can address it
       | before it goes further. It's an early warning and automated
       | solution. Then, if the situation continues, the second 4gb block
       | is also automatically removed with another message send to the
       | admin. Nothing fails silently.
        
       | nycdotnet wrote:
       | This is an old trick for when you need to deploy to media with a
       | fixed size - floppy/CD-ROM/etc. Make a file that is 5-10% the
       | size of your media and don't remove unless you're running out of
       | space in crunch time.
        
       | danielrhodes wrote:
       | My understanding is this is why one should partition a drive. If
       | you have a data partition, a swap partition, and an OS partition,
       | you can get around issues where a server's lack of disk space
       | hoses the whole system.
        
         | yabones wrote:
         | 100% agree. I think at the bare minimum every system should
         | have two partitions: `/` and `/var`
         | 
         | /var is usually where the most data gets added. Logs, database
         | files, caches, and whatever other junk your app spits out. 99%
         | of the time that's what causes the out of space issues. By
         | keeping that separate from root, you've saved your system from
         | being completely hosed when it fills up (which it will).
         | 
         | Obviously there are other places that _should_ get their own
         | mounts, like  /home and /usr, but before you know it you've got
         | an OpenBSD install on your hands with 15 partitions :)
        
           | nijave wrote:
           | I place I used to work achieved something similar with lvm
           | thin provisioning and split out something like /, /home,
           | /var, /var/log and maybe a couple others. I think they also
           | had something clever with lvm snapshots to rollback bad
           | updates (snapshot system, upgrade, verify) so even if an
           | update went rogue and deleted some important, unrelated files
           | it could be undid
        
           | sedachv wrote:
           | OpenBSD default partition allocation is really well thought
           | out:
           | 
           | https://man.openbsd.org/disklabel#AUTOMATIC_DISK_ALLOCATION
           | 
           | At least put /tmp in its own partition as well.
           | 
           | Multiple partitions can also save a lot of recovery time when
           | there is a multiple bad sector event that corrupts an entire
           | partition beyond recovery.
        
           | ClumsyPilot wrote:
           | Sounds like a poor-mans quota system
        
       | znpy wrote:
       | That's a dumb idea?
       | 
       | Iirc some filesystems allow you to reserve a percentage of blocks
       | for this particular use case (recovery by root).
       | 
       | Ext2/3 for sure, ext4 probably too.
       | 
       | Not sure you can do that on linode on the rootfs, since the
       | filesystem is mounted, tho.
        
         | joana035 wrote:
         | tune2fs -m
        
       | rektide wrote:
       | hope you're not running -o compress=lz4 , because you are going
       | to be in for a big surprise when you try to pull this emergency
       | lever! you may be shocked to see you don't actually get much
       | space back!
       | 
       | i do wonder how many FS would actually allocate the 8GB if you,
       | for example, opened a file, seeked to 8GB mark, and wrote a
       | character. many file systems support "sparse files"[1]. for
       | example on btrfs, i can run 'dd if=/dev/zero of=example.sparse
       | count=1 seek=2000000' to make a "1GB" file that has just one byte
       | in it. btrfs will only allocate a very small amount in this case,
       | some meta-data to record an "extent", and a page of data.
       | 
       | i was expecting this article to be about a rude-and-crude
       | overprovisioning method[2], but couldn't guess how it was going
       | to work. SSDs notably perform much much better when they have
       | some empty space to make shuffling data around easier. leaving a
       | couple GB for the drive to do whatever can be a colossal
       | performance improvement, versus a full drive, where every
       | operation has to scrounge around to find some free space. i
       | wasn't sure how the author was going to make an empty file that
       | could have this effect. but that's not what was going on here.
       | 
       | [1] https://wiki.archlinux.org/index.php/sparse_file
       | 
       | [2] https://superuser.com/questions/944913/over-provisioning-
       | an-...
        
         | spijdar wrote:
         | Before reading this, I had presumed that sparse files did not
         | overcommit drive space, but apparently, they do. I don't use
         | them regularly and certainly not to "reserve disk space" but I
         | was surprised that you can make sparse files way larger than
         | available free space on the drive. I had assumed they were
         | simply not initialized, but the FS still required <x> amount of
         | free space in case a block is accessed.
        
         | sonicrocketman wrote:
         | Good point. I didn't think of that. I'm not sure how my CentOS
         | servers handle this scenario, but it seemed to take up the full
         | 8GB however I checked.
        
         | secabeen wrote:
         | > hope you're not running -o compress=lz4 , because you are
         | going to be in for a big surprise when you try to pull this
         | emergency lever! you may be shocked to see you don't actually
         | get much space back!
         | 
         | This is true. If you are replicating this, copy from
         | /dev/urandom rather than using an empty file.
        
           | yjftsjthsd-h wrote:
           | It feels more elegant to do something like `touch /bigfile &&
           | chattr -c /bigfile && truncate --size 8G` to outright disable
           | compression on that file
        
       | gherkinnn wrote:
       | An architect once told me that he always plans for a solid gold
       | block hidden away in the cellar.
       | 
       | Once the project invariably goes over budget, he drops the plans
       | for the gold and frees up extra funds.
       | 
       | Edit: I think it was a large marble slab. Same thing.
        
         | emmab wrote:
         | What?!? How does that work? Does he just draw up a blueprint
         | and write "solid gold block goes here" and them some contractor
         | says "yes that gold block will be $NNNNN" and includes it in
         | the budget??
        
           | marshmallow_12 wrote:
           | Check your basement!
        
           | viraptor wrote:
           | It's likely not literal. He likely quotes for price +50k or
           | something like that, so that people will start thinking about
           | reducing the price before they run out of budget.
        
       | h4waii wrote:
       | This is like carrying around a pound of beef because you refuse
       | to look up the address of a McDonald's 7 minutes away.
       | 
       | Setup quotas or implement some damn monitoring -- if you're not
       | monitoring something as simple and critical as disk usage, what
       | else are you not monitoring?
        
         | bostonsre wrote:
         | Not all environments require a stringent SLA. I have some
         | servers that don't have a stringent SLA and aren't worth being
         | woken up at night over if their disk is filling up fast.
        
           | luckylion wrote:
           | Okay, so you wake up with a full disk. What did the spacer
           | accomplish?
        
         | Dylan16807 wrote:
         | Monitoring doesn't prevent random things from spiking, and
         | something like this makes it easier to recover.
         | 
         | Quotas are tricky to set up when things are sharing disk space,
         | and that could easily give you a false positive where a service
         | unnecessarily runs out of space.
        
       | jedberg wrote:
       | Since the late 90s, this was always my solution:
       | tune2fs -m 2 /dev/hda1
       | 
       | That sets the root reserve on a disk. It's space that only root
       | can use, but also you can change it on the fly. So if you run out
       | of userland space you can make it smaller, and if a process
       | running as root fill your disk, well, you probably did something
       | real bad anyway. :)
       | 
       | But yeah, this is a pretty good hack.
        
       | ryandrake wrote:
       | A lot of tips in this thread are about how to better alert when
       | you get low on disk space, how to recover, etc. but I'd like to
       | highlight the statement: "The disk filled up, and that's one
       | thing you don't want on a Linux server--or a Mac for that matter.
       | When the disk is full nothing good happens."
       | 
       | As developers, we need to be better at handling edge cases like
       | out of disk space, out of memory, pegged bandwidth and pegged
       | CPU. We typically see the bug in our triage queue and think in
       | our minds "Oh! out of disk space: Edge case. P3. Punt it to the
       | backlog forever." This is how we get in this place where every
       | tool in the toolbox simply stops working when there's zero disk
       | space.
       | 
       | Especially on today's mobile devices, running out of disk space
       | is common. I know people who install apps, use them, then
       | uninstall them when they're done, in order to save space, because
       | their filesystem is choked with thousands of pictures and videos.
       | It's not an edge case anymore, and should not be treated as such.
        
         | probably_wrong wrote:
         | I know when our server's /tmp directory is full because Bash's
         | tab autocompletion stops working.
         | 
         | /home still has space, though, so nothing truly breaks. Perhaps
         | I should file a bug report about that.
        
         | jandrese wrote:
         | It doesn't help that the base model of many phones had
         | ridiculously undersized storage for so many years.
         | 
         | "I have an unlimited data plan, I'll just store everything in
         | the cloud." only to discover later that unlimited has an
         | asterisk by it and a footnote that says "LOL it's still
         | limited".
        
       | moistbar wrote:
       | When I worked at SevOne, we had 10x500 MB files on each disk that
       | were called ballast files. They served the same purpose, but
       | there were a couple nice tools built in to make sure they got
       | repopulated when disk space was under control, plus alerting you
       | whenever one got "blown." IIRC it could also blow ballast
       | automatically in later versions, but I don't remember it being
       | turned on by default.
        
       | da_big_ghey wrote:
       | Full disc problem in linux macine has been a problem in partialy
       | solved in past many decades. We have had seperated partition
       | /home, /tmp, /var, /usr in each its own partition. This is reduce
       | problem if not completly removing. This is small desadvantage:
       | there is reducion in fungability for a disc space.
        
       | dmingod666 wrote:
       | It sounds 'cool' and all for 1995.. but, what about one script
       | that'll email you when the disk is at 80%?
        
         | ineedasername wrote:
         | So that if by the time you get the email the issue is at 97%
         | you can immediately give yourself enough breathing room to
         | figure things out with downtime or significantly degraded
         | performance.
        
           | dmingod666 wrote:
           | Sure, alerts + this sounds like an approach someone would
           | take
        
         | dylan604 wrote:
         | Besides immediately after first spin-up, when is a drive not at
         | 80% capacity?
        
       | capableweb wrote:
       | I've had my disks so full that a `rm` command doesn't even work,
       | would this workaround work in those cases too?
        
         | geocrasher wrote:
         | Yes because you could just do
         | 
         | > save_my_butt.img
         | 
         | and now it's 0 bytes.
        
           | kiwijamo wrote:
           | Would that work? The fs may actually allocate a new file
           | before deleting the exisiting allocation so the risk of it
           | not working is still there I would think?
        
             | geocrasher wrote:
             | Worst case scenario, if
             | 
             | > filename
             | 
             | didn't null it, then just
             | 
             | echo "0" > filename
        
             | quesera wrote:
             | It might vary by kernel, filesystem, or shell, but in my
             | experience and confirmed with a quick test: shell
             | redirection does not create a new file/inode.
        
       | louwrentius wrote:
       | Please use LVM (Logical Volume Manager) if you really are afraid
       | of filling up disks.
       | 
       | If the disk would ever fill up:                   1. Buy an
       | additional virtual disk         2. Add the disk to the LVM volume
       | group         3. Expand the Logical volume
       | 
       | A really good primer on LVM:
       | 
       | https://wiki.archlinux.org/index.php/LVM
        
         | amelius wrote:
         | Can you do all that while blocking the original request for
         | space?
        
           | louwrentius wrote:
           | Yes you can. And all of this is on-line.
        
       | tanseydavid wrote:
       | I am surprised WIN did not make the short list for OS-lost when
       | disk space is almost gone.
       | 
       | I have been there a couple of times and it is a land of crazy
       | unpredictable behavior.
        
       | njacobs5074 wrote:
       | As hacks go, it's a good one. I also like it because you don't
       | have to be root to implement it and you don't have to reconfigure
       | your file system params in ways that might or might not be great
       | for other reasons.
        
       | [deleted]
        
       | robin21 wrote:
       | This is a great idea. Hit this so many times.
        
       | freeone3000 wrote:
       | This sounds like you should, instead, use the "Filespace Reserved
       | for Root" functionality of your filesystem, which exists
       | specifically for this contingency. The default for ext3 is 5%.
        
       | SpaceInvader wrote:
       | To extend space in any filesystem in the root volume group on AIX
       | you need space in /tmp. Years ago while working for some major
       | bank I proposed to create such dummy file in /tmp exactly for the
       | reason of extending filesystem. It saved us several times :)
        
       | fractal618 wrote:
       | I think i know where this is going without even reading. Any
       | attempt from the outside to pull this 8b Gb file would be a very
       | noticeable red flag.
        
         | outworlder wrote:
         | Nope. Try again :)
        
           | qwertox wrote:
           | But the idea isn't that bad. Name it properly, like saudi-
           | arabia-customer-data.sql.pgp next to a directory named pgp-
           | keys and fill it with /dev/random.
        
       | clipradiowallet wrote:
       | An alternative approach here... make sure (all) your filesystems
       | are on top of LVM. This reduces the steps needed to grow your
       | free space. Whether you have a 8gb empty file laying around, or
       | an 8gb block device to attach...LVM will happily take them both
       | as pv's, add them to your vg's, and finally expand your lv's.
       | 
       | some reading if LVM is new and you want to know more:
       | https://opensource.com/business/16/9/linux-users-guide-lvm
       | 
       | edit to add: pv=physical volume, vg=volume group, lv=logical
       | volume
        
         | uniformlyrandom wrote:
         | I would not say this is an alternative, more like yet another
         | tool in a shed:                 1. Tunefs       2. spacer.8gb
         | 3. lvm
        
         | ttyprintk wrote:
         | Added benefit of not waiting to backup and restore 8 GB.
        
         | rubiquity wrote:
         | Yes LVM can help here. Another approach would be when you
         | create the logical volume to intentionally under allocate.
         | Perhaps only use 80-90% of the physical volume.
        
         | jonhermansen wrote:
         | If you are using LVM on all of your filesystems, it seems like
         | a bad idea to use a file residing on LVM block device as
         | another PV. And actually I'd be surprised if this was even
         | allowed. Though maybe it is difficult to detect.
         | 
         | You'd effectively send all block changes through LVM twice
         | (once through the file, then through the underlying block
         | device(s))
        
           | labawi wrote:
           | LVM is just fancy orchestration for the device-mapper
           | subsystem with some headers for setup information.
           | 
           | For block operations it's no different from manual setup of
           | loop-mounted volumes, that also need to travel a couple of
           | layers to hit the backing device.
           | 
           | Though there is an important caveat - LVM is more abstracted,
           | making it easier to mistakenly map a drive onto itself, which
           | may create a spectacular failure (haven't tried).
        
       | scottlamb wrote:
       | > On Linux servers it can be incredibly difficult for any process
       | to succeed if the disk is full. Copy commands and even deletions
       | can fail or take forever as memory tries to swap to a full disk
       | and there's very little you can do to free up large chunks of
       | space.
       | 
       | This reasoning doesn't make sense. On Linux, swap is
       | preallocated. This is true regardless of whether you're using a
       | swap partition or a swap file. See man swapon(8):
       | 
       | > The swap file implementation in the kernel expects to be able
       | to write to the file directly, without the assistance of the
       | filesystem. This is a problem on files with holes or on copy-on-
       | write files on filesystems like Btrfs.
       | 
       | > Commands like cp(1) or truncate(1) create files with holes.
       | These files will be rejected by swapon.
       | 
       | I just verified on Linux 5.8.0-48-generic (Ubuntu 20.10) / ext4
       | that trying to swapon a sparse file fails with "skipping - it
       | appears to have holes".
       | 
       | Now, swap is horribly slow, particularly on spinning rust rather
       | than SSD. I run my systems without any swap for that reason. But
       | swapping shouldn't fail on a full filesystem, unless you're
       | trying to create & swapon a new swapfile after the filesystem is
       | filled.
        
         | Yizahi wrote:
         | I've seen Linux systems simply crash when root partition was
         | 100% full, though it was an embedded system, not representative
         | to a big servers.
        
           | Blikkentrekker wrote:
           | Define crash of a "system"?
           | 
           | Kernel panic? some user process you deem essential stopping?
        
         | bostonsre wrote:
         | Not sure about their reasoning.. but if you don't have root ssh
         | enabled, sudo can break if there is no free disk space. I do
         | something similar where I write a 500mb file to /tmp and chmod
         | 777 it so anyone can free it up without needing sudo.
        
           | reph2097 wrote:
           | In that case, use "su".
        
           | franga2000 wrote:
           | I've experienced far more full disks than I'd want to admit,
           | on many different hardware and software configurations, and
           | I've never seen sudo break. Is this something you've
           | experienced recently?
           | 
           | I definitely agree with your advice and will go double check
           | all my servers if /filler is 777 (not in /tmp since it's
           | sometimes mounted tmpfs), but if sudo does break in that
           | situation, that sounds like a pretty severe and most likely
           | fixable bug.
        
           | jeroenhd wrote:
           | I've never had sudo break on my full disks. However, that
           | doesn't mean recovery is easy...
           | 
           | Working in a terminal to find out what on earth has just
           | filled up your disk is a real pain when your shell complains
           | about failing to write to your $HISTFILE and such. And, of
           | course, the problem always shows up on that one server that
           | doesn't have ncdu installed...
           | 
           | I'm sure sudo can theoretically break with 0 free disk space,
           | but that's not the usual mode of failure in my experience. At
           | most sudo need to touch a dotfile or two, so deleting _any_
           | temporary file or old log archive will do for it to recover.
           | 
           | The balloon file is not a bad idea. I think I will apply it
           | on my own servers just for good measure, although 8GiB is a
           | bit much for my tastes.
        
         | MayeulC wrote:
         | IIRC, swap is actually needed for some memory operations. And
         | when you run out of memory, the behaviour is often worse
         | without swap.
         | 
         | These days I always at least configure an in-memory compressed
         | swap (zram).
        
           | scottlamb wrote:
           | You recall incorrectly; swap is not needed. It's not just me
           | who runs without it; Google production machines did for many
           | years.
           | 
           | "The behavior is often worse without swap" is more vague /
           | subjective. I prefer that a process die cleanly than
           | everything slow to a crawl semi-permanently. I've previously
           | written about swap causing the latter:
           | https://news.ycombinator.com/item?id=13715917 To some extent
           | the bad behavior can happen even without swap because non-
           | locked file-backed pages get paged in slowly also, but it
           | seems significantly worse with swap.
           | 
           | zram is a decent idea though. I use it on my memory-limited
           | Raspberry Pi machines.
        
           | zamadatix wrote:
           | Depends if you want things to gracefully degrade because you
           | know you don't have enough RAM or if you'd rather things just
           | straight up die. E.g. for the things I work on my laptop with
           | if whatever I do isn't going to work with 128 GB of RAM (80%
           | of which was meant to be cached data not actually used) then
           | it's because it went horribly wrong and needs to be halted
           | not because I needed some swap which is just going to try to
           | hide that things have gone horribly wrong for a minute and
           | then die anyways. Now if I were doing the same things on a
           | machine with 8 GB or 16 GB of RAM then yeah I want to
           | gracefully handle running out of physical memory because
           | things are probably working correctly it's just a heavier
           | load and it can be better to swap pages to disk than drop
           | them from a small amount of cache completely.
        
       | ohazi wrote:
       | Ah yes... good ole'
       | in_case_of_fire_break_glass.bin
        
       | eecc wrote:
       | That's what tune2fs is for
       | https://www.unixtutorial.org/commands/tune2fs
        
       | CrLf wrote:
       | This is why the invention of LVM was such a good idea even for
       | simpler systems (where some people claimed it was useless
       | overhead). In my old sysadmin days I _never_ allocated a full
       | disk. The  "menace" of an almost full filesystem was usually
       | enough to incentivize cleanups but, when necessity came, the
       | volume could be easily expanded.
       | 
       | I guess a big file is not a bad idea either.
        
       | midasuni wrote:
       | I do similar, I keep multiple files though - 4GB, 2GB, 1GB and
       | 100M, which I also use for testing speed
        
       | nijave wrote:
       | The real question... Why does Linux or at least the common
       | filesystems get stuck so easily running out of disk space? Surely
       | normal commands like `rm` should still function.
        
         | nemo1618 wrote:
         | > Surely normal commands like `rm` should still function
         | 
         | They do. In my experience, the only disruption to most terminal
         | operations is that tab completion will fail with an error.
        
           | hobofan wrote:
           | They sometimes don't. The article even acknowledges this:
           | 
           | > Copy commands and even deletions can fail
           | 
           | I've had that happen too many times, so I don't know why
           | would fill up my disk with a hacky spacer file, which surely
           | can also fail to be deleted when the disk is already full.
        
         | npongratz wrote:
         | As recently as 2016 I experienced major problems using `rm`
         | with an intentionally-filled btrfs (and current Linux kernel at
         | the time), and per my notes, it was even mounted as `-o
         | nodatacow`:                   # rm -f /mnt/data/zero.*.fill
         | rm: cannot remove '/mnt/data/zero.1.fill': No space left on
         | device
        
       | arthurmorgan wrote:
       | It's a really bad problem on iOS where a full disk won't allow
       | you to delete anything and a reboot puts your phone in a boot
       | loop.
        
       | andimm wrote:
       | ot: the first two links are "swapped"
        
         | sonicrocketman wrote:
         | Thanks for pointing this out. Fixed.
        
       | dominotw wrote:
       | sysadmin version of setting the clock 5 mins ahead?
        
       | johnchristopher wrote:
       | It's the google photos thumbnails db. /s
        
       | harperlee wrote:
       | At work, OneDrive does not sync by policy if there is less than
       | 30Gb free space. Apparently for ensuring space for updates when
       | they come...
        
       | TazeTSchnitzel wrote:
       | I am reminded of a tweet that suggested adding a sleep() call to
       | your application that makes some part of it needlessly slow, so
       | that you can give users a reason to upgrade when there's a
       | security fix (it's 1 second faster now)!
        
         | Black101 wrote:
         | Apple did something like that...
         | https://www.npr.org/2020/11/18/936268845/apple-agrees-to-pay...
        
           | TazeTSchnitzel wrote:
           | They did it so old phones wouldn't suddenly hard-shutdown at
           | 30% battery. I appreciate that they did that, it's very
           | annoying.
        
             | Black101 wrote:
             | That's what Apple likes to say. Luckily they got at least a
             | small fine.
        
       | kristjansson wrote:
       | Lots of comments assailing this approach as a poor replacement
       | for monitoring miss the point. Of course monitoring and proactive
       | repair are preferable - but those are systems that can also fail!
       | 
       | This is a low cost way to make failure of your first line of
       | defense less painful to recover, and seems like a Good Idea for
       | those managing bare-metal non-cattle systems.
        
       | davidmoffatt wrote:
       | Dumb idea. Read the man page for tunefs. The file system has some
       | thing called min free which does the same thing. However this
       | does not interfer with wear leveling. Dummy data does.
        
         | 404mm wrote:
         | Not commenting on whether OPs is sound or not, however tuners
         | implies the now less and less used ext4 (many distros are
         | switching to XFS or btrfs :-/). On another note, that limit
         | applies to non-privileged processes only. Some crap running as
         | root will just fill up the disk too.
        
       | deeblering4 wrote:
       | if you are on an ext filesystem, reducing the reserved percentage
       | on the full filesystem can save the day. its more or less this
       | same trick built in to the filesystem
       | 
       | IIRC 5% is reserved when the filesystem os created, and if it
       | gets full you can run:
       | 
       | tune2fs -m 4 /dev/whatever
       | 
       | which will instantly make 1% of the disk available.
       | 
       | of course should be used sparingly and restored when finished
        
       | AcerbicZero wrote:
       | In most VMware clusters that use resource pools extensively I've
       | always maintained a small emergency CPU reservation on a pool
       | that would never use it, just in case I had to free up some
       | compute without warning.
        
       | raldi wrote:
       | This reminds me of Perl's esoteric $^M variable. You assign it
       | some giant string, and in an out-of-memory condition, the value
       | is cleared to free up some emergency space for graceful shutdown.
       | 
       | "To discourage casual use of this advanced feature, there is no
       | English long name for this variable."
       | 
       | But the language-build flag to enable it has a great name:
       | -DPERL_EMERGENCY_SBRK, obviously inspired by emergency brake.
        
         | Wibjarm wrote:
         | I'd expect the name is also inspired by the sbrk(2) system
         | call, so you can allocate some memory "for emergency use" if
         | needed.
        
         | amock wrote:
         | I think it likely relates to
         | https://en.m.wikipedia.org/wiki/Sbrk.
        
           | raldi wrote:
           | Yes, of course. That's what makes the pun work.
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2021-03-25 23:00 UTC)