[HN Gopher] All my servers have an 8 GB empty file on disk
___________________________________________________________________
All my servers have an 8 GB empty file on disk
Author : sonicrocketman
Score : 524 points
Date : 2021-03-25 18:40 UTC (4 hours ago)
(HTM) web link (brianschrader.com)
(TXT) w3m dump (brianschrader.com)
| joana035 wrote:
| Mind you can also use tune2fs. It has an option "-m" that one can
| tune how much reserved space is dedicated to root user.
| pritambarhate wrote:
| All my servers have an alarm when disk space goes above 70%. It
| sends an email every hour once the disk usage goes above 70%.
| Never had a server go down because of disk space issue after
| adopting this practise.
|
| Also one of the main reasons server disks go full is generally
| log files. Always remember to "logrotate" your log files and you
| will not have this issue that much.
|
| Yes one more thing, for all user uploaded files use external
| storage like NFS or S3.
| ghostly_s wrote:
| Is there a package I can install to set this up?
| Moto7451 wrote:
| Icinga is a common solution for monitoring FS and other use
| metrics. I imagine his setup, if custom rolled, is a shell
| script checking df and sending an email when the usage is at
| or above 70
| tetha wrote:
| This goes into the same vein I was going to point out.
|
| Most uncontrolled space usage comes from logs, users doing user
| things, or something like build servers just eating temporary
| and caching storage for lunch. Databases also tend to have
| uncontrolled space usage, but that tends to be wanted.
|
| So, if you push /var/log to it's own 20-30Gb partition, a mad
| logger cannot fill up /. It can kill logging, but no logging is
| better than fighting with a full /. Similar things with /home -
| let users fill up their home dirs and pout and scream about
| it... but / is still fine. And you can use their input to
| provide more storage, if they have useful workflows.
|
| Something like databases - where their primary use case is to
| grow - need monitoring though to add storage as necessary.
| bonestamp2 wrote:
| > for all user uploaded files use external storage like NFS or
| S3
|
| We send our log files to S3 too. I mean, we write them locally
| (EC2) and then push them to S3 every minute.
|
| Then we have a tool that will let us search the log files in S3
| and it will parse these rotated log files and join together the
| relevant pieces depending on what we're looking for (or all of
| it for a specific time period if we don't know what we're
| looking for).
|
| This is great because if the server goes down and we can't
| access it, or the instance is gone, we can still see log files
| from shortly before the problem occured. We also use bugsnag,
| etc for real time logging and tracking where possible.
| gowld wrote:
| Disk space for an active server is so cheap; why not alert at
| 30%?
| 1123581321 wrote:
| This is clever. Our shaky version of this, historically, has been
| to run ncdu and hastily delete the first large log file we see.
| It's not ideal.
| kernelsanderz wrote:
| ncdu saves my bacon at least every few months. I do machine
| learning and am always running out of space!
| dylan604 wrote:
| sounds like your machine doesn't seem to be learning the
| right things.
| sonicrocketman wrote:
| (that's actually how I solved the original issue that I
| reference in the post, and how I got the idea for this silly
| solution)
| ttyprintk wrote:
| The filesystems where headroom matters are var, tmp and
| sometimes root. I like this strategy with logfiles because
| nethack.log.gz.30 was approximately as important as empty
| space. Keeping another 8gb on root and tmp seems extreme.
| johbjo wrote:
| No. Careful partitioning is the solution to this problem. Monitor
| the growth of your partitions and make sure nothing on rootfs or
| other sensitive partitions grow significantly.
| slaymaker1907 wrote:
| I don't think this has to be an either/or scenario. Having some
| bloat you can get rid of quickly is a nice backup in case your
| monitoring fails for whatever reason.
| reph2097 wrote:
| This is stupid. Just don't make your servers use up all space.
| That's why ext can reserve space for root, 5% by default.
| michelb wrote:
| Showing off that i'm not a sysadmin, but wouldn't a monitoring
| daemon work? Once disk usage grows past a certain uncomfortable
| threshold you get an email/notification to see what's up. I mean
| you obviously are monitoring other server vitals anyway right?
| jodrellblank wrote:
| Cases mentioned below where space fills up quickly due to a
| bug, maybe yes. Outside that there's the problem that you can
| ignore the emails (or be sick, asleep, etc). Worse if they go
| to a team and everyone is busy and assumes someone else will
| deal. Bad if you aren't in charge and tell people in charge and
| they nod and don't decide anything - they prefer to run at
| 70/80/90/95% used indefinitely instead of signing a cheque.
|
| When the drive fills and everything breaks and, you /have/ to
| respond, and it becomes socially ok to make it your highest
| priority and drop everything. An email "only a few weeks until
| maybe it runs out of space" is much harder to prioritise and
| get anyone to care. With this system, the time when it fills
| and breaks has some flex for you to not go down with the
| server, and save your own bacon. It's as much a fix for the
| organization as anything else.
|
| I see this most in smaller company aging systems where they had
| ample storage and drives for a few years ago when they were
| new, now they're crammed full with the growth of software,
| libraries, updates, data, new services being deployed on them,
| increased demands, and nobody wants to commit to more storage
| for an older system towards the end of its warranties, but they
| definitely don't want to commit to the much larger cost of a
| replacement and all the project work, and running at 90% full
| costs nothing and involves no decisions. 91%. 92%.
| willbutler wrote:
| Monitoring is a good idea, regardless. However, there are cases
| where a bug or some other issue can cause disk usage to ramp
| too quickly for someone to respond to an alert.
| 00deadbeef wrote:
| Isn't using LVM and holding some space back a better solution for
| this?
|
| Also I keep databases on their own partition so that nothing else
| can accidentally fill up the space and lead to data loss.
| jstanley wrote:
| Maybe, but author says:
|
| > On Linux servers it can be incredibly difficult for any
| process to succeed if the disk is full.
|
| You won't feel too clever if you come to grow your LVM volume
| into the free space and it won't work because there's no free
| space on the filesystem! :)
|
| (I don't actually know if this would fail or not - but the
| point is "rm spacer.img" is pretty much guaranteed not to
| fail).
| LinuxBender wrote:
| I've used LVM for this purpose plenty of times. lvm2 at least
| has not prevented me from extending a full disk. lvm +
| reserved blocks + a small spacer file are all decent options,
| even better when used together.
| JshWright wrote:
| rm /path/to/big.file is faster than looking up the commands to
| expand the LVM volume and grow the filesystem.
| 00deadbeef wrote:
| They're not hard to memorise. I never look them up.
| nkellenicki wrote:
| I don't use lvm often enough to memorise the commands. "rm
| spacer.img" is short and easy for _anyone_ to remember.
| JshWright wrote:
| Good for you. That's not how my memory works. If I don't
| use a command regularly, I don't trust myself to remember
| it correctly. Even if I did though, that's a multi-step
| process, compared to the single command needed to remove a
| file.
| Spivak wrote:
| Tomato potato. If you use LVM or anything like it to reserve
| space then in your failure situation you have to extend the lv,
| partition, and fs before the space becomes available. More work
| than just rm'ing the file.
|
| I think the ideal ideal is tuning the reserved blocks in your
| filesystem. xfs_io -x -c 'resblks ...
| 00deadbeef wrote:
| It's not really much more work. One command extra.
| pgray wrote:
| tune2fs -m not good enough?
| k__ wrote:
| I remember a discussion here about a dude who did this with
| memory in game development. People didn't like the idea very
| much.
|
| To me it has a taste of domain squatting or GPU scalping, but you
| don't do it with strangers, but your team.
| crabmusket wrote:
| This reminds me of a similar story in a classic Gamasutra
| article[1] (the section is "The Programming Antihero", and I'd
| recommend the other pages of the article for a few good
| chuckles). Apocryphal or not, it makes for a good story.
|
| > I can see how sometimes, when you're up against the wall,
| having a bit of memory tucked away for a rainy day can really
| make a difference. Funny how time and experience changes
| everything.
|
| [1]
| https://www.gamasutra.com/view/feature/132500/dirty_coding_t...
| Denvercoder9 wrote:
| > Copy commands and even deletions can fail or take forever as
| memory tries to swap to a full disk
|
| That's only a problem if your memory is full as well, and even
| then, I've never encountered a server that uses a swapfile
| instead of a swap partition.
| jstanley wrote:
| That's also only a problem if your swap partition is mounted
| from a file on your filesystem, which is an exceedingly
| uncommon configuration.
| yjftsjthsd-h wrote:
| Even a swap file shouldn't matter, since it's still not sparse.
| The one exception is if you're on a system that dynamically
| adds and removes swap files - I believe darwin does that, and I
| _think_ it might be possible to do on Linux(?) but I 've not
| actually seen it done.
| Deathmax wrote:
| Not quite the same situation as described in the article, but
| it is still possible for the kernel to swap memory in and out
| of disk even without a swap file/partition. Memory used for
| storing executable binaries is allowed to be moved out of
| memory, as a copy of it lives on disk. This means you can still
| encounter memory thrashing (and thus system unresponsiveness)
| under low memory situations.
| Denvercoder9 wrote:
| > Memory used for storing executable binaries is allowed to
| be moved out of memory, as a copy of it lives on disk.
|
| On Linux, this is not necessarily the case, as you can change
| the file on disk while the executable is running. I don't
| know if Linux just keeps executable code in memory all the
| time, or if it is smart enough to detect whether a copy of
| executable pages still lives on disk.
| unilynx wrote:
| You should get a "text file busy" error if you try that.
|
| What you can do is delete and then recreate the executable.
| Then the deleted data simply sticks around on disk until
| it's no longer referenced
| benibela wrote:
| I have a dual-boot laptop with windows and linux, and use the
| ntfs partition to share data between them
|
| Recently, I extracted a large archive with Linux on the ntfs, and
| the partition was full
|
| Then Windows did not start anymore
|
| Linux would only mount the partition as read-only, because it was
| marked dirty after the failed start. Finally I found a tool to
| reset the mark, and delete the files.
|
| Now Windows starts again, but my user account is broken. It
| always says "Your Start menu isn't working. We'll try to fix it
| the next time you sign in.", then I sign out, and it is still
| broken
|
| I had to make a new user account
| MayeulC wrote:
| > Finally I found a tool to reset the mark, and delete the
| files.
|
| fsck?
| CodeBeater wrote:
| The gastric balloon of linux servers
| Blikkentrekker wrote:
| > _On Linux servers it can be incredibly difficult for any
| process to succeed if the disk is full. Copy commands and even
| deletions can fail or take forever as memory tries to swap to a
| full disk_
|
| I don't understand this. Swap is either a swap partition, or a
| specific swap file, all of which allocated in advance, so the
| fullness of the storage should have no bearing.
| rrauenza wrote:
| I thought this was gonna be about the obscenely large sparse file
| /var/log/last.
|
| I really wish they would move it from a sparse memmap() file to a
| btree or something.
| AlisdairO wrote:
| One other option is increasing the reserved block count (
| https://ma.ttias.be/change-reserved-blocks-ext3-ext4-filesys...
| ). This has the nice side effect of increasing the space
| available for critical daemons.
|
| If you haven't customised this, in a pinch you can still lower it
| down a bit to buy some time.
| throw0101a wrote:
| ZFS has explicit reservations:
|
| > _The minimum amount of space guaranteed to a dataset and its
| descendants. When the amount of space used is below this value,
| the dataset is treated as if it were taking up the amount of
| space specified by its reservation. Reservations are accounted
| for in the parent datasets ' space used, and count against the
| parent datasets' quotas and reservations._
|
| * https://openzfs.github.io/openzfs-docs/man/8/zfsprops.8.html
|
| These are done on a per dataset basis (basically a directory
| delineated boundary).
| stonesweep wrote:
| I suspect the blog author did not understand this (based on the
| content) - as a Linode user myself, I just had a look at one of
| my VMs and they install with the regular 5% reserved space
| (ext4/Debian).
| birdyrooster wrote:
| Funny because I have always tune2fs -m1 or tune2fs -m0
| because the reserved space was never supposed to scale
| linearly with hard drive capacities and is not useful to
| userspace in anyway. Have never had any issues and been doing
| it for decades in commercial applications. In some cases,
| where you probably shouldn't be using ext3/4 anyways, we are
| talking about reclaiming TBs of reserved space.
|
| It's important to note that mkfs doesn't care if you are
| formatting the root partition or a data volume partition, it
| will still reserve space for the kernel.
| stonesweep wrote:
| Fun trivia: you can actually set per-device settings in
| /etc/mke2fs.conf with all sorts of alternate defaults.
| https://man7.org/linux/man-pages/man5/mke2fs.conf.5.html
| tytso wrote:
| If you try to use a file system to 99% full --- and it
| doesn't matter whether it is a 10GB file system or a 10TB
| file system, you _will_ see significant performance
| penalties as the file system gets badly fragmented. So that
| 's why having a fixed percentage even for massively big
| disks still makes sense.
|
| Disk space is cheap enough that even 5% of a 14TB disk is
| really not that much money --- and if you see bad
| performance, and then have to pay $$$ to switch to an SSD,
| maybe it would have been much cheaper to use a HDD with a
| larger free space reserve....
| sonicrocketman wrote:
| I did not know this. Good solution.
| GekkePrutser wrote:
| I know about this, but I do think it's not a bad idea doing
| what he does because the reserved block count is for root and
| most server processes still run as root. And it's usually them
| that are causing the disk to fill. Though I suppose this also
| makes the problem itself more prominent in the first place. I
| guess if you run into this a lot, stricter monitoring would be
| a better solution.
|
| The way I found out about it originally was because I was using
| external storage drives and I was never able to fit as much as
| I expected :D
|
| Luckily you can easily change this without reformatting.
| chousuke wrote:
| What servers usually run as root? Some may start as root, but
| usually drop privileges for the actual server processes
| quickly, eg. apache, nginx, sshd.
|
| Nothing that actually does the "serving" or accesses data
| should be running as root.
| GekkePrutser wrote:
| No but the logfile writers are usually running as root
| AFAIK. And this is what tends to fill up the disk.
| edoceo wrote:
| Mine don't run as root.
| vntok wrote:
| You should fix it then, takes no time at all.
| repiret wrote:
| Thats not really something that needs fixing.
| derefr wrote:
| On systemd systems, logfiles are written to disk under
| the journald user, `systemd-journal`.
| [deleted]
| kiwijamo wrote:
| Is that true for all logfiles? I still have plenty of
| daemons (by default) writing directly to some file in
| /var/log eg EXIM, Apache, and the like. Also plenty of
| system stuff still write to files in that directory. And
| yes this is a machine that uses systemd.
| comex wrote:
| But those daemons don't usually have their own log writer
| processes running as root, do they? Instead, either the
| log file is accessible by the user the daemon is running
| as, or the daemon opens the log file as root before
| dropping privileges for the rest of its operation.
| stonesweep wrote:
| Most vendors (Debian/Ubuntu, RHEL/clones, etc.) add a
| hook into rsyslog to be a partner with the systemd logger
| and write out text files next to the journal - they
| realize that a lot of people dislike dealing with
| journalctl (I'm one of them) and provide an alternate
| hook already installed and working for you behind the
| scenes.
|
| This is for daemons using syslog methodology, not direct
| writers like apache/nginx/mysql/etc; think more like
| cron, systemd, chrony, NetworkManager, and so forth. The
| vendors are not all aligned on what goes where (example:
| on RHEL, pacemaker/crm write to their own logs buy on
| openSUSE they're sent to syslog) - the actual results
| differ slightly from vendor to vendor.
|
| DIY distros like Arch do not implement the rsyslog
| backend by default, you have to set it up yourself
| following the wiki - only journalctl is there by default.
| GekkePrutser wrote:
| Ah good point, I use Alpine on all my servers so it's
| more traditional logs.
| znpy wrote:
| It used to be common, before "the cloud", to have many
| apparently unnecessary partitions in a server install. One
| for /, one for /var, one for /home, one for swap at the low
| sector numbers...
|
| The idea is that /var filling up would not make the system
| unrecoverable.
| zeta0134 wrote:
| If you happen to use ext as your default filesystem, check the
| output of tune2fs; it's possible your distro has conveniently
| defaulted some 2-5% of disk space as "reserved" for just such an
| occasion. As the root user, in a pinch, you can set that to 0%
| and immediately relieve filesystem pressure, buying you a little
| bit more time to troubleshoot whatever the real problem is that
| filled the disk in the first place.
| tomaszs wrote:
| Interesting. I would try to notify myself when space is getting
| low. But I like the solution because of it's simplicity
| loloquwowndueo wrote:
| If you don't have monitoring to tell you when the disk is more
| than X% full, then you're at risk for more failure scenarios than
| just a full disk (usually trivial to buy time by deleting old
| logs).
| ed25519FUUU wrote:
| Happens all of the time even with monitoring. Somebody enables
| debug monitoring and it fills up in 3 minutes.
| macintux wrote:
| If the problem arises during a migration or other significant
| event, which it sounds like Marco's did, the alert will usually
| be triggered just in time to tell you why you're already in a
| world of pain.
| badcc wrote:
| This trick has certainly saved me more times than I am willing to
| admit! I usually roll with: `fallocate -l 8G
| DELETE_IF_OUT_OF_SPACE.img`
| innomatics wrote:
| Nice. I liked the trick in the article, but was wondering if it
| might confuse the next admin who needs to figure it out.
|
| Was also thinking of customising the login greeting to mention
| the file (/etc/motd).
| fggg444 wrote:
| it feels like setting your watch 5 minutes fast, it's not a real
| solution
| dylan604 wrote:
| can you do this on smartwatches? I know someone that went to
| full extreme of hour push, but they did this by setting their
| system time to the next time zone over.
| cmckn wrote:
| This really goes to show, there is more than one way to skin a
| cat. Yeah the guy could probably overhaul his entire approach to
| system administration, but also...this works. Well-placed hacks
| are maybe my favorite thing.
| IncRnd wrote:
| This solution is what creates the problem. If you want warnings
| when the free disk space is low, set-up warnings for when the
| free disk space is low.
| londons_explore wrote:
| Linux has this built in...
|
| By default, only root can use the last 5% of disk space.
|
| That means you can fire up a root shell and know you have a
| buffer of free space to resolve the issue.
| mamon wrote:
| But now we have Docker, which means all the containerized
| workflows will run as root....
| londons_explore wrote:
| I suspect you need to be root in the root user namespace...
| So docker doesn't get this special power...
| cmeacham98 wrote:
| Docker does not use user namespaces by default (and some
| features are unavailable when using them).
| cmeacham98 wrote:
| I believe this is an ext{2,3,4} feature. Unsure if it exists on
| btrfs, zfs, etc.
| tshaddox wrote:
| How is this better than sounding alarms when free disk space
| drops below 8GB? If you're going to ignore the alarms, then
| you're going to have the same problem after you remove your
| spacer file and the disk fills up again!
| [deleted]
| _wldu wrote:
| It requires far less configuration.
| tshaddox wrote:
| How so? You'll presumably need to configure some way to be
| notified when the disk space is full anyway.
| lostcolony wrote:
| It isn't either/or. It's very likely both.
| busterarm wrote:
| Sometimes your alarms are broken due to misconfiguration.
| tshaddox wrote:
| Both solutions assume that you will have some way of knowing
| when the disk is full. Whether the "alarm" is an automated
| health monitoring system, or an angry customer calling your
| cell phone, there's no point in discussing how to solve
| problems without assuming that you have some way of knowing
| there a problem exists.
| frenchie14 wrote:
| This would work if you have sufficient time between alarm and
| failure. If some issue or process uses up all of your available
| disk space in a short time span, you won't have that luxury.
| Hopefully, the author is using alerts on top of having this
| failsafe
| zokier wrote:
| Compare the rate an haywire process can fill up disk to your
| response time to alarms, and you got your answer right there.
| tshaddox wrote:
| I don't understand. You will still have an alarm when the
| disk fills up, and you will need to respond and delete the
| spacer file. Your response time latency will be the same,
| right?
| luckylion wrote:
| Okay, so now you have a disk full, only become aware of it
| when it's full and your database throws errors. You have an
| easy way to fix it, just delete the spacer file. But what
| good does that do? You're still in the mess where your
| database is really unhappy.
|
| On the other hand, if your monitoring was set up well, you
| got a notification and had time to react to it _before_ it
| was at 100%.
|
| Granted, if you have a process that just wrote a file at
| maximum speed, that time window is tiny, but that's not
| usually what happens in my experience. What happens is that
| something starts logging more and it slowly builds up while
| you're happy that your server is running so well that you
| don't need to pay attention. And then the alert comes and
| tells you that there's less than 10% space available, and you
| have plenty of time to investigate and avert the crisis.
| vineyardmike wrote:
| >You have an easy way to fix it, just delete the spacer
| file. But what good does that do?
|
| You solve the issue right then and there. Step 1. realize
| there is a space issue and get to terminal Step 2. free
| space so any solution has memory Step 3. Solve by doing
| <??? specifics ???>
| tshaddox wrote:
| So if your alarm sounded when there was 8 GB of free disk
| space (instead of 0 GB), then you could still respond in
| the same amount of time and you would still have an
| additional 8 GB worth of padding while you determined the
| root cause. The only difference is that you wouldn't need
| to actually go in and delete the spacer file (and
| potentially have downtime in the time it takes you to
| delete the spacer file).
|
| Another way to think of this is that you have the 8 GB
| spacer file, but when the disk fills up the spacer file
| is automatically deleted and your alarm goes off. Which
| is literally the same as having your alarm go off when
| free disk space reaches 8 GB.
| alvarlagerlof wrote:
| Sorry, but I cannot read this at all. Please increase the font
| thickness.
| kaydub wrote:
| Stories like this and my own past memories make me so happy to
| work somewhere big.
| Something1234 wrote:
| I have an empty leader on my hard drive so that I can recover if
| I accidentally nuke the front of it with dd while making a live
| usb. So it's not a bad idea, and it's super effective so far it
| hasn't been tested, and hopefully I never will need to.
| ttyprintk wrote:
| A good reason to partition swap before /boot.
| aidenn0 wrote:
| This won't work with ZFS, as it may be impossible to delete a
| file on ZFS when disk is full. The equivalent in ZFS is to create
| an empty dataset with reserved space.
| throwaway525142 wrote:
| For me, it was possible to truncate -s 0 a large file on a full
| disk with ZFS.
| davemtl wrote:
| A way to prevent this is to create a dataset and reserve n
| amount of space, typically 10-20% and set it read-only.. before
| the pool gets full. Then when the pool fills up, you can reduce
| the reservation to be able to clean up files.
| hikarudo wrote:
| Thanks, I've been wondering about the "proper" way of doing
| this in ZFS and this method hadn't come up in my searches.
| geocrasher wrote:
| For everyone saying "This isn't a real solution!" I'd like to
| explain why I think you're wrong.
|
| 1) It's not intended to be a Real Solution(tm). It's intended to
| buy the admin some time to solve the Real Issue.
|
| 2) Having a failsafe on standby such as this will save an admin's
| butt when it's 2am and PagerDuty won't shut up, and you're just
| awake enough to apply a temp fix and work on it in the morning.
|
| 3) Because "FIX IT NOW OR ELSE" is a thing. Okay, sure. Null the
| file and then fill it with 7GB. Problem solved, for now.
| Everybody is happy and now I can work on the Real Problem: Bob
| won't stop hoarding spam.
|
| That is all.
| tlibert wrote:
| Real Solutions (tm) are indeed nice, but hackers get shit done
| - this is an utterly shameless hack, and I do it myself.
| keeperofdakeys wrote:
| I find that either a server needs more space, or has files that
| can be deleted. For the former you just increase the disk
| space, since most things are VMs these days and increasing
| space is easy. For the latter you can usually delete enough
| files to get the service backup before you start the proper
| cleanup.
|
| If you really need some reserve space (physical server), I'd
| much rather store it in a vg (or zfs/btrfs subvolume). Will you
| remember the file exists at 2am? What about the other admins on
| your team?
| cbo100 wrote:
| > Will you remember the file exists at 2am? What about the
| other admins on your team?
|
| Hopefully if you were doing something like this it would be
| part of your standard incident response runsheet/checklist.
| luckylion wrote:
| > 1) It's not intended to be a Real Solution(tm). It's intended
| to buy the admin some time to solve the Real Issue.
|
| If you don't have monitoring, will you even be aware that your
| disk is filling up?
|
| If you do have monitoring, why are you artificially filling up
| your disk so that it will be at 100% more quickly instead of
| just setting your monitoring up to alert you when it's at
| $whateverItWasSetToMinusEightGB?
| ben509 wrote:
| One argument in favor of it is the 8GB file may cause a
| runaway process to crash, leaving you without it continuing
| to chew up space and able to recover.
|
| A second argument is it's not opened by any process. One
| problem I've had fixing disk full errors was figuring out
| which process still had a file open.
|
| (For any POSIX noobs: the space occupied by a file is
| controlled by its inode. Deleting a file "unlinks" the inode
| from the directory, but an open filehandle counts as a link
| to that inode. Until all links to the inode are deleted, the
| OS won't release the space occupied by the file. Particularly
| with log files, you need to kill any processes that have it
| open to actually reclaim the disk space.)
| [deleted]
| smarx007 wrote:
| An extra failsafe? You can do both. What if your cron/netdata
| are not forwarding emails for some reason (eg nullmailer gets
| errors from Mailgun)?
| luckylion wrote:
| Right, but again, what good does the spacer file do if
| you're not aware that you're running low on disk space?
| That is: if your monitoring isn't working, how do you know
| that you need to quickly make room?
|
| And if your monitoring is working correctly, the spacer
| file really serves no purpose other than lowering the
| available disk space.
| smarx007 wrote:
| 1. When your DBMS is no longer responding to queries,
| your boss and your customers replace your monitoring
| system (unlimited free phone calls 24/7 included ;). Case
| in point: HN is often a better place to check than Google
| Cloud status page, for example.
|
| 2. Maybe you didn't get it, but "nullmailer not
| forwarding cron email due to mailgun problems" was a bit
| too specific to be an example I just made up, wasn't it?
| Again, the premise "if your monitoring is working
| correctly" is not a good one to base your reasoning upon.
| Especially if you have 1 VM (VPS) and not a whole k8s
| cluster with a devops team with rotational on-call
| assignments.
| CJefferson wrote:
| The reason was, I thought, discussed in the article.
|
| When you actually fill up your disc, many linux commands
| will simply fail to run, meaning getting out of that
| state is extremely difficult. Deleting the file means you
| have room to move files around / run emacs / whatever, to
| fix the problem.
| pvorb wrote:
| Somebody _will_ notify you. If the service is just for
| yourself, you don 't need monitoring at all.
| luckylion wrote:
| Yes, yes, but they will notify you _after_ your service
| is down (because that 's when they notice), in part
| thanks to a spacer file that eats up available disk space
| without being of any use. A monitoring service would
| notify you _before_ your service is down, users grab
| pitchforks and start looking for torches.
|
| I understand the benefit to be able to quickly delete
| some file to be able to run some command that would need
| space, though I find that highly theoretical. If it's
| your shell that requires space to start, you won't be
| able to run the command to remove the spacer, and once
| you're in the shell, I've never found it hard to clean up
| space; path autocompletion is the only noticeable victim
| usually. And at this point, the services are down anyhow,
| and you likely don't want to restart them before figuring
| out what the problem was, so I don't see the point of
| quickly being able to make some room.
|
| It feels like "having two flat tires at the same it is
| highly unlikely, so I always drive with a flat tire just
| to make sure I don't get an unforeseen flat tire". It's
| cute, but I'd look for a new job if anyone in the company
| suggested that unironically.
| ineedasername wrote:
| Because even if you have monitoring, some unforseen issue
| rapidly eating disk space at 3:00 am may not give you the
| time to solve it without downtime or degraded performance
| unless you can _immediately_ remove the bottleneck while you
| troubleshoot.
| tshaddox wrote:
| Then why not automate the removal of the 8 GB spacer file
| when the disk gets full? Or in other words, just sound your
| alarms when there is 8 GB of free disk space.
| ineedasername wrote:
| I actually suggested exactly that in another comment,
| thoigtnto do it in stages: 4gb with an alarm, the more
| alarms and the other 4gb if not resolved.
| marricks wrote:
| Monitors can fail, you can miss an email, etc etc etc
|
| There's always a big gap between what should never happen
| because you planned well and what does happen
| apocalyptic0n3 wrote:
| Besides runaway log files that aren't being properly rotated,
| human error can cause it too. I managed to completely eat up
| the disk space of one of our staging servers a few weeks ago
| trying to tar up a directory so I could work on it locally.
| Didn't realize the directory was 31GB and we only had 25GB of
| space. By the time the notification for 80% usage was
| triggered (no more than 2 minutes after we hit 80%), the
| entire disk was full. Luckily it was just a staging server
| and no real harm was done, but such a mistake could have just
| as easily been made on a production server. In this case, the
| obvious solution is to just delete the file you were creating
| but if you're running a more complicated process that is
| generating logs and many files, it may not be so easy and
| this 8GB empty file might be useful after you cancel the
| process.
| tehjoker wrote:
| This reminds me of the reserve tank toggle on some motorcycles.
| When you run out of gas, you switch the toggle and drive
| directly to a gas station.
| hprotagonist wrote:
| The bikes I've had that have had reserve tanks have also been
| old enough to raise the disconcerting follow-on question,
| which is: "is the reserve gas also full of sludgey crap
| that's settled in the tank and hasn't been disturbed really
| in a year, and am i about to run that through my poor carbs?"
| jedberg wrote:
| My friend had a truck with a reserve tank, but it was the
| same size as the main tank, so he would just flip the
| switch at every fill up to make sure they both got used.
| rcthompson wrote:
| Motorboat fuel tanks have a reserve as well. It's just a
| raised area that splits the bottom of the tank into 2
| separate concave areas. One of the concave areas contains the
| end of the fuel line, and the other doesn't. When you run out
| of gas, you tip the tank up to dump the remaining gas from
| the other basin into the main one, and then you restart the
| engine (or keep it from stopping at all if you're quick
| enough on the draw) and head for the docks.
| 0_____0 wrote:
| always fun when you're barreling down the highway and the
| engine starts to lean out, prompting you to hurriedly locate
| and switch the petcock over before the engine stalls
| completely.
|
| suppose then that you go fill up and forget to set the
| petcock back to normal. 8ball says: "I see a long walk in
| your future."
| geocrasher wrote:
| I once put a new fuel pump in a Chevy pickup with two tanks
| on the side of the road because I was switched to the empty
| tank. Good times.
| jessaustin wrote:
| IME it doesn't take too many hikes to learn that part of
| the procedure for turning off the engine is "turn the fuel
| switch off reserve".
| 0_____0 wrote:
| out of years of riding it's only happened to me a couple
| times.
|
| one time i was eastbound on the bay bridge when my bike
| started to sputter. i'd just reassembled the tank and had
| left the screw-style reserve fuel valve open, so there
| was no reserve fuel to be had. a very kind lady put her
| blinkers on behind me and followed as i coasted the last
| few hundred yards toward yerba buena island.
|
| i pushed my bike up the ramp and looked in the tank to
| assess. it's a dirtbike, so the tank has two distinct
| "lobes" to accomodate the top tube of the frame. I had a
| few ounces in the tank but they were not in the lobe with
| the fuel pickup, so i dumped the bike on its side to get
| the fuel to slosh over to where i wanted it.
|
| i got back on the highway and, going quite slowly and
| gently, managed to get to the gas station at west oakland
| bart, the engine leaning out and sputtering right as i
| rolled into their lot.
| dnautics wrote:
| Surprised there isn't a mechanism that mechanically
| switches the petcock over when you put a fuel nozzle up to
| the port
| mplewis wrote:
| Most motorcycles with a manual petcock are very manual in
| nature. Often this is to minimize the number of moving
| parts that could die on you if you take it into rural
| areas. An automatic petcock adds more complexity that
| could cause a malfunction.
| xkcd-sucks wrote:
| Typically there aren't two separate tanks - In one tank
| there are two tubes at different heights. As the fuel
| level falls below the height of the "main" tube the
| engine sputters, then turning the petcock engages the
| lower down "reserve" tube which is still below the fuel
| level. It's more of a warning than a true reserve, and
| most bikes with an actual fuel gauge don't have a
| reserve.
| quesera wrote:
| Most motorcycles are surprisingly manual. This was
| originally a necessity (like in cars), but remains
| aesthetically preferable for many riders.
|
| OTOH, Honda Goldwings have stereo systems. They might
| grow an automatic fuel reserve switcher-backer someday
| too. :)
| abruzzi wrote:
| Fuel injected motorcycles don't have reserve (at least,
| none that I've seen.) instead they have low fuel lights
| or full fuel gauges. I'm guessing it's because the fuel
| pumps are in the tank and the fuel injection system needs
| high pressure.
| ericbarrett wrote:
| Fuel injectors require filtered gas because even small
| particles can clog them, and said filter is more likely
| to be clogged or even compromised by sucking up the last
| drops of fuel (and scale and debris) in the tank, so the
| low-fuel warning is required.
|
| Carb jets can get clogged, too, but are wider since
| they're not under as much pressure. Also, since they're a
| wear item they're a lot easier to clean and/or replace.
| names_are_hard wrote:
| Many new bikes come with a lot of rider aids for safety
| (ABS, TCS) as well as all kinds of electronics (fuel
| maps), so this is changing. But of course manual
| transmission won't go away until bike are electric.
|
| I am one of those who likes things old school. My bike
| still has a carburetor, has no fuel light or tachometer,
| and I have certainly had some practice reaching down to
| turn the fuel petcock to reserve while sputtering on the
| highway. If they didn't intend for me to do that, why did
| they put it on the left side? :)
| driverdan wrote:
| Some newer bikes, like mine, don't have a reserve
| petcock. They have a low fuel light. No forgetting about
| the petcock and an obvious warning light instead of
| sputtering.
| bigiain wrote:
| Some older bikes, like my '99 Ducati Monster, don't have
| a petcock. It has a low fuel light that first failed in
| around 2002, and for which that part that fails (the in-
| tank float switch) stopped being available in about 2015
| or so. No petcock _or_ warning light. (And that trip
| where the speedo cable fails so I couldn't even use thew
| trip meter to estimate fuel requirements was a fun
| one...)
| dilyevsky wrote:
| Setup proper monitoring and never get to the Real Issue to
| begin with. These sysadmin hacks are not helpful
| berkes wrote:
| "proper monitoring" is extremely broad. And, I would say,
| almost unreachable goal.
|
| You have it mail you when it goes over 80% disk usage (and
| what if you are on holiday)? Does it mail all colleagues? Who
| picks it up (I thought Bob picked it up, but Bob thought Anne
| picked it up. So no one did)? Does it come and wake you in
| person when it reaches 92%?
|
| Will this catch this async job that fails (but should never)
| in an endless loop but keeps creating 20MB json files as fast
| as the disk allows it to?
|
| Is it an alerting that finds anomalies in trends? Will it be
| fast enough for you to come online before that job has filled
| the disk?
|
| I've been doing a lot of hosting management and such. And
| there is one constant: all unforeseen issues are unforeseen.
| geocrasher wrote:
| > I've been doing a lot of hosting management and such. And
| there is one constant: all unforeseen issues are
| unforeseen.
|
| I work in hosting too, and have been for a long time. I
| feel ya.
| dilyevsky wrote:
| Slack warning/ticket at 75%, page at 85% (to oncall
| obviously). Don't let user workload crap into your root
| partition. I've been doing this for over 10 years and
| managed many thousands of nodes and literally don't recall
| full disk problem unless it was in staging somewhere where
| monitoring was deliberately disabled.
| geocrasher wrote:
| In a perfect world, this is true. But we don't have one of
| those.
| [deleted]
| solidasparagus wrote:
| This is one of those great solutions where they got 90% of the
| value of the Real Solution(tm) with 5 minutes of work.
| diego wrote:
| This points to a much more serious problem. This is 2021 and the
| technology is from the 90s, with a really poor user experience
| design. Your car warns you when you're low on fuel, but your
| server doesn't if you're low on critical resources.
| dmingod666 wrote:
| Exactly, it's 1990s 'cool' - the time it took him to write the
| blog, he could have written a script that would send him
| updates on all his devices...
| ineedasername wrote:
| There's no reason not to have multiple fail-safes. Receiving
| the alert on a device at 3am would still mean he could free
| up 8gb immediately and have breathing room to solve the
| problem. And remember this is for a single admin. Asking such
| a person to be on call 24-7 all year, vacations, holidays,
| weekends... Having a quick way to get breathing room can
| significantly reduce the stress & cognitive load of worrying
| about such things in your off-time.
| dmingod666 wrote:
| He didn't mention he has alerts. Sure if alerts is your
| first line of defense, this is a nice thing to do.
| rozap wrote:
| Everyone has this kind of alerting set up, but that's not the
| point. The beauty of this solution is that it's dead simple and
| will never fail. Alerting can fail or be ignored.
|
| It's the same as old VW beetles which had a reserve gas tank.
| When you ran out of gas you opened a valve and you could limp
| to a gas station. Less likely to fail versus a 1950's era gauge
| that is telling you you're low. Also impossible to ignore it.
| dmingod666 wrote:
| The 'beauty' artificially chokes your HDD and produces the
| same problems that you are trying to avoid.. not a sane way
| to proactively manage your disk usage.
| goatinaboat wrote:
| _It 's the same as old VW beetles which had a reserve gas
| tank. When you ran out of gas you opened a valve and you
| could limp to a gas station_
|
| In scuba diving there used to be "J-valves". When you had 50
| bar left in the tank they would cut out. Then you would pull
| to reenable your air and return to the surface.
| Unsurprisingly they are no longer popular.
| mypalmike wrote:
| Same was true of most motorcycles until rather recently,
| though with motorcycles it was rare that there was a fuel
| gauge at all. A sputtering engine was how you knew it was
| low. And I believe that like with motorcycles, the "reserve
| tank" in an old Beetle is really the same tank - there are
| two hoses located in the tank at different heights.
| copperfoil wrote:
| > The beauty of this solution is that it's dead simple and
| will never fail. Alerting can fail or be ignored.
|
| It's not that straightforward IMO. Would this file be deleted
| before the space is filled? If so, there is alerting in
| place, and it assumes there's a way to delete files before
| space fills up. If this file is deleted after space fills up,
| how is this different from not having the file, other than
| making finding files to delete easier? Then what happens
| after that? If you delete the file and realize there's
| nothing else to delete, you'd have to solve the problem the
| same way if you didn't use this method.
| vineyardmike wrote:
| >you'd have to solve the problem the same way if you didn't
| use this method.
|
| What if the solution required some amount of free space?
| (eg. installing a package or swap)
| mtone wrote:
| Assuming we're talking about VMs (2021 etc.), for a SME is
| there any downside to giving 2TB of space to your discs and let
| dynamic allocation do the work?
|
| Perhaps consolidate/defrag once a year. Even monitoring total
| usage more often than that is probably not worth the effort -
| just buy ample cheap storage.
|
| Also, there was a tradition to split drives into OS, DB, DB
| Logs. That was mostly a rust performance thing and these days
| is probably just voluntary management overhead.
|
| RAM is another story.
| jodrellblank wrote:
| If you are using less space than the underlying datastore,
| there's no benefit to dynamic allocation, you may as well
| give the servers larger fixed disks. If you are thinking that
| one server might need more than the fixed size for a sudden
| growth, then you need to be monitoring to deal with that
| because that will run out of your space. If you are
| overprovisioning the datastore, you have the same problem at
| a level lower, and need to be monitoring that and alerting
| for that instead (as well).
|
| > " _just buy ample cheap storage_ "; " _That was mostly a
| rust performance thing and these days is probably just_ "
|
| In the UK a 6TB enterprise rust disk is PS150 and a 2TB
| enterprise SSD is PS300, it's 6x the price to SSD everything,
| and take 3x more drive bays so add more for that. And you can
| never "just" buy more storage than you ever need - apart from
| the obvious "when you bought it, you thought you were buying
| enough, because if you thought you needed more you would have
| bought more", so that amounts to saying "just know the future
| better", but it can't happen because Parkinson's Law ("work
| expands so as to fill the time available for its completion")
| applies to storage, the more there is available, the more
| things appear to fill it up.
|
| Room for a test restore of the backups in that space. Room
| for a clone of the database to do some testing. Room for a
| trial of a new product. Room for a copy of all the installers
| and packagers for convenience. Room for a massive central
| logging server there. What do you mean it's full?
| qw3rty01 wrote:
| One VM using excessively more disk space than it's supposed
| to can potentially cause data corruption in all the other VMs
| on that system. For just spinning VMs up and down for
| testing, you probably won't run into that issue, but on a
| production system, it could potentially cause some massive
| downtime
| cure wrote:
| Virtual machine disk space (e.g. Xen, Linode, AWS EC2, or
| similar) does not work this way. Each VM gets a dedicated
| amount of disk space allocated to it, they don't all share
| a pool of free space.
| jodrellblank wrote:
| Yes they do with the "dyanmic allocation" the parent
| comment mentions; VMware datastore has 1TB total, you put
| VMs in with dynamically expanding disks they are sharing
| the same 1TB of free space and will fill it if they all
| want their max space at the same time and you've
| overprovisioned their max space.
|
| And if you haven't overprovisioned their max space, you
| may as well not be using dynamic allocation and use fixed
| size disks.
|
| Even then, snapshots will grow forever and fill the
| space, and then you hope you have a "spacer.img" file you
| can delete from the datastore, because you can't remove
| snapshots when the disk is full and you're stuck. It's
| the same problem, at a lower level.
| cure wrote:
| I see, a VMware feature, thanks for clarifying. I suppose
| it's a nice idea in theory, but you'd have to be crazy to
| use that in production, or for any workload that you care
| about. It would just be a ticking time bomb.
| Yizahi wrote:
| Alerting is also a hack really. In 2021 the operating SYSTEM
| should work as a system - complexly managing it's resources and
| make intelligent decisions. Ideally OS should dynamically
| reserve as much resources as needed on it's own.
| copperfoil wrote:
| Linux servers aren't like mass consumer products. It's assumed
| users know what they're doing and can build and configure what
| they need on top of it.
|
| > This is 2021 and the technology is from the 90s I don't see
| how this is a valid point. Is integrated circuit technology
| outdated because it was developed in the 60s?
| lostcolony wrote:
| You car also doesn't drop from "alarm" to "empty" in thirty
| seconds. A HD on a VM with a bad behaving process can.
| tyingq wrote:
| Careful how you create it. Several ways to create large files can
| make a sparse file, which I don't think removing will actually
| help later.
| zepearl wrote:
| I think that in the past I saw that when creating a file with
| e.g. ... dd if=/dev/zero of=deleteme.file bs=1M
| count=8196
|
| ...the "free space" shown by "df" slowly decreased while the
| file was being created, but then once the operation completed
| that "free space" magically went back to its original value =>
| the big existing file (full of "0"s) was basically not using
| any storage.
|
| Is this what you mean?
|
| I just tried to replicate this behaviour but, dammit, I cannot
| demonstrate that right now as the behaviour so far was the
| expected one (free storage decreasing when creating the file
| and sticking to that even after the completion of the
| operation).
|
| I strongly believe that that's what I saw in the past (when I
| was preallocating image files to then be used by KVM VMs), but
| now I'm wondering if I'm imagining things... :P
|
| EDIT: this happened when using ext4 and/or xfs (don't remember)
| without using any compression.
| tyingq wrote:
| dd will create sparse files if you use the seek option, like:
| dd if=/dev/zero of=a_sparse_file bs=1 count=0 seek=8G
| beervirus wrote:
| > even deletions can fail or take forever
|
| > in a moment of full-disk crisis I can simply delete it and buy
| myself some critical time to debug and fix the problem
|
| Uhh...
| cwt137 wrote:
| In theory, this is a good idea, but doesn't protect you in all
| cases. I have had instances on a few of my application servers
| where an event happened that dumped GB's worth of log data to the
| log files in a matter of a couple of minutes and filled up the
| drive (Thanks fast SSDs!). If I employed the strategy in the
| article, it would have only bought me a couple of more minutes
| worth of time, if that!
| mgarfias wrote:
| Why not keep an eye on the disk and expand the fs before it goes
| south?
| jbverschoor wrote:
| Because sometimes it can fill up quite quickly. And the extra
| space will give to the headroom you need. Will def do this on
| all my servers
| qwertox wrote:
| Yes. I like the idea very much and also think that I will do
| this with a couple of my machines.
| terramex wrote:
| Ah, the classic 'speed-up loop' approach:
| https://thedailywtf.com/articles/The-Speedup-Loop
|
| About the blogpost itself:
|
| _The disk filled up, and that 's one thing you don't want on a
| Linux server--or a Mac for that matter. When the disk is full
| nothing good happens._
|
| I had this happen few times on a Mac and every time I was shocked
| that if disk gets full you cannot even delete a file and the only
| option is to do a full system reboot. I was also unable to save
| any open file, even to external disk and suffered minor data loss
| every time due to that.
|
| What is the proper way of dealing with such issue on macOS? (or
| other systems, if they behave the same way)
| ghostly_s wrote:
| MacOS has given the user nagging "startup disk is almost full "
| prompts for as long as I can remember, yours doesn't?
| dylan604 wrote:
| And users have been ignoring that message as long as MacOS
| has been giving them. Maybe even longer
| whartung wrote:
| _I had this happen few times on a Mac and every time I was
| shocked that if disk gets full you cannot even delete a file
| and the only option is to do a full system reboot. I was also
| unable to save any open file, even to external disk and
| suffered minor data loss every time due to that._
|
| This just happened to me. I got the best error message I've
| ever seen. Something akin to "Can not remove file because the
| disk is full." This wasn't from the Finder, this was command
| line rm.
|
| On the Mac it's also exacerbated by the fact that swap will use
| the system drive and can fill up the disk, and can not be
| stopped. If you have some rogue process consuming RAM, among
| other things, your disk will suffer until it is full. And, as
| mentioned, macOS does not behave well with a full disk.
|
| And, even if you've remedied the swap issue (i.e. killed the
| process), there's no way I know to recover the swap files
| created without restarting.
|
| Just seems like the design is trouble waiting to happen, and it
| has happened to me.
|
| When this last happened, somehow it managed to corrupt my
| external Time Machine volume.
| TheAdamAndChe wrote:
| I don't know with Mac, but this is why many Linux distros
| recommend putting /home is on a separate partition. If it
| fills, it won't lock up the whole system.
|
| Fun story with this. Ubuntu now has an experimental root-on-zfs
| feature. I installed it and started playing with some docker
| containers, trying to compile a certain version of pytorch.
| Suddenly, my computer crashed. Apparently, my root partition
| filled because docker installed everything on the same
| partition as my OS, crashing everything immediately.
| davidelettieri wrote:
| I always thought that database files should be on a different
| drive from the os. If the db fills up the HD, the os is still
| running smoothly.
| lazyweb wrote:
| Yep, ideally you'd have seperate partitons for /var, /tmp,
| /home, root, any application/db data ..
| monksy wrote:
| To prevent the root fs block from filling up. That's why I always
| partition my home+var+opt partition away from the root partition.
| anonymousisme wrote:
| One thing that many Linux/Unix users do not know is that all
| commonly used filesystems have a "reserved" amount of space to
| which only "root" can write. The typical format (mkfs) default is
| to leave 5% of the disk reserved. The reserved space can be
| modified (by root) any time, and it can be specified as a block
| count or a percentage.
|
| As long as your application does not have root privileges, it
| will hit the wall when the free+reserved space runs out. Instead
| of the clumsy "spacer.img" solution, one could simply
| (temporarily) reduce the reserved space to quickly recover from a
| disk full condition.
| reph2097 wrote:
| Of course the application is running as root, duh.
| gkarthik92 wrote:
| So far, my first stop to temporarily get more disk space was to
| reduce the size of the swapfile which on a lot of servers seems
| to be allotted >1x the requirement.
|
| Will be switching to this hack! Perfect illustration of the KISS
| principle (Keep it simple, stupid).
| bluedino wrote:
| Useful for people who still do 2x physical memory and you have
| a server with 64+ Gb
| rags2riches wrote:
| Back in my early university days the disks always seemed to be
| full at inconvenient times on the shared Unix systems we used.
| Some students resorted to "reserving" disk space when available.
| Which of course made the overall situation even worse.
| Saris wrote:
| It's interesting to me that linux doesn't natively reserve a
| little space to allow basic commands like directory listing and
| file deletion to function even with a full disk.
|
| Because really the biggest problem when I've had a partition get
| full, is I sometimes can't even delete the offending log file.
| myself248 wrote:
| It still boggles my mind that the act of deleting a file can
| fail because it requires space to perform the act.
|
| If y'ask me, that's a fundamental design flaw. Of course nobody
| asked me...
| chungy wrote:
| Depends entirely on the design of the file system. In copy-
| on-write file systems, it's a necessity: you need to at least
| allocate a new metadata block that doesn't record the
| existence of some file... and that's assuming you don't have
| snapshots keeping it allocated anyway.
|
| You can run into real trouble on btrfs if you fill it, it has
| no reserve space to protected from this scenario. ZFS at
| least reserves a fraction of the total space so that deletes
| are allowed to work even when the pool reaches 100% capacity.
| SethTro wrote:
| Same idea as this game development legend
|
| https://www.dodgycoder.net/2012/02/coding-tricks-of-game-dev...
|
| > he had put aside those two megabytes of memory early in the
| development cycle. He knew from experience that it was always
| impossible to cut content down to memory budgets, and that many
| projects had come close to failing because of it. So now, as a
| regular practice, he always put aside a nice block of memory to
| free up when it's really needed.
| Cerium wrote:
| In my work it is very common to make the memory map a little
| smaller than it has to be. If you can't ship an initial version
| in a reduced footprint you will have no hope of shipping future
| bugfixes.
| 295310e0 wrote:
| If true, I hate that story. Think of the better art assets that
| were needlessly left behind. How is it that said block of
| memory had never been identified by any profiling?
| emmab wrote:
| If it would be detected by profiling that does make the
| technique asymmetric in that it would only stick around if
| nobody profiled to find it.
| hinkley wrote:
| Or if you didn't have an understanding with the sort of
| people who would run the profiler...
| usefulcat wrote:
| > Think of the better art assets that were needlessly left
| behind.
|
| Consider how long it takes to edit or recreate art assets to
| reduce their size. Depending on the asset, you might be
| basically starting over from scratch. Rewriting code to
| reduce its size is likely to be an even worse option,
| introducing new bugs and possibly running slower to boot. At
| least smaller, simpler art assets are likely to render
| faster.
|
| This is also the kind of problem that's more likely to occur
| later in the schedule, when time is even more scarce. Between
| these two factors (lack of time and amount of effort required
| to get art assets which are both decent looking and smaller),
| I think in practice you're actually more likely to get better
| quality art assets by having an artificially reduced memory
| budget from the outset.
| _carbyau_ wrote:
| I see it as a "Choose your problem." affair.
|
| 1. Deal with possibly multiple issues possibly involving
| multiple people with the politics that entails resulting in a
| lot of stress for all involved as any one issue could render
| it a complete failure.
|
| 2. Have extra space you can decide to optimise if you want.
| You could even have politics and arguments over what to
| optimise, but if nothing happens it all still works so there
| is a lot less stress.
|
| I pick 2.
| Bost wrote:
| There's a difference between "The server is not responding
| right now. We're loosing customers.", and "Low resources during
| product development". Actually the latter may be a case of
| enforcing premature optimization. So no, it's not the same
| idea.
| smarx007 wrote:
| I think we are thinking of a different baseline. You are
| thinking along the lines of "this should run, we can reduce
| server costs later", I would suggest (if I may) "the app
| needs to run on any Android device with 2GB RAM". And then
| you develop a game to run on a 1.5GB RAM phone, expecting
| that it will eventually fit into 2GB RAM budget.
| benhurmarcel wrote:
| https://thedailywtf.com/articles/The-Speedup-Loop
| pjmorris wrote:
| I'd read in 'Apollo: Race To The Moon', Murray and Cox, that
| the booster engineers had done something similar with their
| weight budget, something the spacecraft engineers wound up
| needing. Contingency funds of all sorts are a great thing.
| xen2xen1 wrote:
| I will be doing this. Marvelous idea.
| gfody wrote:
| having an 8gb file you know you can delete isn't really all that
| helpful if everything has already gone disk-full-fracked. you
| should really have an alarm on free space, especially if you're
| an indie.
| kelnos wrote:
| Sure, but sometimes the disk filling up is caused by something
| runaway and fast. If your "60% full" alarm goes off and the
| disk fills up 2 minutes later, you're still stuck.
|
| With a "ballast file" (as another commenter termed it), you can
| decide exactly when processes get to start consuming disk
| again, and that can give you some headroom to fix the problem.
| liaukovv wrote:
| I'm sorry for meta comment, but this font is barely readable.
| nielsbot wrote:
| It's pretty light. Use Reader mode?
| davidcollantes wrote:
| Not just the font of choice, but the formatting (hanging
| indent?). It makes it harder to read when there is little
| volume of text.
| patrickserrano wrote:
| Glad I wasn't the only one that found the indented paragraphs
| made it difficult to read.
| kchoudhu wrote:
| This is why I insist on data and root partitions on all the
| machines I administer. Go ahead and kill the data partition, at
| least the root partition will keep the system up and running.
| rgj wrote:
| In the mid nineties I worked in a research institute. There was a
| large shared Novell drive which was always on the verge of full.
| Almost every day we were asked to clean up our files as much as
| possible. There were no disc quota for some reason.
|
| One day I was working with my colleague and when the fileserver
| was full he went to a project folder and removed a file called
| balloon.txt which immediately freed up a few percent of disk
| space.
|
| Turned out that we had a number of people who, as soon as the
| disk had some free space, created large files in order to reserve
| that free space for themself. About half the capacity of the
| fileserver was taken up by balloon.txt files.
| twistedpair wrote:
| I worked at a large company during a migration from Lotus to
| Outlook. We were told we'd get our current Lost email storage +
| 100MiB as a new email quota limit under Outlook.
|
| I made a bunch of 100MiB files of `/dev/random` noise (so they
| don't compress, compressed size was part of the quota) and
| emailed them to myself before the migration, to get a few GiB
| of quota buffer.
|
| My co-workers were constantly having to delete old emails in
| Outlook to stay under quota, but not me. I'd just delete one of
| my jumbo attachment emails, as needed. ;)
| ABeeSea wrote:
| Email quotas aren't just a cost thing. It forces deletion of
| files/communications that aren't relevant anymore. The last
| thing the legal department wants is some executive's laptop
| with 10 years of undeleted email to make it's way to
| discovery.
| easton wrote:
| Then why not just tell Exchange to delete any emails older
| than 5 years (or whatever your lawyers tell you to put)?
| gowld wrote:
| You had a community-driven quota system.
| manifoldgeo wrote:
| Sounds like they had a tragedy of the commons, too haha
| agumonkey wrote:
| is there a model to solve these ?
| Dylan16807 wrote:
| Partly that, partly the opposite.
|
| It's basically reserving part of the disk for very
| important things only, which scares off less important
| uses. Like making the commons seem more polluted than it
| actually is to get some action taken.
|
| If those files weren't there, the space would probably fill
| up, but now without any emergency relief valves.
|
| It would be better if these files were a smaller fraction
| of space and had more oversight... but that's just a quota
| system. This is something halfway in between real quotas
| and full-on tragedy of the commons.
| the-rc wrote:
| At the opposite end, I heard a story of actually full storage
| from the beginning of the century, when I worked at a "large
| medical and research institution in the Midwest". They had
| expensive SMB shares (NetApp?) that kept getting full all the
| time. So they did the sane thing in the era of Napster: they
| started deleting MP3 files, with or without prior warning.
| Pretty soon, they got an angry call that music could not be
| played in the operating room. Oops. Surgeons, as you can guess,
| were treated like royalty and didn't appreciate seeing their
| routines disrupted.
| tapland wrote:
| D: I like getting to listen to something in the MRI though.
| bane wrote:
| This is a surprisingly common hoarding behavior among humans
| using scarce resources. In technology you see it everywhere,
| virtualization infrastructure, disk storage, etc.
|
| This is actually kind of clever. How the tribal knowledge for
| how to "reserve space" was developed and disseminated would be
| pretty interesting to study.
| jachee wrote:
| In Pittsburgh, it's evolved into the parking chair.
|
| https://en.wikipedia.org/wiki/Parking_chair
| tomrod wrote:
| This is very common in F500 companies. It's also a symptom of
| dysfunction.
| victor9000 wrote:
| Perfect example of the tragedy of the commons. If individuals
| don't create these balloon files then they won't be able to use
| the file server when they need it, yet by creating these
| balloon files the collective action depletes the shared
| resource of its main function.
| Rule35 wrote:
| It's a decentralized implementation of a quota system.
|
| By slowly releasing supply you prevent anyone having to self-
| regulate (which requires unreasonable deprivation, OR global
| knowledge) and everyone bases their decisions off of the only
| global signal, free space.
| [deleted]
| njovin wrote:
| This is similar to how some government agencies retain their
| budgets.
|
| At the end of the budget period they've only spent 80% of
| their allocated budget, so they throw out a bunch of
| perfectly good equipment/furniture/etc. and order new stuff
| so that their budget doesn't get cut the following year,
| rather than accepting that maybe they were over-budgeted to
| begin with.
|
| Rinse, repeat, thus continuing the cycle of wasting X% of the
| budget every year.
| ChuckMcM wrote:
| Okay, that is hilarious.
|
| I use some scripts that monitor disk space, and monitor disk
| usages by "subsystem" (logs, mail, services, etc) using Nagios.
| And as DevOps Borat says, "Disk not full unless Nagios say
| 'Disk is full'" :_) Although long before it is full it starts
| warning me.
|
| It doesn't go off very much, but it did when I had a bunch of
| attacks on my web server that started core dumping and that
| filled up disk reasonably quickly.
|
| Back in the day we actually put different things in different
| partitions so that we could partition failures but that seems
| out of favor with a lot of the distros these days.
| tinus_hn wrote:
| Same thing happens with floating licenses, if they are too
| scarce, people open the program first thing in the morning
| 'just in case' and keep a license reserved all day.
| legulere wrote:
| Seems like a good example how private property and therefore
| capitalism can be seen as the same as a tragedy of the
| commons.
| [deleted]
| dmingod666 wrote:
| What?
| michaelt wrote:
| If it's morally incorrect to get to the license server
| early and snatch a roaming license you might not use to
| its fullest, is it not also wrong to access the property
| market early and snatch a bunch of land you might not use
| to its fullest?
| jsmith99 wrote:
| The difference is that market mechanisms are meant to
| allocate items to those that value them the most (assumed
| to be the same as their willingness to pay). Claiming a
| licence by contrast has little cost.
| Kaze404 wrote:
| I think a homeless person values an empty house way more
| than a real estate owner.
| ipaddr wrote:
| It is unhealthy group behaviour. Your actions reduces the
| groups access. By sharing you increase group access and
| group success.
|
| When you buy a property for speculation you take a risk
| on your ability to increase market value (or outside
| reasons). You could lose money.
|
| Hoarding group resources makes you a net negative in your
| group. Hoarding property alone means you had to create or
| borrow enough value to obtain the property and finance
| upkeep. Someone has the money to build a new home. That
| transaction is a net positive.
| buffet_overflow wrote:
| I still know people that pre-emptively buy toilet paper
| "because hoarders might buy the rest of it" with absolutely
| no introspection.
| woko wrote:
| Hoarders will buy the rest of it.
|
| During the 2-month lockdown a year ago, I would purchase
| 4 frozen pizzas at a time when I had a chance to buy
| them, because I was so upset that I could not buy one
| when I wanted a single one during the 2 previous weeks,
| because of hoarders who had been faster than me.
|
| People think of toilet papers, but it is not just that.
| Pasta, rice, flour, yeast, plenty of useful things went
| missing due to hoarders during lockdown. Thankfully, I
| don't eat meat, because there was no meat at the
| supermarkets. Fish was also hard to find in the frozen
| shelves of the supermarkets. I am not a big fish eater
| either, so it was fine for me as well, but you get the
| idea: if you don't hoard a tiny bit, you get increasingly
| frustrated because hoarders will hoard, and you will have
| to wait for weeks before you get the chance to have what
| you want.
| _carbyau_ wrote:
| I noticed a weird thing here. All the regular pasta
| rapidly disappeared. Rice also was gone.
|
| But "Gluten Free"(I am a GFer) pasta - in this case,
| pasta made from rice - was fine!
|
| I amused myself with the idea that people:"would rather
| starve to death than eat that GF crap." :-)
| jakeva wrote:
| but that's the point, by responding in that way you have
| become one of those hoarders.
| overboard2 wrote:
| You could say it's a real tragedy
| smegger001 wrote:
| a common one at that
| akdor1154 wrote:
| This is a really perfect illustration of both parts of
| the parent comment.
| bane wrote:
| The meat situation was kind of interesting actually. At
| the beginning of the lockdown I remember going shopping
| for lots of shelf-stable foods. Very perishable stuff
| like meat or fresh veggies were out or hard to find (and
| yes freezing meat works fine I know). However, lots of
| stores had _tons_ of shelf stable boxed and canned goods
| and for meat jerky and other dried meats which are simple
| to boil and turn into simple soups in case of emergencies
| -- and yet nobody was really buying them at that time.
|
| I think I "prepped" for the worst by buying a 10 lb bag
| of flour, 40 lb of rice, 5 lbs of oatmeal, a sack of
| potatoes and a few bags of beef jerky and trail mix. Even
| if the worst didn't come to pass I figured we'd
| eventually eat it all anyways and it wouldn't really be
| hoarded, but in pinch we could ration it and it would
| last a few months and give us nutritious meals. It's
| basically what ships crews used to survive on during the
| age of sail as they spent months at sea. Not a ton of
| variety but it will keep you alive.
| plank_time wrote:
| I bought a bunch of rice, and all that happened was
| little rice bugs started living in it, so I had to throw
| it all away. But if there were a rice shortage, I would
| probably have eaten it.
| icansearch wrote:
| If you freeze it for 24 hours it will kill them off. Can
| be a good idea to do that when it comes into the house
| anyway as they might be in there already.
|
| Easy enough to separate them out of the rice after
| freezing too.
| joshjdr wrote:
| Extra protein (not to mention a few deviations from the
| OP)?
| random5634 wrote:
| The trick is to do exactly what they tell you not too
| immediately.
|
| Masks don't protect against covid - jump on Alibaba and
| buy some.
|
| Don't buy extra toiled paper, there will be no limits on
| its sale - immediately do a order on amazon prime.
| Because within a few days there will be price caps on the
| sale of TP (increased prices would deter hoarding) so
| everything will sell out right away.
|
| I think people have really learned this by now - as SOON
| as the announcements come that there is nothing to worry
| about and they won't do price caps or other things, they
| will be doing exactly that soon.
|
| Places like Alibaba do seem to continue to function -
| masks are maybe 50 cents per mask instead of 5, so folks
| buy a lot less, but you can still get a few for $5. Same
| with TP, it's not dirt cheap per roll, but you can get a
| few.
| whatshisface wrote:
| To rationalize this phenomenon... They don't warn you
| about it until they think you'd be worried about it, and
| they don't think you'd be worried about it until they
| start thinking about doing it. ;)
| nostrademons wrote:
| It's not really irrational or some behavior that they'd
| change if they _did_ introspect. I suspect a significant
| portion of these people are internally like "Yeah, I'm
| part of the problem now, but the problem is not going to
| go away if I don't join in on it, I'll just get screwed."
|
| In other words, they understand game theory and tragedy
| of the commons.
| 45ure wrote:
| The introspection or lack thereof, might be defined by
| Nash equilibrium. Nevertheless, the situation depicted in
| the article is a life hack -- exactly as described. There
| is no malicious intent/compliance, rather it is about
| having the foresight and ingenuity to save the day.
|
| https://theconversation.com/a-toilet-paper-run-is-like-a-
| ban...
| [deleted]
| xupybd wrote:
| This is the opposite of private property ownership.
| Property ownership comes at a cost. That cost increases as
| supply dries up. In this case there is no cost to the
| individual, only to the group.
| worik wrote:
| There is nothing intrinsic about property that implies
| cast.
|
| I think you are confusing property and scarcity.
| frenchy wrote:
| > Property ownership comes at a cost.
|
| Sort-of. Typically there is a cost associated with
| getting rights over a property, either by manufacturing,
| or in the case of land, by purchase or through the
| efforts of settling. However, once you have ownership of
| a property, it's usually relatively cheap to continue own
| (except, I suppose, if you consider the risk of a
| communist revolution or something).
|
| This is esentially how the British monarchy earns their
| generous sums of money. They stole land from the Brits
| during the Norman conquest about 1000 years ago, and now
| they rent it back to them for a handsome profit (though
| the whole thing is rather complicated now and they only
| get a portion of the money).
| DevKoala wrote:
| In California you pay ~1.5% in property taxes per year.
| Owning property is not cheap.
|
| In fact you never really own it.
| angry_octet wrote:
| Unless you're a golf course in LA, and cajole a permanent
| exemption!
| [deleted]
| gwright wrote:
| Not really.
| economusty wrote:
| More like the opposite. Imagine if real estate was
| available without trading something of value(like the free
| space on the drive). I don't see how that is like
| capitalism, in one case the resource was free and finite
| and it was almost always taken by those who don't need it.
| In the other case you have to trade something of value and
| there is ample land in general to trade. Imagine tomorrow
| Biden get on TV and says "property rights are dissolved,
| take what you want", my guess is that in a single short
| time period all land in the world would be claimed. Now
| imagine if the users of the shared disks page had to plan
| and requisition disk space.
| ineedasername wrote:
| A great idea, but it still leaves the possibility for performance
| issues prior to an admin's ability to address is. Some like two
| 4gb blocks might work better: if you get within, say, 200mb of
| storage limits you remove the first one and trigger an
| email/text/whatever to the admin, that way they can address it
| before it goes further. It's an early warning and automated
| solution. Then, if the situation continues, the second 4gb block
| is also automatically removed with another message send to the
| admin. Nothing fails silently.
| nycdotnet wrote:
| This is an old trick for when you need to deploy to media with a
| fixed size - floppy/CD-ROM/etc. Make a file that is 5-10% the
| size of your media and don't remove unless you're running out of
| space in crunch time.
| danielrhodes wrote:
| My understanding is this is why one should partition a drive. If
| you have a data partition, a swap partition, and an OS partition,
| you can get around issues where a server's lack of disk space
| hoses the whole system.
| yabones wrote:
| 100% agree. I think at the bare minimum every system should
| have two partitions: `/` and `/var`
|
| /var is usually where the most data gets added. Logs, database
| files, caches, and whatever other junk your app spits out. 99%
| of the time that's what causes the out of space issues. By
| keeping that separate from root, you've saved your system from
| being completely hosed when it fills up (which it will).
|
| Obviously there are other places that _should_ get their own
| mounts, like /home and /usr, but before you know it you've got
| an OpenBSD install on your hands with 15 partitions :)
| nijave wrote:
| I place I used to work achieved something similar with lvm
| thin provisioning and split out something like /, /home,
| /var, /var/log and maybe a couple others. I think they also
| had something clever with lvm snapshots to rollback bad
| updates (snapshot system, upgrade, verify) so even if an
| update went rogue and deleted some important, unrelated files
| it could be undid
| sedachv wrote:
| OpenBSD default partition allocation is really well thought
| out:
|
| https://man.openbsd.org/disklabel#AUTOMATIC_DISK_ALLOCATION
|
| At least put /tmp in its own partition as well.
|
| Multiple partitions can also save a lot of recovery time when
| there is a multiple bad sector event that corrupts an entire
| partition beyond recovery.
| ClumsyPilot wrote:
| Sounds like a poor-mans quota system
| znpy wrote:
| That's a dumb idea?
|
| Iirc some filesystems allow you to reserve a percentage of blocks
| for this particular use case (recovery by root).
|
| Ext2/3 for sure, ext4 probably too.
|
| Not sure you can do that on linode on the rootfs, since the
| filesystem is mounted, tho.
| joana035 wrote:
| tune2fs -m
| rektide wrote:
| hope you're not running -o compress=lz4 , because you are going
| to be in for a big surprise when you try to pull this emergency
| lever! you may be shocked to see you don't actually get much
| space back!
|
| i do wonder how many FS would actually allocate the 8GB if you,
| for example, opened a file, seeked to 8GB mark, and wrote a
| character. many file systems support "sparse files"[1]. for
| example on btrfs, i can run 'dd if=/dev/zero of=example.sparse
| count=1 seek=2000000' to make a "1GB" file that has just one byte
| in it. btrfs will only allocate a very small amount in this case,
| some meta-data to record an "extent", and a page of data.
|
| i was expecting this article to be about a rude-and-crude
| overprovisioning method[2], but couldn't guess how it was going
| to work. SSDs notably perform much much better when they have
| some empty space to make shuffling data around easier. leaving a
| couple GB for the drive to do whatever can be a colossal
| performance improvement, versus a full drive, where every
| operation has to scrounge around to find some free space. i
| wasn't sure how the author was going to make an empty file that
| could have this effect. but that's not what was going on here.
|
| [1] https://wiki.archlinux.org/index.php/sparse_file
|
| [2] https://superuser.com/questions/944913/over-provisioning-
| an-...
| spijdar wrote:
| Before reading this, I had presumed that sparse files did not
| overcommit drive space, but apparently, they do. I don't use
| them regularly and certainly not to "reserve disk space" but I
| was surprised that you can make sparse files way larger than
| available free space on the drive. I had assumed they were
| simply not initialized, but the FS still required <x> amount of
| free space in case a block is accessed.
| sonicrocketman wrote:
| Good point. I didn't think of that. I'm not sure how my CentOS
| servers handle this scenario, but it seemed to take up the full
| 8GB however I checked.
| secabeen wrote:
| > hope you're not running -o compress=lz4 , because you are
| going to be in for a big surprise when you try to pull this
| emergency lever! you may be shocked to see you don't actually
| get much space back!
|
| This is true. If you are replicating this, copy from
| /dev/urandom rather than using an empty file.
| yjftsjthsd-h wrote:
| It feels more elegant to do something like `touch /bigfile &&
| chattr -c /bigfile && truncate --size 8G` to outright disable
| compression on that file
| gherkinnn wrote:
| An architect once told me that he always plans for a solid gold
| block hidden away in the cellar.
|
| Once the project invariably goes over budget, he drops the plans
| for the gold and frees up extra funds.
|
| Edit: I think it was a large marble slab. Same thing.
| emmab wrote:
| What?!? How does that work? Does he just draw up a blueprint
| and write "solid gold block goes here" and them some contractor
| says "yes that gold block will be $NNNNN" and includes it in
| the budget??
| marshmallow_12 wrote:
| Check your basement!
| viraptor wrote:
| It's likely not literal. He likely quotes for price +50k or
| something like that, so that people will start thinking about
| reducing the price before they run out of budget.
| h4waii wrote:
| This is like carrying around a pound of beef because you refuse
| to look up the address of a McDonald's 7 minutes away.
|
| Setup quotas or implement some damn monitoring -- if you're not
| monitoring something as simple and critical as disk usage, what
| else are you not monitoring?
| bostonsre wrote:
| Not all environments require a stringent SLA. I have some
| servers that don't have a stringent SLA and aren't worth being
| woken up at night over if their disk is filling up fast.
| luckylion wrote:
| Okay, so you wake up with a full disk. What did the spacer
| accomplish?
| Dylan16807 wrote:
| Monitoring doesn't prevent random things from spiking, and
| something like this makes it easier to recover.
|
| Quotas are tricky to set up when things are sharing disk space,
| and that could easily give you a false positive where a service
| unnecessarily runs out of space.
| jedberg wrote:
| Since the late 90s, this was always my solution:
| tune2fs -m 2 /dev/hda1
|
| That sets the root reserve on a disk. It's space that only root
| can use, but also you can change it on the fly. So if you run out
| of userland space you can make it smaller, and if a process
| running as root fill your disk, well, you probably did something
| real bad anyway. :)
|
| But yeah, this is a pretty good hack.
| ryandrake wrote:
| A lot of tips in this thread are about how to better alert when
| you get low on disk space, how to recover, etc. but I'd like to
| highlight the statement: "The disk filled up, and that's one
| thing you don't want on a Linux server--or a Mac for that matter.
| When the disk is full nothing good happens."
|
| As developers, we need to be better at handling edge cases like
| out of disk space, out of memory, pegged bandwidth and pegged
| CPU. We typically see the bug in our triage queue and think in
| our minds "Oh! out of disk space: Edge case. P3. Punt it to the
| backlog forever." This is how we get in this place where every
| tool in the toolbox simply stops working when there's zero disk
| space.
|
| Especially on today's mobile devices, running out of disk space
| is common. I know people who install apps, use them, then
| uninstall them when they're done, in order to save space, because
| their filesystem is choked with thousands of pictures and videos.
| It's not an edge case anymore, and should not be treated as such.
| probably_wrong wrote:
| I know when our server's /tmp directory is full because Bash's
| tab autocompletion stops working.
|
| /home still has space, though, so nothing truly breaks. Perhaps
| I should file a bug report about that.
| jandrese wrote:
| It doesn't help that the base model of many phones had
| ridiculously undersized storage for so many years.
|
| "I have an unlimited data plan, I'll just store everything in
| the cloud." only to discover later that unlimited has an
| asterisk by it and a footnote that says "LOL it's still
| limited".
| moistbar wrote:
| When I worked at SevOne, we had 10x500 MB files on each disk that
| were called ballast files. They served the same purpose, but
| there were a couple nice tools built in to make sure they got
| repopulated when disk space was under control, plus alerting you
| whenever one got "blown." IIRC it could also blow ballast
| automatically in later versions, but I don't remember it being
| turned on by default.
| da_big_ghey wrote:
| Full disc problem in linux macine has been a problem in partialy
| solved in past many decades. We have had seperated partition
| /home, /tmp, /var, /usr in each its own partition. This is reduce
| problem if not completly removing. This is small desadvantage:
| there is reducion in fungability for a disc space.
| dmingod666 wrote:
| It sounds 'cool' and all for 1995.. but, what about one script
| that'll email you when the disk is at 80%?
| ineedasername wrote:
| So that if by the time you get the email the issue is at 97%
| you can immediately give yourself enough breathing room to
| figure things out with downtime or significantly degraded
| performance.
| dmingod666 wrote:
| Sure, alerts + this sounds like an approach someone would
| take
| dylan604 wrote:
| Besides immediately after first spin-up, when is a drive not at
| 80% capacity?
| capableweb wrote:
| I've had my disks so full that a `rm` command doesn't even work,
| would this workaround work in those cases too?
| geocrasher wrote:
| Yes because you could just do
|
| > save_my_butt.img
|
| and now it's 0 bytes.
| kiwijamo wrote:
| Would that work? The fs may actually allocate a new file
| before deleting the exisiting allocation so the risk of it
| not working is still there I would think?
| geocrasher wrote:
| Worst case scenario, if
|
| > filename
|
| didn't null it, then just
|
| echo "0" > filename
| quesera wrote:
| It might vary by kernel, filesystem, or shell, but in my
| experience and confirmed with a quick test: shell
| redirection does not create a new file/inode.
| louwrentius wrote:
| Please use LVM (Logical Volume Manager) if you really are afraid
| of filling up disks.
|
| If the disk would ever fill up: 1. Buy an
| additional virtual disk 2. Add the disk to the LVM volume
| group 3. Expand the Logical volume
|
| A really good primer on LVM:
|
| https://wiki.archlinux.org/index.php/LVM
| amelius wrote:
| Can you do all that while blocking the original request for
| space?
| louwrentius wrote:
| Yes you can. And all of this is on-line.
| tanseydavid wrote:
| I am surprised WIN did not make the short list for OS-lost when
| disk space is almost gone.
|
| I have been there a couple of times and it is a land of crazy
| unpredictable behavior.
| njacobs5074 wrote:
| As hacks go, it's a good one. I also like it because you don't
| have to be root to implement it and you don't have to reconfigure
| your file system params in ways that might or might not be great
| for other reasons.
| [deleted]
| robin21 wrote:
| This is a great idea. Hit this so many times.
| freeone3000 wrote:
| This sounds like you should, instead, use the "Filespace Reserved
| for Root" functionality of your filesystem, which exists
| specifically for this contingency. The default for ext3 is 5%.
| SpaceInvader wrote:
| To extend space in any filesystem in the root volume group on AIX
| you need space in /tmp. Years ago while working for some major
| bank I proposed to create such dummy file in /tmp exactly for the
| reason of extending filesystem. It saved us several times :)
| fractal618 wrote:
| I think i know where this is going without even reading. Any
| attempt from the outside to pull this 8b Gb file would be a very
| noticeable red flag.
| outworlder wrote:
| Nope. Try again :)
| qwertox wrote:
| But the idea isn't that bad. Name it properly, like saudi-
| arabia-customer-data.sql.pgp next to a directory named pgp-
| keys and fill it with /dev/random.
| clipradiowallet wrote:
| An alternative approach here... make sure (all) your filesystems
| are on top of LVM. This reduces the steps needed to grow your
| free space. Whether you have a 8gb empty file laying around, or
| an 8gb block device to attach...LVM will happily take them both
| as pv's, add them to your vg's, and finally expand your lv's.
|
| some reading if LVM is new and you want to know more:
| https://opensource.com/business/16/9/linux-users-guide-lvm
|
| edit to add: pv=physical volume, vg=volume group, lv=logical
| volume
| uniformlyrandom wrote:
| I would not say this is an alternative, more like yet another
| tool in a shed: 1. Tunefs 2. spacer.8gb
| 3. lvm
| ttyprintk wrote:
| Added benefit of not waiting to backup and restore 8 GB.
| rubiquity wrote:
| Yes LVM can help here. Another approach would be when you
| create the logical volume to intentionally under allocate.
| Perhaps only use 80-90% of the physical volume.
| jonhermansen wrote:
| If you are using LVM on all of your filesystems, it seems like
| a bad idea to use a file residing on LVM block device as
| another PV. And actually I'd be surprised if this was even
| allowed. Though maybe it is difficult to detect.
|
| You'd effectively send all block changes through LVM twice
| (once through the file, then through the underlying block
| device(s))
| labawi wrote:
| LVM is just fancy orchestration for the device-mapper
| subsystem with some headers for setup information.
|
| For block operations it's no different from manual setup of
| loop-mounted volumes, that also need to travel a couple of
| layers to hit the backing device.
|
| Though there is an important caveat - LVM is more abstracted,
| making it easier to mistakenly map a drive onto itself, which
| may create a spectacular failure (haven't tried).
| scottlamb wrote:
| > On Linux servers it can be incredibly difficult for any process
| to succeed if the disk is full. Copy commands and even deletions
| can fail or take forever as memory tries to swap to a full disk
| and there's very little you can do to free up large chunks of
| space.
|
| This reasoning doesn't make sense. On Linux, swap is
| preallocated. This is true regardless of whether you're using a
| swap partition or a swap file. See man swapon(8):
|
| > The swap file implementation in the kernel expects to be able
| to write to the file directly, without the assistance of the
| filesystem. This is a problem on files with holes or on copy-on-
| write files on filesystems like Btrfs.
|
| > Commands like cp(1) or truncate(1) create files with holes.
| These files will be rejected by swapon.
|
| I just verified on Linux 5.8.0-48-generic (Ubuntu 20.10) / ext4
| that trying to swapon a sparse file fails with "skipping - it
| appears to have holes".
|
| Now, swap is horribly slow, particularly on spinning rust rather
| than SSD. I run my systems without any swap for that reason. But
| swapping shouldn't fail on a full filesystem, unless you're
| trying to create & swapon a new swapfile after the filesystem is
| filled.
| Yizahi wrote:
| I've seen Linux systems simply crash when root partition was
| 100% full, though it was an embedded system, not representative
| to a big servers.
| Blikkentrekker wrote:
| Define crash of a "system"?
|
| Kernel panic? some user process you deem essential stopping?
| bostonsre wrote:
| Not sure about their reasoning.. but if you don't have root ssh
| enabled, sudo can break if there is no free disk space. I do
| something similar where I write a 500mb file to /tmp and chmod
| 777 it so anyone can free it up without needing sudo.
| reph2097 wrote:
| In that case, use "su".
| franga2000 wrote:
| I've experienced far more full disks than I'd want to admit,
| on many different hardware and software configurations, and
| I've never seen sudo break. Is this something you've
| experienced recently?
|
| I definitely agree with your advice and will go double check
| all my servers if /filler is 777 (not in /tmp since it's
| sometimes mounted tmpfs), but if sudo does break in that
| situation, that sounds like a pretty severe and most likely
| fixable bug.
| jeroenhd wrote:
| I've never had sudo break on my full disks. However, that
| doesn't mean recovery is easy...
|
| Working in a terminal to find out what on earth has just
| filled up your disk is a real pain when your shell complains
| about failing to write to your $HISTFILE and such. And, of
| course, the problem always shows up on that one server that
| doesn't have ncdu installed...
|
| I'm sure sudo can theoretically break with 0 free disk space,
| but that's not the usual mode of failure in my experience. At
| most sudo need to touch a dotfile or two, so deleting _any_
| temporary file or old log archive will do for it to recover.
|
| The balloon file is not a bad idea. I think I will apply it
| on my own servers just for good measure, although 8GiB is a
| bit much for my tastes.
| MayeulC wrote:
| IIRC, swap is actually needed for some memory operations. And
| when you run out of memory, the behaviour is often worse
| without swap.
|
| These days I always at least configure an in-memory compressed
| swap (zram).
| scottlamb wrote:
| You recall incorrectly; swap is not needed. It's not just me
| who runs without it; Google production machines did for many
| years.
|
| "The behavior is often worse without swap" is more vague /
| subjective. I prefer that a process die cleanly than
| everything slow to a crawl semi-permanently. I've previously
| written about swap causing the latter:
| https://news.ycombinator.com/item?id=13715917 To some extent
| the bad behavior can happen even without swap because non-
| locked file-backed pages get paged in slowly also, but it
| seems significantly worse with swap.
|
| zram is a decent idea though. I use it on my memory-limited
| Raspberry Pi machines.
| zamadatix wrote:
| Depends if you want things to gracefully degrade because you
| know you don't have enough RAM or if you'd rather things just
| straight up die. E.g. for the things I work on my laptop with
| if whatever I do isn't going to work with 128 GB of RAM (80%
| of which was meant to be cached data not actually used) then
| it's because it went horribly wrong and needs to be halted
| not because I needed some swap which is just going to try to
| hide that things have gone horribly wrong for a minute and
| then die anyways. Now if I were doing the same things on a
| machine with 8 GB or 16 GB of RAM then yeah I want to
| gracefully handle running out of physical memory because
| things are probably working correctly it's just a heavier
| load and it can be better to swap pages to disk than drop
| them from a small amount of cache completely.
| ohazi wrote:
| Ah yes... good ole'
| in_case_of_fire_break_glass.bin
| eecc wrote:
| That's what tune2fs is for
| https://www.unixtutorial.org/commands/tune2fs
| CrLf wrote:
| This is why the invention of LVM was such a good idea even for
| simpler systems (where some people claimed it was useless
| overhead). In my old sysadmin days I _never_ allocated a full
| disk. The "menace" of an almost full filesystem was usually
| enough to incentivize cleanups but, when necessity came, the
| volume could be easily expanded.
|
| I guess a big file is not a bad idea either.
| midasuni wrote:
| I do similar, I keep multiple files though - 4GB, 2GB, 1GB and
| 100M, which I also use for testing speed
| nijave wrote:
| The real question... Why does Linux or at least the common
| filesystems get stuck so easily running out of disk space? Surely
| normal commands like `rm` should still function.
| nemo1618 wrote:
| > Surely normal commands like `rm` should still function
|
| They do. In my experience, the only disruption to most terminal
| operations is that tab completion will fail with an error.
| hobofan wrote:
| They sometimes don't. The article even acknowledges this:
|
| > Copy commands and even deletions can fail
|
| I've had that happen too many times, so I don't know why
| would fill up my disk with a hacky spacer file, which surely
| can also fail to be deleted when the disk is already full.
| npongratz wrote:
| As recently as 2016 I experienced major problems using `rm`
| with an intentionally-filled btrfs (and current Linux kernel at
| the time), and per my notes, it was even mounted as `-o
| nodatacow`: # rm -f /mnt/data/zero.*.fill
| rm: cannot remove '/mnt/data/zero.1.fill': No space left on
| device
| arthurmorgan wrote:
| It's a really bad problem on iOS where a full disk won't allow
| you to delete anything and a reboot puts your phone in a boot
| loop.
| andimm wrote:
| ot: the first two links are "swapped"
| sonicrocketman wrote:
| Thanks for pointing this out. Fixed.
| dominotw wrote:
| sysadmin version of setting the clock 5 mins ahead?
| johnchristopher wrote:
| It's the google photos thumbnails db. /s
| harperlee wrote:
| At work, OneDrive does not sync by policy if there is less than
| 30Gb free space. Apparently for ensuring space for updates when
| they come...
| TazeTSchnitzel wrote:
| I am reminded of a tweet that suggested adding a sleep() call to
| your application that makes some part of it needlessly slow, so
| that you can give users a reason to upgrade when there's a
| security fix (it's 1 second faster now)!
| Black101 wrote:
| Apple did something like that...
| https://www.npr.org/2020/11/18/936268845/apple-agrees-to-pay...
| TazeTSchnitzel wrote:
| They did it so old phones wouldn't suddenly hard-shutdown at
| 30% battery. I appreciate that they did that, it's very
| annoying.
| Black101 wrote:
| That's what Apple likes to say. Luckily they got at least a
| small fine.
| kristjansson wrote:
| Lots of comments assailing this approach as a poor replacement
| for monitoring miss the point. Of course monitoring and proactive
| repair are preferable - but those are systems that can also fail!
|
| This is a low cost way to make failure of your first line of
| defense less painful to recover, and seems like a Good Idea for
| those managing bare-metal non-cattle systems.
| davidmoffatt wrote:
| Dumb idea. Read the man page for tunefs. The file system has some
| thing called min free which does the same thing. However this
| does not interfer with wear leveling. Dummy data does.
| 404mm wrote:
| Not commenting on whether OPs is sound or not, however tuners
| implies the now less and less used ext4 (many distros are
| switching to XFS or btrfs :-/). On another note, that limit
| applies to non-privileged processes only. Some crap running as
| root will just fill up the disk too.
| deeblering4 wrote:
| if you are on an ext filesystem, reducing the reserved percentage
| on the full filesystem can save the day. its more or less this
| same trick built in to the filesystem
|
| IIRC 5% is reserved when the filesystem os created, and if it
| gets full you can run:
|
| tune2fs -m 4 /dev/whatever
|
| which will instantly make 1% of the disk available.
|
| of course should be used sparingly and restored when finished
| AcerbicZero wrote:
| In most VMware clusters that use resource pools extensively I've
| always maintained a small emergency CPU reservation on a pool
| that would never use it, just in case I had to free up some
| compute without warning.
| raldi wrote:
| This reminds me of Perl's esoteric $^M variable. You assign it
| some giant string, and in an out-of-memory condition, the value
| is cleared to free up some emergency space for graceful shutdown.
|
| "To discourage casual use of this advanced feature, there is no
| English long name for this variable."
|
| But the language-build flag to enable it has a great name:
| -DPERL_EMERGENCY_SBRK, obviously inspired by emergency brake.
| Wibjarm wrote:
| I'd expect the name is also inspired by the sbrk(2) system
| call, so you can allocate some memory "for emergency use" if
| needed.
| amock wrote:
| I think it likely relates to
| https://en.m.wikipedia.org/wiki/Sbrk.
| raldi wrote:
| Yes, of course. That's what makes the pun work.
| [deleted]
___________________________________________________________________
(page generated 2021-03-25 23:00 UTC)