[HN Gopher] I'm "still afraid to use spaces in file names" years...
___________________________________________________________________
I'm "still afraid to use spaces in file names" years old
Author : dario_satu
Score : 1318 points
Date : 2021-11-11 09:40 UTC (2 days ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| fortran77 wrote:
| I think people who use a terminal interface, regardless of OS,
| don't like spaces in file names. I avoid them.
| mrb wrote:
| Sort of related, but here's a joke: _Windows 95 does support long
| filena~1_
| necovek wrote:
| I don't use spaces because it's so much faster to type filenames
| out (including with TAB-completion) in the terminal.
|
| I do, however, use Cyrillic (UTF-8) in filenames, and I regularly
| try out if moving a file into ASCII-path will let some programs
| open it (half the time it's that when I am having trouble).
| spurgu wrote:
| I'm not "afraid" of it, I just think it's unnecessarily
| compicated to work with spaces in filenames on the command line.
| gorgoiler wrote:
| I like to store data on USB flash drives. After being left to
| mature for a few years in a humidity and temperature environment,
| you get some really interesting and _complex_ byte streams where
| your original file names used to be.
|
| Often they are not even valid UTF8 which, when you uncork the
| filesystem for the first time in a decade causes the most
| delightful crashes. The more years the better the aroma.
| boffinAudio wrote:
| Every week, I encounter a user - just like I did in the 80's -
| who cannot explain the _difference_ between a file and a folder.
|
| "What do I use a folder for?", they ask, in the same breath that
| they request "some way to organize things logically".
|
| The no-filesystem movement has worked hard to eradicate this
| scourge from user experiences, but I fear that this is the devils
| work. Computer users _should know what a file is_ , and what its
| for - and they should _know_ what a folder is for, and why they
| would want to create one to put their files into it ..
|
| But yet: they don't.
|
| It hasn't improved since the 80's. Taking away the users
| responsibility to understand these things, only makes computing
| worse. The fact that "special chars in paths" breaks things, also
| holds this factor into place, imho.
| octorian wrote:
| > The no-filesystem movement
|
| Is that the movement to store all your data as an amorphous
| pile of crap, and then provide easy-to-use search tools to
| actually find the content you're looking for?
|
| On one hand, I really like the search tools that come from
| this. But I still like to actually organize my data, so I can
| browse it if I want to. Also, these search tools seem to only
| work well enough on macOS and fall flat on their face in
| Windows. (and no idea where Linux falls on this)
| boffinAudio wrote:
| You had me at "amorphous pile of crap", but lost me at
| 'actually find the content'... ;)
|
| Meanwhile, I've got a single directory full of PDF files
| (over 60,000+) which I routinely "ls -alF | grep <search
| term>" for, and I've also got some PyPDF scripts for doing
| deeper content search - but yet I yearn for a way to
| automatically parse the filenames and organize things
| categorically into a folder tree resembling a word cloud,
| symbolic links and all .. one of these days ..
| vbg wrote:
| Spaces in file names are a bad idea because spaces delimit the
| name of separate distinct files,
|
| At least in my crazy old illogical head anyway.
| vbg wrote:
| File names should be long enough to clearly communicate
| meaning/purpose/context, no more no less.
| koziserek wrote:
| .doc
| koziserek wrote:
| och my emojis didn't display, sorry
| iknowstuff wrote:
| Hahah how ironic.
| hajile wrote:
| I dislike constantly having to backslash escape files on the
| command line, so I use dashes instead.
| sixdimensional wrote:
| This seems like a case for an axiom I hear infrequently, but I
| think comes up a lot - things that seem like they should be
| simple and easy, but are in fact difficult.
| LennyHenrysNuts wrote:
| Me too, I never do it.
| sieve wrote:
| This is a UI/UX problem that I only face when dealing with shells
| and shell scripts. Never had any issues when spawning processes
| from within languages/runtimes that support sane argument arrays.
|
| _sh_ , _bash_ and _cmd.exe_ are shit. The shell needs serious
| rethinking.
| necovek wrote:
| This is a difference between $@ and "$@" (note the quotes):
| $ cat proba.sh #!/bin/sh echo "Using quotes:"
| for i in "$@"; do echo "$i"; done echo "No quotes:"
| for i in $@; do echo "$i"; done $ ./proba.sh "ho ho ho"
| Using quotes: ho ho ho No quotes: ho ho
| ho
| tomcam wrote:
| Damn I didn't know that. Thanks
| Joker_vD wrote:
| I see that there are lots of comments about problems of TAB-
| completions with filenames with spaces in this comment section
| and I am frankly puzzled: both Bash and cmd.exe actually TAB-
| complete those perfectly fine, inserting quoting where it's
| needed.
| tremon wrote:
| And where it isn't needed. If you have a path that contains a
| variable _and_ a space, bash will happily escape the $,
| making the path invalid. See the following: $
| cd $HOME $ mkdir my\ dir $ ls my[tab] $ cd
| / $ ls $HOME/my[tab] ls: cannot access '$HOME/my
| dir/': No such file or directory
|
| That error is because when you press [tab], bash changed the
| path to \$HOME/my\ dir/ but that isn't obvious from the
| output and I couldn't find a proper way to include the tab-
| expanded result in the transcript.
|
| (edit: this is on GNU bash, version 4.3.48(1)-release but
| I've seen this behaviour for years)
| Joker_vD wrote:
| Depends on the Bash version, I guess? Mine is 4.4.20(1) and
| when I do "cd $HOME/my[TAB]", it replaces the input line
| with "cd /home/joker/my\ dir/", and pressing [ENTER]
| changes the directory to '/home/joker/my dir', as can be
| seen from the prompt.
| akovaski wrote:
| The variable escaping behavior has existed for a while
| https://stackoverflow.com/questions/32463052/bash-
| tabbing-fo... https://askubuntu.com/questions/70750/how-
| to-get-bash-to-sto...
| https://askubuntu.com/questions/41891/bash-auto-complete-
| for...
|
| And I experience the problematic behavior on my Ubuntu
| VM. However, I can get the above describe expansion
| behavior if I run: shopt -s direxpand
| necovek wrote:
| I seem to remember bash losing preferred escaping when TAB-
| completing, but can't reproduce it now with 5.0.17.
|
| Eg. you'd type `ls -l "Spaced [TAB]` and it would turn it
| into `ls -l Spaced\ Name`. I remember similar annoyances with
| other special shell characters (eg. single quotes, dollars,
| slashes), but that all seems to behave sane now.
| xyzzy_plugh wrote:
| I didn't even know this was a thing, but can't say I've
| ever preferred an escape style. I actually use backslashes
| a fair bit, usually just with spaces. I tend to reserve
| double quotes for variable or shell expansion, explicitly.
| necovek wrote:
| It's not so much about a preference, but your cursor
| would jump about and you'd need to be on the lookout if
| you wanted to edit the completion (eg. to change the
| extension).
| sieve wrote:
| > inserting quoting where it's needed
|
| You have to remind yourself to do this manually in scripts if
| you don't want to see lines full of "No such file or
| directory."
|
| One of the reasons the shell is broken is because the
| character they use as an argument array member separator is
| something that regular people use to distinguish between two
| words, such as in a file name.
| Joker_vD wrote:
| Well, writing scripts would be much less painful if
| $VARNAME did not explode into pieces by default. Alas, this
| ship has sailed long ago.
| goohle wrote:
| IMHO, it's possible to add a flag to bash, which will
| turn on this behavior, so problem can be fixed, but it
| will diverge bash from POSIX sh a lot.
| v-erne wrote:
| I have come back to this thread, which I have spotted and
| forgotten something like two days ago, to say that just like a
| minute ago one of new Jenkins jobs that I added failed because I
| named the item using space and some custom Gradle/Maven magic
| tool failed to load one of its own auto generated files (I could
| tell that space was the culprit because error message printed
| only second half of item name).
|
| How can I not be afraid of spaces if this happens like every
| other day with every other custom tool ...
| Pxtl wrote:
| Yes, but working with filenames with spaces in them is a huge
| PITA in command-line tools, because you have to quote everything.
| The ergonomics is just really annoying.
|
| Personally I wish console shells had chosen another delimiter
| than space, but here we are.
| shadowgovt wrote:
| And honestly, it's a good fear to have; there are contexts where
| it still just doesn't work.
|
| Last I checked, the standard answer for GNU make is "Spaces are
| expected to break the tool, that's working as intended, it will
| never be fixed." And because we build our towering edifices of
| software on the pillars of the past, I can't guarantee to you
| that a project of arbitrary complexity _won 't_ try to cram a
| list of filenames through a make script.
| xvilka wrote:
| Spaces in path are a pain for the shell autocompletion, since you
| have to escape them by using either "" for the whole string or
| use the "\ " instead.
| timakro wrote:
| Maybe if we'd do it more software would actually learn to deal
| with it.
| TacticalCoder wrote:
| Define " _space_ ". Is the Hangul filler we talked about
| yesterday a spacing character? Is the zero-width non-breaking
| space a spacing character? What about the typographic spacing
| characters?
|
| You should better be _very_ afraid of using spaces in filenames.
|
| You should do everything you can to support them but you have to
| know you'll invariably encounter countless cases where you'll
| have this or that tool that won't work properly with them.
|
| I still live in a world where I cannot name a song from the
| french group _L 'imperatrice_ with an eacute in the filename or
| my car's media system will display garbage (it's running QNX and
| I don't know which filesystem).
|
| FWIW, and it should be food for thought, every single Git
| repository in the world contains a pre-commit hook sample
| (disabled by default but it's there) that enforces that every
| committed file in the repo is named using a subset of ASCII
| characters.
|
| Every Git repository in the world has that example: let that sink
| in.
| selfhoster11 wrote:
| > FWIW, and it should be food for thought, every single Git
| repository in the world contains a pre-commit hook sample
| (disabled by default but it's there) that enforces that every
| committed file in the repo is named using a subset of ASCII
| characters.
|
| I use Git for documents too, not only code. Why shouldn't I use
| my native language?
| cerved wrote:
| non-ascii characters cause annoying hard to fix problems. If
| you're willing to deal with that - kudos. Personally I don't
| find it worthwhile
| selfhoster11 wrote:
| I haven't had problems yet. Spaces, punctuation, and quotes
| are the main offenders, most of the time.
| numpad0 wrote:
| Tab completion don't work well for languages that require
| IME. That is one reason why _I_ don't.
| dredmorbius wrote:
| IME == Input method editor?
|
| https://en.wikipedia.org/wiki/Input_method
| numpad0 wrote:
| yup, I type in pronunciation and let it guess what I'm
| trying to say. Works okay in editors but don't work great
| with shells in a terminal emulator, so I just prefer not
| having to use it in shell operations.
| selfhoster11 wrote:
| That's actually a good point. On the other hand, not all
| languages use IMEs. Mine just uses the AltGr modifier key,
| but is otherwise just a standard QWERTY layout without any
| features.
| glandium wrote:
| Tab completion works just fine for me with a Japanese IME.
| chungy wrote:
| > I still live in a world where I cannot name a song from the
| french group L'imperatrice with an eacute in the filename or my
| car's media system will display garbage (it's running QNX and I
| don't know which filesystem).
|
| I have an Android phone and I tell MusicBrainz Picard to save
| all files with ASCII-only names and Windows-compatible names
| for the ones that get sent over to the phone. Basically for
| this reason. Sometimes it's players on Android itself, but even
| more frequently, whatever bluetooth radio I'm connected to
| freaking out with non-ASCII characters.
| torstenvl wrote:
| What do you mean, display garbage?
|
| L'imp?ratrice? L'impratrice? L'impA(c)ratrice? L'imp,ratrice?
| L'impUratrice?
| kingcharles wrote:
| You get all those space characters working and then some jerk
| comes along and uploads a file like this: r[?][?][?][?][?][?][?
| ][?][?][?]e[?][?][?][?][?][?]g[?][?][?][?][?]e[?][?][?][?][?][?
| ][?][?][?][?][?]x[?][?][?][?][?][?][?][?][?][?]-[?][?][?][?][?]
| [?][?]t[?][?][?][?][?][?]h[?][?][?][?][?][?][?][?][?][?][?][?][
| ?][?][?][?]i[?][?][?][?][?][?][?]s[?][?][?][?].[?][?][?][?][?][
| ?][?][?][?][?]e[?][?][?][?][?][?][?][?][?][?][?][?][?]x[?][?][?
| ][?][?][?][?][?][?][?][?][?][?][?][?][?][?][?][?]e[?][?][?][?][
| ?][?][?][?][?][?][?]
| dang wrote:
| Please don't Zalgo on HN. It's enough to speak its name.
| allemagne wrote:
| It would be one thing if it was making other comments
| difficult to read or causing browser issues, but I
| appreciated the demonstration that both would presumably be
| possible on certain browsers
| quantified wrote:
| Glad you didn't choose a sequence that crashes my browser.
| aasasd wrote:
| Until now, I haven't actually thought of what would happen if
| zalgotext occurred anywhere other than a web browser. Looking
| forward to the five minutes of fun with the file manager and
| whatnot.
| meepmorp wrote:
| regex this, bravo
| jagged-chisel wrote:
| 768 characters is too long for macOS it seems. (References
| online say HFS+ has a limit of 255 UTF-16 characters. Didn't
| find anything for APFS immediately... edit: same for APFS)
| hnuser847 wrote:
| Honest question - what the heck are those characters?
| sdenton4 wrote:
| Zalgo text: https://zalgo.org/
|
| It was a great joke for a couple weeks two internets ago.
| Sohcahtoa82 wrote:
| > two internets ago
|
| It's been like three internets since I heard someone
| using "internet" as a measurement of time.
|
| It's actually interesting to think about "generations" of
| internet, just like generations of people, and how the
| culture shifted between them.
|
| There was a time in the early '00s when broadband was
| catching on, yet YouTube didn't exist. A time when
| Ebaumsworld and Newgrounds ruled the internet. When
| Homestar Runner was pop internet culture. Weebls Stuff.
| The frog blender.
| grishka wrote:
| Combining diacritic marks.
| Valgrim wrote:
| It corrupted text or "Zalgo" text, it relies on diacritics.
|
| See this answer on stackoverflow:
|
| https://stackoverflow.com/questions/1732348/regex-match-
| open...
| lmkg wrote:
| I disagree with calling it "corrupted." We're not
| tricking the browser into trying to render garbage bytes
| that are actually the middle of a jpeg or something. It's
| actually valid Unicode. It's an edge-case which is not
| seen in regular usage, but it's technically following all
| of the rules.
| orangepurple wrote:
| In digital typography, combining characters are characters
| that are intended to modify other characters. The most
| common combining characters in the Latin script are the
| combining diacritical marks (including combining accents).
|
| https://en.wikipedia.org/wiki/Combining_character
| db48x wrote:
| Specifically _Vietnamese_ combining characters. The
| Vietnamese writing system uses multiple combining
| characters at a time, and stacks them vertically. Throw
| in a few that wrap around the character like
| t*1.000.000*his, some alternat lttr frms, disturbing
| imagery, and perhaps a few other tricks, and you have
| zalgo. See also
| https://stackoverflow.com/a/1732454/823846
| Loughla wrote:
| This legitimately made me laugh out loud in my office.
|
| The characters reach up off the screen as I reply to this.
| They overlay the comment above you. Amazing. How?
| pxndxx wrote:
| It's usually called Zalgo text, and it's what you get when
| you start stacking all kinds of Unicode diacritics on poor
| unsuspecting characters.
|
| https://en.wikipedia.org/wiki/Zalgo_text
| Macha wrote:
| Interestingly I get different behaviour per browser/OS.
| Firefox/Linux clips it to the bounding box of the parent
| element, Firefox/Mac and Safari/Mac clip it to the line
| height, and only Chrome/Mac lets it extended further.
| terr-dav wrote:
| Firefox and Safari on iOS 15 both render all the glyphs
| attached to the base character. Vivaldi, Chrome and
| Firefox on Win10 all render them stacked and overlapping
| the parent and child comments.
| detritus wrote:
| Huh, I tried it in Chrome to see how it reacted here and
| it maintained about the same position as it did in my
| usual browser, Firefox.
| kingcharles wrote:
| This is the best generator I found:
| https://lingojam.com/GlitchTextGenerator
| nyanpasu64 wrote:
| I find
| http://animalswithinanimals.com/generator/generator.html
| much more controllable.
| Liquix wrote:
| For anyone who is curious (and acolytes of Zalgo): "In
| Unicode, character rendering does not use a simple character
| cell model where each glyph fits into a box with given
| height. Combining marks may be rendered above, below, or
| inside a base character. So you can easily construct a
| character sequence, consisting of a base character and
| "combining above" marks, of any length, to reach any desired
| visual height, assuming that the rendering software conforms
| to the Unicode rendering model."
|
| [https://stackoverflow.com/questions/6579844/how-does-
| zalgo-t...]
| shadowgovt wrote:
| Hah, lucky for me Chrome on Ubuntu didn't implement the
| spec correctly. ;)
| prepend wrote:
| Let me tell you how much of a pain in the ass that my employer
| forces spaces in the corporate OneDrive directory.
|
| PS-Microsoft is horrible about stupidly named folders being
| created and dumped in there.
| mtift wrote:
| I have an overly-aggressive function in my .bashrc to rename all
| files in the current directory: # Rename all
| files in a directory rn() { rename "s/ /-/g" *
| rename "s/_/-/g" * rename "s/-/-/g" * rename
| "s/://g" * rename "s/\(//g" * rename "s/\)//g" *
| rename "s/\[//g" * rename "s/\]//g" * rename
| 's/"//g' * rename "s/'//g" * rename "s/,//g" *
| rename "y/A-Z/a-z/" * rename "s/---/--/g" *
| rename "s/---/--/g" * }
|
| I use this all the time, especially when I download files.
| BiteCode_dev wrote:
| Thanks to all the comments in this threads, I now have "sudo
| apt install rename detox" in my install script, and:
| normalize_names() { rename "s/-/_/g" *
| detox -s lower * }
|
| in my .bashrc.
|
| I've thrown some edge cases at it, and it handles it super
| well. It deals with consecutive "_", remove leading garbage,
| normalize unicode, and even prevents naming conflicts by opting
| out early.
|
| Thanks you.
| IX-103 wrote:
| You missed ~ You really don't want to create a directory named
| "~".....
| cerved wrote:
| I wonder if rename has an -e flag like sed. It might be worth
| baking this into one monolithic regex if you call this often
| mrzool wrote:
| You might be interested in detox:
|
| https://github.com/dharple/detox
| OskarS wrote:
| Overly aggresive is right! I don't know if this is genius or
| deranged! I'm leaning towards genius and stealing the idea.
|
| By the way: what's your beef with en dashes? I mean, if it was
| "everything should be 'HYPHEN-MINUS' (U+002D)", then fine, but
| why specifically en dashes and not em dashes?
| michaelt wrote:
| _> By the way: what 's your beef with en dashes?_
|
| Of all the changes in that list, removing _the character that
| doesn 't appear on a standard keyboard_ seems like the least
| controversial...
| mywittyname wrote:
| To add, it's a character that gets magically inserted for
| no reason in various situations.
|
| It's up there with those damn angled quotes.
| jedimastert wrote:
| A better question might be "how did it get there in the
| first place?"
| dredmorbius wrote:
| Presume all inputs are hostile.
|
| Whether people or processes, something is likely to
| introduce the character at some point.
| ggm wrote:
| Sw which converts -- and __ on the fly. Same sw converts
| quote pairs "for your convenience"
| eyelidlessness wrote:
| Opt+- if you use macOS, long press on - if you use any
| Apple touch OS.
| mtift wrote:
| I totally agree that for some people, this could be a
| terrible command to have around. However, I know that it has
| been working for me for about 8+ years or so. I almost always
| run in in my ~/Downloads folder on files that I don't really
| care about. I download a lot of academic papers and books,
| and this just saves me a lot of time to put files in the
| format I like: author--paper-title.pdf. And that's part of
| the reason why I make all of the dashes the same, so if I'm
| opening something by an author, I can easily autocomplete and
| not have to remember how to make other sorts of dashes on the
| command line.
| OskarS wrote:
| For a download folder in particular, this does sound like a
| great idea. You'd break the list in the browser or
| whatever, but who cares about that?
| l0b0 wrote:
| If you're a developer you're doing yourself a big disservice by
| not learning how to deal with special characters.
| mtift wrote:
| I agree. I am a developer and I know how to deal with special
| characters. But this isn't something I use professionally. I
| just prefer not to have to deal with special characters in
| the pdfs, m4as, txts, and other files that I use on a daily
| basis. When I write papers, I'll write u or N or c or
| whatever (incidentally, I have a lot of shortcuts in my
| .vimrc for those). I would not say I am "afraid" to use
| spaces in filenames, but I get a certain satisfaction storing
| academic papers in the author--paper-title.pdf format and my
| notes in author--paper-title.md because it helps me find
| things.
| tgbugs wrote:
| Word of warning from hard experience: rn is a really dangerous
| thing to name a function because it is one char away from rm.
| post-it wrote:
| Looks like it's typically run without any arguments, so it's
| probably fine.
| lioeters wrote:
| A typo can go the other way, like "rn somefile" where it
| was meant to remove a file but instead it renames all
| files.
| spurgu wrote:
| One char away also physically on the keyboard (maybe that's
| what you meant?).
| tgbugs wrote:
| Yeah, the physical layout is the primary concern. I should
| have noted that since there is ambiguity because n and m
| also happen to be next to each other in the alphabet.
| jatone wrote:
| laughs in dvorak
| kataklasm wrote:
| cries in colemak
| Extigy wrote:
| I once ran "crontab -r" instead of "crontab -e" and also
| thought that was terrible design for the same reason.
| TheSkyHasEyes wrote:
| ren would be better than rn. :)
| eyelidlessness wrote:
| Note to self: snag
| "notTerseAtAllMoreVerboseIdentifiersForGreatGood.js" on NPM
| theshowmustgo wrote:
| Nice but how do you prevent overwrites? What about
| directories/folders and the files in that directory/folder?
|
| I have: Movie Bla (2020) Movie Bla
| (2020).mp4
|
| But also: Movie_Bla_(2020)
| Movie_Bla_(2020).mp4 Movie_Bla_(2020).srt
|
| Would not like to lose files like the the srt.
| BiteCode_dev wrote:
| rename will stop and output and error.
| mtift wrote:
| Yeah, sometimes I end up renaming things I don't want to, but
| it really doesn't happen all that often. And sometimes I
| throw caution to the wind, add some excitement to my life,
| and rename a bunch of files (not for anything professional)
| in some really old directory and hope I don't break anything.
| But I'm not aiming for perfect with this comment. I just
| mentioned in another comment, but the vast majority of times
| I run this is in my ~/Downloads folder on files I don't
| really worry about breaking.
| Tempest1981 wrote:
| Surely you must run into conflicts now and then?
| nybble41 wrote:
| That's the most beautiful part! After running this script
| there are no more conflicts, because it just silently
| overwrites all but one version of the "cleaned" filename.
|
| (Also--that entire function is super inefficient and could be
| replaced with a single invocation of "rename".)
| mtift wrote:
| Totally inefficient. But for me it's readable and
| practical. This is mostly just a convenience function for
| me to help store files in a format I like rather than
| something I need optimized. If it ever started to feel
| slow, sure I could optimize. But for now, when I still
| occasionally download a file that has some weird character
| and I just prefer to add another line to my function.
| nybble41 wrote:
| Without changing the design too much, you could rearrange
| it like so to avoid renaming multiple times and still
| have the option to just "add another line":
| # Rename all files in a directory rn() {
| rename \ -e "s/ /-/g" \ -e "s/_/-/g"
| \ -e "s/-/-/g" \ -e "s/://g" \
| -e "s/\(//g" \ -e "s/\)//g" \ -e
| "s/\[//g" \ -e "s/\]//g" \ -e
| 's/"//g' \ -e "s/'//g" \ -e "s/,//g"
| \ -e "y/A-Z/a-z/" \ -e "s/---/--/g" \
| -e "s/---/--/g" \ * }
|
| Though I would at least take advantage of character
| classes to reduce the number of substitutions:
| # Rename all files in a directory rn() {
| rename \ -e 's/[ _--]/-/g' \ -e
| 's/[:\(\)\[\]",]//g' \ -e "s/'//g" \
| -e 'y/A-Z/a-z/' \ -e 's/--+/--/g' \ *
| }
|
| (I'm using the `rename` command provided by the `rename`
| Debian package, a.k.a `file-rename`. The options may vary
| if you're using a different version.)
| donio wrote:
| https://github.com/dharple/detox is a nice tool for this. Sane
| defaults but configurable.
|
| In addition to CLI I use it from emacs dired-mode too:
| (defun my-dired-detox () (interactive)
| (dired-do-shell-command "detox" nil (dired-get-marked-files))
| (revert-buffer))
|
| I bind it to "_" in dired-mode.
| niccl wrote:
| I use this snippet, to change spaces to underscore for
| directories and files in the current directory and below.
| Haven't made it a function yet, but should. I got it from stack
| overflow or somewhere, but no attribution. Thanks to whoever
| did it first: find . -depth -name '* *' |
| while IFS= read -r f ; do mv -i "$f" "$(dirname
| "$f")/$(basename "$f"|tr ' ' _)" ; done
| cmg wrote:
| I nearly gave up on learning newer front-end JavaScript stuff
| like React & webpack and so on a few years ago because of spaces
| in paths.
|
| node-gyp doesn't like it when there's a space anywhere in your
| working path. Stuff I was messing around with was all in ~/Code
| Projects at the time, and using npm install on some things just
| broke. Looking back, I definitely could have done a better job
| parsing the error messages but still...
|
| There's an issue but it was closed in 2018 as "The workaround is
| to use a path without blanks" https://github.com/nodejs/node-
| gyp/issues/439
| xdennis wrote:
| Looks like I'm in the minority. I always use spaces and non-ASCII
| characters in filenames.
|
| In many languages it's a requirement. For example, in Romanian,
| there are 8 words that collide with ,,fata" if you remove the
| diacritics (fata, fata, fata, fata, fata, fata, fata, fata).
|
| Given that we have to use diacritics, spaces don't seem like a
| big deal.
| vadfa wrote:
| >In many languages it's a requirement. For example, in
| Romanian, there are 8 words that collide with ,,fata" if you
| remove the diacritics
|
| That is what context is for.
| selfhoster11 wrote:
| So do I. I have a language, and I'm not afraid to use it. My
| computer should speak it just as well as I do.
| cerved wrote:
| There's a server at work that name with a non-ascii
| character. I've run into compatibility issues lots of times
| where I can't connect. I prefer to just use English with
| ASCII and be happy
| selfhoster11 wrote:
| Server names are different. They are by and large machine-
| facing identifiers, whereas filenames have a 50-50 split of
| whether they are machine-facing, human-facing, or both.
| They makes their support of Unicode a much more critical
| (and appealing) proposition.
| rob74 wrote:
| Hmmm, I thought I was fluent in Romanian (born there and lived
| there for 26 years), but I only know 5 of those 8 words...
| xdennis wrote:
| That doesn't seem unusual. Only the first 5 are very common.
| theshrike79 wrote:
| According to Google Translate the first two are "girl" and
| the rest are "face". =)
| xdennis wrote:
| * fata - the girl
|
| * fata - girl
|
| * fata - the face
|
| * fata - face
|
| * fata - was giving birth
|
| * fata - a small fish, or a child who won't sit still
|
| * fata - was fussing
|
| * fata - variant of fata
|
| As you might infer from the first 4, Romanian uses postfix
| "the" and for singular feminine words you can't tell the
| difference if you use only ASCII.
| qayxc wrote:
| Google Translate is a horrible tool for "translating"
| single words or lists of unrelated words.
|
| Use a proper dictionary for that. The very nature of
| statistical models makes proper translation without context
| impossible for these systems, especially when uncommon
| words and diacritics are involved.
| hdjjhhvvhga wrote:
| So how did you deal with it in the 80s/90s?
| PeterisP wrote:
| Not sure about Romanian, but for many other languages people
| essentially came up with transliteration schemes (multiple,
| incompatible, ambiguous) to squeeze your language into ascii.
|
| The resulting text was understandable by the "computer
| people" but not the general population who did not use the
| networks back then, perhaps somewhat comparable to when some
| time ago USA parents encountered the "SMS slang" used by
| their teenagers.
| octorian wrote:
| Back in the day there were dozens of character sets that were
| alternatives to US-ASCII. Having once worked on an Email
| client, I needed to bake in a bunch of translation tables to
| convert stuff sent that way into UTF-8.
| xdennis wrote:
| As you would assume: use ASCII and deduce from context. Many
| people still do that.
|
| That has lead to phantom diacritics: reading letters in
| unfamiliar words/names based on what you assume they are. For
| example some pronounce Chirica as Chirica because they assume
| someone forgot to type the breve in a.
| apricot wrote:
| I call it the habanero trap. There is no n in "habanero",
| yet a lot of people say "habanyero", probably by analogy
| with "jalapeno".
| masklinn wrote:
| > Given that we have to use diacritics, spaces don't seem like
| a big deal.
|
| There is one big difference: CLI utilities don't usually care
| about diacritics (though encoding issues can throw a wrench in
| that), but they care a lot about spaces. So putting spaces in
| filenames requires properly quoting or escaping parameters,
| whereas diacritics does not. That makes one-off shell snippets
| and scripts a lot more annoying (though TBH I tend to shy away
| from those anyway, these days).
| yread wrote:
| We have a few words that depend on diacritics to be unique in
| Czech as well - though not as bad as this example - but people
| just manage without. Hell, I don't even bother installing the
| Czech keyboard, if I REALLY need it (like in names), I just
| google for words that have the character and copy it
| enriquto wrote:
| Why stop here? Why not put spaces in your variable names also?
| Allowing spaces only in file names and not in variable names is
| short-sighted when not inconsistent.
| ricardobayes wrote:
| You can now?
| morpheuskafka wrote:
| I'm 19 now and learned this advice from my dad growing up. Still
| run into situations in my IT work and programming stuff where it
| makes a difference.
| pulse7 wrote:
| I'm still afraid to use national specific characters in file
| names...
| Joker_vD wrote:
| One of the main reasons why Windows used "Program Files" and
| "Documents and Settings" was to _force_ the programs (and
| programmers) to deal with paths with spaces. And you know, for
| the most part it kinda, more or less worked out although of
| course even today you will find programs that ask you to install
| them in a folder without spaces in the path.
| Rerarom wrote:
| VFAT and stuff like that actually provided alternate names like
| PROGRA~1
| beardyw wrote:
| Yes, I was doing code to quickly read FAT folders (on a micro
| controller) and got to the bit about filenames more than 8.3.
| I decided my life was too short (and processing time) to go
| and sort out what the "real" file name is. Enforced 8.3 as a
| requirement!
| alerighi wrote:
| That annoys me every time I use a Windows system. It was a
| terrible decision, especially since both the command prompt and
| the new powershell doesn't accept like bash a backspace before
| a space, you have to quote the whole path! I get that most
| users on Windows don't use the shell, but as a developer I do a
| lot, and every time it's a pain (no wonder they added the WSL
| in Windows after the failure of Powershell...)
| rashil2000 wrote:
| Why would they accept a backslash? Backslash is a path
| separator on Windows. In most Windows programs, you don't
| even need to escape the space - arguments can contain spaces
| and it will understand it, like `notepad My file.txt`
|
| The escape character on PowerShell is backtick, and on cmd it
| is caret. You don't need to quote everything.
| toyg wrote:
| The main culprit for space issues is stuff relying on BAT or
| CMD files, where escaping variables seems to be a black art.
|
| Sadly such set includes loads of Java programs. If only SUN had
| shipped a standard way to generate isolated exe files in
| 1998... but they worked under the presumption that you'd have a
| JVM already there, because distributing that monster was
| difficult in dialup times, so you could just hand people a jar;
| and the enterprise market did not care, since they had webapp
| servers. Sadly it's an "optimization" that became obsolete very
| quickly but wasn't rectified until it was too late (java 9+).
| ReleaseCandidat wrote:
| > The main culprit for space issues is stuff relying on BAT
| or CMD files, where escaping variables seems to be a black
| art.
|
| Actually it isn't, just use double quotes and add a '~'. It's
| just about the only thing batch files handle better than
| shell scripts. set "VARIABLE=%~PATH"
| makecheck wrote:
| They may have thought that would happen but I saw just as much
| stuff end up in C:\Windows or \Users or (always my favorite)
| those "Documents" that are really just "whatever random crap
| every app wants to put there".
| dale_glass wrote:
| And that was a good idea, if only Microsoft also fixed the
| CreateProcess function, Windows would be somewhat sane in this
| regard. But somehow nobody seemed to think of it. Seriously,
| look at it:
|
| https://docs.microsoft.com/en-us/windows/win32/api/processth...
|
| The arguments are a single string. So you want to pass
| parameters with spaces in them? You've got to add quotes and
| stuff all of that into a single string. Instead of doing it in
| a more sane manner, like oh, the arguments to main().
| IiydAbITMvJkqKf wrote:
| The root cause is that argv isn't a first-class citizen like
| on linux, but an abstraction. The kernel only cares about a
| single string argument. If you use main instead of WinMain,
| the CRT will transform the single string into an argv for
| you.
|
| Oh and cmd.exe uses a different escaping scheme than the CRT.
| dale_glass wrote:
| Microsoft is in full control of the Windows kernel, so they
| can make it care about whatever they want to, and one would
| think better argument passing would be a nice quality of
| life improvement. Less nonsense for developers to deal
| with, and less weird bugs on the platform.
| exciteabletom wrote:
| Sure, but MS values backwards compatibility a lot.
|
| They aren't going to break existing API or bloat the
| kernel with a bunch of functions that do the same thing.
| Joker_vD wrote:
| They can either add a new API which almost nobody would
| use -- because everyone already learned to use the
| existing one and either reused or reimplemented the
| MSVCRT's logic so that most of the software parse the
| command lines the same way; or they can literally break
| every single program in existence by breaking the
| interface of CreateProcess -- which is just as likely as
| Linux breaking the interface of execve(2).
|
| Giving CreateProcess a new flag so it would to correctly
| accept "path\\\to\\\my\\\program.exe\0arg_1\0second
| argument\0argument with literal \" symbol" (with an
| implicit \0 terminating it) as lpszCmdLine is an easy
| part; the hard part would be forcing everyone to switch
| to using it.
|
| Also, I'm pretty certain this processing happens in the
| user space, and Win32 API is already bloated beyond any
| belief.
| naikrovek wrote:
| maintaining backwards compatibility means maintaining silly
| decisions, and Microsoft does both.
| Avalaxy wrote:
| Yet in Microsofts own cmd tool I need to put quotes around my
| path if I want to refer to any files/folders below those
| folders.
| alerighi wrote:
| It's not a matter of being afraid, spaces in filenames are
| annoying.
|
| I mostly use the shell and navigating in directories with spaces
| is annoying, you have either to quote it or put a \ before each
| space. You also have to remember to quote everything, and in bash
| that can become complex, you start adding quotes everywhere to
| solve problems caused by spaces (or other special characters like
| *) in filenames.
|
| So I prefer to not use them, a simple _ is as readable as a
| space. Only thing is that spaces gets rendered better on
| graphical file managers, but... that could have been solved (and
| can still be solved) by simply adding an option to render a _ as
| a space graphically if there is no ambiguity. I don't care that
| much since I don't use graphical file managers that much.
| canjobear wrote:
| Python has made me afraid to use hyphens in file names
| uwagar wrote:
| or Capital letters
| mindslight wrote:
| I\ am\ not\ afraid,\ I\ just\ do\ not\ see\ how\ it\ benefits\
| my\ quality\ of\ life.
| student2k wrote:
| I recently find out a windows folder can't end by a space.. But
| python for example you can create this folder 'example ' every
| file you create in this folder will be inaccessible, and
| impossible to delete.
| goto11 wrote:
| I still use the "web safe palette" when choosing color codes for
| CSS
| anodyne33 wrote:
| Reminds me of the time I watched a coworker's head explode when
| he tried to extract an archive (from a 'Nix environment) on his
| Windows machine and was indignant about getting duplicate
| filename errors.
| anodyne33 wrote:
| As a Windows guy case still seems like a weird thing to worry
| about.
| GuB-42 wrote:
| Spaces in file names break half of the shell scripts I have
| encountered.
|
| And it is one of the biggest reason I hate Unix shells as
| programming languages, it is a minefield. In fact I think that
| after a dozen lines, Perl is a better option. It has most of what
| shells are good at (i.e. running commands), but saner and more
| powerful.
| ndesaulniers wrote:
| my god, I was simply trying to loop over every file in a dir
| and zip it in a bash one liner. Of course, some of the inputs
| had spaces in the file names. What an exercise in
| frustration!!!
| chrisBob wrote:
| I know I _can_ put spaces in file names, but \ is one of the
| characters I still can 't touch type, so I still hate dealing
| with them in the terminal.
| Aulig wrote:
| I had to move my development folders because you can't develop
| Android apps if your project path contains a space. Not sure
| where the issue is, if it's gradle or something else.
|
| Edit: thinking about it again, it might not have even been the
| space but the exclamation mark in my path. Or both.
| matchagaucho wrote:
| Keep%20the%20names%40and%20links%20readable%20or%20submit%20to%20
| encoding
| davidjgraph wrote:
| You think space are bad (and yes I'm old enough that I don't use
| them)... We work with a company that has forward slashes "/" in
| their trading name and insist on shared cloud directories
| involving them to be prefixed with that trading name.
|
| As you as you do anything programmatic in/out of these drives it
| all hits the fan. So I'd add to the original statement - "Avoid
| 'technical' companies with special characters in their name",
| it's just not right...
| Decabytes wrote:
| It's just such a pain in the butt to work with files with spaces.
| In a script it's fine b/c I just surround it in double quotes,
| but on the command line I hate having to escape the spaces.
|
| This might already exist, but I wonder about a terminal that was
| really just a multi-line repl to a language. It would be
| preloaded with libraries that replicated all the features of the
| gnu core utils, but instead of calling grep like normal, you
| called a function like grep("args"). The advantage would be that
| you had access to a full blown programming language at all times.
| So when you needed to do something more complicated you would
| still have access to all the standard language features. And when
| you didn't need that, your canned core utils like functions would
| work
| glitcher wrote:
| Wait, what's a file? :P
| mherdeg wrote:
| There was some prior discussion about a generational shift here
| at https://news.ycombinator.com/item?id=28615884 -- there's an
| idea that people no longer need to know what files or folders are
| in order to get things done day-to-day with software (
| https://www.theverge.com/22684730/students-file-folder-direc...
| ).
|
| I'm wondering when the first generation of college students will
| start who have never used a physical keyboard to input text.
| ctur wrote:
| i feel so seen
| douglaswelch wrote:
| Yep. Yep yep.
| mgdv wrote:
| Years of Java has me seeing the world in camel case
| maydup-nem wrote:
| Not afraid, but typing a dash in the terminal is easier and
| shorter than typing a reverse slash and a space. Spaces are kind
| of a pain in the ass in the terminal, tbh.
| ezfe wrote:
| Quotes around the path is easier and avoids any issues - but
| tab completion and drag and drop files into terminal handles
| most cases for me.
| fragmede wrote:
| And I'm older than Google. If you want some hilarity, newlines
| are allowed in filenames as well (\n, \r, \r\n). Try getting bash
| to handle that! (It's possible, though annoying. try redirecting
| to `while read line` in addition to xargs -print0 hackery)
| mikewarot wrote:
| File names shouldn't have anything except a-z,0-9,_ and perhaps a
| -. No unicode, no spaces, no nulls.
|
| It's not fear that keeps me from using spaces in file names, it's
| habit.
|
| If we're going to play this dangerous game, from now on I'll
| figure out how to use nulls (\0) in my file names, and make all
| the C/C++ programmers cry.
| codetrotter wrote:
| I do it the other way around. I used to be afraid of spaces. But
| I have come to realize that it is better to learn sooner than
| later which pieces of software is in such a bad state that they
| aren't handling spaces correctly.
|
| That being said, even after all these years I sometimes need to
| try a few times in order to get the quoting and the escapes right
| when communicating names of files with spaces through multiple
| layers of software.
| douglaswelch wrote:
| Yep. Me too. Early bad experience with spaces in file name and
| Unix cured me of that.
| duxup wrote:
| Some react scripts freaked out on me recently because my login
| (and thus user folder) in windows contained a space.
| MisterTea wrote:
| 2021-11-11_I_have_absolutely_no_idea_what_you_are_talking_about.t
| xt
| DeathArrow wrote:
| If I'm going to use the file in the command line, I won't use
| spaces, since I don't know what sick bug I might encounter.
| [deleted]
| alpaca128 wrote:
| I avoid spaces because they make tab completion more cumbersome
| in bash.
| cerved wrote:
| 100% this
| eloisius wrote:
| Same. For documents and stuff that I use in normiespace I give
| them friendly names with capitalization and spaces and such,
| but for anything I'm going to be working on via CLI I try to
| use filenames that will be easily chunked as "words" when doing
| things like double clicking it in terminal to select, ^w to
| erase it, tab completion etc.
| floatingatoll wrote:
| Coming from web-heavy _and_ perl5 backgrounds, it 's insane to me
| that people don't treat filenames and arguments and environment
| variables as tainted user input, and just blindly trust
| properties about them like "does not contain whitespace or
| control characters".
| jmull wrote:
| This is a general issue to this day. So that isn't very old.
| apricot wrote:
| That's funny because the first operating system I used (Apple DOS
| 3.3) was very liberal about file names. There was a 30-character
| limit which was a lot, and it didn't mind spaces in file names.
| Even control characters were fair game, which made things fun
| when you accidentally inserted a ^A in a SAVE command.
| yboris wrote:
| I've been stuck for years with a bug in my commercial Electron
| application where images do not get displayed if the folder path
| has spaces in it :'(
|
| https://github.com/whyboris/Video-Hub-App/issues/667
|
| Any help would be really appreciated!
| oshiar53-0 wrote:
| Yet another reason to ditch Make /s
| authed wrote:
| I try to avoid spaces and special characters because issues still
| happen to this day (just yesterday, I had an issue with a file
| with an accent in it).
| shockeychap wrote:
| Maybe it's just me, but it always seemed like prohibiting spaces
| and other special characters was a reasonable way to avoid
| unnecessary complexity (and the bugs that accompany it) when
| parsing and navigating directory trees and files.
|
| I'm old enough to remember working with 8.3 filenames in DOS, and
| while the length limitation was maddening, the space part never
| was. Then Windows 95 came out and all restrictions were thrown
| out.
|
| Why couldn't we just have a file system that robustly supports
| long filenames, including variable length extensions, while
| prohibiting certain special characters - namely spaces, slashes
| or any directory denoting characters in files, and characters
| that have special meaning in regex context? (brackets, asterisk,
| etc.)
| kasabali wrote:
| Related: David Wheeler's Fixing Unix/Linux/POSIX Filenames
|
| https://dwheeler.com/essays/fixing-unix-linux-filenames.html
| tgv wrote:
| By coincidence, I found another reason just two days ago. A web
| app lists uploaded files' names, and (in a rarely used context)
| lets the user search for them. One user has copied a file name
| from the web page, and pasted it into the search box, but got
| no results. Turned out that the file name contained two
| consecutive spaces, which the browser turns into a single
| space, hence no match. Every layer between the user and file
| system can do something unexpected.
| IiydAbITMvJkqKf wrote:
| Posix makefiles don't support spaces in dependency names. Not
| sure about gmake.
|
| Cmake doesn't support semicolons, because everything in cmake is
| a string, and ; is the list item separator.
|
| PATH is separated by colons, so you can't add directories
| containing : to it.
| reaperducer wrote:
| Spaces are still not "permitted" in URLs.
|
| Browsers will take http://example.com/some name.pdf and
| automagically turn it into http://example.com/some%20name.pdf,
| and deliver the goods without a problem. But having that space in
| the URL is still out of spec, and will cause your web page to
| fail validation, even though it works fine.
| crescentfresh wrote:
| Our local development environment has evolved to a complex enough
| sequence of steps to set up and troubleshoot that I spent 2 weeks
| creating tooling that you can simply point at source checkout
| locations and the tool will take care to setup that repo.
|
| It broke on the first try on a jr hire's machine, the source
| checkout location was `C:\source code`.
| Vrondi wrote:
| If you're in tech long enough, you can be traumatized by
| anything. Like the time a vendor-supplied system decided after an
| update that nothing could have a hyphen in the title, and a lot
| of existing content just... broke at once. Fun times.
| mhd wrote:
| I'm >>still tempted to write umlauts like 'Mot"orhead' old.<<
|
| But also a "use a font that has a proper capital ss" hipster.
| jl6 wrote:
| I had half a feeling that the warning against using spaces in
| names pre-dates computing, but after a little research into
| library call numbers and archive accession numbers, which turn
| out to have both historically included spaces, I have found no
| evidence to support this feeling.
| mindvirus wrote:
| Heck, I'm still afraid to use caps!
| msoucy wrote:
| My coworkers still don't quote strings in their bash scripts,
| even when they're paths... and yet they wonder why everything
| falls apart.
| branko_d wrote:
| I have an uneasy feeling whenever I see a path parameter declared
| as string. Path is not a string - it's a sequence of path
| components and should be treated as such by our APIs. A path
| should be parsed once - on user input - and then used in its
| "sequence form" throughout the software stack.
|
| And "path component" is not an arbitrary string either - e.g.
| appending a path component to the path should first require
| converting/parsing the string into the path component, and only
| if that's successful appending it to the path.
| dahfizz wrote:
| > I have an uneasy feeling whenever I see a path parameter
| declared as string. Path is not a string
|
| I guess that depends on what you mean by "string". `open` and
| `fopen` need a char* path to open a file. Whatever fancy Path
| abstraction you use eventually becomes a char* string, because
| that's what the kernel needs.
| VWWHFSfQ wrote:
| yeah. it's a string.
| dwheeler wrote:
| On POSIX systems file names are not strings, they are
| sequences of bytes. They might not be UTF-8 or have any
| meaning. Python3 had to hack around this, they thought they
| could force everything to Unicode and discovered that
| doesn't work.
| account42 wrote:
| On POSIX system file paths are C strings, which are
| sequences of bytes that cannot include the 0 character.
| UTF-8 or oher meaning is not required for something to be
| a string.
| guntars wrote:
| Which makes for fun issues like there's no standard way
| to display a filename in Unix. A system that's, you know,
| all about files.
| warkdarrior wrote:
| Unix: everything is a file, including file names!
| sipos wrote:
| At least for most Linux systems (not sure about other
| *nix, but I expect the same?), there is a system default
| encoding, defined by the locale, and I think decoding the
| filename in that encoding and displaying the resulting
| string, is probably the correct way to display a
| filename? That seems as good as you are likely to get on
| any system really.
|
| I think for any POSIX system, either there is locale
| support defining the encoding, or it uses the POSIX
| locale, which defines the encoding (ASCII).
|
| Of course you need to handle cases where filenames cannot
| be decoded in the system encoding (probably by replacing
| characters that cannot be decoded), because a filename in
| a different encoding, or even with no valid encoding, has
| been used on disk. While systems can say that file names
| containing bytes that are not valid characters in the
| system's encoding are not valid file names, that doesn't
| stop people mounting disks with them, so the problem
| never goes away if you support opening media from other
| systems.
|
| What I am saying is that this is no more a Unix problem
| than it is a problem on any system that supports
| removable media.
| duped wrote:
| That's probably because paths aren't properties of the
| file itself, they're helpers to reference the file.
| naikrovek wrote:
| things like this are why the Unix philosophy is so bad.
|
| text processing is hard if you must support Unicode, and that
| means every Unix command line tool must implement or employ a
| text processor to handle input. it would be much easier if
| objects were passed back and forth. PowerShell got this right.
| SAI_Peregrinus wrote:
| POSIX "Fully portable filenames" allow all characters except
| 0x2F (/) and 0x00 (NULL). That means file names can include
| line feeds, backspaces, EOF, etc.
|
| "This is `a
|
| perfectly vali'd.\010! file name\377, despite the weirdness"
| jerf wrote:
| "Path is not a string - it's a sequence of path components and
| should be treated as such by our APIs."
|
| For maximum correctness, you want to turn it into a file handle
| as soon as possible, and do all operations through the
| variations of the file functions that end in "at", like:
| https://linux.die.net/man/2/openat
|
| The downside of this approach is that you still technically
| have to carry the path around with you if you ever want to
| present it back to the user, because once you have a directory
| handle, you can get back to the root directory easily enough by
| following parent links and seeing what directories you end up
| in, but that may not be what the user "thinks" the path is, and
| they want to see their path, not a canonicalized one. And
| they're mostly right. And it's not easy to correctly track
| changes to their intended path from this basis either.
|
| Basically, I don't know of a really solid, 100% correct way to
| handle this with any reasonable degree of effort.
| Pxtl wrote:
| "you want to turn it into a file handle as soon as possible"
|
| But no sooner.
|
| For example, I've run into problems where I'm configuring
| program A server to talk to file location B... but _I_ don 't
| have access to file location B. But the client-side library
| for talking to the server tries to convert location B into a
| file handle and then freaks out because I can't access it.
| When I don't want to access it. I want that program to serve
| it.
|
| If it was using simple "path" objects that _didn 't_ confirm
| that I have access to the path, everything would be hunky
| dory. But because it tried to convert it into a file handle
| unnecessarily, I get blocked.
| jmull wrote:
| > For maximum correctness, you want to turn it into a file
| handle as soon as possible
|
| That's not right. You want to resolve a file/folder path to a
| file/folder at the exact point it makes sense.
|
| It's a problem if you're using a path when you wanted the
| file. The file can be switched/modified out from underneath
| you.
|
| It's also a problem if you've got the file when you only
| wanted a reference. Now you can't simply switch/modify the
| file independent of the reference. E.g., maybe you want
| config file changes to take effect immediately and
| transparently.
|
| You can also have the hybrid case, e.g., where you want the
| folder directly, but have a relative path to a file that is
| resolved late.
|
| If you're unsure, I'd err on the side of late resolution.
| BoorishBears wrote:
| > For maximum correctness, you want to turn it into a file
| handle as soon as possible
|
| This is why I get stressed out when I see paths turned into
| special objects encoding separators and such.
|
| It tells me the path is living for way too long compared to
| the file handle.
|
| I only want to see path-specific objects if we're modifying
| the path, and even then I want that to happen as late as
| possible.
| cerved wrote:
| doesn't this lock the file?
| aspaceman wrote:
| Why not just hold onto both? The users representation and the
| file handle. Only ever "display" the representation, while
| you do all operations on the handle. (Not trying to be
| sarcastic, just curious).
| globular-toast wrote:
| This goes for most instances of user input. Timestamps is the
| other common one people get wrong. I've even seen programs
| that pass around timestamps as strings in multiple formats
| and as integers (Unix time).
| aqfamnzc wrote:
| As a programming noob, I'm wondering what would be the
| better way to pass or return a unix time value as opposed
| to an integer?
| globular-toast wrote:
| Depends on the language but most high-level languages
| have a timestamp or datetime abstraction which you should
| be using.
| joe_guy wrote:
| If it's being serialized, consider fully qualified
| iso8601.
| mleonhard wrote:
| If you need to keep the timezone with it, then use an
| ISO8601 [0] string: "2021-11-11T15:32:35-07:00".
|
| Otherwise, use an integer unix timestamp, the number of
| seconds since 1970-01-01T00:00:00Z: 1636673555. Use an
| unsigned 32-bit integer or a 64-bit integer to avoid the
| 2038 problem [1]. JSON's maximum safe integer value is a
| signed 53-bit integer, so if you're using HTTP JSON RPC,
| you'll have to check for overflow.
|
| [0] https://en.wikipedia.org/wiki/ISO_8601
|
| [1] https://en.wikipedia.org/wiki/Year_2038_problem
|
| [2] https://developer.mozilla.org/en-
| US/docs/Web/JavaScript/Refe...
| globular-toast wrote:
| ISO8601 is a serialisation format. You wouldn't want to
| use it in internal function calls simply for performance
| reasons. You also wouldn't want to pass it around as just
| a "string" type. I think the question was asking about
| internal function calls. For external data interchange,
| ISO8601 is the only sane option and deals with all known
| timezone and leap second bollocks.
| tmerr wrote:
| Another inconvenience with this approach is that you can keep
| thousands of paths in memory no problem. But thousands of FDs
| may cause you to exceed per-process limits.
| anyfoo wrote:
| Strings following certain rules are entirely valid
| _representations_ of paths, just like sequences of path
| components in the chosen language /framework are. Similarly,
| the sequences of bits that make up the sequences of your
| language/framework in memory are an entirely valid
| representation of said sequences of components.
|
| Yes, paths have structure, but saying "a path is not a string"
| is equivalent of saying "C source code is not a string". Both
| are strings, and both are something else, represented by
| strings according to rules. Different internal representations
| have different advantages and disadvantages. I fully agree that
| for things such as "adding components" an internal
| sequence/list representation is better, but strings can pass
| arbitrary IPC or even ABI boundaries much easier for example.
| (And you wouldn't bat an eye for example when you see FQDNs
| like "www.google.com" passed as a string instead of as
| ["www","google","com"] because the string representation works
| pretty well.)
| fouric wrote:
| C source code and paths are both representable by strings,
| true, but the fact that they're not actually strings is still
| important, because most people don't know that, and in the
| case of paths that leads to a lot of edge cases (in the case
| of source code it leads to a bunch of inefficient and weak
| tooling, which isn't quite as bad).
|
| Because neither are strings, their _native representation_
| shouldn 't be such - it should be something structured, and
| only when necessary (IPC, FFI, serdes) be serialized into a
| string representation. This would save people a lot of time
| and effort.
| anyfoo wrote:
| It really depends. Do you usually keep hostnames as
| strings? URLs? JPEGs? Why or why not?
|
| Sure, a browser will hopefully quickly parse that URL and
| break it up, an image viewer will do the same with a JPEG.
| Will anything that's only interested opening/displaying
| that URL or JPEG, through a library or external program?
|
| POSIX paths are actually remarkably simple in structure[1].
| The only caveat is equality and normalization: Without
| normalization, a path a might be equal to a path b while
| their representations differ, e.g. "/etc/foo" and
| "/etc/bar/../foo". But this is the same whether you have a
| string or a list of strings, you need to normalize in
| whatever representation you choose to check for equality.
|
| [1] Almost shocking myself, even Haskell defines its
| primary FilePath type literally as "String".
| gadders wrote:
| Where I used to work they had a risk system that created
| directories on the window server that matched the book name. They
| had a trader that named one of his books "COM1"...
| rob74 wrote:
| Well, you should still be afraid! Be very afraid! Seriously: only
| a few months ago I was confronted with a video encoding tool that
| didn't work properly when the file names contained spaces - so
| yes, even in 2021 it's still safer not to use spaces in file
| names...
| nojs wrote:
| Not to mention most naively written bash scripts!
| tiagod wrote:
| Honestly, this still causes a lot of problems with some Software.
| I've had friends asking for help with obscure errors that were
| ultimately caused by the files they were using being on a path
| that contains a space or special character.
| shoto_io wrote:
| On a similar note: "it makes sense to add a date to a file name"
| years old.
| cbushko wrote:
| Base64 is your best friend!
| shmerl wrote:
| Never use spaces in file names. It shouldn't depend on age, it's
| common sense.
| imchillyb wrote:
| This is why \Program Files, and \Program Files(x86) exist as they
| do. With spaces, and strange characters, in the name.
| shaoner wrote:
| Any shell script that uses files should use double quotes for at
| least the variables: `mv $1 $2` is not safe, should be `mv "$1"
| "$2"`
| pseingatl wrote:
| You need them for URL's. Running a stand-alone web page maker
| using Rust. Document structure:
| [Introduction](./Introduction.md)\\ [Chapter
| One](./chapter one.md)\\
|
| Crashed on trying to deal with building html when there are
| spaces in the file name. It is still an issue.
| analog31 wrote:
| I\'m%20still%20afraid%20to%20use%spaces%20too.
| neogodless wrote:
| I work in Azure Data Factory, and there are places where a space
| in a name will cause you difficult to troubleshoot errors. But I
| can never remember where. It's not universal. So I just avoid
| them entirely.
| sva_ wrote:
| What about long filenames and paths?
| xenocyon wrote:
| Not exactly spaces, but I have been bitten by something like this
| at my work quite recently. A Confluence page with special
| characters in the page title was working fine for a while. At
| some point there was a Confluence version update which made the
| page URL broken (and apparently unrecoverable, or at least not
| easily recoverable).
|
| One way to look at it is that people of a certain generation
| eschew spaces because the tools of their formative years simply
| couldn't handle spaces - but another is that the olds have
| learned that generally erring on the side of KISS ("Keep it
| simple, stupid!") isn't a bad idea.
| rndgermandude wrote:
| I still feel slight unease sometimes when using more characters
| than 8.3
|
| Damn, I feel old now :P
| ourmandave wrote:
| A lot of my stuff is cross platform so making filenames portable
| means avoiding spaces.
|
| Ironically, even NASA doesn't like space.
|
| https://www.nas.nasa.gov/hecc/support/kb/portable-file-names...
| zibzab wrote:
| Touche my friend, had a good laugh
| hardwaresofton wrote:
| I am also that age, and kebab-case is the best case for
| filenames.
|
| 2021-01-01-some-important-document.pdf gives me the warm fuzzies.
| On the off chance that some more differentiation is needed, throw
| in an underscore and a whole new world opens up
| ModernMech wrote:
| Kebab case is the often overlooked benefit of prefix notation
| and semantic white space in programming languages. Honestly the
| best case of all cases imo.
| kibwen wrote:
| One glorious day we'll accept programming languages that
| require spaces around infix arithmetic operators so that we
| can make kebab case a reality!
| JasonFruit wrote:
| Lisps, especially Scheme with its `x->something-else`
| convention, have ruined naming in other languages for me.
| MaxBarraclough wrote:
| Forth does something like this, by virtue of its reverse
| Polish notation.
|
| In Forth, 'words' (which are roughly analogous to functions
| and operators) must always be separated by whitespace, as
| Forth doesn't parse out operators the way most languages
| do. In exchange, you get the ability to use symbols in
| identifiers, as Forth has no reason to single out symbols
| like _+_ as being syntactically special. You can even use a
| number for the first character. (For that matter, Forth
| will even let you override the usual interpretation of a
| numerical literal, but that 's always struck me as going a
| bit far.)
|
| It gives you a _+_ word, analogous to the _+_ operator of
| most languages [0]. It also gives you a _1+_ word, as an
| (admittedly slight) abbreviation of the sequence _1 +_. [1]
| If you wanted a _2+_ word, you could easily define it
| yourself.
|
| (This property of Forth evidently wasn't enough to get it
| to take over the world, but it's still neat.)
|
| [0] https://www.complang.tuwien.ac.at/forth/ansforth-
| cvs/documen...
|
| [1] https://www.complang.tuwien.ac.at/forth/ansforth-
| cvs/documen...
| eCa wrote:
| Maybe Raku[1] is for you!
|
| [1] https://raku.guide/#_syntax_overview (see section
| 1.7.1)
| ravel-bar-foo wrote:
| This used to be my default, and then I used Matlab, and "-" was
| interpreted as subtraction.
| apricot wrote:
| I'm of the opinion that kebab-case is the best case for all
| identifiers, because it's easy to read and to type. As always,
| Lispers were right all along.
| jerry1979 wrote:
| I found that some_document_2021-01-01_v03.pdf works best
| because it keeps the same document next to its other versions
| alphabetically, keeps them in date order, and keeps them in a
| sub-day version order.
| jaclaz wrote:
| As a side note, in the good ol' times of ISO9660 level 1-4 and
| the various mkisofs parameters, an underscore _ which is a
| CAPITAL -, may have given issues, only for the record/as a
| curiosity:
|
| https://web.archive.org/web/20151007005513/http://www.911cd....
|
| P.S. should anyone want to see/run the actual batch, a copy has
| been uploaded here:
|
| http://reboot.pro/index.php?showtopic=18962&page=29#entry204...
| Raineer wrote:
| In my work, today's date would be 21K11, to save space over the
| longer date.
| blackboxlogic wrote:
| How do you distinguish 21K111 and 21K111?
| inanutshellus wrote:
| Are you trying to catch GP on differentiating hours, were
| it to be appended to his time format (1st @ 11 vs 11th @
| 1am)?
|
| Notably he didn't promise any, but presumably one'd need a
| separator... Maybe, per his "K" usage of the month, one'd
| use the alphabet again. 11am would be "K" again... or
| lowercase just for giggles?
|
| I don't think it reads very well, but I also think one'd
| get used to it pretty quickly.
| blackboxlogic wrote:
| I was thinking January 11th vs November 1st. Maybe their
| "date" doesn't need/support day-of-month? Or they typod
| and I should just focus on my work.
| apricot wrote:
| I imagine January is A and November is K, so 21A11 vs.
| 21K1 (or maybe 21K01).
| blackboxlogic wrote:
| Ah yes, I missed that K was a month.
| onychomys wrote:
| Are you working in some embedded system with tiny memory
| space or something? What's the use of saving one character?
| Just make it YYMMDD!
| jjoonathan wrote:
| > kebab-case
|
| I hadn't heard that before and I love it.
| FpUser wrote:
| Same. I had tears in my eyes from laughing. For some
| inexplicable reason it seems incredibly funny.
| Asraelite wrote:
| Google considers it too violent apparently. In one of their
| recent changes to their style guide, they started
| recommending "dash-case" instead.
|
| https://developers.google.com/style/word-list#letter-k
| [deleted]
| prepend wrote:
| This guide is stupid. They recommend not using "janky."
| cerved wrote:
| Tbf, dash-case is more descriptive. Kebab doesn't mean
| skewer everywhere
| sodapopcan wrote:
| If you hadn't heard kebob-case called that before there's a
| chance you haven't heard SCREAMING_SNAKE_CASE called that
| before, and I couldn't live myself if I didn't let you know.
| inanutshellus wrote:
| that's hilarious, thanks for sharing that.
|
| Perennially relevant xkcd: https://xkcd.com/1053/
| sodapopcan wrote:
| Awe, in turn I have never seen that particular xkcd--it's
| great! I learned to call it "feigning surprise" and I
| always try and be conscious of it (though I still catch
| myself doing it from time-to-time).
| ur-whale wrote:
| > 2021-01-01
|
| Yes on the date format.
|
| Saves you so much time.
| hnburnsy wrote:
| I don't bother with the century or the dashes, saves time...
|
| 211111_foobar_v1.txt
|
| I am old enough that I still save before printing. I think it
| was Lotus 123 that engrained it for me.
| zz865 wrote:
| Agreed on dates ordering problem but 20210101 is so much
| easier to type.
| testplzignore wrote:
| Years that end in a 1 are awful when doing this, especially
| in October and November. We've had 20211001, 20211010,
| 20211101, 20211110, and now today 20211111.
| nicoburns wrote:
| But much less easy to read!
| zokier wrote:
| I just tend to use $(date -Is) so I don't need to think
| what date it happens to be today. I guess -Id would work if
| you don't want the time part.
| tambourine_man wrote:
| I go one step further: 2021-11-11_client_project-name.ext
|
| 2021-11-11_client_projectName.ext is also OK. But underscore
| separates fields, hyphens for space replacement.
| hardwaresofton wrote:
| I see and applaud your use of the underscore there, but I
| must reject the premise!
|
| work/client/project/2021-11-11-file.ext is more or less how I
| lay stuff out. I'd say client/project is a folder level
| distinction (arguably dates too).
|
| [EDIT] Realistically most of the stuff under <project> is git
| repos and I usually make a "home" repo where I keep org files
| for tracking hours, notes, and resources related to the
| engagement.
| Zababa wrote:
| I'll be the opposite voice: the file system isn't for
| precise organisation, it's just for storing. For
| organisation, the ideal thing to use is tags. Since most
| file systems don't have tags and using software for that
| would be a pain, the best way to do this is to list the
| tags in the file name.
| cmg wrote:
| work/client/project/2021-11-11-file.ext is great until
| you've got a '2021-11-11-project-status.txt' in a few
| directories and you need to find one quickly! I do a
| combination: clients/client/project/2021-11-11-client-
| project-update.txt
| renewiltord wrote:
| I just store it as a content hash and then when I want to
| find the file, I just have to recreate its content and I
| can then just get the hash.
| ModernMech wrote:
| It sounds like what everyone in this thread needs is a
| database file system. This was always my favorite
| proposed feature of Windows Longhorn that never made the
| cut. Almost 2 decades later and Microsoft's latest OS
| still doesn't have this feature.
| tambourine_man wrote:
| Have you used BeOS?
| ModernMech wrote:
| For sure! I actually used Be before I ever used Linux.
| nayuki wrote:
| I wrote about what I perceived as deficiencies of
| hierarchical file systems, and proposed an alternative
| organization based on tags and hashes. It was discussed
| on Hacker News last week and many years ago.
|
| https://www.nayuki.io/page/designing-better-file-
| organizatio... ;
| https://news.ycombinator.com/item?id=29141800
| zajio1am wrote:
| > But underscore separates fields, hyphens for space
| replacement
|
| But why not the other way, hyphen-minus for separating fields
| and underscore for space replacement? That seems to me more
| consistent with how underscores and dashes are used.
| zepearl wrote:
| I fully agree, that's how I do it :)
|
| my_project-some_activity-this_document-20210923-v02.txt
| ridaj wrote:
| Maybe you mean `2021-11-11_client_project-name_v2_final.ext`
| whatusername wrote:
| 2021-11-11_client_project-name_v2_final_ridaj(1).ext
| eurasiantiger wrote:
| Copy (2) of 2021-11-11_client_project-
| name_v2_final_ridaj(1)__FINAL-v2.ext
| pluc wrote:
| this is the way
| tomcam wrote:
| but the extra Shifts, no thank you
| pluc wrote:
| you gotta involve your pinky or it'll atrophy
| reaperducer wrote:
| Cut most mine off in an unsupervised Halloween pumpkin
| carving accident when I was a kid. I think the lack of
| length actually allows me to type faster.
| FpUser wrote:
| I use this style:
|
| 2021-01-01_what-happened_who-did-it_possible-reason
| jonnycomputer wrote:
| I've recently shifted sharply toward the dash from the
| underscore. I find it more readable, and it doesn't require the
| shift key. However, I do find it useful to use underscores to
| create groups, e.g. test-001_2021-10-11.log. Including hours,
| minutes, seconds is still awkward.
| FpUser wrote:
| Brother in arms. I just posted similar thing below.
| kingcharles wrote:
| Burn the witch!
| discreteevent wrote:
| There's a customer for everything. I've just never liked the
| aesthetics of the underscore. Also if your underscored thing
| gets put in some document and then underlined the underscores
| can become invisible.
| jonnycomputer wrote:
| A lot of this is personal aesthetics, for sure. Personally,
| I am not a big fan of camel casing. In code, I only use it
| for class names, generally. I don't find it particularly
| readable, and for filenames, not all filesystems are case
| sensitive, so best not to rely on case to differentiate
| files. Camel case does have the nice property of being more
| compact, as no character is required. That's its main
| benefit.
|
| R traditionally uses the . as a legal character in
| identifiers. Once you get it used to not being syntactic, I
| found I actually prefer them to underscores.
| ur-whale wrote:
| If any of you reading this have to deal with very large scale
| data pipelines for data science / ML type processing, and if
| "don't use spaces and weird chars in file names" hasn't become
| second nature by now, let me just say: you are very, very brave.
| intrasight wrote:
| My first job as a SW Eng was in 1989 in the nuclear industry. Our
| folders and files were limited to 8 letters. So names were
| effectively acronyms. It was actually pretty awesome. Clean and
| concise. Years later, I still remembered the whole folder
| structure.
| roody15 wrote:
| me too... still use underscore all the time.
| hirako2000 wrote:
| I never put spaces, and won't go over 32 characters, preferably
| less than than 16. even when sending a file to my grand mom.
| that's how deep rooted the trauma is. and yes, it remains an
| issue with some parsers and what not.
| johnchristopher wrote:
| I still find files on the internet that my browser can't
| download because too many characters :(.
|
| Edit: can't save, downloading works.
| denysvitali wrote:
| This is a Windows-only issue AFAIK. It's the same reason why
| people decide to put their projects in something like C:\dev
|
| Apparently it's quite easy to reach the 260 chars limit
| johnchristopher wrote:
| No, it's also a Linux issue.
| denysvitali wrote:
| Too many characters on Linux? Quite difficult to reach to
| be fair. Do you have an example?
| johnchristopher wrote:
| I have been trying to repro with a small nodejs server
| but either the server cut off the content-disposition
| filename or firefox truncates it. When I get that in the
| wild I'll post an update.
|
| In the meantime: $ touch 11111111111111
| 111111111111111111111111111111111111111111111111111111111
| 111111111111111111111111111111111111111111111111111111111
| 111111111111111111111111111111111111111111111111111111111
| 111111111111111111111111111111111111111111111111111111111
| 11111111111111 touch: impossible de faire un
| touch '11111111111111111111111111111111111111111111111111
| 111111111111111111111111111111111111111111111111111111111
| 111111111111111111111111111111111111111111111111111111111
| 111111111111111111111111111111111111111111111111111111111
| 11111111111111111111111111111111111': Nom de fichier trop
| long
|
| https://serverfault.com/questions/9546/filename-length-
| limit... 255 bytes it is then.
|
| Firefox cut off at ~217, httpie at 255.
| qwertox wrote:
| I have experienced a person using a space _in a password_ for
| Windows login.
|
| I still don't know how to process this emotionally. Either it is
| somehow naively really genius, or stupid.
|
| In any case, it scares me, mostly because it is a non-IT person.
| Waterluvian wrote:
| Even if libraries all handled it, I'd still personally avoid
| spaces because spaces get semantically used to separate tokens
| and I see file names as tokens.
| ajsnigrutin wrote:
| ascii, no spaces for me
|
| i still get issues with old one-off scripts, that still work, and
| I forgot to properly quote stuff... plus the urls are pain in the
| ass with the %20;s.
| vbezhenar wrote:
| [0-9A-Za-z_-]+ for me.
| lkuty wrote:
| Same here and most of the time it's even just [0-9a-z_]+ It's
| simple and there are no suprises around the corner
| anovikov wrote:
| But it still breaks in so many situations and becomes a pain in
| the ass in so many other ones! I HATE people who use spaces in
| file names. For me it is a sign of a "deeply nontechnical
| person".
| joshlemer wrote:
| I don't know that this is really hacker news material guys...
| amelius wrote:
| You should be still afraid. Many commands such as Unix "xargs"
| don't work properly with spaces if the right flag is omitted.
| AdamN wrote:
| The meta point here is that spaces are the type of thing that
| work fine ... until they don't. This class of bug is best avoided
| entirely, especially if there is an easy workaround (not using
| spaces).
| doodpants wrote:
| I'm not young, but I've been using Macintosh computers regularly
| since 1990, and even back then file names could be up to 31
| characters long, and could include any character except colon.1
| So I'm pretty comfortable using spaces, and sometimes even non-
| ASCII characters, in file names.
|
| Also back then Mac file names typically did not include an
| extension, because the file's type was stored as part of the
| metadata in its resource fork. I remember one time a friend of
| mine was visiting and was playing around with a paint program on
| my Mac. Being used to DOS, when she went to save her file, she
| typed a very short name, and then asked me what the proper file
| extension should be. I smirked and said, "That's not how you name
| files on a Mac. THIS is how you name files on a Mac." And then I
| named her file "Ailsa's Cool Picture". Her mind was blown. :-)
|
| 1This is because the colon was the path separator. But since the
| classic Mac OS had no command line interface, the typical user
| would never type or even see a file path written out.
| forgotmypw17 wrote:
| All of that was very cool and impressive and extremely user-
| friendly.
|
| However, I found the lack of a command-line to be restricting.
| badsectoracula wrote:
| On the other hand Mac had some great GUI programs.
|
| Sometimes i think that the command-line is a crutch that
| keeps programmers from learning how to make good UIs.
| forgotmypw17 wrote:
| True, but most Mac apps were virtually inaccessible by
| keyboard, and with the slow cursor rate made them a
| nightmare for the wrist.
| harshadwaj wrote:
| I have been following the guidelines from this presentation for
| all my filenames, everywhere and it has been working well so far
| - https://speakerdeck.com/jennybc/how-to-name-files
| kreeben wrote:
| Slightly off topic but I find myself stuck at being "please for
| the love of god don't use spaces in git branch names" old. Anno
| dazumal this might not even have been an issue and I'm just cargo
| culting.
| jrimbault wrote:
| And on that topic, git branches are case sensitive but windows
| filesystem API isn't. Git branches are materialized on the
| filesystem as files and directories.
| qayxc wrote:
| The Windows filesystem API supports CS file- and directory
| names just fine.
|
| It can be enabled on a per-directory basis like so:
|
| > fsutil.exe file setCaseSensitiveInfo C:\folder enable
|
| NTFS had support for this for decades now - it was designed
| that way to be POSIX-compliant.
|
| It's shoddy software that lacks support for it, not the OS or
| the file system.
| jhallenworld wrote:
| Yep, I recently got bit by this, someone checked in a branch
| named something like "x<-->y", Windows was unhappy. I think
| this is a git bug: git should escape these names for the
| native platform.
|
| https://stackoverflow.com/questions/1976007/what-
| characters-...
| masklinn wrote:
| If people actually abuse git branches being CS, odds are good
| they're also abusing CS in the repository content.
|
| The linux kernel is one of the offenders, if you check it out
| on Windows or macOS (which supports CS but remains CI by
| default) you'll immediately get garbage in netfilter, because
| it's an habitual user of having different files with names
| identical but for the casing e.g. xt_TCPMSS.h and
| xt_tcpmss.h.
| chrismorgan wrote:
| I enjoy choosing fun branch names from time to time. A few of
| them: Russian when a user reported a typo in a Russian
| translation; emoji (mostly _added_ emoji rather than _pure_
| emoji); and my personal favourite, a ~250 character diatribe
| about a single-character bug I was fixing (~250 after I
| discovered that Git's error messages when you cause it to try
| to use file names too long for the file system are fairly
| mediocre).
| swayvil wrote:
| Me too. Afraid of dashes too as they might be interpreted as
| minus. I use a lot of underscores __ _____ _ _ _
|
| Weirdly, my friend hates underscores. But he's a baseball fan
| nvilcins wrote:
| Tangentially, I frequently add dates to filenames to keep things
| organized. And _always_ in the `YYYYMMDD` format for clarity and
| technical reasons; `DDMMYYYY` (or God forbid the Americans'
| `MMDDYYYY`) never made much sense to me.
| wglb wrote:
| I do this so often that I have an emacs macro or two that helps
| me out: (defun mdy () (interactive)
| (insert (format-time-string "%04Y-%02m-%02d")))
|
| That inserts the "proper" date format (e.g., 2021-11-11) at the
| current point.
|
| Then to create a date-stamped file name: (defun
| file-mdy (file-name) (interactive "sbasename: ")
| (find-file (format "%s-%s.org" (format-time-string
| "%04Y-%02m-%02d") file-name)) (save-buffer))
|
| And a few others.
|
| Nobody seems to misunderstand this date format. US folks might
| find it annoying, but understand what it means.
| sclangdon wrote:
| If you're developing on Windows, I find a good way of dealing
| with this to convert paths to short format before using them
| (E.G. GetShortPathName in kernel32.dll).
| andreareina wrote:
| Spaces breaking tab completion is still an issue, so, yeah.
|
| ETA: not broken in a technical sense, but having to escape them
| isn't the best experience. So it's just easier for me to avoid
| spaces.
| JadeNB wrote:
| Where? It works fine in bash and I think most shells ....
| andreareina wrote:
| That was a bit of hyperbole on my end, my bad. But you do
| have to escape the space, which I'm counting as a minor
| break.
| thriftwy wrote:
| I have had a huge music library on my RAID, and naturally it had
| a lot of spaces, and non-ASCII, in the file names.
|
| It's cumbersome-ish, but can be made to work.
|
| Then there's shell injection via files containing a newline
| character in their name...
| slmjkdbtl wrote:
| Can someone convince me to not use spaces in music, film, and
| book files where they have a "standard title"?
| wglb wrote:
| I still find them annoying, doing lots of work on the command
| line. I use this hack: #!/usr/local/bin/sbcl
| --script (load "~/.sbclrc2") (require 'replace-all)
| (in-package :replace-all) (format t "file is ~s"
| (second sb-ext:*posix-argv*) (probe-file (second sb-ext:*posix-
| argv*))) (let* ((args sb-ext:*posix-argv*) (orig
| (second args) ) (newfn (if orig (replace-all
| orig "(" "-") orig)) (newfn1 (replace-all
| newfn ")" "_")) (newfn2 (replace-all newfn1 " " "-"))
| (newfn3 (replace-all newfn2 "&" "-")) (newfn4 (replace-
| all newfn3 ":" "-"))) (when orig (format t "renaming
| \"~a\" to \"~a\"~%" orig newfn4) (multiple-value-bind (new-
| name old-truename true-newname) (rename-file orig newfn4)
| (format nil "new-name ~a old-true ~a new true ~a" new-name old-
| truename true-newname))))
| forgotmypw17 wrote:
| I'm "whitespace as syntax is stupid" years old
| adfm wrote:
| Kids these days will say "What's a file name?" and mean it.
| Typing? That's for the olds.
| 3guk wrote:
| Somehow the OneDrive clients still refuse to allow leading or
| trailing spaces in the filenames, along with a few other
| characters that are not allowed - seems to cause quite a bit of
| user friction at least with the non-tech guys that I work with
| who are confused about why OneDrive is one of the few file
| syncing clients that has these requirements....
| icefo wrote:
| Gdrive the same "issue". I think it's on purpose to avoid files
| that seems to have exactly the same name.
|
| This can cause user confusion
| luckman212 wrote:
| I have had to deal with that nightmare multiple times this
| year! It was a real head scratcher at first.
| bryanrasmussen wrote:
| I'm "still afraid to use spaces in file names" wise, dammit!
| nocman wrote:
| I would say I'm "wise enough to not use spaces in filenames".
|
| It's not about fear, it's about making good decisions, and
| avoiding unnecessary complication.
| cabaalis wrote:
| I'm hoping to one day be "Windows adds user root folder to the
| quick links in explorer by default" years old.
| toyg wrote:
| Shells are indeed the main culprits for the continued fear of
| spaces, but not the only ones. A lot of programs that deal with
| "metadata" which will then generate database tables and stuff
| like that, still struggle when working with any sort of special
| character. And the same for anything that, behind the scenes,
| just feeds text into regexes.
| frzj wrote:
| Just this weekend I learned that the Espressif Framework
| doesn't like it aswell.
| jimnotgym wrote:
| I won't use a space if I think I may need to address that file
| from the command line...
| neilv wrote:
| I'm apparently in the minority of people who know how to write
| shell scripts that have a chance of working correctly with
| filenames with spaces in them... _and_ that 's not the only
| reason I avoid spaces in filenames. :)
| foxrider wrote:
| I must be nightmare customer, because I've always been exploiting
| my ability to use filenames in full UTF-8. I'm that guy that
| sends .pdf to your website.
| notacoward wrote:
| If putting spaces in file names makes you queasy, try punctuation
| - especially punctuation like semicolon or ampersand or single
| quote that's meaningful to shells and such. <shudder>
|
| Also, emoji.
| hutzlibu wrote:
| Or for more fun, use language specific characters, like
| aouss...
|
| And even more fun is, when it mostly works, but then it doesn't
| and you notice too late.
| sokoloff wrote:
| You don't name your files with extensions && rm -rf?
| amitaibu wrote:
| I can relate! :)
| stavros wrote:
| I saw this and felt old, but then the comments in here made me
| realize that the fear\ is%20real.
| Pensacola wrote:
| I'm newly afraid to use emojis in domain names:
| https://tinyprojects.dev/projects/mailoji
| dukoid wrote:
| I'm "still afraid to use more than 8.3 characters in file names"
| years old!
| alanhaha wrote:
| Today, WSL will try to add PATH in Windows to PATH in Linux. So
| if you install something like NodeJS in Windows, and run node in
| Linux, it will try to call /mnt/c/Program Files/nodejs/node.exe
| and say "no such file or directory: /mnt/c/Program".
| oytis wrote:
| In the shell spaces have to be escaped which is annoying. This
| doesn't change with age I think
| distant_hat wrote:
| I had a guy in my team use forward slashes in filenames. Terrible
| idea, caused all sorts of weird issues.
| zokier wrote:
| Did you mean backslashes? I don't know if any filesystem/OS
| supports forward slashes in filenames
| kps wrote:
| OS X does in the GUI; they're isomorphic to ':' at the UNIX
| level. (The Mac used ':' as the directory separator.)
| rootbear wrote:
| And a : in a file name at the GUI level gets turned into a
| dash! I just tried to name a text file "Foo/Bar 10:01.rtf"
| and it changed it to "Foo/Bar 10-01.rtf"!
| kps wrote:
| In that case the GUI is merely changing the file name you
| type; in a shell you'll see it as "Foo:Bar 10-01.rtf".
| danachow wrote:
| How was this possible? None of the mainstream operating systems
| allow this.
| distant_hat wrote:
| via GUI in OS X.
| danachow wrote:
| Ah so that's not really putting a slash in the name on disk
| - finder is just displaying the colon that way - it
| substitutes with a colon for historical reasons that have
| to do with pre OSX MacOS (but you can see if you create a
| file from a program or the command line with a colon in it,
| it will display as a slash in finder). It shouldn't cause
| any problems on its own on the system - but the colon is
| troublesome if you have to interact with DOS/Windows
| lineage machines.
| mrweasel wrote:
| But nice for testing. I spend a few month on Windows while
| doing a Django project and found a number of bugs no one else
| discovered because they used Mac or Linux.
| 1970-01-01 wrote:
| I'm still afraid of any non-8.3 filename.
|
| https://en.wikipedia.org/wiki/8.3_filename
| zwieback wrote:
| Anything more than 8.3 is for sissies.
| rsync wrote:
| acme.sh - a shell script that I use to create "Let's Encrypt" SSL
| certificates - creates and maintains directories with asterisks
| in them:
|
| https://github.com/acmesh-official/acme.sh/issues/1408
|
| This is the sysadmin equivalent of piercing your nose just to
| make your parents mad.
| lostgame wrote:
| I name almost everything with underlines still. I think it's a
| programming habit.
|
| Although lately I have started saving my Logic Pro files with
| spaces, simply because I prefer it to be the name of the song as-
| is.
| ReleaseCandidat wrote:
| Still way too many libraries and programs can't handle spaces in
| filenames.
|
| And shells and other programs still have problems with perfectly
| legal characters in filenames too, like '!' or ':'.
| HenryKissinger wrote:
| > Still way too many libraries and programs can't handle spaces
| in filenames.
|
| "It's nothing."
|
| "What do you mean?"
|
| "It's nothing... It's empty space. I never taught the computer
| how to read empty space!"
|
| "I never taught Virgil how to fly."
| Pxtl wrote:
| Colons are a problem on Windows, so it's reasonable to
| discourage creating files with colons in the name.
| danielvaughn wrote:
| yep, I still don't use spaces. I also don't use uppercase
| characters. Just underscores or hyphens.
| boringg wrote:
| Sometimes I break the rule and use uppercase but never
| spaces.
| ptha wrote:
| I've had issues when moving between Window/*nix file
| systems, where Windows file names are case insensitive and
| *nix systems are case sensitive.
|
| Build script works fine locally on Windows, but then chokes
| in *nix test server, as it's effectively a different path.
| cerved wrote:
| file names aren't case insensitive, it's the windows API
| that is
| ptha wrote:
| I assume you mean that the Windows API (standard for
| Windows apps) is case insensitive, but if using the WSL
| (Windows Subsystem for Linux) it's possible to get case
| sensitivity: https://docs.microsoft.com/en-
| us/windows/wsl/case-sensitivit...
| efreak wrote:
| Even if you're not using WSL, you've always been able to
| turn on case sensitivity via a registry key. This has not
| been recommended in the past due to possible issues with
| windows itself as well as third party software. This
| history is mentioned here[0]. Everywhere that mentioned a
| registry key seems to be referring to windows nfs server,
| not to general file access, however I know that SFU
| (Services for Unix) installer had an option to do so, so
| it's certainly possible.
|
| As of sometime in 2018, fsutil can set specific directory
| trees to be treated case sensitive in Windows 10 without
| setting it for the OS. This ability is mentioned here[1]
|
| [0]: https://devblogs.microsoft.com/commandline/per-
| directory-cas... [1]: https://docs.microsoft.com/en-
| us/windows/wsl/case-sensitivit...
| danielvaughn wrote:
| I've had issues with git when changing a filename, if the
| only change is the casing.
| jerf wrote:
| Was recently encoding my Stargate: SG-1 DVDs to move them to
| plex. I was encoding it on a system other than what was serving
| it, so I had to copy it. It's surprisingly difficult to "scp" a
| file with a colon in it directly.
|
| I also love when you're using bash and you have a file with !
| in the name, and you accidentally fail to correctly backslash
| it, you not only get "bash: !rest_of_filename: event not
| found", but it _also_ fails to add that command line to the
| history, so you can 't just hit up and fix it. You have to
| actually go to the mouse and copy and paste.
| philote wrote:
| Can't you usually just put quotes around the filename and/or
| path to prevent all those issues?
|
| Edit: nope, just tried it and scp still sees the quoted
| filename as a host + path
| warkdarrior wrote:
| That is just lazy programming. If the input "foo:bar" is
| ambiguous, the program should try both interpretations
| (HOST:FILE and FILE) and then present the user with a
| prompt that provides sufficient information.
|
| "Does foo:bar refer to the local file `foo:bar' (size:
| 102kB, date: 2021-11-11) or to the file `bar' on host `foo'
| (FQDN: foo.example.com, IP address: 1.2.3.4)?
|
| 1: local file `foo:bar'
|
| 2: file `bar' on remote host `foo'
|
| Your selection: "
| AnIdiotOnTheNet wrote:
| It's almost like in-band signaling isn't a good idea or
| something.
| kerblang wrote:
| That sounds like... Puzzle time! I had to cheat, sort of, by
| looking at the man page:
|
| > Local file names can be made explicit using absolute or
| relative pathnames to avoid scp treating file names
| containing ':' as host specifiers.
|
| So `scp foo:bar user@host:~` fails because it tries to find
| the host foo. But `scp ./foo:bar user@host:~` works just
| fine. I feel kind of stupid for not guessing as much.
| koheripbal wrote:
| WinSCP currently has a bug that crashes if it tries to sync a
| folder with a space in the name
| mywittyname wrote:
| Is "!" legal in Windows? I'm pretty sure it is not, but I'm not
| on a Windows machine to test.
| remram wrote:
| If you suspect that the file might be handed to a bash script
| at any point, being afraid of spaces is very healthy for sure.
| chrisseaton wrote:
| > And shells and other programs still have problems with
| perfectly legal characters in filenames too, like '!' or ':'.
|
| Without asking you to always quote and escape every file name -
| what alternative is there? If they tried this you'd probably
| find you didn't like it.
| zeroimpl wrote:
| Not exactly - the problem is mostly when doing variable
| expansion. The fact that bash treats "$x" and $x as different
| is a bit of a design flaw. Of course there's still an issue
| with evaluating dynamically generated code, but that problem
| is partly solved by working with arrays.
| chrisseaton wrote:
| I mean how do you want shells to deal with file names with
| spaces in? Do you think we should have to quote and escape
| all file names all the time? If not then how do you think
| it should work?
| rcxdude wrote:
| Shells should treat data as data, and not have the
| default behaviour be treating it as code (i.e. you should
| need to do 'eval $x' or some equivilant if you acutally
| want the string to be treated as a shell command). This
| would also mean having a real list type, instead of
| depending on arbitrary seperators in strings. This is
| exactly how other languages treat it, and it is not a
| significant challenge for interactive use (in fact, it
| would substantially reduce the opportunity for suprises
| when running commands interactively as well).
| billpg wrote:
| "You need to add --print0 to your find call and -0 to your
| xargs."
| jrootabega wrote:
| I tend to follow a Postel-like system when it comes to this. When
| I write a script I'll usually get paranoid and make at least
| token efforts to handle spaces. Which I will then never, ever
| use.
| adulion wrote:
| I don't even use spaces in csv column names
| uncomputation wrote:
| I don't think this is so much an age thing as a programmer thing.
| Old people will still name files all sorts of things, and a lot
| of young programmers today avoid spaces.
| NoblePublius wrote:
| I love it when characters like | break OneDrive
| alephan wrote:
| I've never created a filesystem entry name with a space. Mainly
| because fear and when fear is not proven, "\" looks so ugly. But
| I think I'm even worse, I dislike capital letters too.
| ubermonkey wrote:
| Our tool has no issues with spaces in fields, but we still advise
| users not to do it because other systems OFTEN STILL DO, in the
| year of our lord 2021.
| JadeNB wrote:
| So, born today, eh?--says the guy who still regularly runs into
| build scripts that cheerily command that they be run from
| directories without spaces, since that's easier than proper
| quoting in the script.
| HNo wrote:
| Anyone else totally fine with spaces in filenames? I use to rip
| _a lot_ of CDs back in the day, and never had an issue with the
| spaces in the file names.
|
| 01 - Metallica - Metallica - For Whom the Bell Tolls.mp3
|
| Names like that were common, and had many spaces.
| snvzz wrote:
| Spaces in filenames were a mistake to begin with.
|
| Spaces are used to separate parameters in the command line.
| There's also no real need for filenames to support spaces.
| jfb wrote:
| The filename belongs to the user. Therefore, it is incumbent on
| the computer to adapt, not the other way around.
| nomel wrote:
| Or, one could claim that the poor parsing of a text interface
| shouldn't dictate the for-human names of files, especially when
| an exceedingly small percentage of users deal with that text
| interface.
|
| But, of course, if you mix the abstractions of metadata
| (filename) with location, things won't be trivial.
| kazinator wrote:
| The nice thing about spaces is there are so many to choose from,
| thanks to Unicode.
| makapuf wrote:
| Well, I'm using makefiles old
| hknapp wrote:
| Literally just fixed a bug in our software because of an issue
| with spaces.
| rvense wrote:
| There was a Discussion yesterday at work about allowing quotation
| marks and semicolons in some user-set titles. We use Mongo. But I
| empathize.
| bborud wrote:
| Not obeying the "Robustness Principle" in software is just poor
| engineering.
|
| https://en.wikipedia.org/wiki/Robustness_principle
| vertere wrote:
| Definitely applicable here. There's no way we're going to
| eliminate all problems with spaces etc, so why invite trouble.
|
| I wouldn't say it's _always_ poor engineering though,
| especially the 'liberal in what you accept' half.
| bborud wrote:
| Yes, you have a point there, but in this case would being
| _liberal in what you accept_ be to accept filenames with
| spaces or (arguably) doing filename handling correctly (ie
| accept filenames with spaces)?
| armandososa wrote:
| I'm "8 characters max plus a 3 character extension in your file
| names" old.
| Havoc wrote:
| I_promise_I'm_not.
| comeonseriously wrote:
| Without exception, I never ever ever use spaces in filenames.
| Ever.
| trudler wrote:
| tbh using spaces in file names is still stupid.
| DonHopkins wrote:
| Then you must also be "still afraid to write Python instead of
| Bash scripts" years old, too.
| dncornholio wrote:
| Remember when we put + instead of %20? Spaces in URL's are still
| a nightmare IMO. I still get strange access log entries where
| some encoding went lose, especially in heavy Javascript
| enviroments.
|
| Same goes for capitalisation. All filenames should be lowercase.
|
| Maybe it's not strictly necessary, it can avoid headaches.
| necovek wrote:
| Plus sign actually came from
| https://en.wikipedia.org/wiki/Query_string#Indexed_search
| jasode wrote:
| Yes, spaces in filenames introduce edge cases and bugs that
| people are not always aware of.
|
| E.g. Here's a random StackOverflow q&a about a Git pre-commit
| hook where the _top-voted answer does not properly handle
| filenames with spaces_ :
| https://stackoverflow.com/questions/2412450/git-pre-commit-h...
|
| However, the 2nd and 3rd most upvoted answers do mention "-z"
| option to handle spaces.:
| https://stackoverflow.com/questions/2412450/git-pre-commit-h...
| jonathanoliver wrote:
| I always format my filesystems (macOS) as case sensitive and I'm
| surprised by the software that has a hard time with that.
|
| On Unix/Linux we've grown up with case sensitive by default but
| everywhere else it still seems to be a problem now and again.
|
| I should qualify this...I'm en-US so I have no idea what the
| experience is like for anyone else.
| phreack wrote:
| My username has been my name which has an accented character and
| has broken countless Windows apps every year since forever, so I
| just keep a C:/Programs folder where I run stuff. You should
| never not fear filenames.
| ASalazarMX wrote:
| I am overly aggressive with spaces and special characters in
| filenames: I use them everywhere and report a bug when they
| cause errors, because they shouldn't in this UTF-8 age.
|
| I still don't use the special character of my name in my
| username because that has caused me many hard to fix troubles.
| Think "cannot recover user password because this user doesn't
| exist".
| antiquark wrote:
| You mean, I'm linux years old?
| meepmorp wrote:
| This is much older than linux or gnu.
| darepublic wrote:
| If you're working on cli this is reasonable
| luke2m wrote:
| I'm 15. I am as well.
| glandium wrote:
| Spaces in file names are a nightmare in Makefiles.
| necovek wrote:
| Not if you are careful (a bit like "$@" vs $@ in shell
| scripts).
|
| Edit: replace $@ with quoted version which actually changes the
| behavior (I was wrong that the difference is between $* and
| $@).
| chrismorgan wrote:
| I don't think it's fair to claim that any Make implementation
| supports spaces: there are too many fundamental bugs and
| breakages, so that lots of rather important Make
| functionality is off-limits if any of your file names will
| have spaces.
|
| https://www.cmcrossroads.com/article/gnu-make-meets-file-
| nam... explains the situation in GNU Make in 2007 (and I
| don't think it's changed since then, though jgrahamc
| especially could correct me). Not being able to use such
| features as $^ and $(patsubst) is _severely_ debilitating for
| all but the simplest of makefiles.
| necovek wrote:
| That's a fair point, thanks!
| rkangel wrote:
| Software engineers - particularly of the more embedded variety -
| absolutely still have this problem.
|
| The main culprit is GNU Make which does not cope with spaces in
| filenames. As far as it is concerned an array is a string
| separated by spaces so it gets very confused. Yes there are some
| partial workarounds, no none of them consistently work. You learn
| very quickly to check all code out in a file tree with no spaces
| in it, otherwise builds can randomly break in strange ways. It's
| not always clear up front whether Make is going to be involved
| somewhere in the build, so it's just easier to be safe.
| 123pie123 wrote:
| I still use the Netbios limitations (15 Characters) when naming
| servers
| Jenda_ wrote:
| I don't use spaces, because I want to be able to run ad-hoc shell
| one-liners when working with my data without worrying about
| quotation and similar stuff.
|
| I also don't use :, as I have ran into problems with both Bash
| and its completion and FAT FS. Unfortunately, I routinely have
| timestamps in filenames, so I need to use +%F-%H-%M-%S instead of
| simple +%F-%T.
|
| One thing has improved, though: I have not run into problems with
| escrzyaie (which my language is full of) for maybe a decade,
| except on OpenWRT where space seems to be scarce to support non-
| ascii.
|
| Edit: I now remember one problem, getting images for a website
| from an OS X user, which used combining characters instead of
| direct code points
| (https://en.wikipedia.org/wiki/Unicode_equivalence#Example), but
| HTTP requests got normalized in some browsers, leading to strange
| 404s.
| kabdib wrote:
| My proposal for a shell on the Mac, in the late 80s, was:
|
| - Spaces in filenames get transformed to non-breaking spaces by
| the filesystem;
|
| - The filesystem treats nbsp as equal to space (just as case-
| folding treats A=a, B=b, etc.)
|
| Now, argument parsing, mouse double-clicks, etc. all respect
| filenames as "words", and the output from things like 'ls' just
| work.
|
| (Yes, I'm well aware that there are case-sensitive filesystems
| out there. I'd forgotten that iOS was one of those).
| throwawayffffas wrote:
| If a filename doesn't match \w+\\.\w+ I hate it
| rapind wrote:
| I wonder why "space" wasn't always simply treated as another
| character. To save a couple bytes back in the 50s (when it
| mattered) I assume?
| deepsun wrote:
| All because we use programmatically interfaces that were intended
| for humans to write: command line, sql, html, email headers.
| qayxc wrote:
| It's worse than that. Whitespace is a hellish invention in the
| world of computers: there are multiple characters that may or
| may not render as whitespace with no way to distinguish them by
| just looking at the output.
|
| Yet to the machine (script, shell, program, ...) it matters a
| lot, since u0020[?]u0009[?]u00A0[?]u2000[?]u2001, etc. whereas
| the aforementioned codepoints render like this: " " (and yes,
| that's indeed the five codepoint in that order - at least I
| typed them that way).
|
| (Ab)Using whitespace like that can lead to all sorts of funny
| business, not just when dealing with shell scripts and variable
| expansion.
| bravetraveler wrote:
| Admittedly trite/unhelpful comment: avoid xargs
| stochastic_tn wrote:
| I see that this guy must be in his early twenties as well.
| fallingfrog wrote:
| No way I would put anything but a-z, 0-9, and underscore in any
| file name. Too many stupid ways it can go wrong. I guess I have
| very little trust in my fellow programmers!
| pixelbeat__ wrote:
| POSIX portable file names were defined not to have spaces, and
| just contain '[[:alnum:]_./]'.
|
| The findnl script as part of fslint identifies problematic
| patterns, and has 4 levels of stringency, with "POSIX" being the
| most stringent.
| https://github.com/pixelb/fslint/blob/master/fslint/findnl
| zibzab wrote:
| Why stop at spaces?
|
| An old prof of mine used to send emails where the subject line
| was always a valid identifier in C.
|
| Hello_dear_students_where_are_your_reports_
| wruza wrote:
| That identifier is clearly too long.
|
| MISRA C:2004, 5.1 - Identifiers (internal and external) shall
| not rely on the significance of more than 31 character.
| meshaneian wrote:
| As a software engineer, I require testing of paths and files in
| spaces, and forbid the use of spaces for any system generated
| file possible to make cli easier.
| RickJWagner wrote:
| Oh, yeah. Me too!
|
| Except nowadays I worry more about user names that get fed into
| collaborating applications (with different edit criteria) and
| password characters (again for systems with differing, strange
| edit rules.)
| ineedasername wrote:
| Instead of spaces I just use U+2215
| kazinator wrote:
| Spaces in file names are a poor idea. File names are identifiers,
| not titles.
|
| Let's test something: http://example.com/my silly webpage.html.
|
| Hey look, HackerNews just broke a URL with spaces in it. And it's
| written in a Lisp dialect and all; it's not some Unix job cobbed
| together with shell, sed and awk. The language has a string data
| type, and strings are passed to functions without word-breaking
| interpolations taking place.
|
| You know what else breaks on spaces? Basic everyday gui text
| manipulation.
|
| Suppose that in a block of text we have the sentence:
|
| > Please look for the Holiday Schedule 2021 file.
|
| If you double click on any part of the name like Schedule, pretty
| much every text widget on the planet will just select only that
| word, and not the entire filename.
|
| However, if you have:
|
| > Please look for the holiday-schedule-2021 file.
|
| There is at least a ghost of a chance that a semi-intelligent GUI
| can pick that out as a word.
|
| There exist good reasons to keep identifiers as clump beyond just
| command line shells.
|
| It's why we need encoding like %20 in URLs that never pass
| through a shell script.
| NelsonMinar wrote:
| Nothing old about that; lots of stuff is still broken. What are
| the odds Homebrew works if installed to a directory with a space
| in the name? Maybe the core brew manager itself, but all the
| packages?
| totetsu wrote:
| It messes with tab completion in bash is why I avoid spaces
| foxfluff wrote:
| I'm hardly afraid but I just think it's poor ergonomics. Same as
| the move from xset m 0 0
|
| to xinput --set-prop 'pointer:Logitech USB
| Receiver' 'libinput Accel Profile Enabled' 0, 1
|
| Everything seems to be going this way in Linux land. Longer
| names, harder to type names, camelcase names, spaces... I'm
| looking forward to an OS that treats command line ergonomics as a
| first class feature and where camelcase & spaces are verboten.
| martin-t wrote:
| I find this attitude misguided. More descriptive names are more
| ergonomic for things you only use rarely but they need to be
| combined with much better autocompletion than most shells
| provide by default.
| Too wrote:
| Short option for interactive terminal. Long option in
| automation.
|
| I'll be damned if I have to remember or lookup what -n means
| to some obscure program, when reading someone else's script.
| Exception given for super common tools where everybody knows
| like ls -la.
|
| With the disclaimer that shell scripts, especially ls, aren't
| exactly suitable for reliable automation in the first place.
| foxfluff wrote:
| You state that as if that were objective.. but that's not my
| subjective experience at all. Somehow I have a hard time
| remembering these long names, (is it --conf or --config or
| --config-file or --config-path? -c would've done it for me.
| --set or --set-prop or --set-property or --prop or
| --property?), and I need to look them up in a man page
| anyway, and I make more typos typing them, and shell
| completion rarely works well if at all. I also find it harder
| to read and edit long lines that wrap.
|
| Somehow these short letters stick much better for me, and the
| effort for finding them in the manual is the same, although
| in case of extra complexity as with xinput, it's even worse
| with the long names. I don't use either command often, but
| it's hard to forget xset m. The only thing I remember about
| xinput is that it's a horribly long lithany of things which I
| need to look up every time, and the syntax still feels weird.
| me_me_me wrote:
| the most used options for properly written tools have both
| short single char option like -c and long-form version
| --config if you need verbose self-describing option.
|
| If you are using cli tools of github written by a random
| person, then no wonder you will see non-standard approaches
| to UX.
| sidpatil wrote:
| PowerShell takes an interesting approach in that it
| accepts any truncated variant of a long-form flag as a
| short form, provided it isn't ambiguous (i.e. if the
| interpreter can't decide which long-form flag to expand a
| short-form flag to.)
|
| For example, if a command features a "-ConfigFile" flag,
| valid short-form variants include "-C", "-Co", "-Con",
| "-Conf", and so on. But if the command featured an
| additional flag "-ConfigURL" for example, the
| aforementioned short-form flags would be ambiguous.
| Mindless2112 wrote:
| getopt_long (and thus most GNU programs) work this way. I
| think it's probably a misfeature though since it means
| that adding a new option can introduce ambiguity. Having
| both short (ex. -x) and long (ex. --exclude) options is a
| less problematic solution.
| efreak wrote:
| The few scripts that I've written for personal use
| generally lack documentation or help commands of any sort;
| instead, they take all possible straightforward variants I
| can think of for each command (`--config`, `--config-file`,
| `--cfg`, `--conf`, etc). They usually convert everything to
| lowercase before processing, too. It's easier to fail
| safely on too much/too little input than it is to provide
| actual help.
| ufo wrote:
| The shell ought to be able to help with that. There's no
| need to remember if it's --conf or --config if you can
| press --conf<tab>.
|
| One of the things I like about Fish is that by default it
| can tab-complete program options and also shows a one-line
| description of what each of them does. (It grabs that info
| from the man page).
| ori_b wrote:
| So much of computing is dedicated to solving problems
| that could be omitted.
| salawat wrote:
| Seriously. Just get up from the computer and go do
| something else. /s
|
| We computer people are truly an odd bunch.
| Joker_vD wrote:
| I mean, that's precisely my thoughts on copyright and
| licensing in general but what can you realistically do?
| forgotmypw17 wrote:
| Realistically, on an individual scale, you can pretend it
| doesn't exist and go on with living your life?
| Joker_vD wrote:
| I very much would if only that pesky State didn't
| persecute me for that. Apparently, when I refuse to
| acknowledge the copyright and software license terms,
| other people get upset to the point of bringing the wrath
| of that Leviathan of oppression upon me! The nerve of
| some people!
| fouc wrote:
| > and shell completion rarely works well if at all
| foxfluff wrote:
| I just tried fish. xinput --set-[TAB] and nothing.
| Apparently it doesn't understand the standard long-option
| format that is supported by xinput and documented in the
| man page. You have to know to omit the dashes and then
| it'll complete. And it's downhill from there.
|
| Yeah I used to have all kinds of simple as well as
| supposedly sophisticated completion setups with zsh years
| ago but I've given up on it since then. It's always half-
| assed and half the time causes more problems than it
| solves. Same with bash. There are some places where I
| must resist the urge to try complete a filename because
| the shell starts trying to figure out which target it can
| complete from a Makefile in a large build system and just
| freezes. The only practical way out is to interrupt and
| type the command again or wait a stupidly long time.
| There are other issues like completion trying to be smart
| and filtering out things it thinks you don't want to
| complete. Nothing is more frustrating than a shell
| refusing to complete a filename that you know is there.
| throw10920 wrote:
| I run fish. I was able to get long-option completion for
| gcc, polybar, firefox, man, emacs, xrandr, and fish
| itself. The only command I was _not_ able to get long-
| option completion for was xinput. You just picked a bad
| program to try.
| johnchristopher wrote:
| I like long form version. It helps me remembering what it
| does and why. Eg: `iptables --insert INPUT --protocol tcp
| --jump ACCEPT` was more helpful to me than `iptables -i
| INPUT -p tcp -j ACCEPT` when told how to allow TCP traffic.
|
| For everyday command like `ls -l` I don't mind but anything
| more serious I take a more cautious approach.
| tambourine_man wrote:
| I'm with you. Terseness is paramount.
|
| I could never overcome my repulsion for Java and ObjC
| because of that. On the other hand, I fell at home with
| crazy RegEx that look like line noise to most people.
| yepguy wrote:
| I think shells could use something like a built-in
| eldoc[1], in addition to tab completion. It would make
| terse command line interfaces much more usable if you
| could see what the positional arguments were for.
|
| [1]: https://docs.cider.mx/cider/config/eldoc.html
| 8bitsrule wrote:
| I hate .methodNameAsLongAsMyArm as well, but there's the
| opposite extreme:
|
| As a beginner, I liked short variable names. When I came
| back a few months later, I learned my lesson. Years
| later? easier to just start over.
| omnicognate wrote:
| Spaces don't make anything more descriptive, they just cause
| completely unnecessary quoting and escaping hassle.
|
| The amount of time that has been wasted by Windows using
| "C:\Program Files" instead of "C:\Program_Files" far
| outweighs any highly questionable aesthetic benefit IMO.
| ygra wrote:
| On the other hand, how much broken code has been fixed to
| properly deal with paths just because of that? I'd argue
| that to be a major benefit. Same with Windows Vista forcing
| developers to write applications that work properly as a
| non-admin user.
| skohan wrote:
| What's wrong with camelCase? It's easier to type than snake
| thrwyoilarticle wrote:
| There's a tendency away from snake_case and towards kebab-
| case in things you interact with via CLI. Even moreso towards
| nocase.
|
| Programs like Powershell eschew ease of use in CLI for
| readability in scripts.
| pvaldes wrote:
| Snake_case is problematic for including filenames in TeX
| also. This is a big no for me, even if I find it more
| readable than the other.
| JadeNB wrote:
| > Even moreso towards nocase.
|
| Nocase (did I break a rule by writing it that way?) seems
| great when you're enmeshed in the domain and you can see
| the implicit separators, but then someone looks at your
| naming from the outside and you're guaranteed to have an
| 'expertsexchange' in there somewhere.
| thrwyoilarticle wrote:
| oh, fsck
| rk06 wrote:
| Powershell is case-insensitive, so camelCase is only a
| writing preference
| thrwyoilarticle wrote:
| It's still verbose in places
| chrismorgan wrote:
| camelCase is objectively harder to read than snake_case or
| kebab-case, though familiarity can mitigate that.
| skohan wrote:
| I'd argue it's at most a tiny bit harder to read, and a
| _lot_ easier to type. On balance I 'd rather avoid making a
| pinky key one of the keys I have to use the most.
| frenchyatwork wrote:
| Having used a lot of all the formats, it's argue it's a
| lot easier to read an a tiny bit harder to type. For
| typing it's basically just an extra `-` because unless
| your alternative is nocase.
|
| For reading, CamelCase has 2 significant ambiguity
| issues: similarity between I and l, and what do you do
| with acronyms. Acronyms wouldn't actually be a problem if
| everybody just wrote them would in snake_case (i.e. only
| capitalize the first letter), but they don't and so it's
| anyone's guess whether you're going to get "Id" or "ID".
|
| There's also a minor issue where if you're on a case-
| insensitive file system it can be a little difficult to
| change casing, but adding/removing underscores is easy.
| skohan wrote:
| Adding an underscore everywhere is horrible! The spacebar
| is huge, and gets your thumbs basically to itself because
| space will be one of, if not the most commonly typed key.
| To replace that with one of the least ergonomic keys
| makes no sense.
|
| And if CamelCase is so hard to read, why is it the norm
| for "high level languages"? Shouldn't those be optimized
| for ease of use?
| frenchyatwork wrote:
| > And if CamelCase is so hard to read, why is it the norm
| for "high level languages"
|
| That's over-selling it a bit. It's more common, but not
| dramatically so. Outside of class names, CamelCase isn't
| the norm for Python, PHP, CSS, HTML. It's also not the
| norm for shell scripting, but shell scripting has
| horrible readability for other reasons.
|
| I believe CamelCase is more common for languages like Go,
| C#, and Java because they grew up in large organizations
| where having god objects/classes with 400 methods is
| kinda normal and having aMethodWithAReallyLongName is
| pretty common. One of the advantages of CamelCase is that
| it does shorten really long names.
| [deleted]
| daneel_w wrote:
| _" On balance I'd rather avoid making a pinky key one of
| the keys I have to use the most."_
|
| And you use something else than your pinky finger for the
| shift key specifically when typing capitalized letters
| for camelCase?
| skohan wrote:
| At least it's where they sit naturally on the keyboard.
| And the shift key is wider specifically so you don't have
| to be accurate with your pinky when you're pressing it.
| The underscore is one of the least ergonomic keys there
| is. And you need _both_ pinkies to do it
| daneel_w wrote:
| I might be misunderstanding. On all layouts I'm familiar
| with the underscore key is directly next to one of the
| shift keys, or left of backspace. Neither layout requires
| the Vulcan death grip. Shift should always be under your
| pinky fingers to avoid contortions.
| skohan wrote:
| On the US layout it is next to the zero key on the top
| row.
| Pxtl wrote:
| imho, the fundamental problem is using space as a delimiter.
| Also, case-sensitivity is a disaster for ergonomics.
|
| If you had comma-delimiting like in an algol-derived language,
| you wouldn't need to quote things with spaces.
|
| edit: also, code is read more times than it is written, so
| optimizing for readability over brevity is generally a good
| move.
| Dudeman112 wrote:
| I could infer a lot about the second and what those params mean
| and what they do.
|
| The first one is some magical incantation.
| zsmi wrote:
| Another interpretation is:
|
| On the first, you think you know what it does, but you're not
| sure. So maybe it gets looked up.
|
| On the second, you know you don't know what it does. You so
| know to look it up.
|
| Personally, I'll take the second. Assumptions during
| debugging are dangerous things.
| foxfluff wrote:
| Sure. One could also make "move-down-one-line" be the
| incantation to move the cursor down a line in vi, but I
| prefer j.
|
| Ergonomics isn't all about making everything self-descriptive
| for someone seeing the thing for the first time. It's about
| making things comfortable to actually use. If it's so long
| and complicated that you can't even remember how to do it,
| it's not very comfortable to use. Even if I could remember,
| xset m 0 0 is still far more comfortable.
|
| And fwiw you still don't know what 0, 1 in accel profile do;
| you need to look that up or take a wild guess, and if you
| want to use that command, you'll also have to know how to
| look up the device because chances are yours is not the same
| as mine. So it's not any less magical in the end, just more
| verbose.
|
| The "cool" thing about the xinput command is that you don't
| even find accel profile in the man page. You gotta look
| elsewhere if you want to understand what it is and what it
| does and what the parameters are.
|
| xset m? Yes, that is documented in the man page.
| Gigachad wrote:
| It should be based on frequency of usage. I can tell you
| that moving down a line in vim is a little more common than
| toggling the mouse acceleration.
|
| I would never even type such a command. I would just copy
| paste it once.
| foxfluff wrote:
| Yeah well, given that mouse acceleration tends to be on
| by default, I need to turn it off every time I'm on a
| fresh install or computer I haven't used before. The last
| time I needed that was yesterday.
|
| I don't want to waste time searching for a command to
| copy-paste when it could just be made short, simple,
| memorable and ergonomic. I could type xset m 0 0 faster
| than I could open a browser and ask google how to disable
| acceleration with libinput. And again: you can't just
| copy-paste the xinput command unless you're lucky enough
| that it matches your device. On my new computer, the
| device has a different name than on my old laptop even
| though it's the same damn mouse.
| TheOtherHobbes wrote:
| It should be, but how would you keep track of usage
| frequency?
|
| At least it would push all the "This switch was added by
| someone playing with UNIX at a university in 1986 and
| hasn't been used since" options to the end of the list.
| ReleaseCandidat wrote:
| > Ergonomics isn't all about making everything self-
| descriptive for someone seeing the thing for the first
| time.
|
| We're talking about `xset`. It doesn't make sense to
| optimize that for usage of more than once a year.
| foxfluff wrote:
| The less frequently I need something, the more
| frustrating it is if it's not short and memorable (or
| easy to look up in the synopsis or built-in help).
| Forgetting and googling a needlessly complicated command
| over and over again every year isn't fun.
|
| xset achieves that perfectly. If I somehow _didn 't_
| remember how to set mouse acceleration with it, a quick
| glance at the synopsis immediately tells me. Or I can
| just run the command and it'll tell me:
| To set mouse acceleration and threshold: m
| [acc_mult[/acc_div] [thr]] m default
|
| Zero frustration, and the command is so short and simple
| that I end up remembering it without trying.
|
| This is something I've observed more than once: I easily
| memorize useful sets of one-letter flags even if I can't
| remember or know what they all stand for. This just
| doesn't happen nearly as much with long options. Commands
| like ls -ctrl or ss -nap quickly become part of my
| repertoire even if I don't use them very often, but I
| really couldn't remember ss --numeric --all --processes
| (if I had written that from memory, it could've ended up
| as --num --all --pid or --numeric --any --process), and I
| don't even know what the corresponding long options for
| ls are. In the rare case when I have to deal with an
| option that has no short equivalent, I feel like I have
| to look it up every time if it's been longer than a few
| weeks.
|
| You talk of optimization but I think this is just a very
| basic (and reasonably successful) attempt at sane design.
| It's not like someone had to go far out of their way to
| make this in a manner that isn't batshit insane.
| eloisius wrote:
| But which case should software interfaces optimize for?
| Ergonomics of someone who uses a tool frequently, or
| interpretability for casual by-standers of some out-of-
| context shell command?
| formerly_proven wrote:
| Cue nmcli (CLI for Gnome's NetworkManager) which uses UUIDs for
| everything and (at least a while ago) did not accept partial-
| but-unique UUIDs. Basically goes "nmcli connection up
| 5095665a-d82c-4ae6-8964-283623387941".
| gertlex wrote:
| Weird, I haven't had to do this. Most(/all?) connections have
| nice names you can see with `nmcli c`... and so I can do
| `nmcli c up id DroidNet` and that's pretty dang nice. Pretty
| sure this worked with Ubuntu 14.04 (though, nmcli has gotten
| much more featureful since then)
|
| (The ability to shorthand connection->c and similar is great,
| too; obviously not unique to nmcli)
| apricot wrote:
| By this point, I'm pretty sure there are people at gnome who
| compete to see who will make the stupidest suggestion that
| gets put in production.
| MonkeyClub wrote:
| It's a Gnomespiracy to determine whether worse is actually
| better.
| 8bitsrule wrote:
| Fits right in with COP26. (Could Of Punted?)
| prionassembly wrote:
| apt-get install nmtui # it's better
| O_H_E wrote:
| nmtui is a life saver tbh
| apricot wrote:
| The problem is we're optimizing for "easy to learn" rather than
| "easy to use".
| jjoonathan wrote:
| In a world of broken promises and tool churn, minimizing
| tooling investment isn't laziness, it's a defense mechanism.
|
| This is a lesson I had to learn the hard way, multiple times.
| forgotmypw17 wrote:
| I've learned this lesson too, and I now avoid using any
| tools that have broken backwards compatibility in the past
| 20 years.
| foxfluff wrote:
| That may be a part of the problem but honestly I don't feel
| like all these new crazy interfaces are easy to learn either.
| I mean how do you come up with the lithany xinput calls for?
| You need to understand the syntax for specifying a device.
| You need to know that you're to set a libinput property, and
| you need to know the name of that property, and it's not
| documented in xinput man page, and of course you need to know
| the values to pass which again are not documented in xinput
| man page. You can play with --list-props and then take your
| search elsewhere because it is completely opaque and doesn't
| explain what the properties actually do.
|
| I suspect the number of people who figured all that out
| without having to find it by googling / arch wiki / whatever
| is very very low.
|
| Now I'm not gonna say xset is the easiest interface to figure
| out, but the syntax for setting mouse acceleration is right
| there in the synopsis, and if you search down the man page,
| you'll learn a little more (and also if you just run xset
| without arguments, it'll tell you how to set mouse
| acceleration). It might not be the best designed tool but
| it's something I learned back in the day as a teenager just
| by looking at the man page.
|
| I think the real issue is that people nowadays are designing
| these interfaces to be consumed by interactive configuration
| tools, GUI apps, and desktop environments; they're more
| dynamic, more complex, more flexible, but not easier to
| figure out, not for you on the command line. The command line
| is just a last resort. Second class citizen if you will.
| forgotmypw17 wrote:
| alias mouseoff='xinput set-prop 11 "Device Enabled" 0'
| alias mouseon='xinput set-prop 11 "Device Enabled" 1'
|
| Kind of ridiculous if you ask me.
| foxfluff wrote:
| It is, but they actually have a shortcut for that
| (--disable, --enable).
| forgotmypw17 wrote:
| Direct quote from my console: $ xinput
| --disable Segmentation fault (core dumped)
| deckard1 wrote:
| On some level it makes sense. The problem with the command
| line is familiarity.
|
| How often do you reach for iptables? If you're like myself,
| and most home/desktop users, then probably once in a blue
| moon to set it up and then you leave it alone. But a system
| admin? Maybe they touch it a few times a week or month. Every
| time I use iptables I have to relearn how Linux networking
| works.
|
| Similarly, the xset/xinput thing. When I need those tools I
| just create a script or throw it in .bashrc. I adjust the
| settings once and will not touch them again for a couple
| years. It makes sense to have long parameters that are
| _readable_. I can look at my .bashrc and see exactly what
| device is getting adjusted.
| zibzab wrote:
| I've a feeling you will hate powershell
| akersten wrote:
| Needlessly long parameter/command names and the bizarre
| insistence on capital letters are the #1 and #2 reasons I
| detest PowerShell. Like GP, I resent that Linux tools are
| moving in that direction.
| ansible wrote:
| Well, if you think that's bad, behold the recent trend in
| network interface names on Linux.
|
| We started out with 'eth0', 'eth1', etc. Which adapter was
| which could change when adding and removing a network card.
| That was bad, so that prompted the evolution.
|
| Now we have 'enp1s0', 'enp0s31f6', 'enp13s0' and many similar
| variations. These are supposedly more stable across device
| changes. As it turns out, it wasn't.
|
| But wait, there is more! Now we have the "predictable names"
| scheme that produces interface names that are even longer, and
| not even slightly easier to remember.
|
| Read about the whole sorry saga here:
|
| https://wiki.debian.org/NetworkInterfaceName
|
| I do get that it is not an easy problem to solve, especially in
| the face of removable network interfaces (like USB Ethernet /
| WLAN). But surely this is not the best we can do.
| foxfluff wrote:
| I was actually ranting about this on IRC last night (yeah now
| my laptop has two enp* interfaces and enx[MAC])..
|
| One thing I like about OpenBSD is that buses are scanned and
| drivers probe in order and there's no race between drivers
| coming up. Unless your hardware is physically tampered with
| or broken, all interfaces come up with the same name across
| reboots. Linux isn't like that (even if you don't touch your
| hardware, interfaces could swap across reboots), so you need
| to do something about it.
|
| As is typical on Linux, the default is unergonomic and if you
| want something nice, you're on your own to make it so.
|
| If you already have userspace daemons responsible for device
| insertion and naming, it really wouldn't have been so hard
| for it to e.g. automatically add a config file / database
| entry for each interface the first time is seen. So the
| devices that came up as eth0 and eth1 are still eth0 and eth1
| on the next boot; if I unplug eth0 and add a new card, the
| new one would be eth2 because eth0 is still reserved for the
| first card I had.
| ReleaseCandidat wrote:
| > add a config file / database entry for each interface the
| first time is seen.
|
| Ubuntu did that with their persistent-net.rules udev rule.
| That was a part of the PITA of the old naming.
| account42 wrote:
| If netwok interfaces were files we could just have both short
| names and stable names, like what we have for block devices.
| nocman wrote:
| Missed the 's', it's:
|
| https://wiki.debian.org/NetworkInterfaceNames
| ReleaseCandidat wrote:
| > These are supposedly more stable across device changes.
|
| No. These are stable across reboots. The old eth? weren't.
| And yes, that had been a PITA.
| nomorecommas wrote:
| Long option names are more descriptive, more easily
| distinguished, and easier to remember. Your shell should be
| intelligent enough to provide tab completion for option names,
| assuming it is configured to.
| forgotmypw17 wrote:
| >Your shell should be intelligent enough to provide tab
| completion for option names, assuming it is configured to.
|
| Wait, are you saying that I need to change my shell or config
| to make up for another tool's poor design?
|
| No, thanks.
| Jiro wrote:
| Long option names are more difficult to remember because a
| long option name can be spelled multiple ways and it is
| difficult to remember which spelling is correct.
| Angostura wrote:
| > Long option names are ... easier to remember ... Your shell
| should be intelligent enough to provide tab completion
|
| They are so easy to remember that you need to configure your
| shell to remember them for you?
| [deleted]
| kaba0 wrote:
| IMO, powershell got it right. Yeah, it's syntax is strange,
| but it has standard flag usage with proper autocomplete, and
| you can shorten any flag the way you want (eg. fuzzy match)
| if it is unambiguous.
| throw10920 wrote:
| These changes are meant to make it easier to _read and
| understand_ command-line incantations (and to make them more
| explicit, which is always good), because the command-line
| paradigm, being text-based, imposes an unavoidable trade-off
| between ergonomics and understandability /ease-of-use. It
| sounds like you prefer ergonomics - although I wouldn't be
| surprised if most users would prefer ease-of-use.
|
| Of course, if one doesn't write a CLI to begin with, this
| trade-off doesn't exist - you can have your cake and eat it
| too.
| hackbinary wrote:
| It seems to me that many of the problems associated with spaces
| in filenames are due the OS assuming that a space signals the end
| of a command or filename.
|
| Maybe we ought have to a different character signify the end of a
| name? Or signfiy a option section, or the next option section of
| a command?
| bcrl wrote:
| The Amiga supported spaces in filenames in 1985... =-)
| pimterry wrote:
| I work on a complex desktop application, and it's been astounding
| the number of bugs that have appeared over the years triggered by
| spaces and other unusual characters in file names. If you do
| anything with subprocesses or path processing, it's absurdly easy
| to hit in a thousand different ways, over and over again.
|
| Pro tip: rename your development directory (or even better: the
| workspace path in CI) to put a space and/or special characters in
| it.
|
| Forces you to deal with this properly, and immediately ensures
| that every automated test checks this case without you having to
| remember every time. Hasn't been particularly inconvenient, since
| I'm autocompleting it 99% of the time anyway, and I haven't
| shipped a single path parsing bug since.
| franga2000 wrote:
| More importantly than your source files, put your testing data
| on such a path as well. Nobody uses absolute paths in testing
| so it doesn't matter how many spaces your absolute path has if
| your input is "./tests/file1". Put those files in a folder with
| spaces too and throw in a unicode character for good measure.
| josteink wrote:
| > it's been astounding the number of bugs that have appeared
| over the years triggered by spaces and other unusual characters
| in file names
|
| If you consider spaces "unusual" I would say you haven't
| encountered a single average user in your lifetime. Spaces in
| file-names is the single most common thing people have, outside
| programming environments.
|
| As a x-plat developer, the only platform where I (still)
| regularly encounter these kind of bugs are platforms where
| solving problems through scripting is common, like Linux, where
| the primary means of operation is through stringly-typed
| statements getting parsed and processed in a untyped-fashion.
| It's not very reliable.
|
| On Windows people more often use "real APIs" (because scripting
| doesn't really work as well), but then these problems just goes
| away.
|
| Pros and cons, I guess.
| SAI_Peregrinus wrote:
| It's especially funny that it affects Linux so much. Most
| file systems allow everything except `/` and NULL in file
| names. Early AT&T UNIX even allowed NULLs! POSIX shells use
| the IFS variable to perform field splitting, and it defaults
| to <space>, <tab>, and <newline>. The choice to perform field
| splitting by default (particularly with spaces in the default
| IFS set) has caused no end of headaches for developers and
| users.
| InfiniteRand wrote:
| It's easy to tell users to make a folder with no spaces if
| you're setting up a global path, however if you have an
| application that runs in user directories things can become
| painful fast. Changing your user name is a pain and can leave
| things inconsistent, but having to handle all the variations in
| people's names with spaces, punctuation, international
| characters, can just be mind boggling.
| ralphc wrote:
| Late '90s I worked on Java software that got installed on
| several Unix platforms, including Linux for IBM mainframes.
| When you deal with the default en/de-coding of Unicode to
| EBCDIC you never have trouble with Java byte encodings ever
| again.
| dheera wrote:
| Or not, which when bugs crop up will teach the businessy types
| to stop putting spaces in their filenames.
| macintux wrote:
| The beatings will continue until morale improves?
|
| Spaces are very useful for readability.
| cerved wrote:
| depends entirely what you're using to browse files
| lifthrasiir wrote:
| While I agree that we should do this in the ideal world, doing
| so will inevitably break other necessary tools so it is
| unworkable for me :(
| Spooky23 wrote:
| Someone should provide the OneDrive/SharePoint people some of
| this religion.
|
| Mysterious character requirements that do not conform with
| Microsoft's OS limits, limits on tbe fully qualified pathname
| length, etc.
| alpaca128 wrote:
| Seems like MS had the same idea according to an answer in the
| link:
|
| _> Microsoft intentionally made programs install to C:\Program
| Files on Windows 95+ to force programmers to deal with spaces
| in filenames._
| vesinisa wrote:
| Except for programs that were too old / obscure to fix I
| guess. I think at least the Symbian Development Kit was such
| that builds would fail with strange errors unless you
| installed it in any other path than the default immediate
| subdirectory of C:\, let alone under "Program Files".
| dr-detroit wrote:
| Plenty of new stuff does this. As long as youre not .net or
| javascript nobody scrutinizes the trash work developers
| charge money for.
| henrikschroder wrote:
| C:\PROGRA~1
|
| Easy fix!
| 8bitsrule wrote:
| At one time there was no number 0. Half of binary was
| missing.
| billti wrote:
| And then to really mess you up and ensure you handle parens
| properly, threw "(x86)" into the mix. (A real pain on some
| REPLs as well as dealing with environment variables).
| lifthrasiir wrote:
| And yet they introduced C:\ProgramData in later versions.
| kitkat_new wrote:
| why "yet"?
|
| one occurrence is enough to make devs care about it
| jjoonathan wrote:
| Imagine if they made programmers put 64 bit DLLs in a
| "System32" directory and 32 bit DLLs in a "SysWoW64"
| directory. That would really keep 'em on their toes!
| kevin_thibedeau wrote:
| They already hamstrung themselves with LONG because DWORD
| just wasn't good enough and now long can't be 64-bit
| either.
| eyegor wrote:
| You should look into the behavior of the
| /windows/sysnative link. It appears and disappears
| depending on whether your process is running as 32 bit or
| 64 bit.
| Karuma wrote:
| Programmers should never put DLLs in those folders... Or
| even ever touch them.
| mastax wrote:
| Except for \Windows\System32\drivers\etc\hosts, of
| course.
| jaywalk wrote:
| I occasionally try to search for the reasoning behind the
| location of the hosts file in Windows, and I always come
| up blank.
| jve wrote:
| https://superuser.com/questions/355297/why-does-windows-
| have...
| HideousKojima wrote:
| They originally copied BSD's network stack, IIRC
| blincoln wrote:
| Maybe it's from back before Windows had a built-in TCP/IP
| stack? If it were a third-party/optional driver, having
| files related to it in a path under system32\drivers
| would make sense.
| mjevans wrote:
| Back around Win 95 when they added networking it was
| based off of (IIRC) BSD's TCP stack and related tools.
| They were an optional 'third party' driver of sorts, but
| shipped by the first party. I'm not positive about WinNT
| or Win3.11 (for workgroups?)
| mixmastamyk wrote:
| I remember adding "trumpet winsock" to Win 3.x back in
| the day. Says '94 for that, and summer of '93 for NT 3.1
| debut:
|
| https://en.wikipedia.org/wiki/Trumpet_Winsock
|
| https://en.wikipedia.org/wiki/Windows_NT_3.1
| drdeca wrote:
| I know that at least like, idk like 3-5 years ago, when I had
| gotten a new windows laptop (windows 7 or 8 I think), setting
| the main account to have the name "" (without the quotes),
| caused some problems with the basic functioning, including, I
| think, with some pre-installed programs,
|
| So, some things were still being handled not quite right
| (whether that's because it shouldn't be allowed to be the
| username, or because programs should handle it being in the
| path, I'm not sure, but probably one of those.)
| [deleted]
| anarazel wrote:
| I just wish they had a decent way to execute programs with
| arguments that might include spaces. But no, every program
| can do argument delineation differently.
| account42 wrote:
| And Microsoft even provides three different slightly
| incompatible ways to parse arguemnts: CommandLineToArgvW,
| the CRT and cmd.exe.
| [deleted]
| cerved wrote:
| Sure. Microsoft only ever ships features
| hetspookjee wrote:
| I wonder how much global work could have been saved if
| Microsoft also provided a covered interface for all paths in
| the system. Not sure if there is any, but one good
| implementation might save thousands of poor implementations
| required to handle it.
| Too wrote:
| You have %Appdata% and friends.
| moontear wrote:
| You mean like the Environment.SpecialFolders enum?
|
| https://docs.microsoft.com/en-
| us/dotnet/api/system.environme...
|
| There are several other classes that take care of getting
| folders, least of which checking system variables.
| zerr wrote:
| Could you please link the reference?
| lamontcg wrote:
| Then they made poor APIs so that you have to do this to get
| it correct:
|
| https://docs.microsoft.com/en-
| gb/archive/blogs/twistylittlep...
|
| In _nix at least you can call execve or other APIs that take
| a char_ argv[] and the whole problem is largely solved and
| you don't need to quote things.
| ealexhudson wrote:
| I wish they did "User Files" instead of "Users" too, because
| so much software breaks on the home area having a space in
| it.
|
| Not least, it makes writing scripts for various shells and
| getting the quoting rules right an absolute pain as well...
| the_mitsuhiko wrote:
| They used to. The folder was called `Documents and
| Settings` until Win7.
| 323 wrote:
| "Documents and Settings" still exists on Windows 10, as a
| soft link to "Users".
| WalterBright wrote:
| Nothing says progress like renaming all your paths.
| sixothree wrote:
| I know this is completely tangential. But you can Win-R
| and just type Documents and it will load your documents
| folder. Same for downloads, pictures, temp (windows
| temp), and I'm sure many others.
|
| Works from File-Open dialogs and address bars and even in
| the command prompt you can even do "explorer documents".
| thedday wrote:
| Yeah, it's a junction point, but it's also useless. Open
| a command box and CD to it; now what? A file explorer and
| set it as the directory, again, now what?
| 0des wrote:
| You know, this makes me wonder.. tangentially speaking- I
| wonder how hard it would be to rearrange the folder
| structure in linux so that I have something like this:
|
| /Users/{root, user0, user1, ... }...
|
| /System/{Logs, Apps/{opt, container, ...}, Temp, Conf
| ...}...
|
| /Devices/{Mount, sda, sdb, null ...}...
|
| /Boot/...
| Bad_CRC wrote:
| macos does something like that.
| matheusmoreira wrote:
| > I wonder how hard it would be to rearrange the folder
| structure in linux
|
| Restructuring the directories is the easy part. You just
| delete the old tree and make a new one. You can also
| mount procfs and sysfs wherever you want.
|
| The hard part is modifying existing software to work with
| the new tree. So many programs assume you have a
| "standard" file system tree. So many programs assume
| procfs is mounted at /proc. So many programs have
| hardcoded paths. Shared library locationd can become part
| of the binaries when they're compiled. It's insane and
| you'd essentially be creating a new Linux distribution.
| anyfoo wrote:
| Is it coincidence that you almost exactly replicated what
| macOS has? Except that /Devices is /Volumes, .../Apps is
| .../Applications. and /Boot is handled differently.
|
| Of course, that's not perfect either, because a) decades
| of changes vs. compatibility have made it less clean in
| certain places, and b) pretty much all the POSIX paths
| still exist for unix-y compatibility, but overall it's
| like that.
| caymanjim wrote:
| You monster!
| 0des wrote:
| Don't even get me started on /usr/local/bin..
| ThaJay wrote:
| You mean "Start Menu"?
| riccardomc wrote:
| Why not just symlink them? You can have best of both
| worlds with relatively little effort.
|
| Make the overlay of your dreams!
| Spivak wrote:
| I mean we're heading there with /usr being your /System.
| Redhat/Pottering are doing heroic work in this space.
| /Users -> /home /System -> /usr /Data ->
| /var /Config -> /etc /Boot -> /boot
| /Ephemeral Temp -> /run /Persistent Temp -> /tmp
|
| The only real holdouts are proc/sys/dev which are the
| kernel and mnt/media/opt/srv which are really for the
| user/sysadmin and aren't really used by the OS anymore.
| woodruffw wrote:
| Genuine question: on what systems is `/tmp` persistent?
| Both macOS and Ubuntu 20.04 clear `/tmp` on every reboot
| for me, and I haven't changed the defaults at all.
| earthboundkid wrote:
| All storage is temporary. You just gotta wait long
| enough.
| novok wrote:
| People don't reboot often. Persistent tmp basically means
| it will be cleared in an infrequent manner, so the
| likelihood of it going away 1s after you release your
| file handle is low.
| mike_hock wrote:
| "Persistent Temp" should be /var/tmp. "Persistent Temp"
| is also an oxymoron.
| nybble41 wrote:
| > "Persistent Temp" is also an oxymoron.
|
| It's not an oxymoron to have files which are temporary
| but not limited in scope to a single power cycle. For
| example, you could have a long-running process which you
| want to be able to resume if it's interrupted; /var/tmp
| would be an appropriate place for the state. The data is
| temporary because it will be deleted once the process is
| finished, but you wouldn't want it wiped out by a system
| reset. Generally /tmp is cleared at every reset, and is
| often a tmpfs mount, while files in /var/tmp are
| automatically cleaned up only when they reach a certain
| age.
| tremon wrote:
| Except that the FHS says that "data stored in /var/tmp is
| typically deleted in a site-specific manner", and as an
| application vendor you have no control over that site-
| specific clean frequency. On all my systems, /var/tmp is
| a symlink to /tmp and that has never caused any issue.
| nybble41 wrote:
| The FHS is not wrong; cleaning policies are indeed site-
| specific and files placed in any temp directory can in
| principle disappear at any time. (Though, in theory, it's
| not supposed to happen while the files are still "in use"
| by running programs.) Still, historically you could count
| on files in /var/tmp lasting longer than files in /tmp,
| including across reboots.
|
| Nothing will immediately break because you linked
| /var/tmp to /tmp. Whether it causes issues depends on the
| programs that you (or your users) run and how they make
| use of /var/tmp. However, if someone did have to restart
| a long-running process from the beginning because recent
| state information in /var/tmp was not preserved across a
| reset, I would say that is a problem with the
| administration of the system and not the program that
| stored its state there.
| Spivak wrote:
| Basically no one uses /var/tmp for anything (and nobody
| should either). World writable directories are a mistake
| and only continue to exist because apps assume they are
| available.
|
| /tmp and friends are poorly named. They really should be
| /shared or /dmz or /freeforall or something.
|
| * If you need service-specific tmp space use
| RuntimeDirectory or PrivateTmp if your app is hardcoded
| to /tmp.
|
| * If you need service-specific persistent data that goes
| in /var/lib/your-app.
|
| * If you need temp space for your user it's at
| /var/run/user/your-uid.
|
| * If you need more than one user/service to share files
| _but not everyone_ then god have mercy on your soul
| because all options are bad. There sure are a lot of them
| but none of them are at all satisfying.
| account42 wrote:
| > Basically no one uses /var/tmp for anything
|
| Gentoo does, at least by default: https://wiki.gentoo.org
| /wiki//etc/portage/make.conf#PORTAGE_...
| nybble41 wrote:
| Right, /var/tmp is the "Persistent Temp" directory, and
| /tmp is "Ephemeral Temp". The /run directory is for
| _runtime data_ such as PID files, Unix sockets, named
| FIFOs, and generated systemd units--it has a specific
| internal structure and shouldn 't be used as a direct
| alternative to the relatively unstructured /tmp
| directory. While both are generally ephemeral tmpfs
| mounts, only /tmp is writable to all users.
| carlhjerpe wrote:
| I'm not sure I'm a fan of the capitalization and spaces,
| other than that I'm all for more self-explanatory names.
| johnlorentzson wrote:
| Why not? That's how proper English text is written. Of
| course there are many programs that can't handle it
| properly (or handles it inconveniently) so in practice it
| might be problematic at times, but otherwise I see
| nothing wrong with it.
| carlhjerpe wrote:
| Generally just because typing it out with tab completion
| in zsh sucks, and I don't see a good solution (if it was
| solved nicely it'd be solved already)
| xeyownt wrote:
| Why compare with English? It's computer domain, it's not
| a book or a poem. It should be clear and unambiguous.
|
| Caps are annoying to type, and difficult to remember (Do
| You Caps, or Do you caps, or DO YOU CAPS, etc).
|
| Spaces are nuisances that bring no benefit. At best we
| should use non-breaking space for filenames, but that
| would be even more atrocious.
| abdusco wrote:
| This is what I want from Linux. Sensible & guessable
| names for newcomers to figure out where to put files and
| programs.
|
| It's frustrating having to spend time to decide whether I
| should install a program in /var or /opt or /usr. What do
| they even mean!
|
| So, I disagree with this convention altogether and use
| /apps or ~/apps now.
| tenebrisalietum wrote:
| The directories that house your executables are read only
| to users other than root, to prevent attacks and
| overwriting them by non-root users.
|
| /var stands for variable data--like log files, cache
| directories, spool directories, etc. You shouldn't put
| executables there. Ideally you should be able to set the
| noexec flag on it.
|
| `/usr` actually exists because the original UNIX
| developers ran out of disk space and had to attach
| another disk. The difference between /bin and /usr/bin is
| not worth it and even Debian symlinks /usr/bin to bin.
|
| But your _distribution 's package manager_ should be
| putting stuff in /bin or /usr/bin, not you. Anything that
| follows the regex "{asterisk}/local{asterisk}" is
| something the system owner can do whatever with. So you
| should be using /usr/local/bin or $HOME/local/bin. I
| don't know why there's no /local off of the root. (One
| thing I do on my own systems is make and use an
| /etc/local although I think you're supposed to use
| something like /usr/local/etc).
|
| /opt is for third party programs that aren't installed
| via your distro's package manager.
|
| If you do this, any customizations you make to a system
| can be easily backed up by copying all dirs with local in
| the name.
|
| There's multiple decades of tradition behind these names,
| but they do date back to the age where actual teletypes
| were used.
| chasil wrote:
| Oh, my young friend, you have no idea what POSIX has done
| to you.
|
| "While no one sane would put newlines in directory names,
| such corruption of the results could lead to exploitable
| vulnerabilities in scripts."
|
| http://www.etalabs.net/sh_tricks.html
| oblio wrote:
| He he.
|
| Want to see true craziness? POSIX file names are just a
| bag of bytes. They don't even have to be text, they can
| be anything (almost), there's no standard text encoding:
|
| https://lwn.net/Articles/325304/
|
| And in typical Open Source fashion, someone actually
| claims it's a feature: https://lwn.net/Articles/325398/
| because hey, you 99.999% percenters can suffer so that I,
| 0.001% percenter can implement my wacky system.
|
| https://xkcd.com/1172/
| ygra wrote:
| It's basically the same on Windows with NTFS. Just a bag
| of 16-bit words instead of bytes.
| chasil wrote:
| This appears to demonstrate the full range of abuse.
| $ mkdir hold $ cd hold $ cat
| ../wildname.c #include <stdio.h> int
| main(int argc, char **argv) { char n[256]; int
| i,j=0; FILE *fp; for(i=1; i<256; i++)
| if(i!=47) n[j++] = i; n[j] = 0; if(fp =
| fopen(n, "w")) { fprintf(fp, "hello world!"); fclose(fp);
| } } $ cc ../wildname.c $
| ./a.out $ ls -l total 16
| -rw-r--r--. 1 luser lgroup 12 Nov 11 16:32
| ???????????????????????????????
| !"#$%&'()*+,-.0123456789:;<=>? @ABCDEFG
| HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~??
| ?????????????????????????????????????????????????????????
| ?????????????????????????????????????????????????????????
| ????????????? -rwxr-xr-x. 1 luser lgroup
| 8464 Nov 11 16:32 a.out
|
| Just because you can do something does not mean that you
| should.
| oblio wrote:
| It's software. Software's contract is the same as a legal
| contract. And a legal contract mostly says what you can't
| do.
|
| So anything not directly blocked by the software is
| allowed.
|
| Ergo, clear specifications, strict yet flexible types and
| APIs, etc.
|
| Otherwise, it's just bad design.
| 0des wrote:
| Behold! https://en.m.wikipedia.org/wiki/Filesystem_Hierar
| chy_Standar...
| LordDragonfang wrote:
| I feel like it just highlights the problem of how
| antiquated and confusing linux terminology that so many
| of those reference "single-user mode", used to refer to
| booting into root, when the vast majority of computing
| devices a given user will interact with only have a
| single _actual_ user, making this a confusing and almost
| meaningless distinction to someone not already intimate
| familiar with linux.
| emteycz wrote:
| Yeah, except that tells me nothing useful... The question
| is exactly the same: So where do I install this random
| binary I downloaded from the internet or compiled myself?
| Is it /opt, /usr/bin, /usr/local/bin, or /bin? Where do I
| put the dependencies I compiled for this software -
| /usr/lib, /usr/local/lib, /lib, /opt/lib, /opt/<app
| name>/lib, or what?
| woodruffw wrote:
| Is your account the only account that's expected to run
| the binary? If so, then `$HOME/bin` is a perfectly
| acceptable (albeit not standard) place to put it.
|
| If you expect other users to be able to execute the
| program, then you should put it in either `/usr/bin` or
| `/usr/local/bin`, depending on whether the former is
| already being used by a package manager. `/opt` is
| _generally_ for self-contained software that doesn 't
| play nicely with the rest of the system, but _might_
| still be installable through the default package manager.
| megous wrote:
| $HOME/.local is the equivalent if /usr/local for per-user
| stuff.
| mananaysiempre wrote:
| I don't think there's any "official" word on that (the
| XDG spec that defines ~/.local/share doesn't mention
| ~/.local/{bin,lib} IIRC, and the traditional per-user
| entry in PATH seems to be ~/bin), but a fair number of
| people use it this way, yes, including me.
| tom_ wrote:
| I started out using $HOME/bin, but a fair amount of stuff
| assumes a /usr- or /usr/local-style folder structure when
| doing make install, so I've settled on using
| $HOME/usr/bin instead, so that programs can create
| $HOME/usr/include and $HOME/usr/share and whatever,
| without trampling on stuff in my home folder.
|
| Can't remember the last time I had a problem arranging
| this. If using autotools, which covers 95+% of stuff,
| it's usually a question of something like "./configure
| --prefix=$HOME/usr".
|
| (If I want to share stuff between users, /usr/local/ is
| of course a better place. macOS is a bit more
| restrictive, so I have a separate user for this, whose
| /usr folder is readable by everybody.)
| woodruffw wrote:
| Yeah, it definitely gets hairier when using anything
| that's more than just a drop-in binary.
| matheusmoreira wrote:
| > $HOME/bin
|
| On freedesktop systems there's the ~/.local directory
| which is supposed to be a mirror of the file system
| hierarchy. Seems like a good place for bin, lib, include
| directories.
| mananaysiempre wrote:
| The standard is, indeed, excessively vague because it was
| written to let many existing implementations be
| conformant as is, though I'd say it's still more helpful
| than many other standards with that deficiency. There's a
| method to it, however:
|
| - Things installed in /, if it's different from / _usr_ ,
| are generally not to be touched;
|
| - Things installed in / _usr_ are under the distro's
| purview or otherwise under a package manager, any
| modifications are on pain of confusing it;
|
| - Things installed in / _usr_ / _local_ are under the
| admin's purview and unmanaged one-offs, there are always
| some but overuse will lead to anarchy;
|
| - Things installed in / _opt_ are for whatever is so
| foreign and hopeless in not conforming to the usual
| factoring that you just give up and put it in its own
| little padded cell (hello, Mathematica);
|
| - Everything is generally configured using files in /
| _etc_ , possibly with the exception of some of the
| special snowflakes in / _opt_ ; the package manager will
| put config files meant to be edited there and expect the
| admin to merge any changes in manually, and sometimes put
| default settings meant to be overridden by them in /
| _usr_ / _share_ (see below)--both approaches can be
| problematic, but the difficulty is with migrating
| configuration in general, not the FHS as such.
|
| There used to be additional hierarchies like / _usr_ /
| _X11R6_ , and even a / _usr_ / _etc_ on some (non-Linux?)
| systems, but AFAIU everyone agrees their existence makes
| no sense (anymore?), so much that even FHS doesn't lower
| itself to permitting them.
|
| The distinction between / and / _usr_ might appear to be
| pointless as well, and nowadays it might be (some distros
| symlink them together), but previously (especially before
| initial ramdisks were widespread) stuff in / was
| whatever was needed to bring up the system enough that it
| could netmount a shared / _usr_.
|
| Inside each of /, / _usr_ and / _usr_ / _local_ there is
| _bin_ for things that are supposed to be directly
| executable, whether binary or a script and all in a
| single place; _share_ and _lib_ for other portable and
| non-portable (usually but not necessarily text and
| binary) shared files, respectively, segregated by
| application or purpose; finally, due to the dominance of
| C ABIs and APIs on Unices, the top level of _lib_ also
| hosts C and C++ library files and there's an additional
| directory called _include_ for the headers required to
| use them. Some people also felt that putting auxiliary
| executables (things like _cc1_ , the first pass of the C
| compiler) inside _lib_ was awkward so they created
| _libexec_ for that purpose, but I don't think the
| distinction turned out to be particularly useful so not
| all distros maintain it.
|
| That's it, basically. There are subtler but logical
| points (files _vs_ subdiretories in / _etc_ ) and things
| people haven't found an obviously superior solution for
| (multilib and cross environments), and I made no attempt
| to be historically accurate (the original separation of /
| and / _usr_ happened for intensely silly reasons), but
| those are the fundamental principles of the system, and I
| feel it does make sense as a coherent implementation of a
| particular design. Other designs are possible (separation
| by application or package not purpose, Plan 9-ish
| overlays, NixOS's isolated environments), but that's a
| discussion on a different level; the point is that _this_
| one is at the very least internally consistent.
|
| Re the unfriendly names ... I honestly don't know.
| Newbie-friendliness matters, but it's not the only thing
| that does; particularly in a system intended for
| interactive text-mode use, concise names have a quality
| of their own. There's a reason I'm more willing to reach
| for curl and jq rather than for httpx and lxml, for
| regular expressions rather than for Parsec, and even for
| cmd.exe, as miserable as it is, rather than for
| PowerShell.
|
| I feel weird that no HCI people seem to have seriously
| considered the tension between interactive and
| programmatic environments and what the text-mode user's
| experience in Unix says about it, but even Tcl, which is
| in many ways a Bourne shell done right, loses something
| in casual REPL use when it eliminates (as far as
| idiomatic libraries are concerned) short switches. Coming
| up with things like _rsync -avz_ or _objdump -Ctsr_ is
| not very pleasant initially, but I certainly wouldn't
| want to type out the longhand form that would be the only
| possible one in most programming languages (even if I
| find their syntax beautiful, _e.g._ Smalltalk /Self).
| nobody9999 wrote:
| >the original separation of / and /usr happened for
| intensely silly reasons
|
| As I recall, there were _very_ good reasons for
| separating / and /usr (as well as /home and /var). The
| biggest one was that various Unix kernels would panic[0]
| if / was full. But that issue was almost universally
| fixed by 1990 or so.
|
| And netmounts of pretty much _everything_ other than /
| were pretty common for many years, due to the high cost
| of storage.
|
| So no, the reasons weren't silly, they just don't apply
| to more modern systems.
|
| [0] https://en.wikipedia.org/wiki/Kernel_panic
| mananaysiempre wrote:
| OK, I didn't put this completely correctly. The
| _original_ separation of /usr to hold user home
| directories (!) and / to hold everything else was because
| the first RK05 disk ran out, but it makes sense in any
| case. The additional hierarchy under /usr was created
| some time later when space on the first RK05 disk ran out
| _again_ , and while this can be a perfectly sensible
| decision for a single installation on a single site,
| taking it seriously decades later is silly. Neither does
| that mean that there weren't good reasons the split got
| preserved in subsequent systems, just that they couldn't
| have been the same as the original ones; there are no
| netmounts in V6, after all.
|
| (I have an old Unix intro book that describes /usr as
| user home directories, the rest is a second-hand
| retelling[1].)
|
| [1] http://lists.busybox.net/pipermail/busybox/2010-Decem
| ber/074...
| nobody9999 wrote:
| Interesting stuff. Thanks for sharing it!
| emteycz wrote:
| Thank you for the thoughtful reply, the point about
| netmounting shared usr makes it much easier to
| understand.
| aranchelk wrote:
| I was taught /usr/local/bin
|
| /opt is for standalone packages, so if it's a single
| file, no.
|
| /bin is only for stuff needed on single user mode, so
| probably not (unless that's what the binary is for.
|
| /usr/bin is going to typically contain files installed by
| your package manager and should probably be left
| unaltered by human hands.
|
| The deps I would assume /usr/local/lib but it hasn't ever
| come up for me.
| krinchan wrote:
| Fun fact: Debian is working towards[1] and Arch already
| has merged / and /usr. /bin is a soft link to /usr/bin
| and similarly with sbin and lib.
|
| [1]: https://wiki.debian.org/UsrMerge
| graycat wrote:
| Where?
|
| See my post to this thread at
|
| https://news.ycombinator.com/item?id=29198222
| sergeykish wrote:
| Follow your distribution. For example Arch Linux provides
| PKGBUILDs for official repos and AUR. Most of the time
| someone has already published PKGBUILD, but if not I just
| patch accordingly.
|
| And conditions that formed separation are long gone, Arch
| Linux symlinks most of it: /bin ->
| /usr/bin /sbin -> /usr/bin /usr/sbin ->
| /usr/bin /lib -> /usr/lib /lib64 ->
| /usr/lib /usr/lib64 -> /usr/lib
| nsv wrote:
| To add: when you install software yourself you choose
| this, when your install software from e.g. a distribution
| package it is chosen by the package maintainers, and to a
| larger extent the maintainers of the distribution.
|
| This is one of the big advantages of using a pre-made
| advantages of using a ready-made Linux distribution:
| beyond the convenience of having an installer or easy to
| install packages, you get some assurance that the system
| as a whole has been thoughtfully put together.
|
| Arch Linux for example symlinks /bin and /sbin to
| /usr/bin and /lib to /usr/lib among other things.
| matheusmoreira wrote:
| > So where do I install this random binary I downloaded
| from the internet or compiled myself?
|
| In your home directory.
| db48x wrote:
| Wherever you want. All of the above, or none. It really
| is up to you.
| emteycz wrote:
| That's exactly the problem. This leads to mess. The
| Windows model of C:\Program Files\<app name> is much
| better.
| db48x wrote:
| No, it frees you to pick whatever unmessy solution you
| want.
|
| You can do `configure --prefix=/Program\ Files/<app>` if
| you want.
| emteycz wrote:
| If I am not writing all of my installation scripts by
| hand, because that would be really intense, then _every
| folder_ gets filled with random bits of software.
|
| Offering too many similar choices leads to mess. There's
| nothing fundamentally different between using one or more
| of these options and using the only option, except that
| in the second case there isn't any opportunity to make
| mess.
|
| > You can do `configure --prefix=/Program\ Files/<app>`
| if you want.
|
| Thanks for the tip! Can't do that with distro repo
| software though :-/
| db48x wrote:
| > then every folder gets filled with random bits of
| software.
|
| What does that even mean? When you install something, you
| put it where you want it.
|
| If you don't like where your distribution puts files,
| choose a different one. Not all of them use the same
| convention.
| emteycz wrote:
| All (except aforementioned GoboLinux) use FHS.
| kevin_thibedeau wrote:
| Use Gnu Stow to keep the random bits contained in their
| own app directory that is symlinked into the /usr/local
| tree. Then you can manage everything without leaving
| orphan files behind.
| emteycz wrote:
| Very cool
| Shared404 wrote:
| Except instead of config files, Windows has the registry.
|
| Also, as mentioned by the siblings to thia comment, the
| 'mess' has a purpose, and is less messy than it appears.
|
| Want to manually install something? Into /usr/local it
| goes. Done.
|
| The only way to handle this that I've been really
| impressed with is Mac's "Applications" folder.
| Unfortunately, I dislike most other things about Mac.
| yjftsjthsd-h wrote:
| When you download a portable app (just a bare .exe), do
| you make a folder for it and drop it in program files?
| (quite possible, you'd just be unusual) If not, why does
| Windows get a free pass?
| drewzero1 wrote:
| Okay, but what about ProgramData? I have enough programs
| that put their junk in there instead of Program Files,
| and others that make their own directories on the root of
| the drive (driver installers are really bad about this).
|
| I think the best model I've seen for consistent binary
| locations is the 'Applications' folder in Mac OS X, but
| it fails as well by retaining the /usr/bin elsewhere.
| tremon wrote:
| But why are many Windows programs under
| C:\Windows\System32 then, if Windows has only a single
| model? Why aren't all Steam-provided (for example) games
| in a single location? Or, if they are, does Windows
| really have a single model?
|
| Yes, the Linux/POSIX model is confusing, but the split is
| to segregate administrative domains:
|
| - / and /usr are the domain of the distribution. As a
| user, you should never install there. The administrative
| group is root.
|
| - /usr/local is the domain of the machine admin. If the
| machine is yours to manage, you can install software
| there. The administrative group is staff.
|
| - /opt/$vendor is the domain of third-party vendors. Each
| vendor (like Steam, Eclipse, Arduino Studio) can get its
| own subdirectory and its own administrative user group.
|
| How would you achieve the same on Windows? How do you
| make sure the Adobe updater can only install new versions
| of CS, but not surreptitiously install a new (free!)
| spyware package under C:\Windows? How would you allow
| certain power users to share one Google Chrome
| installation, allow each of them to update it, but not
| let them install additional software system-wide?
| somehnguy wrote:
| I've read that a handful of times (whenever trying to
| figure out where to put some new random thing), and still
| have never come to a clear conclusion. Even better,
| because there are so many similar places, you might
| choose completely different ones depending on the day of
| the week and your current mood.
|
| Too much choice for things like this is harmful IMO. Deep
| down I truly couldn't care less where the files end up,
| as long as that place is the 'right' place. There are too
| many 'right' places which makes it hard to find random
| things at a later date or when on a box you're not super
| familiar with. It's also a complete waste of time to
| think about it at all.
| graycat wrote:
| > I've read that a handful of times (whenever trying to
| figure out where to put some new random thing), and still
| have never come to a clear conclusion.
|
| So, given some data, say a file and/or directory, maybe
| from saving a Web page, that is relevant to subjects A,
| K, T, and Z, where in the file system directory trees to
| put that data?
|
| My solution: Put the data in a directory for one of the
| subjects A, K, T, or Z without thinking very hard about
| which of these. Then go to a file I call FACTS.DAT
| (right, an old idea with an old 8.3 file name!). I
| maintain that file with a few, simple editor macros. So,
| sure, the file is a catch-all for entries of random short
| _facts_. And each entry starts with a time-date stamp and
| a list of key words. So, in the case of subjects A, K, T,
| or Z, include the key words appropriate for each of
| those. Then in the _body_ of the entry, put the tree name
| of the file /directory where did store the data.
|
| In a few seconds with my favorite text editor I can
| append an entry or search for an entry.
|
| So far this year I have put 686 entries in the file
| FACTS.DAT for about 2.1 entries per day. For anything
| like current personal computers, handling such a file is
| trivial.
|
| The idea works great!
| NavinF wrote:
| It's not just you: Every distro is its own special
| snowflake and patches the programs they distribute to
| store files in a different place.
|
| The "standard" doesn't tell you what directory structure
| to use inside /etc to group related config files. The
| "standard" doesn't tell you where an HTTP server should
| serve its files. Everyone just does their own thing which
| makes upstream docs incorrect and useless for newcomers.
| stryan wrote:
| > The "standard" doesn't tell you what directory
| structure to use inside /etc to group related config
| files. The "standard" doesn't tell you where an HTTP
| server should serve its files. Everyone just does their
| own thing which makes upstream docs incorrect and useless
| for newcomers.
|
| The FHS, does actually answer both of of those questions.
| Files inside /etc/ should be grouped in subdirectories[0]
| andd the HTTP server should serve user-specified website
| files from /srv[1] and normal distro-provided files (such
| as the apache test page) from /var[2].
|
| [0]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch0
| 3s07.htm...
|
| [1]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch0
| 3s17.htm...
|
| [2]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch0
| 5.html#p...
| NavinF wrote:
| > HTTP server should serve user-specified website files
| from /srv
|
| I've never seen that in my life, but I'm sure someone
| does that. This is one of those cases where the people
| who follow the standard are increasing fragmentation
| elevader wrote:
| "use subdirectories" is probably the most handwavey
| answer possible, aside from maybe "just put it somewhere,
| lol". I feel like the standard could provide some sort of
| guidance on how to name folders or something.
| selfhoster11 wrote:
| GoboLinux does exactly that:
| https://en.m.wikipedia.org/wiki/GoboLinux
| DarkWiiPlayer wrote:
| Dammit, I wanted to be the one to mention gobo linux [HN
| deleted my laughing emoji ffs]
| andai wrote:
| Beat the system ?**?
| 0des wrote:
| Wow, thanks for the reply, nice find! I did some poking
| around on my Linux system and even re-arranging the home
| folder was a task of its own because the system kept
| trying to replace folders in their original places. I
| will do some digging in to Gobo and see how they're
| handling this. Thanks again for pointing this out.
| dotancohen wrote:
| > the system kept trying to replace folders in their
| original places.
|
| This is the file that you want: $ cat
| ~/.config/user-dirs.dirs XDG_DESKTOP_DIR="$HOME/"
| XDG_DOWNLOAD_DIR="$HOME/Downloads"
| XDG_TEMPLATES_DIR="$HOME/"
| XDG_PUBLICSHARE_DIR="$HOME/"
| XDG_DOCUMENTS_DIR="$HOME/" XDG_MUSIC_DIR="$HOME/"
| XDG_PICTURES_DIR="$HOME/" XDG_VIDEOS_DIR="$HOME/"
| yjftsjthsd-h wrote:
| That helps, but be warned that there are still programs
| running around that just hardcode their paths
| kaba0 wrote:
| Cries in nixpkgs
|
| (Anyone who tried to package a program that hardcodes the
| "usual" binary paths know the pain)
| account42 wrote:
| Doesn't nix itself hardcode the nix store path though?
| kaba0 wrote:
| Afaik there is an option to change it, but it is not
| advisable as that will break the binary cache and you are
| left with compiling everything yourself. This is due to a
| technical limitation in that different packages can
| contain paths everywhere and thus they are inherently
| part of the resulting hash, on which other packages can
| depend.
| lostlogin wrote:
| You're clearly a more capable user than me, but even so,
| take care. The time I accidentally moved /etc has scarred
| me for life.
| tech2 wrote:
| Early on in my Linux-using-life I made the mistake of
| deleting /etc. That was a learning experience like no
| other :)
| simonblack wrote:
| A couple of weeks after moving to UNIX from MSDOS, I
| thought I'd remove lots of unnecessary 'dot-directories'
| from the /tmp directory. I was root as I had no concept
| of being a 'normal user'.
|
| So I ran two simple commands: cd /tmp
| rm -fr .*
|
| and wondered why it was taking so long. <grin>
| account42 wrote:
| At least that doesn't happen today anymore. From bash:
|
| > When a pattern is used for pathname expansion, the
| character ``.'' at the start of a name or immediately
| following a slash must be matched explicitly, unless the
| shell option dotglob is set. The filenames ``.'' and
| ``..'' must always be matched explicitly, even if dotglob
| is set.
| acquacow wrote:
| I did that on my NAS a few years ago. I had copied in a
| bunch of directories from a mac and they all had tons of
| dot files in each dir that were showing up on my windows
| machines. I popped open a terminal and did the exact same
| thing and wiped most of the NAS out =P Good thing I had
| it mirrored with my other synology.
| mixmastamyk wrote:
| Since Live CDs/Flash drives were invented, I wouldn't
| worry about this stuff any longer. Certainly have your
| personal files in a centralized location and backed up
| first.
|
| Probably the easiest way to experiment these days is to
| create a VM and make snapshot, then start knocking down
| walls, just to see when and where the house collapses.
| Then revert and try something new.
| genewitch wrote:
| There's a computer game that deletes random files when
| you make a mistake or lose.
|
| There could be a competition!
| gavinray wrote:
| How do you deal with lack of being able to just point to
| "/usr/lib/include" or other things when saying "here's my
| directory of shared libs"?
|
| This is definitely interesting though, and an improvement
| I would say
| JonathonW wrote:
| GoboLinux symlinks everything into an FHS-ish structure
| under /System/Index/ so you still have a single place
| where binaries/libraries/includes/etc. live. (There are
| also symlinks from /usr/lib, /usr/bin, and others into
| /System/Index/ for compatibility with programs where
| those might be hardcoded.)
| short12 wrote:
| That actually seems like some low hanging fruit to go on
| a commit spree correcting code that hard codes paths
| oblio wrote:
| GoboLinux is old enough to vote in most countries.
|
| So either those low hanging fruits are higher than they
| seem, or we're all just a bunch of dwarves.
|
| My bet is on the second option.
| matheusmoreira wrote:
| It _is_ low hanging fruit as far as the software is
| concerned. Simply parameterize all paths.
|
| Will upstream accept such patches though? Sounds unlikely
| to me.
| biryani_chicken wrote:
| You don't even need to rearrange the folders themselves,
| just show them like that in the file explorer. Same way
| the windows explorer does.
| 0des wrote:
| Do you have any docs on how to do that? Thanks for the
| reply, I look forward to trying that.
| post-it wrote:
| MacOS too. /usr/ and /dev/ and whatnot exist, they're
| just flagged as invisible in Finder. There's a command to
| globally unhide them for those who want to see them.
| e0a74c wrote:
| Couldn't you do it with plain old symlinks?
| gary_0 wrote:
| In the Win95 era, it was "C:\My Documents".
| grishka wrote:
| Huh, spaces. There's way too much software, especially on
| Windows, that breaks when there are Cyrillic characters in
| a path. I'll let you guess how I found out.
| DarkWiiPlayer wrote:
| A friend had the username "Ruben" and jfc it broke
| everything other than windows itself xD
| dhosek wrote:
| The problem isn't the Cyrillic or the e but the fact that
| Windows lets you put those characters in file names in
| non-Unicode encodings which will create sequences of
| bytes which are invalid UTF-8. It's 2021, FFS, stop using
| legacy encodings.
| grishka wrote:
| All win32 functions that accept or return strings come in
| two varieties, with A and W suffixes,
| MessageBoxA/MessageBoxW. The A works with the system
| default 8-bit encoding (cp1251 in case of Cyrillic), the
| W works with unicode in wide chars. There shouldn't be
| much of a problem with string handling if you stick
| exclusively with W functions.
| ziml77 wrote:
| Using the W functions has been the advice from
| Microsoft's documentation for ages. But people still use
| the A functions because they're easier, especially when
| writing cross-platform software since Windows is the only
| major OS that made the unfortunate choice of having the
| base character type 16 bits wide.
|
| Fortunately the future of the Windows API does look
| better since Microsoft has now added proper UTF-8 support
| since Win 10 1904. All you have to do is request it in
| the application manifest and the A functions will accept
| and return UTF-8.
| grishka wrote:
| > since Windows is the only major OS that made the
| unfortunate choice of having the base character type 16
| bits wide
|
| Apple OSes use something they call "unichar" inside
| NSStrings. I'm not 100% sure what it is, but it feels
| like it's the same 16-bit wide character.
| ziml77 wrote:
| It's possible! It seemed like a sensible choice back in
| the early 90s when the answer to making a system for
| global use was UCS-2. I know Java was another one that
| went with that decision.
| mjevans wrote:
| I would rather they added a U suffixed version and better
| still backported that all the way to Win 7. Now in 3-7
| years people can write programs that use the A functions,
| but have to check the version of Windows and refuse to
| run if it isn't new enough.
| colejohnson66 wrote:
| There's been some talk of repurposing the A variants to
| work on UTF-8
| account42 wrote:
| > All you have to do is request it in the application
| manifest and the A functions will accept and return
| UTF-8.
|
| They really should have gone with WTF-8 [0] since the W
| functions generally accept WTF-16 and not just the valid
| UTF-16 subset.
|
| [0] https://simonsapin.github.io/wtf-8/
| DnDGrognard wrote:
| I had a really odd one last year where a Grave I ( well
| known brand name) got converted by office/excell into a
| Double Grave I.
|
| The double grave I is used by some obscure orthodox
| religionious texts
| kaba0 wrote:
| If you have a username with your full name (plus point if
| you have special characters in your name), you will get the
| whole deal with shitty programs. I'm not sure if it's me,
| but there were cases I simply could not use a program
| installed in such a location, to the point where at my
| previous (admittedly shitty) workplace, we often installed
| software in a root location...
| Matthias247 wrote:
| It not only keeps people on their toes due to the whitespace.
| The folder name is even localized. E.g. with german settings
| there is C:\Programme and c:\Programme (x86).
| spacechild1 wrote:
| You can still use the English names, though.
| 323 wrote:
| Laughs in C:\PROGRA~1\ (try it, still works in Windows 10)
| selfhoster11 wrote:
| Truly lifesaving for when she'll quoting gets in the way.
| kijin wrote:
| You've got a stray single quote in your shell. :)
| selfhoster11 wrote:
| That was a typo, but it seemed like a perfect
| illustration of my point, so I left it in.
| Someone wrote:
| Typo? I would guess it's autocomplete at work. iOS does
| that all the time for me.
| the_mitsuhiko wrote:
| There is no guarantee that the short name has that. In fact
| on a lot of German Windows installations it was PROGRA~2.
| 323 wrote:
| Well, on my disk PROGRA~1 is "Program Files" and PROGRA~2
| is "Program Files (x86)", so still works :)
| floatingatoll wrote:
| That order is not guaranteed consistent across
| installations, however.
| marginalia_nu wrote:
| I wonder if code to this effect has ever been written
| before for (int i = 1; i < INT_MAX; i++)
| { if (dirExists("C:\\PROGRA~%d\\ProgramName", i))
| {
| gmfawcett wrote:
| And that, children, is when marginalia_nu unlocked the
| seventh circle of the inferno. Tomorrow we'll read the
| story of how our new demon overlords forced us all back
| to Windows 3.1.
| jagged-chisel wrote:
| Win 3.1? on DOS 6.22? Actually, this sounds like heaven.
| Just don't put it on the public 'tubes.
| floatingatoll wrote:
| Or do. Can't hack a Mac Classic web server!
| marginalia_nu wrote:
| Got to tweak HIMEM.SYS before the slumbering one can be
| awakened.
| aksss wrote:
| PEEK and POKE could break the HIMEM.
| floatingatoll wrote:
| For whatever it's worth, this is a terrible idea, for so
| many different reasons:
|
| https://web.archive.org/web/20100107184218/http://blogs.m
| sdn...
|
| And so, yes, I'm certain someone must have done it,
| because it's clearly bad idea jeans and so Murphy's Law
| says it must exist.
| benibela wrote:
| Can that work for i > 9 ?
| floatingatoll wrote:
| If you mkdir PROGRA~10, yes!
| Someone wrote:
| Apart from what others mentioned, that can only work if the
| file system automatically creates 8.3 names. NTFS does not
| necessarily do that (https://docs.microsoft.com/en-
| us/windows-server/administrati...)
| antihero wrote:
| Shame it wasn't
|
| > C:\Pr[?]og[?]ram Fil[?]es[?]\
| gattilorenz wrote:
| Funny, in the Italian Win9x it is C:\Programmi, which I
| always thought was more convenient because of the lack of
| spaces :)
| dan-robertson wrote:
| On the other hand their case sensitivity behaviour means that
| "cross-platform" Java applications can break if they are run
| on a non-windows platform where opening files is case
| sensitive (unlike on windows)
| 908B64B197 wrote:
| It's actually a feature.
|
| Easier to add a flag to ignore case rather than fix bugs
| where files only differ by case and are therefore
| overwritten on a case-insensitive filesystem.
| sysadm1n wrote:
| > other unusual characters in file names
|
| Saw a few hacks where malware authors used the RTL feature
| (which is baked into Windows) to obfuscate file extensions. It
| looked like .exe.innocuous-document.docx, but was actually
| .docx.innocuous-document.exe
| masklinn wrote:
| This exact vulnerability in most modern code editors just
| made the rounds, allowing smuggling malicious code right
| through review.
| redwall_hp wrote:
| I don't know if it's still a problem, but it used to break
| Python virtualenv badly. If your working directory had a space
| anywhere in the path, it would throw a huge fit and not work.
| Which is problematic when the expected name for a Mac's boot
| drive is "Macintosh HD" (if you ever had a reason to run a
| virtualenv outside of your home directory).
| rossy wrote:
| > _anything with subprocesses_
|
| I'm begging software developers to stop using subprocess APIs
| that take a string argument (system(), child_process.exec(),
| Process.Start(string)) and start using subprocess APIs that
| take an array of arguments (execvp(), child_process.execFile(),
| Process.Start(string, IEnumerable<string>).)
| AlfeG wrote:
| Today I learned that You cannot install Tailscale on windows if
| installer is inside path with non-latin chars.
| mwcampbell wrote:
| My favorite filename special character bug was when I
| implemented CD ripping in 2005, and one of our beta testers
| ripped a CD with a song called "Have You Ever?". My code wasn't
| prepared to filter out the question mark on Windows.
| mixmastamyk wrote:
| I just hit the one where an album folder ends in a period.
| Rsync copies every time because the period is dropped by the
| filesystem silently. :-/
| Foobar8568 wrote:
| Let's not forget return carriages in filenames within apps...
| shane_b wrote:
| My Mac is formatted case sensitive when the default is case
| insensitive. This will also catch a ton of import related bugs.
|
| League of legends doesn't run until I sed files for instance.
| deckard1 wrote:
| I have coworkers on Mac that write node/JS code. Every once
| in awhile I'd pull down the latest code and it wouldn't run.
| I'm on Linux.
|
| Sure enough, they had SomeFile and were importing Somefile
| and it works fine on Mac but not on Linux (which, of course,
| is what our production servers use). It amazes me that "works
| fine on my machine" is still a thing when I definitely worked
| at companies that solved this back in the 2000s. It was
| solved. It was done. Then devs became enamored with running
| everything locally. Even dozens of microservices or
| databases. Even though JS is fairly isolated, you still have
| NPM packages that need built against the local OS and C/C++
| library and compilers, etc. Which also has caused issues in
| the past.
| jatone wrote:
| my favorite is often being the only developer on linux and
| giving two files with different casing and watching their
| systems crash and burn.
| speedgoose wrote:
| Good news, we have solutions. You could use continuous
| integration and software containers like Docker.
| fouric wrote:
| Does Docker abstract filesystem behaviors like this? I
| always thought that it stopped at the libc level - that
| is, libc is included in the container, but it calls the
| host kernel's system calls, and so inherits the host
| kernel's behavior (including things like underlying
| filesystem case sensitivity).
| handrous wrote:
| Docker relies on LXC, so it's Linux-only. On other
| platforms it runs in a Linux VM. The host for Docker,
| then, is Linux no matter where you are.
| zokier wrote:
| > Docker relies on LXC, so it's Linux-only.
|
| Docker hasn't supported LXC since 2016, and stopped
| relying on it in 2014
|
| https://www.docker.com/blog/docker-0-9-introducing-
| execution...
| handrous wrote:
| I thought the name for the collection of _kernel
| features_ was LXC, I didn 't realize (until just now)
| that was the name _only_ for the also-kernel-level
| _wrapper_ for those features, which name does _not_ cover
| the features themselves. That is, I didn 't realize that
| LXC is to Cgroups+Namespaces as Libvirt is to KVM--I
| thought LXC, as a label, covered the whole feature-set--
| but regardless, it's still married to Linux kernel
| features and runs on other platforms under
| virtualization, no?
| zokier wrote:
| > it's still married to Linux kernel features and runs on
| other platforms under virtualization, no?
|
| Actually no. At least on Windows Docker can do native
| Windows containers too
|
| https://poweruser.blog/lightweight-windows-containers-
| using-...
| [deleted]
| dunham wrote:
| Circa Y2k, I learned that the OSX Palm Pilot software didn't
| work with case sensitive. I've since given up and stuck with
| the default. (I'm anti-case folding in general, because of
| the ambiguity.)
| mdaniel wrote:
| I also enjoyed doing that, but had to make a DMG just for
| Steam because it straight-up refuses to run on a case
| sensitive FS (that's true on Windows, also, which I suspect
| is how we all got here). I think the most recent Steam
| versions either caught wind of my trickery or -- more likely
| -- run something from $HOME/Library/SomethingOrOther and thus
| the work-around it no longer works
|
| When I got a new Mac, I just gave up and acquiesced to the
| case-retentive world :-(
| memsom wrote:
| I once returned a printer because the Mac driver and support
| software expected and enforced case insensitive access and
| basically couldn't install properly on my case-sensitive HFS+
| volume. It half installed and blatantly just didn't work in
| any way when installed.
| NegativeLatency wrote:
| Adobe software used to refuse to install on case sensitive
| file systems back in the not too distant past.
| agumonkey wrote:
| See the recent article about unicode invisible glyphs in
| JavaScript or bash.
|
| Naming freedom needs a stdlib module
| kitkat_new wrote:
| Pro tip2: Use std lib path processing utilities
| idatum wrote:
| Somewhat related to injecting unusual characters, in my
| experience in localization efforts:
|
| Inject a Turkish 'I'. I don't know how to type or paste it
| here, but picture an English lower case 'i' that is upper case.
| It is a splendid way among many to shake out some loc bugs.
| gus_massa wrote:
| I
|
| From https://en.wikipedia.org/wiki/%C4%B0
| ygra wrote:
| That would only shake out anything if you'd also test in a
| Turkish locale, wouldn't it? Since Unicode casing rules are
| locale-dependent and en-US doesn't care much about dotless i
| or dotted i.
| jeffwask wrote:
| It doesn't even have to be complex, often basic automation
| tasks fail with spaces and special characters. Honestly,
| treating a file system like a natural language processor is a
| bad idea. Besides at this point with how digital we have all
| become who can't understand...
|
| thisismyconfig.txt vs this is my config.txt or
| this_is_my_config.txt
|
| ...i've forced myself to stop using spaces, character, and even
| cap. They are all constructs that provide minimal value for the
| extra complexity.
| rch wrote:
| I'm similar, but I would like to support labels intended for
| humans, along with various translations, as metadata on top
| of e.g. filesystem path components.
| fouric wrote:
| You nailed it - getting rid of spaces and dashes and
| underscores is extremely human-hostile. People added spaces
| to the English language for a reason, and that's because
| they make it way easier to read.
|
| Your system is only intended for other programs to interact
| with? Go nuts, make hex UUIDs. Actual people are supposed
| to use it? You need separator characters.
|
| I also don't see how those characters add "extra
| complexity" unless you're doing dumb things like text
| processing on paths and filenames (as opposed to using
| OS/library functions that handle paths correctly) - in
| which case, there's your problem.
| long_time_gone wrote:
| > thisismyconfig.txt vs this is my config.txt or
| this_is_my_config.txt
|
| Just wondering, what is the readability of this for people
| who are dyslexic?
| JCharante wrote:
| I'm not sure, but my gut instinct is that it wouldn't help.
| Dyslexia rates are much lower in China, so if I suppose we
| could start naming files with Chinese characters (on
| systems that support Unicode). It would take a bit to get
| used to, but eventually we'd develop a pidgin language for
| when we talk about software, much like how if you overhear
| Chinese or Vietnamese developers they will mix in English
| words like "linked list" into their sentences, because
| there's not a more natural sounding alternative.
|
| Switching to Chinese would also help eliminate the spaces
| issue.
| reaperducer wrote:
| Or in my case, people for whom English is a second
| language, or have low education levels.
|
| Saying, "who can't understand..." is arrogant, selfish, and
| an example of why normal people hate people in the SV echo
| chamber.
| jeffwask wrote:
| cestmaconfig.txt vs cest ma config.txt vs
| cest_ma_config.txt
|
| It's the same in any language.
|
| Hugs who hurt you.
|
| I'm also pretty sure most of us in any language use
| Slack, SMS or other forms of communication where text
| isn't necessarily presented in a grammatical correct
| manner and we all figure out what the person is saying.
| throwaway2077 wrote:
| SV echo chamber is on your side here - it is very in
| vogue to denounce anglocentrism. they were defending
| hieroglyphs and emoji in variable names in that thread
| about invisible javascript backdoor a day or two ago if
| you'd like a recent example
| dang wrote:
| Could you please stop posting ideological battle comments
| to HN? We ban accounts that do that, regardless of their
| ideology, because it's (a) not what this site is for, and
| (b) destroys what it is for.
|
| If you wouldn't mind reviewing
| https://news.ycombinator.com/newsguidelines.html and
| taking the intended spirit of the site more to heart,
| we'd be grateful.
| danlugo92 wrote:
| Agreed.
|
| But Hacker News should do something about all of the
| anti-bitcoin and anti-anti-nuclear ideologies running
| around in here.
|
| I don't really mind it that much but it'd be nice, it's
| really the only 2 extremisms I've experienced here, all
| other subjects are discussed in a fair manner.
| beambot wrote:
| I appreciate informed discussion about bitcoin & nuclear,
| as both topics are highly relevant to the technical,
| business, and hacker roots of HN. They seem distinctly
| different from, say, "anglocentrism" @dang was calling
| out.
| danlugo92 wrote:
| > discussion
|
| There's no such thing as fair discussion about those
| topics here.
| Hallucinaut wrote:
| What does "fair" mean in this usage? If it means one
| position attracts a lopsided balance of comments either
| for or against then surely that's always going to be the
| case?
|
| Otherwise what is your proposition, don't state any
| opinion unless you find a counter opinion commenter to
| match with?
|
| Lots of folk here are pro-privacy and lots of folk are
| anti-bitcoin (and some of them will be the latter because
| they're the former) so I don't understand how you'd
| extended your position in a way that leaves HN with any
| value.
| long_time_gone wrote:
| > Saying, "who can't understand..." is arrogant, selfish,
| and an example of why normal people hate people in the SV
| echo chamber
|
| Exactly how I feel every time Economics is brought up on
| HN.
| teorema wrote:
| tbh I'm not dyslexic and realized the spaces make it really
| difficult to know what the filename actually is. If you
| just take the second example, how would you know if the
| file was "this is my config.txt" versus "config.txt"?
|
| Aside from parsing errors it just seems to lend itself to
| ambiguity.
| vertere wrote:
| This. People are saying spaces improve ergonomics. Unless
| everyone always quotes their paths in documentation,
| emails, etc -- which they won't -- I say it actually
| reduces readability.
|
| Also programs automatically that turn paths into links
| don't work with spaces.
| 400thecat wrote:
| > treating a file system like a natural language processor is
| a bad idea
|
| could you please explain what you mean by that?
| Too wrote:
| Why stop there. A computer works more efficiently with
| numbers rather than strings, so let's just give each file a
| number instead of a string. Besides, at this point with how
| digital we have all become who can't understand... But wait,
| that already exists and is called an inode.
|
| A file system has a human interface and a computer interface.
| Don't mix them. Let users give file names in whichever way
| they please.
| KronisLV wrote:
| > Pro tip: rename your development directory (or even better:
| the workspace path in CI) to put a space and/or special
| characters in it.
|
| This will also break any code in external tools that are called
| during the builds of your application and do not handle spaces
| correctly for whatever reason, thus making it so that you won't
| be able to successfully finish the build.
|
| Then again, you probably shouldn't be relying on technologies
| like that, but when you're struggling to keep an old enterprise
| system alive, causing yourself more problems is not necessarily
| what you should do.
|
| Still a good idea in most cases, though.
| wldcordeiro wrote:
| Even capitalization is a pain in the ass thanks to how OSes
| treat file names. I pretty much stick with either `file-
| name.ext` or `file_name.ext` exclusively now.
| BiteCode_dev wrote:
| > Pro tip: rename your development directory (or even better:
| the workspace path in CI) to put a space and/or special
| characters in it.
|
| The problem with that is that YOUR code may handle it, but your
| tooling may not. If my code formatter break on spaces, I'm not
| going to change the formatter.
| ChrisSD wrote:
| You could submit a PR to their repo.
| BiteCode_dev wrote:
| I could submit a PR to 5 tools a week on average. I
| actually have the time and resources to do it once a year.
|
| Last week I opened a ticket for a Firefox bug. Following up
| on the bug took me 2 hours in total.
|
| FOSS is not free, you pay it with your time. And as with
| everything you pay for, we all have a budget.
| echelon wrote:
| Better solution: only allow ASCII, maybe dashes, and up to
| twelve characters. Problem solved.
|
| Enforce this in LDAP.
|
| Strict convention is better than flexibility and predicting
| obscure edge cases that can fail.
| pimterry wrote:
| In my case, and for many people writing desktop software, and
| for absolutely everybody writing open-source tools or
| libraries, unfortunately you can't control the environment.
|
| Non-ASCII paths are extremely common (e.g. the user's home
| directory on Windows, for the large majority of users outside
| the English-speaking world) and spaces, punctuation and
| weirder characters will definitely happen when you least
| expect it.
|
| Yes if you can avoid it then absolutely that's great, but I
| don't think most people can.
|
| It's also not usually very difficult to deal with, as long as
| you actually spot the issue in the first place.
| MayeulC wrote:
| Ah, that's the he enterprise edition.
|
| But then your program will crash hard and unexpectedly when a
| user decides to save under "~/house plans" or
| ~/Telechargements.
|
| I think it's better to exercise this in CI, that's what CI is
| for.
| mikepurvis wrote:
| Ugh, we have the 15 character Active Directory limit now with
| hostnames, and a previous IT administration has imposed a
| convention that every name had to follow
| [prod|dev]-[ph|vm]-[service]-[nn]. So basically every
| production service is prod-vm-owtf-01-- you get exactly four
| characters to actually describe what the machine does. Works
| great when the service is "jira" or "wiki", but there are a
| lot that are pretty mystical-sounding, like jkns, jwrk, cntr,
| hrbr, etc, where you kind of just have to know.
| icedchai wrote:
| Do they at least allow you to set up CNAMEs?
| mikepurvis wrote:
| Yes, and for many of the web-serving machines, that's
| what happens, they're jenkins.example.com or
| containers.example.com or similar. But often a singular
| service is backed by hidden worker nodes, databases,
| whatever else, and it seems silly to give those machines
| that level of indirection vs just using the hostname as
| their sole identifier.
| HNo wrote:
| I kind of like that honestly. No doubt you need some
| documentation so everyone knows what the service
| abbreviations are, but after you've been working there for
| a month you get it. Makes everything clean, consistent, and
| informational. You can quickly ascertain what a specific
| host is doing just from the name.
| mikepurvis wrote:
| Oh absolutely it makes sense to have a standard, and
| being able to tell at a glance if something is a VM or
| physical machine is of value also. But dedicating 2/3s of
| the character budget to such a scheme is madness. If the
| prod-vm- prefix simply become pv-, then you'd at least be
| able to do pv-jenkins-01 again.
|
| Anyway, all this was fine when we were on LDAP rather
| than Active Directory. So basically it's all Windows'
| fault.
| reaperducer wrote:
| _only allow ASCII, maybe dashes, and up to twelve characters.
| Problem solved_
|
| ...and only hire people from the exact same background as
| you, who will never have unusual characters or accents in
| their name. And also make sure not to have any users who aren
| 't exactly like you, and conform to this very narrow
| requirement. Surely, excluding 90% of the world won't hurt
| revenue in any way.
| stopagephobia wrote:
| This is not excluding? I just use an ascii canonicalized
| version of my name and works fine.
| badsectoracula wrote:
| You can use an "ASCII-fied" version of the name, only ~27%
| of mine can be typed in ASCII letters that look similar but
| the rest is just phonetically or visually close-enough
| letters. This is something people did for decades and
| nowadays even government IDs have an ASCII-fied (well,
| Latin-fied) version of the name.
| echelon wrote:
| Snarky, but I'll take it.
|
| Use strict schema for the hardware interface, networking,
| physical stuff the user never sees. Microservice names
| don't need to be non-Latin. Database replicas,
| infrastructures, etc. And you're not going to piss off
| employees by giving them ASCII ldap/email addresses.
|
| Use utf8mb4 or similar for storing names. Don't state
| "first" or "last". I've been through this rodeo too many
| times. You're not surprising anyone.
| numpad0 wrote:
| UTF-8 strings aren't reproducible anyways. User ID should
| be strictly for identification, be alphanumeric random
| string if necessary.
| chris_wot wrote:
| And yet OneDrive WP t allow fir spaces before or after a file
| name.
| alx__ wrote:
| I spent hours trying to figure out why an entire folder
| suddenly stopped syncing. Turns out I accidentally added a
| hidden space to the end of a folder name.
| wongarsu wrote:
| > Pro tip: rename your development directory
|
| I changed my username to not contain a space because it was too
| annoying to deal with all the random dev tools breaking. The
| worst offender was probably npx on Windows [1] (resolved after
| four years by deprecating npx), but it was far from the only
| one (though the JS ecosystem was somehow the worst in this
| regard of all languages I worked with).
|
| 1: https://github.com/zkat/npx/issues/100
| kermire wrote:
| Same, even I had to rename my user folder to not have a space
| because so many tools were breaking.
| qwertox wrote:
| In that case, be thorough and insert a Chinese and an Arabic
| character to enforce a Unicode check.
| cduzz wrote:
| And add a emoji, a character in a right to left language ( )
| and perhaps Tai . Maybe italicize one of those too...
| dr-detroit wrote:
| there are things you cant do in .net that you need the old
| Registry commands for and those don't accept spaces
| uberswe wrote:
| I did something similar on accident. I used to keep all my
| development work synced with Dropbox and I had a work and a
| personal account. So any of my own projects would have /Dropbox
| (Personal)/ in the path which did catch some bugs. Dropbox
| renamed my folder to "Dropbox (Personal)" automatically when
| connecting a work account.
| achn wrote:
| I maintain a similar system, where a variety of companies
| submit files that get processed through multiple services - it
| is astounding how ridiculous people's naming of files can be;
| spaces are the least concerning!
| 5faulker wrote:
| For those purposes I've found hyphen to be a nice substitute.
| WalterBright wrote:
| Sometimes / works as a path separator in Windows, sometimes it
| doesn't. It's not predictable.
|
| I never use / on Windows as a result.
| ygra wrote:
| The only common place where it doesn't work is in CMD for
| executing programs and as arguments for built-in commands.
| Everything else goes directly to the relevant APIs which
| don't care about / or \\.
|
| These days using CMD instead of PowerShell should be rare
| enough and PowerShell certainly doesn't mind the slashes.
| Izkata wrote:
| > Pro tip: rename your development directory (or even better:
| the workspace path in CI) to put a space and/or special
| characters in it.
|
| A former co-worker changed his name in our auth system to
| include an apostrophe, so that whenever we handled names wrong
| he'd find it.
| geoduck14 wrote:
| Oh, I like this!
| ygra wrote:
| I've used to have a space in my user name and even
| contemplated to add a bit of non-1252 Unicode. You find a lot
| of issues, but unfortunately often in tools you have little
| control over and end up not being able to work effectively at
| times. It ended up being more frustrating than helpful.
| curuinor wrote:
| the proper name of the glorious sultan of slack, j. r. "bob"
| dobbs, has the quotation marks and therefore is a great
| subject for this
| soheil wrote:
| Obligatory xkcd https://xkcd.com/327/
| floatingatoll wrote:
| I set my nickname to U+FFFD at one point in one work system,
| resulting in a variety of bug reports and concerned emails. I
| think I dropped it since it was generating false reports from
| people who didn't check what character the page contained
| before reporting it.
| reaperducer wrote:
| One of the systems I built is being used by a group of
| younger people. I included an emoji in the superuser account
| name, just to make sure it would work. And to remind me to
| think more broadly about user input.
| ajmurmann wrote:
| A related too for CI: change the system time to be a time
| zone that is during your work hours in a different day
| already than UTC. Really helped getting failures earlier than
| 4pm PST.
| brundolf wrote:
| At my last job we had a wild time-zone bug that only
| happened with your system location set to Mumbai. I left
| mine set to that for the rest of my time there.
| cpeterso wrote:
| Related: here's a recent Firefox bug about a test that
| failed during the daylight saving time change:
|
| https://bugzilla.mozilla.org/show_bug.cgi?id=1739847
| scubbo wrote:
| Could you consider rephrasing this? It sounds like an
| interesting observation that I'd love to understand, but
| I'm genuinely not able to parse it.
|
| My best guess is "change the system time to be a timezone
| for which, during your work hours, the other-timezone is in
| a different day than UTC is" - but I'm still not sure what
| effect that would have on CI failures.
| ajmurmann wrote:
| Maybe an example of the failure this detects helps: when
| I used to work on Rails apps in the olden days it was
| easy to call Time.now and get the local time instead of
| Time.zone.now to get UTC time. This often lead to wrong
| dates but tests would only fail once it was a new day in
| UTC land but still the old day in the local time zone.
| Making the CI machine's system time something Fiji time
| really helped in getting failures much sooner after
| changes were pushed.
| Teknoman117 wrote:
| I read that as "set your CI to run earlier in your
| workday so you don't get new error reports at the end of
| the day." Midnight UTC being 4 pm/16:00 PST.
| ridaj wrote:
| Accents help too
| ygra wrote:
| For anyone curious, this is called Pseudo-localization
| (https://en.wikipedia.org/wiki/Pseudolocalization). I first
| singled across this in Raymond Chen's blog.
| [deleted]
| qwertox wrote:
| I add a Japanese character into any .py, .js and .html file
| to ensure that Unicode is working properly through the entire
| chain. Mostly in form of a variable which gets passed along,
| even in URL parameters.
| fernandotakai wrote:
| my test accounts always have emojis + accents + other weird
| characters.
|
| it keeps everybody on their toes lol.
| enragedcacti wrote:
| To have such thoughtful coworkers. On an old team I had two
| coworkers named Chris and once in a blue moon when they
| reviewed each other code master would start crashing because
| one of them accidentally left in an absolute path starting
| with "/home/chris/".
| cerved wrote:
| Spaces are a pain in the ass when you're using CLI so I'd
| rather enforce a no space policy
| reayn wrote:
| Most shells will behave just fine if you put a quote (single
| or double) before anything that has a space.
|
| A small extra step but something you get used to if you spend
| a lot of time in the cli.
| cerved wrote:
| Escaping spaces is a pain. I have to do it every day.
|
| I set up symlinks which help navigating around but then the
| relative paths are wrong for git.
|
| No thanks.
|
| Friends don't let friends put spaces in paths
___________________________________________________________________
(page generated 2021-11-13 23:03 UTC)