[HN Gopher] Serving my blog posts as Linux manual pages
___________________________________________________________________
Serving my blog posts as Linux manual pages
Author : zerojames
Score : 204 points
Date : 2024-02-29 12:18 UTC (10 hours ago)
(HTM) web link (jamesg.blog)
(TXT) w3m dump (jamesg.blog)
| heartag wrote:
| That's a fun idea, and well executed.
| Annatar wrote:
| One could pipe that output directly to nroff or groff:
| | soelim | tbl | eqn | nroff -man - | $PAGER | soelim
| | tbl | eqn | nroff -man -Tpost - | /usr/lib/postscript/bin/dpost
| | ps2pdf - > ~/Desktop/blog.PDF | soelim | tbl | eqn |
| groff -Tps -man - | ps2pdf - > ~/Desktop/blog.PDF
| anthk wrote:
| with groff you don't need soelim | tbl...
|
| just run | groff -Tpdf -step -k >
| ~/Desktop/blog.pdf
| nulbyte wrote:
| I never thought of making a blog post a man page. That's pretty
| awesome, actually. I'd be interested to see the code for the
| underlying conversion, but maybe I'll just try my hand at it
| myself this weekend.
|
| Btw, in Bash, you can use process substitution to avoid littering
| your folder with files, if you don't want to save them:
|
| $ man <(curl ...)
| rahimnathwani wrote:
| > I'd be interested to see the code for the underlying
| conversion
|
| The post includes the template he used. You can adapt it to any
| templating engine or put some placeholders in and use sed or
| whatever.
| worble wrote:
| Cool idea. I'm starting on the timer until we get "serving my
| blog posts as playable DOOM wads".
| vonjuice wrote:
| Add it to the list of the handful of actually cool things that
| AI could facilitate.
| yrro wrote:
| FYI, 'curl -sL -H "Accept: text/roff"
| https://jamesg.blog/2024/02/28/programming-projects/ | man -l
| /dev/stdin' works for me - no need to safe the roff file locally.
| zerojames wrote:
| Unfortunately, that command doesn't work on macOS
| (`/usr/bin/man: illegal option -- l`). I tried to get a one-
| liner with piping working on my Mac but I always ran into
| errors. The -l flag doesn't exist in the macOS man
| implementation (I consulted the man page; meta!).
| jolmg wrote:
| The `-l` can be skipped (probably because there are `/`s in
| man's argument): curl -sL -H "Accept:
| text/roff" https://jamesg.blog/2024/02/28/programming-
| projects/ | man /dev/stdin
| BossingAround wrote:
| (which still doesn't work on mac)
| divbzero wrote:
| This still doesn't work for me on macOS, perhaps a
| difference in how _/ dev/stdin_ is implemented?
|
| Fortunately, macOS has ZSH as the default shell so the
| following does work: man =(curl -sL -H
| "Accept: text/roff"
| https://jamesg.blog/2024/02/28/programming-projects/)
| penguinjanitor wrote:
| Maybe 'man <(curl -sL -H "Accept: text/roff"
| https://jamesg.blog/2024/02/28/programming-projects/)' works
| then?
| BossingAround wrote:
| Nope
| divbzero wrote:
| This doesn't work because it's equivalent to piping
| _stdout_ from _curl_ into _stdin_ for _man_.
| penguinjanitor wrote:
| It's not, that would be `man <<< "$(curl ...)" `. `man
| <(curl ...)` is a slightly different version of the
| process substitution[1] from your comment - `man =(curl
| ...)`.
|
| [1] https://zsh.sourceforge.io/Doc/Release/Expansion.html
| #Proces...
| divbzero wrote:
| You're right, it isn't quite the same as piping.
|
| _man <(curl ...)_ ends up working on Linux but for
| whatever reason still yields an error on macOS:
| No manual entry for /dev/fd/11
| globular-toast wrote:
| It's actually process substitution: https://www.gnu.org/s
| oftware/bash/manual/html_node/Process-S...
|
| It's essentially running the process in another shell and
| sending its output to a tempfile. The name of that file
| is then substituted in the original command. So, in
| theory, equivalent to `inner > /tmp/file; outer
| /tmp/file`. This works with programs that don't read from
| stdin, or need two or more inputs, for example.
| salviati wrote:
| You can save a few chars with a process sostitution instead of
| a pipe if you're using bash: man -l <(curl
| -sL -H "Accept: text/roff"
| https://jamesg.blog/2024/02/28/programming-projects/)
| teaearlgraycold wrote:
| And the world rejoiced as the bytes were saved.
| hk__2 wrote:
| If you're into that you can save four more chars by using
| `-sLH Accept:text/roff`
| godelski wrote:
| I'm pretty sure the OP is intentionally not doing this because
| piping commands/things-from-the-internet into "bash" (or
| anything else) is generally considered bad practice.
|
| For me, I'm all for this. Anyone that is aware of the security
| implications is extremely likely aware of how to make the
| conversion like you've provided. In that case, no reason to
| tell people, because they already know. But it's not a great
| thing to tell noobies to do because they will get burned. So
| don't tell them. As they advance they'll naturally learn about
| this feature. And hopefully they have learned the implications
| by the time they learn how to do this.
|
| (not me): https://www.seancassidy.me/dont-pipe-to-your-
| shell.html
| globular-toast wrote:
| Piping to your shell and piping to `man` are not the same
| thing.
|
| The really dangerous thing is copy/pasting from a browser to
| your terminal. Always do Ctrl-X Ctrl-E to open an editor,
| paste it in there, then inspect it before saving/closing the
| editor to run the command.
| crtasm wrote:
| Never knew that shortcut, useful!
| godelski wrote:
| I only recently learned of it myself despite being on nix
| for over a decade and reading many advanced bash shortcut
| blogs. IDK why it isn't well known.
|
| For zsh users see here:
| https://nuclearsquid.com/writings/edit-long-commands/
|
| Note: this can be tricky depending on where it goes in
| your zshrc. If you use a plugin manager (like Sheldon)
| then this should be above that. I ended up with this and
| it works well on OSX and linux autoload
| -U edit-command-line # Emacs style (<C-x><C-e>)
| zle -N edit-command-line # make sure `set -o vi` is
| above this line bindkey '^xe' edit-command-line
| bindkey '^x^e' edit-command-line # (VIM) Use visual
| mode bindkey -M vicmd v edit-command-line
| vladxyz wrote:
| Aren't they? Are you sure piping to 'man' can't result in
| arbitrary code execution?
|
| The two things you need to be able to say you trust are
| your CA store, and the source of your curl -> shell.
| globular-toast wrote:
| No, they're not the same thing! It's not piping to your
| shell! The shell's single purpose is to execute code. Man
| is not supposed to do that and it would be considered a
| huge security issue if it could. In any case, how would
| you check the downloaded file? With a text editor? Are
| you sure that can't result in arbitrary code execution?
| AlecSchueler wrote:
| Man can run groff which can in turn run arbitrary
| subprocesses.
| johannes1234321 wrote:
| Even if it were:
|
| There is no practical difference. "Nobody" will inspect
| the man page using a different viewer first. So if I
| download to disk and then view via man or directly via
| man is no difference.
|
| A shell script one might inspect first using some viewer.
| While only few probably do.
| _joel wrote:
| There's not much difference doing a pipe or using and
| intermediary file and "excecuting" that straight away.
| There's no manual step in the documentation there to read the
| output before it's run.
| godelski wrote:
| This is not accurate. If the download gets interrupted bash
| will execute the partial line. That means `rm -r
| /tmp/foo.ext` or `rm -r ${HOME}/.tmp_config` can execute as
| `rm -r /`.
|
| This can be mitigated by wrapping the script, but clearly
| no one is looking at the code so this isn't really verified
| anyways. And it's not like we see that all the time.
|
| Edit:
|
| But my main point is about habits. There's the concern
| mananaysiempre brings up[0], but either way, it is best to
| be in good habits.
|
| [0] https://news.ycombinator.com/item?id=39554895
| jmb99 wrote:
| There definitely is a difference. See [1] (or [2] since the
| original site has misconfigured TLS). Long story short,
| using some fairly trivial heuristics, a malicious server
| could change its response when it detects its output being
| piped to a shell rather than saved to a file. Thus, a
| security-minded person downloading the file (who could
| inspect it) would be given a clean copy, but someone less
| security-minded piping it straight to bash would be given
| the malicious copy. The security-minded person wouldn't be
| able to warn people not to pipe the script to a shell,
| since it appears safe.
|
| [1] https://www.idontplaydarts.com/2016/04/detecting-curl-
| pipe-b...
|
| [2] https://web.archive.org/web/20240228190305/https://www.
| idont...
| hk__2 wrote:
| > I'm pretty sure the OP is intentionally not doing this
| because piping commands/things-from-the-internet into "bash"
| (or anything else) is generally considered bad practice.
|
| It's generally considered bad practice only for bash and
| similar commands that execute their input. It's not a bad
| practice at all for commands that just display or transform
| their input, like `man`, `less`, `ffmpeg`, etc.
| mananaysiempre wrote:
| Man on Linux runs groff, which (like the original
| nroff/troff) is a fully general macro processor in addition
| to being a typesetting system. I wouldn't bet on it not
| being able to launch subprocesses, or especially on it
| having no overflows and such on untrusted input. I'm not
| even sure about OpenBSD's much more limited mandoc.
|
| (Also, I don't know about ffmpeg as it is somewhat more
| rare to have it be public-facing, but there have definitely
| been exploits against ImageMagick, usually targeted at
| websites using it to process user input.)
| codelobe wrote:
| [insert confused trollface]
|
| > ffmpeg There is certainly a few hundered exploitable
| vectors in that program alone... to say nothing of the
| rest.
|
| When in doubt, spin up a VM to run the random untrusted
| thing -- And then go read its mailing list/issue tracker
| for known VM escaping exploits. I have a machine setup to
| test malware, so I just hit my "airgap" switch to isolate
| the system from my network once the questionable code is in
| place and ready to run (potentially amok). Study-up about
| ARP-poison attacks, and remember ARP does not transit to
| upstream routers/switches (Y "combinate" your network for
| fun and profit).
|
| Before you assume non malicious simple text output,
| consider "ANSI" escape code complexity as an intrusion
| vector for whatever terminal you run. I've got "0-days" for
| this going back to MSDOS: ANSI Bomb => arbitrary CMD entry.
| You don't have to take my word for it, your terminal of
| choice is most certainly vulnerable to some ANSI/escape
| code related exploit, look it up.
| jimmaswell wrote:
| Has there been a single recorded case of malware getting
| around this way?
| godelski wrote:
| I'm sorry, your argument is... what exactly? That we
| shouldn't take simple or even trivial preventative measures
| that also reduce other potential problems because we...
| haven't seen anybody abuse such a thing before? Really?
|
| What a terrible argument. Not it's trivial to resolve. Why
| not just fix things that we know are problems or can lead
| to serious problems instead of waiting for it to become a
| problem where it'll then be FAR more work to clean it up?
|
| Seriously, you're a human, not a bug. You have the ability
| to solve things before they become problems. Use it.
|
| While I'm not sure of a specific example, I feel quite
| confident in saying that this has been done before.
| godelski wrote:
| Can't edit my comment so I'll write as a reply
| This is to all the people saying no difference between
| downloading and running right away
|
| If the download gets interrupted bash will execute the
| partial line. That means `rm -r /tmp/foo.ext` or `rm -r
| ${HOME}/.tmp_config` can execute as `rm -r /`. This can be
| mitigated by wrapping the script. Best way to do this is wrap
| the whole script into a function and then execute at the last
| line[0]. Oh, and you can detect `curl|bash` server side[1]
|
| [0] https://archive.is/20160603044800/https://sandstorm.io/ne
| ws/...
|
| [1] https://archive.is/20230325190353/https://www.idontplayda
| rts...
|
| Edit: as an example, rust does the warpping. But they still
| place this stupid shit on in their install instructions. _Bad
| Rust! Bad!_
|
| https://www.rust-lang.org/tools/install
| masto wrote:
| I don't know why this particular thing set off my pedanticism.
| Someone is slightly wrong on the Internet! Maybe because it
| started right off being needlessly Linux-centric. Or that I
| thought it would be one thing but it turned out to be a brief
| demo of content negotiation in NGINX.
|
| In any case, a few pointless things that I seem compelled to say:
|
| * It's not returning roff, as such. Those things like `.TH` are
| not part of roff, they are part of the macro package for writing
| man pages. * I was disappointed that there was no markdown-to-
| roff conversion, which seemed like it was going to be the
| interesting part of this post. At least use one of the existing
| ones. * On a similar note, this means that the text isn't really
| formatted correctly. roff is meant for one sentence per line of
| input, to distinguish between `.` to end a sentence vs. other
| uses. * Also also wik, this means that any line starting with a
| `.` will be interpreted as a command, potentially wreaking havoc.
|
| Or maybe I'm just a grumpy old man.
| zerojames wrote:
| I appreciate you sharing this! I wasn't aware of the exact
| ontology of how roff vs. man relate, and I went through several
| iterations of this post trying to get this right. There being
| other tools -- groff, nroff, etc. -- added to my confusion. A
| blog post unto itself is "here is what roff/man
| page/nroff/other variants are, here's how to use them." I would
| have appreciated a succinct description; I'm sure others would,
| too.
|
| As for markdown to roff, I thought about it as a v2. As I
| started to think about implementing a parser, someone shared
| https://github.com/sunaku/md2man with me, which appears to
| solve the problem.
|
| I'd need to figure out how to integrate this into my (Python)
| site that is built on GitHub Pages; a bit of tinkering would be
| required :D
| jorams wrote:
| > * I was disappointed that there was no markdown-to-roff
| conversion
|
| I found this rather surprising too. Pandoc can trivially
| convert markdown to man-page roff. Insert that into the given
| template and it looks like more like an actual man page.
| zerojames wrote:
| Good suggestion!
|
| Context: All man pages are generated on the fly on GitHub
| Pages. My site generates ~2500 pages, for which 826 are
| eligible for a man page. I didn't want to introduce another
| parser since I just got my site build times down :D
|
| I can counter increased build times with caching, but it gets
| a bit icky since some blog pages are evergreen (i.e. my
| blogroll). [insert cache invalidation complaint here] But
| there's certainly a way!
| jorams wrote:
| I feel that, build times can be a pain. I'm calling out to
| pandoc to build a static site and I've had to parallelize
| it to get build times down, and that's with far fewer
| pages.
| yegle wrote:
| It would be cool to provide a deb repo as a way of subscribing to
| your blog.
|
| So `apt update` will pull in all blog posts, `man your-blog` will
| show the latest post with links to the index of all your other
| posts.
| zerojames wrote:
| _adds to their TODO list :D_
| tutfbhuf wrote:
| Please also create an AUR package for Arch Linux.
| basilgohar wrote:
| Release as a Flatpak and make it distro agnostic!
| gkbrk wrote:
| Make it even better and release an AppImage!
| edvinbesic wrote:
| Even even better, put it on the web so it's accessible
| from anywhere
| abulman wrote:
| wow, there. Lets not get crazy there. It's plenty enough
| to support multiple package managers, but what you are
| considering is just too much!
| hk1337 wrote:
| Okay, just create an iOS application.
| out_of_memory wrote:
| first thing came to my mind when i read the blog. it would be a
| very cool thing. i hope this becomes a thing.
| dcminter wrote:
| On the one hand this idea is brilliant - on the other hand if
| it caught on, there are some obvious opportunities for malware
| intrinsic to the approach :'(
|
| I think I'd be too chicken to subscribe.
| zerojames wrote:
| This is now in progress.
|
| https://github.com/capjamesg/jamesg.blog.deb has all you need
| to build a man-page-only deb file using:
|
| git clone https://github.com/capjamesg/jamesg.blog.deb cd
| jamesg.blog.deb dpkg-deb --build --root-owner-group jamesg.blog
| sudo dpkg -i jamesg.blog.deb
|
| You should see:
|
| ... Processing triggers for man-db (2.9.1-1) ...
|
| Which indicates the man page for `man jamesg.blog` is
| available. There is just a placeholder in there for now. I will
| perhaps finish this tomorrow!
|
| NB: This may become a blog post soon :D
| potta_coffee wrote:
| Release your blog as an Electron application please.
| busfahrer wrote:
| Speaking of URLs that do fancy stuff on the terminal, I once
| stumbled upon this one from textfiles.com, it's essentially a
| short animated movie on the terminal using VT100 terminal codes,
| all served from a URI. On modern systems you can use rate-
| limiting to watch it:
|
| curl --limit-rate 1000 http://textfiles.com/sf/STARTREK/trek.vt
| && reset
|
| (the reset is there because it might mess up your terminal)
|
| Other terminal-powered URIs:
|
| curl cheat.sh/tar (gets examples on how to use the program after
| the /)
|
| curl wttr.in/berlin (gets weather info with terminal formatting)
| hk__2 wrote:
| If you want to do your own ASCII video via telnet, I did it in
| Go a few years back: https://github.com/bfontaine/RickASCIIRoll
|
| It's actually quite simple; the hardest part is to generate the
| frames but that can be done with ffmpeg+img2txt.py: https://git
| hub.com/bfontaine/RickASCIIRoll/tree/master/movie....
| anthk wrote:
| Use tritty to fake a 1200/9600 BPS baud rate.
| kolme wrote:
| There's also star wars over telnet:
|
| https://itsfoss.com/star-wars-linux/
| gglitch wrote:
| Fun project :) Next step: Texinfo, which will output to info,
| html, and pdf, among others, and which includes links and
| indexes.
| pjmlp wrote:
| Cool experiment, as means to serve UNIX man pages.
| throwaway81523 wrote:
| That doesn't look like man pages. Was it supposed to be an
| example? (EDIT: oh nm, I see, I have to view the image. Nice. But
| the web view should also look like that.)
|
| I noticed the "written by human, not AI" logo, a little too cute
| but the sentiment is good. I had been thinking of putting
| something like "this page created by natural stupidity" in mine.
| rahimnathwani wrote:
| If you like this you might like mdless, which does exactly what
| the name suggests.
|
| https://github.com/ttscoff/mdless
| divbzero wrote:
| Now all we need is a Markdown to roff converter... which
| apparently already exists.
|
| https://github.com/postmodern/kramdown-man
|
| https://rtomayko.github.io/ronn/ronn.1.html
|
| https://kristaps.bsd.lv/lowdown/
| ajvpot wrote:
| I love pandoc[0] for tasks like this. It supports most markup
| formats I've needed.
|
| [0]: https://pandoc.org/
| dpassens wrote:
| The correct media type would be text/troff, as per RFC 4263
| (https://www.rfc-editor.org/rfc/rfc4263.html).
| jedberg wrote:
| Ironically it breaks when you request _this_ page as a man page.
| :)
| medstrom wrote:
| There is an Emacs package that installs Abelson & Sussman's SICP
| (Structure and Interpretation of Computer Programs) into the Info
| directory.
|
| Simply type M-x package-install sicp RET
|
| That gave me the idea that one could install a whole bookshelf of
| blog archives through some modified feed reader. Reading Info in
| Emacs, you even get bookmarks.
| felideon wrote:
| FYI, SICP is from Abelson and Sussman.
| medstrom wrote:
| My bad, thanks.
| Gormo wrote:
| There's no need to fork or use an intermediate file; you can just
| pipe straight into `man`:
|
| `curl -sL -H "Accept: text/roff"
| https://jamesg.blog/2024/02/28/programming-projects/ | man -l -`
| hereonout2 wrote:
| Should have read the manual.
| 0xEF wrote:
| rtfm.lol I guess
| zerojames wrote:
| See https://news.ycombinator.com/item?id=39552852 for more
| discussion.
| hk1337 wrote:
| https://rtfm.lol
|
| already taken.
| godelski wrote:
| Please don't... Actually yrro posted something similar 2hrs
| before you and now we have the whole {curl,wget} pipe into
| command discussion again... Friends don't let
| friends pipe streams into commands
|
| https://news.ycombinator.com/item?id=39554044
| adriangrigore wrote:
| I prefer my own UNIX web dev solution https://mkws.sh! Man pages
| ar too much imo.
| hk__2 wrote:
| > I prefer my own UNIX web dev solution https://mkws.sh! Man
| pages ar too much imo.
|
| Those are orthogonal subjects; you could generate your own
| static page with your generator AND also serve them as
| manpages. The linked article does not suggest to server
| manpages to everyone, just to user agents that request them.
| anthk wrote:
| WIth mandoc and maybe groff you can typeset pages to man pages,
| HTML, PS and PDF files.
| harryvederci wrote:
| I feel like this is too accessible.
|
| Anyone doing this but with Vim help files?
| darkwater wrote:
| Maybe I can search the interwebs to answer this question but I
| prefer to ask the HN crowd. Back in highschool in remember
| someone (on an HP-UX) showing me that you could jump to any
| underlined word (referring to a section) by pressing some key
| combination, but for the life of me I cannot remember which. I
| already checked man(1) and man(7), and could not find it either.
| Maybe this is just a fake memory?
| epcoa wrote:
| I'm not familiar with any custom man viewer functionality but
| maybe you're thinking of the CDE help viewer, dthelpview which
| would display man pages.
| darkwater wrote:
| No, it was in a dumb text terminal. But maybe I just made it
| up, and I'm confusing it with classic "hjkl" navigation.
| _xerces_ wrote:
| For the lazy or short of time, does with worth with tldr pages :)
| https://tldr.sh/
___________________________________________________________________
(page generated 2024-02-29 23:00 UTC)