[HN Gopher] Serving my blog posts as Linux manual pages
       ___________________________________________________________________
        
       Serving my blog posts as Linux manual pages
        
       Author : zerojames
       Score  : 204 points
       Date   : 2024-02-29 12:18 UTC (10 hours ago)
        
 (HTM) web link (jamesg.blog)
 (TXT) w3m dump (jamesg.blog)
        
       | heartag wrote:
       | That's a fun idea, and well executed.
        
       | Annatar wrote:
       | One could pipe that output directly to nroff or groff:
       | | soelim | tbl | eqn | nroff -man - | $PAGER            | soelim
       | | tbl | eqn | nroff -man -Tpost - | /usr/lib/postscript/bin/dpost
       | | ps2pdf - > ~/Desktop/blog.PDF            | soelim | tbl | eqn |
       | groff -Tps -man - | ps2pdf - > ~/Desktop/blog.PDF
        
         | anthk wrote:
         | with groff you don't need soelim | tbl...
         | 
         | just run                      | groff -Tpdf -step -k >
         | ~/Desktop/blog.pdf
        
       | nulbyte wrote:
       | I never thought of making a blog post a man page. That's pretty
       | awesome, actually. I'd be interested to see the code for the
       | underlying conversion, but maybe I'll just try my hand at it
       | myself this weekend.
       | 
       | Btw, in Bash, you can use process substitution to avoid littering
       | your folder with files, if you don't want to save them:
       | 
       | $ man <(curl ...)
        
         | rahimnathwani wrote:
         | > I'd be interested to see the code for the underlying
         | conversion
         | 
         | The post includes the template he used. You can adapt it to any
         | templating engine or put some placeholders in and use sed or
         | whatever.
        
       | worble wrote:
       | Cool idea. I'm starting on the timer until we get "serving my
       | blog posts as playable DOOM wads".
        
         | vonjuice wrote:
         | Add it to the list of the handful of actually cool things that
         | AI could facilitate.
        
       | yrro wrote:
       | FYI, 'curl -sL -H "Accept: text/roff"
       | https://jamesg.blog/2024/02/28/programming-projects/ | man -l
       | /dev/stdin' works for me - no need to safe the roff file locally.
        
         | zerojames wrote:
         | Unfortunately, that command doesn't work on macOS
         | (`/usr/bin/man: illegal option -- l`). I tried to get a one-
         | liner with piping working on my Mac but I always ran into
         | errors. The -l flag doesn't exist in the macOS man
         | implementation (I consulted the man page; meta!).
        
           | jolmg wrote:
           | The `-l` can be skipped (probably because there are `/`s in
           | man's argument):                 curl -sL -H "Accept:
           | text/roff" https://jamesg.blog/2024/02/28/programming-
           | projects/ | man /dev/stdin
        
             | BossingAround wrote:
             | (which still doesn't work on mac)
        
             | divbzero wrote:
             | This still doesn't work for me on macOS, perhaps a
             | difference in how _/ dev/stdin_ is implemented?
             | 
             | Fortunately, macOS has ZSH as the default shell so the
             | following does work:                 man =(curl -sL -H
             | "Accept: text/roff"
             | https://jamesg.blog/2024/02/28/programming-projects/)
        
           | penguinjanitor wrote:
           | Maybe 'man <(curl -sL -H "Accept: text/roff"
           | https://jamesg.blog/2024/02/28/programming-projects/)' works
           | then?
        
             | BossingAround wrote:
             | Nope
        
             | divbzero wrote:
             | This doesn't work because it's equivalent to piping
             | _stdout_ from _curl_ into _stdin_ for _man_.
        
               | penguinjanitor wrote:
               | It's not, that would be `man <<< "$(curl ...)" `. `man
               | <(curl ...)` is a slightly different version of the
               | process substitution[1] from your comment - `man =(curl
               | ...)`.
               | 
               | [1] https://zsh.sourceforge.io/Doc/Release/Expansion.html
               | #Proces...
        
               | divbzero wrote:
               | You're right, it isn't quite the same as piping.
               | 
               |  _man <(curl ...)_ ends up working on Linux but for
               | whatever reason still yields an error on macOS:
               | No manual entry for /dev/fd/11
        
               | globular-toast wrote:
               | It's actually process substitution: https://www.gnu.org/s
               | oftware/bash/manual/html_node/Process-S...
               | 
               | It's essentially running the process in another shell and
               | sending its output to a tempfile. The name of that file
               | is then substituted in the original command. So, in
               | theory, equivalent to `inner > /tmp/file; outer
               | /tmp/file`. This works with programs that don't read from
               | stdin, or need two or more inputs, for example.
        
         | salviati wrote:
         | You can save a few chars with a process sostitution instead of
         | a pipe if you're using bash:                   man -l <(curl
         | -sL -H "Accept: text/roff"
         | https://jamesg.blog/2024/02/28/programming-projects/)
        
           | teaearlgraycold wrote:
           | And the world rejoiced as the bytes were saved.
        
           | hk__2 wrote:
           | If you're into that you can save four more chars by using
           | `-sLH Accept:text/roff`
        
         | godelski wrote:
         | I'm pretty sure the OP is intentionally not doing this because
         | piping commands/things-from-the-internet into "bash" (or
         | anything else) is generally considered bad practice.
         | 
         | For me, I'm all for this. Anyone that is aware of the security
         | implications is extremely likely aware of how to make the
         | conversion like you've provided. In that case, no reason to
         | tell people, because they already know. But it's not a great
         | thing to tell noobies to do because they will get burned. So
         | don't tell them. As they advance they'll naturally learn about
         | this feature. And hopefully they have learned the implications
         | by the time they learn how to do this.
         | 
         | (not me): https://www.seancassidy.me/dont-pipe-to-your-
         | shell.html
        
           | globular-toast wrote:
           | Piping to your shell and piping to `man` are not the same
           | thing.
           | 
           | The really dangerous thing is copy/pasting from a browser to
           | your terminal. Always do Ctrl-X Ctrl-E to open an editor,
           | paste it in there, then inspect it before saving/closing the
           | editor to run the command.
        
             | crtasm wrote:
             | Never knew that shortcut, useful!
        
               | godelski wrote:
               | I only recently learned of it myself despite being on nix
               | for over a decade and reading many advanced bash shortcut
               | blogs. IDK why it isn't well known.
               | 
               | For zsh users see here:
               | https://nuclearsquid.com/writings/edit-long-commands/
               | 
               | Note: this can be tricky depending on where it goes in
               | your zshrc. If you use a plugin manager (like Sheldon)
               | then this should be above that. I ended up with this and
               | it works well on OSX and linux                 autoload
               | -U edit-command-line       # Emacs style (<C-x><C-e>)
               | zle -N edit-command-line       # make sure `set -o vi` is
               | above this line       bindkey '^xe' edit-command-line
               | bindkey '^x^e' edit-command-line       # (VIM) Use visual
               | mode       bindkey -M vicmd v edit-command-line
        
             | vladxyz wrote:
             | Aren't they? Are you sure piping to 'man' can't result in
             | arbitrary code execution?
             | 
             | The two things you need to be able to say you trust are
             | your CA store, and the source of your curl -> shell.
        
               | globular-toast wrote:
               | No, they're not the same thing! It's not piping to your
               | shell! The shell's single purpose is to execute code. Man
               | is not supposed to do that and it would be considered a
               | huge security issue if it could. In any case, how would
               | you check the downloaded file? With a text editor? Are
               | you sure that can't result in arbitrary code execution?
        
               | AlecSchueler wrote:
               | Man can run groff which can in turn run arbitrary
               | subprocesses.
        
               | johannes1234321 wrote:
               | Even if it were:
               | 
               | There is no practical difference. "Nobody" will inspect
               | the man page using a different viewer first. So if I
               | download to disk and then view via man or directly via
               | man is no difference.
               | 
               | A shell script one might inspect first using some viewer.
               | While only few probably do.
        
           | _joel wrote:
           | There's not much difference doing a pipe or using and
           | intermediary file and "excecuting" that straight away.
           | There's no manual step in the documentation there to read the
           | output before it's run.
        
             | godelski wrote:
             | This is not accurate. If the download gets interrupted bash
             | will execute the partial line. That means `rm -r
             | /tmp/foo.ext` or `rm -r ${HOME}/.tmp_config` can execute as
             | `rm -r /`.
             | 
             | This can be mitigated by wrapping the script, but clearly
             | no one is looking at the code so this isn't really verified
             | anyways. And it's not like we see that all the time.
             | 
             | Edit:
             | 
             | But my main point is about habits. There's the concern
             | mananaysiempre brings up[0], but either way, it is best to
             | be in good habits.
             | 
             | [0] https://news.ycombinator.com/item?id=39554895
        
             | jmb99 wrote:
             | There definitely is a difference. See [1] (or [2] since the
             | original site has misconfigured TLS). Long story short,
             | using some fairly trivial heuristics, a malicious server
             | could change its response when it detects its output being
             | piped to a shell rather than saved to a file. Thus, a
             | security-minded person downloading the file (who could
             | inspect it) would be given a clean copy, but someone less
             | security-minded piping it straight to bash would be given
             | the malicious copy. The security-minded person wouldn't be
             | able to warn people not to pipe the script to a shell,
             | since it appears safe.
             | 
             | [1] https://www.idontplaydarts.com/2016/04/detecting-curl-
             | pipe-b...
             | 
             | [2] https://web.archive.org/web/20240228190305/https://www.
             | idont...
        
           | hk__2 wrote:
           | > I'm pretty sure the OP is intentionally not doing this
           | because piping commands/things-from-the-internet into "bash"
           | (or anything else) is generally considered bad practice.
           | 
           | It's generally considered bad practice only for bash and
           | similar commands that execute their input. It's not a bad
           | practice at all for commands that just display or transform
           | their input, like `man`, `less`, `ffmpeg`, etc.
        
             | mananaysiempre wrote:
             | Man on Linux runs groff, which (like the original
             | nroff/troff) is a fully general macro processor in addition
             | to being a typesetting system. I wouldn't bet on it not
             | being able to launch subprocesses, or especially on it
             | having no overflows and such on untrusted input. I'm not
             | even sure about OpenBSD's much more limited mandoc.
             | 
             | (Also, I don't know about ffmpeg as it is somewhat more
             | rare to have it be public-facing, but there have definitely
             | been exploits against ImageMagick, usually targeted at
             | websites using it to process user input.)
        
             | codelobe wrote:
             | [insert confused trollface]
             | 
             | > ffmpeg There is certainly a few hundered exploitable
             | vectors in that program alone... to say nothing of the
             | rest.
             | 
             | When in doubt, spin up a VM to run the random untrusted
             | thing -- And then go read its mailing list/issue tracker
             | for known VM escaping exploits. I have a machine setup to
             | test malware, so I just hit my "airgap" switch to isolate
             | the system from my network once the questionable code is in
             | place and ready to run (potentially amok). Study-up about
             | ARP-poison attacks, and remember ARP does not transit to
             | upstream routers/switches (Y "combinate" your network for
             | fun and profit).
             | 
             | Before you assume non malicious simple text output,
             | consider "ANSI" escape code complexity as an intrusion
             | vector for whatever terminal you run. I've got "0-days" for
             | this going back to MSDOS: ANSI Bomb => arbitrary CMD entry.
             | You don't have to take my word for it, your terminal of
             | choice is most certainly vulnerable to some ANSI/escape
             | code related exploit, look it up.
        
           | jimmaswell wrote:
           | Has there been a single recorded case of malware getting
           | around this way?
        
             | godelski wrote:
             | I'm sorry, your argument is... what exactly? That we
             | shouldn't take simple or even trivial preventative measures
             | that also reduce other potential problems because we...
             | haven't seen anybody abuse such a thing before? Really?
             | 
             | What a terrible argument. Not it's trivial to resolve. Why
             | not just fix things that we know are problems or can lead
             | to serious problems instead of waiting for it to become a
             | problem where it'll then be FAR more work to clean it up?
             | 
             | Seriously, you're a human, not a bug. You have the ability
             | to solve things before they become problems. Use it.
             | 
             | While I'm not sure of a specific example, I feel quite
             | confident in saying that this has been done before.
        
           | godelski wrote:
           | Can't edit my comment so I'll write as a reply
           | This is to all the people saying no difference between
           | downloading and running right away
           | 
           | If the download gets interrupted bash will execute the
           | partial line. That means `rm -r /tmp/foo.ext` or `rm -r
           | ${HOME}/.tmp_config` can execute as `rm -r /`. This can be
           | mitigated by wrapping the script. Best way to do this is wrap
           | the whole script into a function and then execute at the last
           | line[0]. Oh, and you can detect `curl|bash` server side[1]
           | 
           | [0] https://archive.is/20160603044800/https://sandstorm.io/ne
           | ws/...
           | 
           | [1] https://archive.is/20230325190353/https://www.idontplayda
           | rts...
           | 
           | Edit: as an example, rust does the warpping. But they still
           | place this stupid shit on in their install instructions. _Bad
           | Rust! Bad!_
           | 
           | https://www.rust-lang.org/tools/install
        
       | masto wrote:
       | I don't know why this particular thing set off my pedanticism.
       | Someone is slightly wrong on the Internet! Maybe because it
       | started right off being needlessly Linux-centric. Or that I
       | thought it would be one thing but it turned out to be a brief
       | demo of content negotiation in NGINX.
       | 
       | In any case, a few pointless things that I seem compelled to say:
       | 
       | * It's not returning roff, as such. Those things like `.TH` are
       | not part of roff, they are part of the macro package for writing
       | man pages. * I was disappointed that there was no markdown-to-
       | roff conversion, which seemed like it was going to be the
       | interesting part of this post. At least use one of the existing
       | ones. * On a similar note, this means that the text isn't really
       | formatted correctly. roff is meant for one sentence per line of
       | input, to distinguish between `.` to end a sentence vs. other
       | uses. * Also also wik, this means that any line starting with a
       | `.` will be interpreted as a command, potentially wreaking havoc.
       | 
       | Or maybe I'm just a grumpy old man.
        
         | zerojames wrote:
         | I appreciate you sharing this! I wasn't aware of the exact
         | ontology of how roff vs. man relate, and I went through several
         | iterations of this post trying to get this right. There being
         | other tools -- groff, nroff, etc. -- added to my confusion. A
         | blog post unto itself is "here is what roff/man
         | page/nroff/other variants are, here's how to use them." I would
         | have appreciated a succinct description; I'm sure others would,
         | too.
         | 
         | As for markdown to roff, I thought about it as a v2. As I
         | started to think about implementing a parser, someone shared
         | https://github.com/sunaku/md2man with me, which appears to
         | solve the problem.
         | 
         | I'd need to figure out how to integrate this into my (Python)
         | site that is built on GitHub Pages; a bit of tinkering would be
         | required :D
        
         | jorams wrote:
         | > * I was disappointed that there was no markdown-to-roff
         | conversion
         | 
         | I found this rather surprising too. Pandoc can trivially
         | convert markdown to man-page roff. Insert that into the given
         | template and it looks like more like an actual man page.
        
           | zerojames wrote:
           | Good suggestion!
           | 
           | Context: All man pages are generated on the fly on GitHub
           | Pages. My site generates ~2500 pages, for which 826 are
           | eligible for a man page. I didn't want to introduce another
           | parser since I just got my site build times down :D
           | 
           | I can counter increased build times with caching, but it gets
           | a bit icky since some blog pages are evergreen (i.e. my
           | blogroll). [insert cache invalidation complaint here] But
           | there's certainly a way!
        
             | jorams wrote:
             | I feel that, build times can be a pain. I'm calling out to
             | pandoc to build a static site and I've had to parallelize
             | it to get build times down, and that's with far fewer
             | pages.
        
       | yegle wrote:
       | It would be cool to provide a deb repo as a way of subscribing to
       | your blog.
       | 
       | So `apt update` will pull in all blog posts, `man your-blog` will
       | show the latest post with links to the index of all your other
       | posts.
        
         | zerojames wrote:
         | _adds to their TODO list :D_
        
           | tutfbhuf wrote:
           | Please also create an AUR package for Arch Linux.
        
             | basilgohar wrote:
             | Release as a Flatpak and make it distro agnostic!
        
               | gkbrk wrote:
               | Make it even better and release an AppImage!
        
               | edvinbesic wrote:
               | Even even better, put it on the web so it's accessible
               | from anywhere
        
               | abulman wrote:
               | wow, there. Lets not get crazy there. It's plenty enough
               | to support multiple package managers, but what you are
               | considering is just too much!
        
               | hk1337 wrote:
               | Okay, just create an iOS application.
        
         | out_of_memory wrote:
         | first thing came to my mind when i read the blog. it would be a
         | very cool thing. i hope this becomes a thing.
        
         | dcminter wrote:
         | On the one hand this idea is brilliant - on the other hand if
         | it caught on, there are some obvious opportunities for malware
         | intrinsic to the approach :'(
         | 
         | I think I'd be too chicken to subscribe.
        
         | zerojames wrote:
         | This is now in progress.
         | 
         | https://github.com/capjamesg/jamesg.blog.deb has all you need
         | to build a man-page-only deb file using:
         | 
         | git clone https://github.com/capjamesg/jamesg.blog.deb cd
         | jamesg.blog.deb dpkg-deb --build --root-owner-group jamesg.blog
         | sudo dpkg -i jamesg.blog.deb
         | 
         | You should see:
         | 
         | ... Processing triggers for man-db (2.9.1-1) ...
         | 
         | Which indicates the man page for `man jamesg.blog` is
         | available. There is just a placeholder in there for now. I will
         | perhaps finish this tomorrow!
         | 
         | NB: This may become a blog post soon :D
        
         | potta_coffee wrote:
         | Release your blog as an Electron application please.
        
       | busfahrer wrote:
       | Speaking of URLs that do fancy stuff on the terminal, I once
       | stumbled upon this one from textfiles.com, it's essentially a
       | short animated movie on the terminal using VT100 terminal codes,
       | all served from a URI. On modern systems you can use rate-
       | limiting to watch it:
       | 
       | curl --limit-rate 1000 http://textfiles.com/sf/STARTREK/trek.vt
       | && reset
       | 
       | (the reset is there because it might mess up your terminal)
       | 
       | Other terminal-powered URIs:
       | 
       | curl cheat.sh/tar (gets examples on how to use the program after
       | the /)
       | 
       | curl wttr.in/berlin (gets weather info with terminal formatting)
        
         | hk__2 wrote:
         | If you want to do your own ASCII video via telnet, I did it in
         | Go a few years back: https://github.com/bfontaine/RickASCIIRoll
         | 
         | It's actually quite simple; the hardest part is to generate the
         | frames but that can be done with ffmpeg+img2txt.py: https://git
         | hub.com/bfontaine/RickASCIIRoll/tree/master/movie....
        
         | anthk wrote:
         | Use tritty to fake a 1200/9600 BPS baud rate.
        
         | kolme wrote:
         | There's also star wars over telnet:
         | 
         | https://itsfoss.com/star-wars-linux/
        
       | gglitch wrote:
       | Fun project :) Next step: Texinfo, which will output to info,
       | html, and pdf, among others, and which includes links and
       | indexes.
        
       | pjmlp wrote:
       | Cool experiment, as means to serve UNIX man pages.
        
       | throwaway81523 wrote:
       | That doesn't look like man pages. Was it supposed to be an
       | example? (EDIT: oh nm, I see, I have to view the image. Nice. But
       | the web view should also look like that.)
       | 
       | I noticed the "written by human, not AI" logo, a little too cute
       | but the sentiment is good. I had been thinking of putting
       | something like "this page created by natural stupidity" in mine.
        
       | rahimnathwani wrote:
       | If you like this you might like mdless, which does exactly what
       | the name suggests.
       | 
       | https://github.com/ttscoff/mdless
        
       | divbzero wrote:
       | Now all we need is a Markdown to roff converter... which
       | apparently already exists.
       | 
       | https://github.com/postmodern/kramdown-man
       | 
       | https://rtomayko.github.io/ronn/ronn.1.html
       | 
       | https://kristaps.bsd.lv/lowdown/
        
         | ajvpot wrote:
         | I love pandoc[0] for tasks like this. It supports most markup
         | formats I've needed.
         | 
         | [0]: https://pandoc.org/
        
       | dpassens wrote:
       | The correct media type would be text/troff, as per RFC 4263
       | (https://www.rfc-editor.org/rfc/rfc4263.html).
        
       | jedberg wrote:
       | Ironically it breaks when you request _this_ page as a man page.
       | :)
        
       | medstrom wrote:
       | There is an Emacs package that installs Abelson & Sussman's SICP
       | (Structure and Interpretation of Computer Programs) into the Info
       | directory.
       | 
       | Simply type M-x package-install sicp RET
       | 
       | That gave me the idea that one could install a whole bookshelf of
       | blog archives through some modified feed reader. Reading Info in
       | Emacs, you even get bookmarks.
        
         | felideon wrote:
         | FYI, SICP is from Abelson and Sussman.
        
           | medstrom wrote:
           | My bad, thanks.
        
       | Gormo wrote:
       | There's no need to fork or use an intermediate file; you can just
       | pipe straight into `man`:
       | 
       | `curl -sL -H "Accept: text/roff"
       | https://jamesg.blog/2024/02/28/programming-projects/ | man -l -`
        
         | hereonout2 wrote:
         | Should have read the manual.
        
           | 0xEF wrote:
           | rtfm.lol I guess
        
             | zerojames wrote:
             | See https://news.ycombinator.com/item?id=39552852 for more
             | discussion.
        
             | hk1337 wrote:
             | https://rtfm.lol
             | 
             | already taken.
        
         | godelski wrote:
         | Please don't... Actually yrro posted something similar 2hrs
         | before you and now we have the whole {curl,wget} pipe into
         | command discussion again...                 Friends don't let
         | friends pipe streams into commands
         | 
         | https://news.ycombinator.com/item?id=39554044
        
       | adriangrigore wrote:
       | I prefer my own UNIX web dev solution https://mkws.sh! Man pages
       | ar too much imo.
        
         | hk__2 wrote:
         | > I prefer my own UNIX web dev solution https://mkws.sh! Man
         | pages ar too much imo.
         | 
         | Those are orthogonal subjects; you could generate your own
         | static page with your generator AND also serve them as
         | manpages. The linked article does not suggest to server
         | manpages to everyone, just to user agents that request them.
        
       | anthk wrote:
       | WIth mandoc and maybe groff you can typeset pages to man pages,
       | HTML, PS and PDF files.
        
       | harryvederci wrote:
       | I feel like this is too accessible.
       | 
       | Anyone doing this but with Vim help files?
        
       | darkwater wrote:
       | Maybe I can search the interwebs to answer this question but I
       | prefer to ask the HN crowd. Back in highschool in remember
       | someone (on an HP-UX) showing me that you could jump to any
       | underlined word (referring to a section) by pressing some key
       | combination, but for the life of me I cannot remember which. I
       | already checked man(1) and man(7), and could not find it either.
       | Maybe this is just a fake memory?
        
         | epcoa wrote:
         | I'm not familiar with any custom man viewer functionality but
         | maybe you're thinking of the CDE help viewer, dthelpview which
         | would display man pages.
        
           | darkwater wrote:
           | No, it was in a dumb text terminal. But maybe I just made it
           | up, and I'm confusing it with classic "hjkl" navigation.
        
       | _xerces_ wrote:
       | For the lazy or short of time, does with worth with tldr pages :)
       | https://tldr.sh/
        
       ___________________________________________________________________
       (page generated 2024-02-29 23:00 UTC)