[HN Gopher] I don't always use LaTeX, but when I do, I compile t...
       ___________________________________________________________________
        
       I don't always use LaTeX, but when I do, I compile to HTML (2013)
        
       Author : pyjamafish
       Score  : 184 points
       Date   : 2024-01-26 01:13 UTC (21 hours ago)
        
 (HTM) web link (www.peterkrautzberger.org)
 (TXT) w3m dump (www.peterkrautzberger.org)
        
       | pyjamafish wrote:
       | So, I originally posted this last year. When I posted it, I was
       | using tectonic as my LaTeX compiler, and since it didn't support
       | HTML output yet, I didn't actually try the article's suggestion.
       | 
       | Today, when I saw that I got an invitation to repost this article
       | from the mods, I thought I'd take the time to try it out.
       | 
       | The two commands that the article suggests can be combined into
       | one:                   latexmlpost --dest=mydoc.html
       | --format=html5 <(latexml mydoc.tex)
       | 
       | I did a comparison[1] of pdflatex and latexml using some old
       | assignments, and it looks like compiling to HTML isn't fully
       | there yet: the spacing was off in some places, and manual line
       | breaks didn't work. But, I remain hopeful. If this gets polished,
       | viewing LaTeX documents on phones would be much nicer.
       | 
       | [1]: https://imgur.com/a/yyyXWL8
        
         | thewakalix wrote:
         | What's the advantage of that subshell redirection over a simple
         | pipe?
        
           | pyjamafish wrote:
           | I don't know if there's an advantage, haha. It was just the
           | first thing that came to mind.
           | 
           | Looks like a pipe is also supported; you just need to pass
           | `-` as the name of the file to `latexmlpost`.
           | latexml mydoc.tex | latexmlpost --dest=mydoc.html
           | --format=html5 -
        
             | tkw01536 wrote:
             | You can actually also use the latexmlc omni-executable [1]
             | (that is part of the latexml distribution), which can
             | convert to html in one command:                   latexmlc
             | --dest=mydoc.html --format=html5 mydoc.tex
             | 
             | [1] https://math.nist.gov/~BMiller/LaTeXML/manual/commands/
             | latex...
        
         | marknazzaro wrote:
         | There's some good news... arXiv just adopted LaTeXML for in-
         | house HTML conversions of its papers. They allow users to
         | submit bug reports and have collected over 700 so far.
         | 
         | LaTeXML is maintained by a team at NIST, and they are actively
         | responding to the bug reports on github issues.
         | 
         | The LaTeX team headed by Frank Mittelbach is also working to
         | add more structural information to the output of LaTeX, which
         | will make compiling to HTML much easier.
        
         | PrimeMcFly wrote:
         | > Today, when I saw that I got an invitation to repost this
         | article from the mods
         | 
         | The mods personally invited you to repost a year later?
        
           | pyjamafish wrote:
           | Yes! I was surprised too. It's a cool hidden mechanic of HN,
           | the second chance pool[1].
           | 
           | [1]: https://news.ycombinator.com/item?id=26998308
        
             | PrimeMcFly wrote:
             | Interesting! Thanks for the link :)
        
       | acidburnNSA wrote:
       | Sphinx and reStructuredText are, IMHO, underrated power houses of
       | document building. With extensions, you can hook them up to
       | Zotero (or whatever)-managed bibtex files. You can render to
       | beautiful HTML files, and you get latex PDFs and epubs for free.
       | First class latex-math support, plenty of integrations with
       | things like mermaid, graphviz, and the ability to build super-
       | powerful custom directives to do basically anything. And way
       | simpler/easier than pure LaTeX.
       | 
       | Heck you can even integrate a full-on requirements management
       | system in them using sphinx-needs https://sphinx-
       | needs.readthedocs.io/en/latest/
        
         | mr_mitm wrote:
         | One of the selling points of PDF is that it is a single self-
         | contained file. I found this lacking in Sphinx and wrote an
         | extension for it to zip and bundle the assets into a single
         | HTML file: https://github.com/AdrianVollmer/Zundler
         | 
         | Also works with HTML documents produced in other ways.
        
           | o11c wrote:
           | Hmm, the disadvantage of your approach is that it
           | unconditionally requires Javascript, even if the original
           | didn't.
           | 
           | Also if you're going to embed a giant binary blob, _please_
           | ship way to extract it.
        
             | mr_mitm wrote:
             | Yes, it's a trade-off.
             | 
             | Not a bad idea, thanks for the suggestion.
        
             | 3rd3 wrote:
             | Aren't the image blobs embedded in the URLs using
             | Base64-encoded strings rather than using JS?
        
           | markdoubleyou wrote:
           | You're getting close to making your own CHM format, which
           | Sphinx could make for you.
           | 
           | I always thought CHM files were a nice self-contained option
           | for multi-page HTML docs. (Though they'd happily execute
           | whatever JavaScript the author embedded in there... Maybe
           | that's why they fell out favor?)
        
             | mr_mitm wrote:
             | It would be great if there was an open CHM-like format that
             | was supported by all major browsers. The nice thing about
             | browsers is that everyone already got one installed. They
             | can even open PDFs natively these days. Sadly, they cannot
             | even open epubs (which is almost like CHM without
             | interactivity). I believe firefox used to be able to open
             | epubs, not sure what happened.
        
               | jhoechtl wrote:
               | Edge could. MS cut it out long before the move to the
               | chrome rendering engine.
        
               | WorldMaker wrote:
               | Edge supported epub until the bitter end of the Spartan
               | renderer. It was only Microsoft's attempt at an ebook
               | store that died long before that. Admittedly, most
               | people's visibility into Edge epub support was through
               | the Store and the sidebar dedicated to store purchases,
               | but if you had no other book reader app take over the
               | .epub file extension (or if you realized that you could
               | drag and drop DRM-free .epub files into new tabs) Edge
               | would still read them right up to the Chromium switch.
        
               | Shorel wrote:
               | And it was probably the best EPUB reader available on
               | Windows.
               | 
               | Particularly because of the text-to-speech engine
               | features.
        
               | WorldMaker wrote:
               | I think it was too. I also think a lot of people missed
               | that there was an app in the Microsoft Store from some
               | team adjacent to the Edge team at the time called the
               | boring and easy to overlook name "Reader" that _just_ had
               | the PDF and EPUB viewers from Edge in a file-based UI
               | instead of browser chrome UI. It was such a useful app
               | and you could set it to default for PDF (in Windows 8 and
               | the early years of 10) and EPUB files (in early Windows
               | 10, with some effort). I never understood why their ebook
               | store effort focused on a sidebar in Edge that didn 't
               | work like anything else in Edge instead of beefing up a
               | file-based app like Reader. Reader also died when Edge
               | went to Chromium and I still miss it as a lightweight and
               | fast PDF reader.
        
               | WorldMaker wrote:
               | The "Portable EPUBS" discussion happening nearby is on
               | this subject, too.
               | 
               | https://news.ycombinator.com/item?id=39138042
        
           | acidburnNSA wrote:
           | If you just run sphinx-build with the latex builder and then
           | run xelatex or pdflatex on the result you'll get one fully-
           | consistent PDF with everything it it, including fully
           | functional internal hyperlinks. That's what I do for PDF. I
           | can make big documentation packages this way building 2000
           | page pdfs in a minute or two on a modest laptop.
           | 
           | Wait: also, how is what you're saying different from the
           | built-in singlehtml builder? https://www.sphinx-
           | doc.org/en/master/usage/builders/index.ht...
        
             | mr_mitm wrote:
             | In the product of the singlehtml builder, you will have the
             | entire document in one single DOM tree. For large
             | documents, even modern browsers on a modern machine will be
             | brought to its knees.
             | 
             | Check out the CPython docs for example:
             | https://adrianvollmer.github.io/Zundler/output/cpython.html
             | 
             | This is a huge document, and having this all rendered
             | naively in one single page will not only be hard to
             | navigate, it will also feel really sluggish if not crash
             | the browser.
        
               | acidburnNSA wrote:
               | Ah, ok, so you want a PDF-like single file but in HTML in
               | a way that's more efficient/scalable than the built-in
               | singlehtml builder. Ok fair enough.
               | 
               | For my use cases, the default multi-file HTML builds are
               | ok, and I just pound out a latex-builder generated PDF
               | for the archives.
        
         | anta40 wrote:
         | I guess latex is still unbeatable for writing complex math
         | expressions. These days, when I don't need that, I'm happy with
         | AsciiDoc.
        
           | acidburnNSA wrote:
           | Sphinx/reStructuredText supports math in LaTeX input format
           | [1], so you can still go nuts with complex math expressions
           | while still benefitting from the relative simplicity.
           | 
           | [1] https://www.sphinx-
           | doc.org/en/master/usage/restructuredtext/...
           | 
           | Looks like AsciiDoc supports similar latex math blocks [2].
           | Are there reasons you can't stick with that when doing math?
           | 
           | [2] https://docs.asciidoctor.org/asciidoc/latest/stem/#block
        
             | anta40 wrote:
             | For example: writing complicated expression invovling
             | calculus/matrix. That's not something I need everyday,
             | though.
        
               | acidburnNSA wrote:
               | I have documented at least 10 x 10 matrices with rst math
               | directives and found it to be pretty convenient. I don't
               | understand what the benefit of pure latex is in this
               | context.
        
           | chaxor wrote:
           | Typst.
           | 
           | Typst is better IMO
        
             | jamiedumont wrote:
             | As a certified grumpy old developer I spent years writing
             | off the "X but in Rust" projects, but I have to confess
             | that a lot of good things with meaningful improvements have
             | come from the rewrite-everything-in-Rust movement.
             | 
             | I've not used Typst and not authored much LaTeX (but worked
             | on a project with a group of scientists who used nothing
             | but LaTeX) and can see obvious advantages to Typst. Same
             | with many, many other Rust libraries.
        
               | kuschkufan wrote:
               | So funny to me that people assume, oh it's written in
               | Rust, so it must be a rewrite of something else just so
               | they can use Rust.
               | 
               | They never imagine that people choose Rust for something
               | they want to implement anyway and not just to replicate
               | something existing, that they do not want to use since
               | it's not implemented in Rust????
        
               | jamiedumont wrote:
               | Oh I know there's loads of original Rust work, but you
               | have to acknowledge that the "X, but in Rust" trope
               | exists.
        
               | reaperman wrote:
               | Yep even as a big fan of it...it's definitely a trope.
               | And one that's very easy to either dismiss or make fun
               | of. It would be a bit strange for fans to feel
               | defensiveness or denial over that.
        
               | avgcorrection wrote:
               | jamiedumont let out a rambonctious laugh to himself.
               | 
               | - Ah, you got me good you meddling kids!
               | 
               | jamiedumont was talking to himself again.
               | 
               | hackerbod slowly leaned over and squinted at the screen.
               | 
               | - Uh Typst?
               | 
               | - Yeah! It's a typesetting markup language. It's supposed
               | to be better than things like latex.
               | 
               | - Ok. What's so funny about it?
               | 
               | - Oh hehe, it's written in--guess what?
               | 
               | - I dunno?
               | 
               | - Rust!
               | 
               | jamiedumont started giggling but hackerbod remained
               | neutrally unamused.
               | 
               | - Oh come on! Rewrite in Rust? Language zealots? Young
               | adults who can't program without some Ruby syntax
               | sprinkled in?
               | 
               | - So this "typt" thing--
               | 
               | - Typst.
               | 
               | - Right, Typst, this typesetting thing was created to
               | promote Rust in some way?
               | 
               | - Oh I don't think so.
               | 
               | - It doesn't mention Rust on the homepage or something?
               | You know, Written in Rust?
               | 
               | - Nope. Not to my recollection.
               | 
               | - So is it a rewrite of something else in--
               | 
               | - Nope.
               | 
               | - So then what does that have to do with--
               | 
               | - Ah, but you're missing the bigger picture, hackerbod.
               | 
               | - Ok.
               | 
               | - Year after year of this eye-rolling promotion and
               | nagging, blah blah blah memory unsafety is bad, blah blah
               | this is why we used angle brackets for generics, and
               | these sly bastards went and pulled off the most epic
               | Trojan Horse that I've ever seen been--
               | 
               | - And what's that?
               | 
               | - They made an actually useful language!
               | 
               | hackerbod had to scoot back as jamiedumont fell off his
               | swivel chair because he was laughing so hard. hackerbod
               | scratched his head.
               | 
               | jamiedumont finally recovered from the ab-induced
               | euphoria.
               | 
               | - Ah hackerbod, I hate to admit it but they got me good!
               | Those cursed language zealots got one over on me!
        
               | jamiedumont wrote:
               | I...I don't know what to make of this!
        
               | chaxor wrote:
               | I think that typically a rewrite in, well _anything_ ,
               | can be helpful - simply because the first write wasn't
               | sure of what may work or what the correct model for the
               | system should be, or how to handle specific parts of the
               | system etc.
               | 
               | A rewrite in Rust can be good for those reasons, as it
               | removes the "cruft" of old implementation, but also gets
               | the nice properties of speed and such.
               | 
               | But ultimately the thing I love most about Rust is not
               | even the safety and such - it's the package management
               | and build system. Just look at the horrible python/js
               | scene for how bad packaging and build systems can be, and
               | you'll understand why that basic uniform experience can
               | be so nice.
        
             | kuschkufan wrote:
             | No HTML export yet. Which this post is about.
        
             | BeFlatXIII wrote:
             | I wish Textile had won instead of plain Markdown. What are
             | the benefits of Typst over the ConTeXt family?
        
           | bobbylarrybobby wrote:
           | Asciidoc supports math blocks, and there's an extension to
           | render them at compile time
        
         | wodenokoto wrote:
         | I write a fair amount of reports professionally and I use word.
         | 
         | Getting data from my Python analysis into the reports are
         | tedious at best and updating numbers last minute is hair
         | pulling frustrating.
         | 
         | But because of the good wysiwyg I can cheat on my adjustments
         | when I need a graph to go "just there", I can edit my paragraph
         | wording such that I don't get a almost completely blank page in
         | between sections, etc, etc which is important to make a good
         | looking report, imho.
         | 
         | How do you go about that with rst? I'd love to write a
         | templates rst file that can be fed from my excel sheets and
         | Python scripts, but how do I go about final layout adjustments?
        
           | acidburnNSA wrote:
           | I've gone a few routes. I have used sphinx's singlehtml
           | builder to make a huge HTML file and then used pandoc to
           | convert it into docx for final adjustments. This worked
           | surprisingly well on a 2000-page document. But it's a bit
           | cludgy.
           | 
           | Another (non-Sphinx) thing you can do is just write (portions
           | of) your docx reports directly from Python using python-docx
           | [1]. I use this approach when people give me strict docx
           | templates that need to be filled in from Python in a very
           | specific way. It can drop data-generated tables in at special
           | placeholder sections and everything.
           | 
           | [1] https://python-docx.readthedocs.io/en/latest/
           | 
           | I will say that I've been more and more happy with just using
           | sphinx straight to pdf for very professional looking reports.
           | Given some latex preamble work in the config you can get it
           | looking quite nice. I haven't personally struggled recently
           | with too many egregious formatting issues on the sphinx-built
           | latex stuff. You do have to swap over to landscape mode for
           | large tables, etc. so it takes some work. But you're right
           | that in many cases, formatting issues do still happen, so
           | YMMV.
           | 
           | Another neat trick in sphinx is the csv-table directive [2],
           | which loads table data directly from a csv file you have
           | around, which you can obviously get from your xlsx.
           | 
           | [2] https://docutils.sourceforge.io/docs/ref/rst/directives.h
           | tml...
        
             | michaelrpeskin wrote:
             | I do something similar for my reports. I write most of it
             | in markdown using Typora and then I export the last draft
             | to docx for fine tuning and distribution (the agencies I
             | work with want docx submissions, not pdf, which always
             | bothers me).
             | 
             | Typora uses pandoc to do the conversion. My reports are
             | mainly text, charts, and lots of math formulae and it works
             | great. You don't get fine adjustment of layout, but I find
             | that a feature not a bug. I see so many people waste time
             | to put a figure in just the right place. It doesn't matter.
             | The goal is clear information transfer so just get the
             | figure in the doc where it makes sense and go on.
        
           | chaxor wrote:
           | Try out Typst.
           | 
           | It senses changes to any file and auto-updates the doc
           | _lightning fast_ - it 's far better than LaTeX IMO
        
             | kuschkufan wrote:
             | No HTML export yet. Which this post is about.
             | 
             | Though I too like typst and am subscribed to their Github
             | issue for HTML export, that maybe some day will be
             | available.
        
           | PrimeMcFly wrote:
           | There's a lot you can do with latex to automatically import
           | data and update automatically from external sources, and
           | while it might seem counter-intuitive it is much easier and
           | less effort than Word's wysiwyg interface.
        
             | wodenokoto wrote:
             | I'm jealous of how easy it is to import data when using a
             | structured source code like format such as rst, markdown or
             | latex. I'm sticking with word because I can easily do small
             | layout adjustments like decreasing the margins of a table
             | to make it fit on a page, or easily see when a paragraph is
             | 1 or 2 words too long, causing it to shift all sorts of
             | elements across pages.
        
               | PrimeMcFly wrote:
               | You can do that with Latex as well? I use TexStudio which
               | has a preview pane. Any time I make changes I hit f5 and
               | it updates pretty quickly. It's not instantly but pretty
               | close to it, and there are already less problems with
               | things shifting around because it manages that better
               | than Word does, by design.
        
           | spinningslate wrote:
           | I've recently switched to Quarto[0] with RStudio desktop[1]
           | as the editor. It's my preferred approach for all writing
           | now:
           | 
           | 1. Great markdown editor with both source and WYSIWYG views
           | 
           | 2. Render to a wide range of formats including html, pdf,
           | epub, docx
           | 
           | 3. Generate books, web sites, single page docs, presentations
           | 
           | 4. Incorporate code (like jupyter) except the source is plain
           | text with fenced blocks
           | 
           | 5. Supports code in a number of languages including Python
           | and R.
           | 
           | 6. Can use other editors too (iirc there's a plugin for VS
           | Code though never tried it).
           | 
           | 7. Built in support for MathJax for mathematical formulae and
           | Mermaid for text-based diagramming with auto inline preview
           | 
           | I prefer it to Word for writing and jupyter for notebooks. No
           | affiliation to Posit, the company that develops both Quarto &
           | RStudio. Just a fan of the products.
           | 
           | --
           | 
           | [0]: https://quarto.org/ [1]:
           | https://posit.co/download/rstudio-desktop/
        
         | DrSantow wrote:
         | I agree! I've been also using this as a personal website (for
         | academia). This works like a charm. It's easy to render any
         | equation, and it's fast (because not bloated).
        
         | ReleaseCandidat wrote:
         | It is too complex compared to Markdown and hasn't got enough
         | features to be comparable to Latex. And I still (almost) use
         | the same Latex templates that I used at university, 25 years
         | ago.
        
         | mgaunard wrote:
         | I forced myself to use it recently, I mostly found it to be
         | both limited (cannot have part of a link in bold or italics)
         | and inconvenient (each line of inline code must be indented).
        
           | acidburnNSA wrote:
           | It does have some limits, for sure. I havent tried bolding a
           | portion of a url before.
           | 
           | I have enjoyed including inline code using the literal-
           | include directive, which allows you to just include sections
           | of code directly from a file in disk. This is great because
           | you can cover your example code with unit tests while also
           | talking about it in docs without replication. You can even
           | use little border comments to mark snippet sections so that
           | it's not sensitive to specific line numbers.
           | 
           | https://www.sphinx-
           | doc.org/en/master/usage/restructuredtext/...
        
         | fireflash38 wrote:
         | Sphinx/rst are a nice middle ground between the simplicity of
         | markdown and complexity of LaTeX. I used it to generate a lot
         | of html docs for test reports. I did try pdf gen using via
         | LaTeX and pdflatex for a bit, but stopped after the pdf was
         | breaking the multiple thousands of pages.
         | 
         | And it's really tweakable, especially with html output where
         | you can provide your own templates, or add in your own
         | CSS/scripts even manual tags.
        
           | PeterisP wrote:
           | Providing my own templates is kind of a weird feature,
           | because that's not really what I want (in the sense "people
           | don't want to buy drills, they want to buy holes") -
           | obviously that's a necessary feature, but I never ever want
           | to make my own template, what I want instead is to have a
           | template that does _exactly_ what I need but that 's made and
           | maintained by someone else.
           | 
           | E.g. I don't care about a configurable formatting for
           | bibliography, but I would want a pre-made template that
           | implements the APA bibliography guidelines with _all the tiny
           | nuances_ correctly. I don 't want to configure margins for
           | columns, I want a template that does the IEEE formatting
           | standard exactly. (95% compatibility doesn't work, if a
           | single missing feature means the tool can't produce the
           | required document because it's wrong at one spot on page 3,
           | then I'd need to abandon the tool and pick something that
           | works). And crucially, I want the separation between content
           | and formatting so that I can easily take a blob of content
           | that was formatted for one layout and just copy it in a
           | completely different template and have it match the new
           | formatting guidelines, e.g. automatically moving all the
           | image captions to the other side, changing how they're
           | numbered and referenced, etc.
           | 
           | Latex has all this baggage solved, almost everyone who wants
           | a specific format from me will provide a Latex template with
           | their weird typesetting fetishes included, and I just need to
           | provide the content - while any upcoming tool has an uphill
           | battle to become compatible and provide the same things, at
           | the very least pre-made (and _well_ made) templates for all
           | the major formats (each discipline of science generally uses
           | something different).
        
       | mattl wrote:
       | I write markdown, use pandoc to make LaTeX and from that a PDF
       | for a printed thing and just supply markdown for non-printed
       | stuff.
        
         | davidthewatson wrote:
         | I was surprised recently when I changed up my HTML and PDF
         | toolstack not just how good pandoc was, but the entire
         | ecosystem that had emerged around pandoc including pandocomatic
         | and pandoc-resume.
        
           | mattl wrote:
           | pandoc is so good. And volunteer maintained.
        
         | chaxor wrote:
         | Typst is pretty close to markdown for simple things, and scales
         | nicely to hard things. So you don't really need to worry about
         | the markdown-pandoc shuffle anymore.
        
           | amai wrote:
           | Unfortunately typst doesn't support HTML output. It can only
           | generate PDFs.
        
       | xattt wrote:
       | Correct me if I'm wrong, but there isn't a way to do a compile
       | that incorporates Biblatex.
        
         | acidburnNSA wrote:
         | I've started auto-exporting Zotero-managed references to a
         | bibtex file using better bibtex [1] and then using Sphinx and
         | reStructuredText to process them uniformly into nicely
         | formatted HTML, pdf, and epub using sphinxcontrib-bibtex [2].
         | 
         | [1] https://github.com/retorquere/zotero-better-bibtex
         | 
         | [2] https://sphinxcontrib-
         | bibtex.readthedocs.io/en/latest/usage....
        
       | bradrn wrote:
       | I honestly don't see the point of using LaTeX if you're
       | generating HTML. The great strength of LaTeX, in my view, is the
       | precise control it provides over typography and formatting. As
       | such, it works best with an output format which can faithfully
       | render these documents -- such as PDF. For an output format like
       | HTML, which encourages reflowability over faithful rendering, I'd
       | much prefer to use an 'easier' document format like Markdown or
       | reStructuredText.
        
         | golol wrote:
         | Exactly, there is a triangle of tradeoffs here: prettyness vs
         | easyness vs responsiveness. You can only have 2 of them. pretty
         | and easy is Latex. The reason people call CSS a nightmare is
         | because responsiveness fundamentally makes it much more
         | difficult to make a document pretty. So HTML+CSS gives you
         | pretty + responsive or easy + responsive. That's not the same
         | functionality as a pdf for a fixed scientific document.
        
       | seeknotfind wrote:
       | I spent a few weeks last year doing the opposite, HTML to LaTex
       | in order to print and nicely typeset top HN articles, so I'd have
       | a nicely printed booklet each morning. I think creating hard
       | copies of web content for offline reading holds a lot of promise,
       | but the internet is a beast.
        
         | AzuraIsCool wrote:
         | Interesting, I have done exactly that too! I have it sent to my
         | laser printer to print out just before I wake up.
        
         | PrimeMcFly wrote:
         | > so I'd have a nicely printed booklet each morning.
         | 
         | Why? If you're just printing to read on the train or whatever,
         | wouldn't you just discard after reading?
        
       | generationP wrote:
       | This is from 2013, so the bet that "nobody will want to read
       | [PDFs] in 5 years" can be considered failed. If anything, PDF has
       | become the lingua franca of the academic web, crowding out even
       | DjVU at the thing that DjVU was made for and PDF was not.
       | 
       | I have not been following the development of mathjax, pandoc,
       | etc. carefully, so I'm wondering: Have the main issues been
       | solved? By these I mean
       | 
       | (1) support for most popular packages,
       | 
       | (2) automatically breaking long outputs into small pages that
       | don't overheat my laptop or crash my browser and yet reference
       | each other properly,
       | 
       | (3) printability (without lines broken in half, senseless
       | overflows and the likes) or cross-compilability with a regular
       | PDF compiler?
       | 
       | I know the ar5iv project is getting closer and closer to (1) and
       | (3), but is that available to regular users?
        
         | bloaf wrote:
         | And it is a shame. The current AI explosion is the poorer for
         | it, due to the greater difficulty of extracting the text from
         | PDFs.
        
         | adastra22 wrote:
         | mathjax has come tremendously far, but not on the problems you
         | mention :(
        
         | bowsamic wrote:
         | The problem with DjVu is that its viewers suck, especially on
         | macOS, which is very popular in modern academia
        
         | roel_v wrote:
         | But don't worry, 2024 is going to be the Year Of Math On The
         | Web.
         | 
         | (I've been trying to do 'math on the web' (ish)) since 2002,
         | and it's always sucked in some way; and all that time,
         | images/pdf have Just Worked(TM). The emphasis in the OP on how
         | much you'll have to report/chip in/fix is telling...)
        
       | matt3210 wrote:
       | At work all reports are html. If you want pdf, cmd-P
        
       | whatever1 wrote:
       | I dont always use latex but when I do I always hate it.
        
       | IAmLiterallyAB wrote:
       | Tangently related, does anyone have experience with AsciiDoc?
       | I've used reStructuredText before, but AsciiDoc is tempting, it
       | looks cleaner.
        
         | pbronez wrote:
         | Asciidoc has potential. Last time I dug into it the ecosystem
         | was lacking, but there were glimmers of a reboot. I hope that
         | pulls through because it's a great format.
         | 
         | Edit: yeah it's managed through the Eclipse Foundation now.
         | They're slowly working towards a formal spec, haven't hit 1.0
         | yet.
         | 
         | Details here https://gitlab.eclipse.org/eclipse/asciidoc-
         | lang/asciidoc-la...
        
           | pbronez wrote:
           | yeah it's managed through the Eclipse Foundation now. They're
           | slowly working towards a formal spec, haven't hit 1.0 yet.
           | 
           | Details here https://gitlab.eclipse.org/eclipse/asciidoc-
           | lang/asciidoc-la...
        
         | jiehong wrote:
         | Using it for internal docs, but we don't generate pdfs so I
         | can't comment on that part.
         | 
         | I personally find asciidoc easier to write manually.
        
         | throwaway290 wrote:
         | I had experience with AsciiDoc and personally not a fan. IMO it
         | has weird features like totally illegible compact table syntax
         | (seriously, that stuff is worse than XML) and the spec looks
         | abandoned. But I keep seeing it being used, I guess it appeals
         | to people who want something more flexible than Markdown (and
         | who like Ruby, or they would go with RST)
        
         | lkuty wrote:
         | You have also AsciiDoctor ( https://asciidoctor.org/ ) which is
         | alive and well. I am using it for technical CS documentation
         | internally, but only for single page documents. I did not try
         | to deploy their whole multi-document setup called Antora (
         | https://antora.org/ ).
        
       | abdullahkhalids wrote:
       | LaTeML [1] is presumably the latex to html tool that arXiv is
       | testing right now. What are peoples thoughts about it compared to
       | other such tools?
       | 
       | [1] https://github.com/brucemiller/LaTeXML
        
       | j2kun wrote:
       | The recommendation to use Markdown+MathJAX fall short when you
       | want to write longer documents with numbered section, subsection,
       | and theorem/definition/figure etc tracking and referencing.
       | 
       | I'm sure with Sphinx and reStructuredText you can get that large-
       | scale document tracking stuff, but with LaTeX it just works for
       | the most part and you don't need to juggle a bunch of different
       | side-projects and extensions. Plus you get things like automatic
       | index generation (for a physical book).
        
         | bigpeopleareold wrote:
         | I searched for a comment to supports the fact that LaTeX shines
         | in certain areas.
         | 
         | My memory of LaTeX has weakened over the years, since I am not
         | writing long texts with lots of figures and such, but I know
         | it's more than this statement let's on in the article:
         | "Something that is more modern than learning a hundred bits of
         | print typesetting that your student will never, ever need?"
         | 
         | What exactly is, in the end, is 'modern'? Is it because there
         | is less syntax in Markdown to remember and the Modern is
         | syntax-adverse? :D Aren't there editors for these in the first
         | place to avoid the daily grind of remembering syntax?
        
           | BlueTemplar wrote:
           | Modern as in "more recent" (and not as in "the modern era"
           | that ended decades ago). More recent doesn't mean better
           | though : the likes of Overleaf, Google Docs, Github are also
           | "more modern" than some of their alternatives, yet ought to
           | be avoided like the plague.
        
         | phiresky wrote:
         | Markdown actually works great for larger documents when you use
         | it with pandoc [1]. That way you get HTML output _and_ PDF
         | output via Latex, without the HTML being a second class
         | citizen.
         | 
         | I wrote my thesis (50 pages) and multiple published papers this
         | way. Maybe it seems janky but honestly my experience with Latex
         | and it's 10 incompatible compilers and thousands of semi-
         | incompatible packages has been much worse.
         | 
         | I also don't understand why (academic) publishing is so PDF
         | focused. It's a horrible format to read on screens (think
         | multi-column PDFs, and scrolling / jumping up and down to find
         | references), and who actually prints stuff anymore?
         | 
         | The thing I love most about Pandoc is that my notes can just
         | slowly turn into a fully fledged document. Like bullet points -
         | The syntax in Latex is far too verbose to make taking notes
         | with it comfortable.
         | 
         | It's also much easier to extend, I wrote a simple tool that
         | automatically converts URLs into full and correctly formatted
         | citations, so I don't even need a citation manager to get the
         | same results:                   The GAN was first introduced in
         | [@gan](https://papers.nips.cc/paper/5423-generative-
         | adversarial-nets).
         | 
         | Turns into https://github.com/phiresky/pandoc-
         | url2cite/blob/master/exam...
         | 
         | Another great project with similar structure is Manubot [3],
         | though the PDFs there are not generated by LaTeX.
         | 
         | [1]: https://pandoc.org/ [2]:
         | https://github.com/phiresky/pandoc-url2cite [3]:
         | https://manubot.org/
        
       | dwheeler wrote:
       | One solution is to embed alternatives within PDF itself.
       | LibreOffice can embed inside a PDF the original editabble source
       | in ODF format. You could also embed ePub. That would mean you
       | would have a single file that could be processed in many useful
       | ways.
        
       | froh wrote:
       | I just moved "up" from gfm markdown to asxiidoc and oh do I miss
       | LaTeX.
       | 
       | html rendering of LaTeX is a godsend. and imnsho asciidoc a work
       | around to not fully having that.
        
       | bmacho wrote:
       | I feel ambivalent to LaTeX.
       | 
       | I don't like the language, the ecosystem is too big, complicated
       | and breaks, but the end result is hard to do any other way.
       | 
       | This applies both the equations part, and the text reflow part (I
       | think them as separate things, but they usually go together).
       | 
       | It should be possible to write text in HTML or markdown, and
       | write the equations in latex or asciimath, and turn it into a
       | beautiful/article style pdf, but sadly it is not.
       | 
       | Although CSS (colored and rounded boxes and such) + MathJax-SVG
       | also can look nice.
        
         | loxdalen wrote:
         | I believe I have used pandoc to convert markdown to PDF. Maybe
         | this is something you could try?
        
           | criddell wrote:
           | That's probably what they were referring too when they
           | described it as big, complicated, and fragile.
           | 
           | https://pandoc.org/chunkedhtml-demo/2.4-creating-a-pdf.html
        
             | Muehe wrote:
             | Well you need to install the appropriate texlive
             | dependencies which can be somewhat complicated, but once
             | that's done it's just writing inline Latex
             | $$\like{this}$$
             | 
             | into your Markdown files and then doing
             | pandoc -f markdown -t pdf -o output.pdf input.md
             | 
             | Haven't used this in a while and just tried it again, was
             | just a matter of searching a few error messages, gleaning
             | the missing texlive package names from the results, and
             | installing them. Works like a charm now.
             | 
             | I also had this working for Markdown to HTML conversion
             | back in the day when I needed it, but that requires the
             | website using a JS library like Mathjax.
        
         | bowsamic wrote:
         | Using REVTeX I honestly have no issues with LaTeX, especially
         | if I just stick to Overleaf
        
         | i_am_proteus wrote:
         | It's entirely possible. One tool one could use for this is
         | Quarto: https://quarto.org/
        
         | jhoechtl wrote:
         | Time to sunset Latex
         | 
         | https://github.com/typst/typst
        
           | kuschkufan wrote:
           | No HTML export yet. Which this post is about.
        
             | datadeft wrote:
             | I am hoping this is going to be implemented soon:
             | 
             | https://github.com/fenjalien/obsidian-typst/issues/5
        
           | Diti wrote:
           | How do you handle internationalization, and, in particular,
           | hyphenation? That's the main reason I use LaTeX for (well,
           | specifically XeTeX & Tectonic, which are pretty modern).
           | Without those two features, one might as well use
           | LibreOffice, no?
        
           | vouaobrasil wrote:
           | It will be hard to replace LaTeX. I still use it. It's
           | virtually bug-free and compiles documents from 30 years ago.
           | I sincerely think it will be around for another 30. It's
           | tried and tested and that's hard to find in the software
           | world. Typst looks interesting though. I'll keep my eye on
           | it...
        
           | martopix wrote:
           | Might still be pretty limited, but I've been looking for
           | something with a more modern syntax for years, and this seems
           | a good candidate! Thanks for sharing.
           | 
           | Of course it will take years to replace LaTeX, but we need to
           | begin working on it.
        
         | mbirth wrote:
         | This is from a week ago:
         | 
         | https://news.ycombinator.com/item?id=39027543
         | 
         | Talks about "htmldocs" (which shows maths formulas on one of
         | their templates) but there are also various other alternatives
         | mentioned in the discussion.
        
         | ants_everywhere wrote:
         | Document formatting seems like one of those problems where 80%
         | or so of the problem space is simple and the remaining 20% is
         | an unfathomable pit of nightmares.
         | 
         | There are so many different ways people could want characters
         | printed on a sheet of virtual paper that the problem is
         | virtually unconstrained in its difficulty.
         | 
         | TeX was a major theoretical advance, and LaTeX is a nice enough
         | UI layer on TeX that has gotten significant traction. But even
         | outside of TeX, it feels like even software like MS Word are
         | impossibly complex and clunky.
         | 
         | You can make something nicer by dramatically simplifying or
         | cutting the feature set. I think that's probably how Google
         | Docs has a pretty simple interface. But I'm not convinced
         | there's a real replacement for the incumbents that simply tries
         | to improve UI without having a deep technical insight about
         | document layout the way Knuth had with TeX.
        
           | pydry wrote:
           | Latex has a lot of caked in design mistakes which are never
           | going away.
           | 
           | Unfortunately typst seems to have replicated the primary one
           | - inventing a new turing complete programming language rather
           | than piggybacking off an existing one.
           | 
           | It's possible to conceptualize a much better latex but it
           | would take years to build properly and build the ecosystem
           | around it to do all the odd things people need when doing
           | markup requiring 1000-2000 community packages.
        
             | BeFlatXIII wrote:
             | What are the other caked-in design mistakes in LaTeX, and
             | which existing language(s) would you like to see a DSL
             | piggybacked off?
        
           | PeterisP wrote:
           | Thing is, you can't really cut the feature set much. Nobody
           | needs 90% of the features but for almost everyone there's
           | some 10% of the less-used features that's a must-have, a
           | total dealbreaker if the other tool doesn't have them or does
           | them poorly; and that's a _different_ 10% for different
           | people, so if you have a cut-down feature set you lose many
           | people - some because you don 't have A, some because you
           | don't have B, some because you don't have Z, and they all
           | instead use the same old, complex tool that has support for
           | "their thing".
        
         | da_chicken wrote:
         | Every time I encounter LaTeX, I think of something I heard:
         | "You shouldn't need a build environment for a word processor."
         | I can't get away from that sentiment. Almost nobody I've seen
         | using LaTeX has actually been using it for _typesetting_.
         | Usually they 're using a typesetter for word processing.
         | 
         | Sometimes it feels like they're only using LaTeX because they
         | "learned it in college." You ever notice that? So many people
         | in LaTeX threads say they learned it in college, or they've
         | been using the same setting since college, or whatever. People
         | learn LaTeX to make college papers look nice, and then they
         | _never need to configure it again_? Isn 't that strange?
         | 
         | The worst part, though, is that people complain if you call it
         | latex. Which I think says quite a lot about it's userbase.
        
       | bowsamic wrote:
       | If I'm using LaTeX, I'm writing scientific articles. I expect
       | scientific articles to be read by people on computers with normal
       | screen sizes or printed off. Therefore there's no reason to
       | bother with anything other than PDF. PDF works great.
        
       | Retr0id wrote:
       | > don't just produce PDFs that nobody can read on small screens
       | 
       | I was thinking about this recently. If you get pedantic enough*
       | about it, the typesetting quality you can get from a LaTeX+PDF is
       | strictly better than what can be achieved using (sane) HTML.
       | 
       | I wanted to blog in LaTeX, and to solve the screen-size issue I
       | thought I'd pre-bake to a wide range of page geometries, and then
       | serve up an appropriate one to the client using pdf.js.
       | 
       | Fortunately for everyone, I decided against it in the end and
       | continued blogging in markdown+html (with mathml support)
       | 
       | *well beyond what most readers would possibly care about
        
       | mbid wrote:
       | For me, the main problem with most tools that render to HTML was
       | that they don't support all math typesetting libraries that latex
       | supports. I used to work with category theory, where it's common
       | to use the tikz-cd library to typeset commutative diagrams. tikz-
       | cd is based on tikz, which is usually not supported for HTML
       | output.
       | 
       | But apart from math typesetting, my latex documents were usually
       | very simple: They just used sections, paragraphs, some theorem
       | environments and references to those, perhaps similar to what the
       | stack project uses [3]. Simple latex such as this corresponds
       | relatively directly to HTML (except for the math formulas of
       | course). But many latex to html tools try to implement a full tex
       | engine, which I believe means that they lower the high-level
       | constructs to something more low level (or that's at least my
       | understanding). This results in very complicated HTML documents
       | from even simple latex input documents.
       | 
       | So what would've been needed for me was a tool that can (1)
       | render all math that pdflatex can render, but that apart from
       | math only needs to (2) support a very limited set of other latex
       | features. In a hacky way, (1) can be accomplished by simply using
       | pdflatex to render each formula of a latex document in isolation
       | to a separate pdf, then converting this pdf to svg, and then
       | incuding this svg in the output HTML in the appropriate position.
       | And (2) is simply a matter of parsing this limited subset of
       | latex. I've prototyped a tool like that here [1]. An example
       | output can be found here [2].
       | 
       | Of course, SVGs are not exactly great for accessibility. But my
       | understanding is that many blind mathematicians are very good at
       | reading latex source code, so perhaps an SVG with alt text set to
       | the latex source for that image is already pretty good.
       | 
       | [1] https://github.com/mbid/latex-to-html
       | 
       | [2] https://www.mbid.me/lcc-model/
       | 
       | [3] https://stacks.math.columbia.edu/
        
         | ykonstant wrote:
         | Tangentially, for me the stacks project is the gold standard of
         | mathematical typography on the web. Look at this beauty:
         | https://stacks.math.columbia.edu/tag/074J
         | 
         | Also check the diagrams:
         | https://stacks.math.columbia.edu/tag/001U
         | 
         | If anyone can explain to me, a complete noob regarding html,
         | how they achieve this result with html, css and whichever latex
         | engine they use, I would be grateful. I want to make a personal
         | webpage in this style.
        
           | dolmen wrote:
           | uMatrix tells me there are 8 external sites to grant
           | permissions for access to resources. Definitely not a
           | "beauty".
        
             | ykonstant wrote:
             | I don't understand what this has to do with typography.
        
           | artagnon wrote:
           | It's standard MathJaX that's rendered client-side. I managed
           | to get MathJaX + XyPic rendered server-side on my website,
           | which is a lot nicer.
        
             | ykonstant wrote:
             | Oh, you misunderstand the level of my question; rephrased,
             | how do maek wabpag with "MathJaX that's rendered client-
             | side"? (o'V`o)
        
               | red_trumpet wrote:
               | Take a look at MathJax's website:
               | https://www.mathjax.org/#gettingstarted
               | 
               | They have a link to JSBin which contains an easy example
               | html page.
        
               | ykonstant wrote:
               | Thanks!
        
         | datadeft wrote:
         | Have you seen typst? I have moved over from LaTex to Typst and
         | most if not all your use cases are covered.
         | 
         | https://typst.app/
        
           | _flux wrote:
           | Except the main theme, which was HTML export?
           | https://github.com/typst/typst/issues/721
           | 
           | Though it's in the roadmap!
        
       | opentokix wrote:
       | LyX is the way to LaTeX
        
       | notpushkin wrote:
       | Instead of MathJax, maybe also consider KaTeX: https://katex.org/
       | 
       | It's faster than MathJax and also can be pre-rendered on the
       | server (or in your SSG!).
        
         | amai wrote:
         | That is old news. Mathjax 3 is a lot faster nowadays than it
         | used to be and it supports more LaTeX keywords than KaTex.
         | Especially the important \label and \ref are still not
         | supported by KaTex.
        
       | bovermyer wrote:
       | When I use LaTeX, it's because I want a way to store book
       | manuscripts and their layout as code in version control. I never
       | use any of the math layout. I get the impression that my use case
       | is rather in the minority.
       | 
       | I would use CSS+HTML for layout, but what do I do about
       | automatically generating tables of contents and indexes?
       | 
       | I guess I could write my own tool for that. Hmm.
        
         | gglitch wrote:
         | Looks like Pandoc can generate tables of contents for HTML,
         | though I don't see anything about indexes. Roff and friends,
         | and Texinfo, can do both, though with their own tradeoffs.
         | 
         | https://pandoc.org/MANUAL.html
        
       | riperoni wrote:
       | This article really doesn't get what LaTeX does. Of course it is
       | overkill to have 5 lines of text rendered with LaTeX into a PDF.
       | But the point of LaTeX is exactly to set the typesetting of an
       | output document in stone. PDF is meant to do that and HTML cannot
       | do that. A PDF conserves everything and that is precisely the
       | point to have a set layout for printing or displaying on
       | different devices.
       | 
       | Yes, there should be easy ways to display math on the web. No,
       | this doesn't mean that LaTeX is obsolete.
       | 
       | Besides, what about references, both external and internal?
       | Probably needs more "modern" tooling.
        
         | geon wrote:
         | > to have a set layout for printing or displaying on different
         | devices.
         | 
         | That's a horrible way to go about it. Already in the 90s it was
         | clear that varying display sizes was a problem, and it has
         | gotten orders of magnitudes worse since then.
         | 
         | The concept of a single set layout that is suitable for
         | everyone is utterly absurd.
        
           | master-lincoln wrote:
           | Then do not use a tool that was designed for typesetting
           | printed pages which is what LaTex is for. The author of the
           | article seems to think about LaTex only for math rendering.
           | But that is just a fraction of what it is used for. Complex
           | diagrams with tikz or typesetting entire books, so that
           | adding content in an arbitrary place still makes the rest of
           | the book look good without breaking layout are some of the
           | examples of why I would use LaTex instead of html
        
             | geon wrote:
             | That's exactly what the author says at the beginning:
             | 
             | > are you sure you have to use LaTeX?
        
       | artagnon wrote:
       | LatexML has come a long way. Even arXiv uses LatexML internally
       | to offer HTML5 versions as of late 2023. It does have limitations
       | in not supporting all packages, or producing a high-quality
       | translation in all cases.
       | 
       | If you don't need to convert entire LaTeX documents, MathJaX and
       | KaTeX are really good at rendering a subset of LaTeX as
       | MathML/SVG. I run MathJaX + an xypic extension for commutative
       | diagrams with server-side rendering on my website, and it works
       | great in practice.
        
       | kkfx wrote:
       | I like LaTeX for the quality of it's pdf output, I use in for
       | docs that need to be "printed" (non necessarily on paper, but
       | still 'fixed typographical form for potentially long term
       | archiving) not for anything else and yes I DO HATE pdfs because
       | of their design, but PostScript is not much common these days and
       | while a bit better for certain aspect is not much better in
       | general, dvi is even worse.
       | 
       | For my notes, for anything that need to be "live" I use org-mode
       | because:
       | 
       | - it's a far more natural markup than anything else
       | 
       | - it's rendered INLINE, no need to jump between a source form and
       | a rendered one, a thing MD lovers fails to understand
       | 
       | - it's an outlining tool, another thing most other tools fails
       | miserably to understand
       | 
       | - it easily incorporate live things in other languages (org-
       | babel) a thing no modern REPL-alike DocUI like Jupyter can't do
       | 
       | Long story short I prefer the best tool depending on the job.
       | HTML might be the least common denominator tool, making it the
       | worst in essentially all cases. XML for machine usage, SGML in
       | general, are good for machine usage, but they are very
       | impractical in current usage, just see the actual crappy state of
       | things for e-invoicing with XML/XADES docs + XSL to render them
       | in the end as pdf for the human. They are a good too in some
       | case, but again not the best for any specific case.
        
       | amai wrote:
       | A lot has happened since 2013. Have a look at
       | https://quarto.org/, if you plan to publish in HTML. Quarto has
       | already support for Typst: https://quarto.org/docs/output-
       | formats/typst.html
        
       | DominikPeters wrote:
       | Another LaTeX-to-HTML tool is lwarp
       | (https://github.com/bdtc/lwarp) which starts from the idea that
       | there only exists one program that can parse LaTeX: the LaTeX
       | compiler itself. Implementing a new parser is almost futile. So
       | instead, the lwarp package redefines all the macros to output
       | HTML. Something like \renewcommand[1]{\textbf}{<b>#1</b>} This
       | way, compiling LaTeX gives you a PDF whose text is HTML code, so
       | now you can extract the plain text from it and you have an HTML
       | file. The advantage is that it can easily deal with custom macros
       | etc., because these are natively resolved by the LaTeX compiler.
       | 
       | I use lwarp to make https://tikz.dev/, an HTML version of the
       | TikZ manual, which is probably one of the most complicated LaTeX
       | documents in existence.
        
         | magnio wrote:
         | You are the author of tikz.dev? I have always thought it was
         | made by the tikz author. Mad props to you, the site is very
         | functional and helpful to me. With it, using tikz feels a bit
         | less like a chore.
        
       | soegaard wrote:
       | Note that one can convert PDF to HTML using tools like:
       | 
       | https://pdf2htmlex.github.io/pdf2htmlEX/
       | 
       | Example of a paper with equations:
       | 
       | https://pdf2htmlex.github.io/pdf2htmlEX/demo/demo.html
        
         | eadmund wrote:
         | Oh, now that is beautiful! Thanks for sharing.
        
         | smaddox wrote:
         | That's just HTML that looks like a PDF, though. Incredible
         | feat, but not really what I want from PDF turned to HTML. I
         | want something mobile friendly.
        
       | setgree wrote:
       | I learned LaTeX in grad school in 2013, starting with LyX.
       | Yesterday, I compiled an Rmarkdown document into an
       | APA6-conformant PDF with just a bit of YAML, with a tex file as
       | an intermediate output.
       | 
       | We're almost there for skipping LaTeX entirely, but in my
       | experience, Google Docs and Overleaf still offer vastly superior
       | collaborating tools. Now if we could just edit {.md; .rmd;
       | .ipynb} files directly on Overleaf, with comments and track
       | changes, we'd be in business...
        
       | asimpletune wrote:
       | I love the author's "if you want to leave a comment email me". I
       | saw this somewhere else and it motivated me to make an automated
       | system that works like that: https://r3ply.com
        
       ___________________________________________________________________
       (page generated 2024-01-26 23:02 UTC)