[HN Gopher] Lessons from Building a Static Site Generator (2020)
       ___________________________________________________________________
        
       Lessons from Building a Static Site Generator (2020)
        
       Author : ramshorst
       Score  : 88 points
       Date   : 2021-06-30 12:56 UTC (10 hours ago)
        
 (HTM) web link (nicholasreese.com)
 (TXT) w3m dump (nicholasreese.com)
        
       | whydoineedthis wrote:
       | What is hydration? You talk a lot about it, but I've never had to
       | add water to my websites before.
        
         | nickreese wrote:
         | Good question. Hydration is where Javascript is rendered
         | statically or on the server and the client needs to take over
         | that HTML.
         | 
         | Traditional frameworks like Next.js, Gatsby, Nuxt.js all "fully
         | hydrate" the client.
         | 
         | This means that every bit of HTML that is sent to the is
         | browser is taken over by JS on the client.
         | 
         | This has it's costs but it is done to give interactivity.
         | 
         | Partial hydration is where you are only adding interactivity to
         | the parts of the site that need it... think of it like the good
         | old days of jquery but with a modern front end framework...
         | Svelte.
        
           | whydoineedthis wrote:
           | Thank you, very helpful explanation.
        
         | Santosh83 wrote:
         | Here you go:
         | https://en.wikipedia.org/wiki/Hydration_(web_development)
        
       | klodolph wrote:
       | I wrote a static site generator for my own personal site. I've
       | been using it for over 10 years, and it's gone through several
       | major refactors / redesigns. A few comments:
       | 
       | 1. Template system
       | 
       | There are tons of different template systems out there for things
       | like the "shortcodes" in the article.                   {{youtube
       | id="123asdf4" /}}
       | 
       | My conclusion is that the correct way to do this is with custom
       | tags,                   <embed-youtube id="123asdf4"></embed-
       | youtube>
       | 
       | I apologize for the verbosity... but this is completely valid
       | HTML5, and you do not need anything but an ordinary HTML5 parser
       | to parse this. This maximizes your choices for the libraries you
       | use in the static site generator and it maximizes the level of
       | support in whatever editor you choose to author the site in. For
       | example, you can just use the HTML mode in Vim or Emacs, or you
       | can use VS Code, TextMate, Sublime Text, etc. and get a ton of
       | features: syntax highlighting, indenting, etc.
       | 
       | While on the surface it looks verbose because of the closing tag,
       | in most editors, you only have to press a key or two to close the
       | tag. HTML5, strictly speaking, does not support self-closing tag
       | syntax for custom tags. That syntax is only supported for void
       | elements. There are only 16 void elements in HTML5.
       | 
       | I use "prefix-suffix" syntax to avoid ambiguity... any tag with a
       | hyphen is obviously a custom tag.
       | 
       | 2. Routing
       | 
       | Something you can use to tackle the routing complexity is to
       | place your source files in the same path as the canonical URL.
       | You only need routes for generated content, like index pages and
       | such.
       | 
       | 3. Index data
       | 
       | You'll naturally want to generate indexes and create previews for
       | links. I suggest that you start by looking at the schema.org
       | schema for web pages and work with a useful subset of that. This
       | way, you can generate indexes on your web page using the same
       | exact data, same exact schema, that you use for the JSON-LD data
       | you provide for search engines like Google.
       | 
       | This is a minor point, but it reduces duplicated effort between
       | the code for generating content for your website and the code for
       | generating JSON-LD metadata.
       | 
       | Don't dive too deep into the schema.org schema, just take a
       | couple bits and pieces that you need, and refer to the feature
       | guides in Google's documentation:
       | 
       | https://developers.google.com/search/docs/guides/intro-struc...
        
         | EricE wrote:
         | "2. Routing Something you can use to tackle the routing
         | complexity is to place your source files in the same path as
         | the canonical URL. You only need routes for generated content,
         | like index pages and such." Thank you! The only saving grace
         | for a Gatsby site I recently did was it leveraged a template
         | that did that automatically. To say that it dramatically
         | simplified things is a gross understatement. Template for those
         | interested: https://github.com/18F/federalist-uswds-gatsby
        
         | nickreese wrote:
         | Author of the post. This is really an interesting take on
         | shortcodes. I've been struggling with a format that the svelte
         | compiler likes and can be used in markdown. This may be the
         | answer.
         | 
         | > <embed-youtube id="123asdf4"></embed-youtube>
         | 
         | Thank you.
        
       | na85 wrote:
       | I went down this rabbit hole and found everything to be
       | overcomplicated for my use case. I'm so sick of static site
       | generators that have seven layers of templating engines and
       | complicated build systems.
       | 
       | Most of the static site builders I tried were either way too
       | complex, or else just straight-up didn't work at all (looking at
       | you, coleslaw).
       | 
       | I tried to go full emacs and use org export (org being my
       | favorite text format) but the default export is horrendous and
       | the documentation for org to html export is so bad it might as
       | well not exist.
       | 
       | Software is simultaneously awesome and infuriating.
       | 
       | So after three days I just gave up on org-export and now I have
       | pandoc shit out an html snippet that I concatenate with a hand-
       | rolled html preamble and postamble via a Makefile.
       | 
       | Found a few rough edges, probably because org is underspecified.
       | It's not elegant but it works for my use case.
        
         | jstrieb wrote:
         | For what it's worth, if you're using Pandoc, you can set the
         | HTML output to be "standalone" based on a simple template.[0]
         | You can also include a standard header and footer to be
         | automatically inserted for each generated page.
         | pandoc \           --standalone \           --css=/style.css \
         | --highlight-style=code-highlight.theme \
         | --variable=lang:en \           --include-before-
         | body=navbar.html \           --include-after-body=footer.html \
         | --template=template.html \           $MD -o $HTML
         | 
         | I use a variation of this command in a bash script to generate
         | my entire static site.[1] A friend improved upon my script with
         | a Go implementation that does some more advanced stuff, but
         | still compiles Markdown to HTML using this command under the
         | hood.[2]
         | 
         | 0: https://pandoc.org/MANUAL.html#option--standalone
         | 
         | 1: https://github.com/jstrieb/personal-
         | site/blob/master/compile...
         | 
         | 2: https://github.com/lsnow99/dudu
        
         | breck wrote:
         | Try https://scroll.pub
         | 
         | It uses Scrolldown instead of markdown, which is simpler,
         | cleaner, and incredibly extensible.
         | 
         | The command line app has just a few commands---and they all
         | take zero params.
         | 
         | A site is just a single folder, and because content is written
         | in Scrolldown, it works great with git and is great for content
         | sites or collaborative strongly typed databases.
         | 
         | It fast, I get about 300 pages per second, and not a lot of
         | code (sub 1k excluding dependencies), and the code is tested.
         | 
         | It's in nodejs now but no reason scroll can't be language
         | agnostic.
         | 
         | I've been around SSGs for over a decade and designing this one
         | to be simple, reliable, and to stay out of the creators' way. I
         | think it could be the last SSG you'll ever need.
        
         | wishinghand wrote:
         | Is this hyperbole or am I just lucky to avoid whatever you
         | tried? What SSGs require 7 layers of templating and complicated
         | build systems. Whenever I tried out Sergey, Nuxt, Docpad, and a
         | few others, there was just one templating engine each and a
         | build command for the CLI.
        
       | eatonphil wrote:
       | This is a good review of a fairly complex piece of software. But
       | don't let this convince you that all static site generators must
       | be complex.
       | 
       | Out of laziness, most sites I run have their own 100-200 line
       | Python static site generator that takes Markdown (if I'm really
       | feeling it) or HTML files with Jinja templates and generates
       | pages around them. The core generator code hardly ever changes
       | year by year. Here's an example [0].
       | 
       | This isn't to say that everyone should always write their own. I
       | am just a bit surprised by all the debate around each generator
       | because they all produce the same thing and the only (or major)
       | variables are the template language and what themes are built in
       | (though of course you can always bring your own CSS).
       | 
       | But _using_ a static site generator is a very good idea. If you
       | have no other inclinations, I think the stack that makes sense
       | for anyone with multiple contributors is to use WordPress for
       | editing and then have a plugin that will generate static pages
       | from it so not every request to your site hits the database.
       | 
       | [0]
       | https://github.com/eatonphil/notes.eatonphil.com/blob/master...
        
         | jjjbokma wrote:
         | Mine is just over 1KLOC :-) But it includes both a RSS and a
         | JSON feed, support for Twitter card / Facebook sharing, a
         | calendar view, and a tag cloud. Live demo: https://plurrrr.com/
         | 
         | Code is available at github: https://github.com/john-
         | bokma/tumblelog
        
           | jazzyjackson wrote:
           | The choice of using one markdown document to render all the
           | pages is really interesting to me. Do you just edit the
           | document in your terminal? Trying to imagine the usability of
           | that, I guess if I was quick at jumping from one "page" to
           | another it might be faster than opening and closing files.
        
         | nickreese wrote:
         | Author here. At it's core most static site generators are just
         | fancy "string concatenation" tools.
         | 
         | In my experience playing with several generators before
         | building Elder.js it isn't so much about the output that
         | matters it is about how the static site generator lets you do
         | non-trivial customization. Things that would be hard without a
         | larger framework.
         | 
         | More importantly, when building a major project on a static
         | site generator, it is important to have an upgrade path beyond
         | a static site generator should you require it. Elder.js was
         | built with that use case in mind though I didn't cover that in
         | the article.
         | 
         | Being able to move to SSR should the project require it is a
         | huge plus in my book.
        
           | eatonphil wrote:
           | > At it's core most static site generators are just fancy
           | "string concatenation" tools.
           | 
           | I think that's a bit reductive (especially since template
           | libraries themselves are already complicated, not in a bad
           | way). To me a static site generator is: 1) a string template
           | library, plus 2) a file system walker/trigger to generate
           | from a template, plus 3) additional data to feed the
           | templates, plus 4) the actual content.
           | 
           | Then there's of course the additional features you may or may
           | not need: a tagging system, a comment system, a subscription
           | system, etc.
           | 
           | Thankfully despite all these components you don't need to
           | write much of the actual code since they all exist as builtin
           | (file system walking) or major OSS (Jinja, Mustache.js, etc)
           | libraries. The SSG is primarily glue.
        
         | susam wrote:
         | > But don't let this convince you that all static site
         | generators must be complex.
         | 
         | Indeed! For example, https://github.com/sunainapai/makesite is
         | a simple and lightweight static site generator written in
         | Python. It can be customized easily by modifying the Python
         | source code and adapting it to one's needs. I like that when I
         | need a new feature, I can add it quite easily by writing a few
         | Python functions. It is meant to be programmer-friendly.
         | 
         | Disclosure: My wife wrote this project. I am just a happy user
         | of the project.
        
           | mturmon wrote:
           | I used ```makesite.py``` as a template for a small site of
           | 100-200 pages that I maintain. It has worked quite well.
           | 
           | The animating idea ("use this as a template but don't be
           | afraid to customize or reinvent certain parts") liberated me
           | from feature-by-feature evaluation of a bunch of complex
           | config/templating systems.
           | 
           | For a small site like mine, getting the content right is the
           | main thing and the site generator should mostly get out of
           | the way. I update the site sporadically and don't want to re-
           | learn a complex templating and config-file system every time
           | I go back to it.
        
         | pjc50 wrote:
         | I've become convinced that it's easier to write a SSG than
         | understand someone else's, and it's _definitely_ quicker than
         | trying to evaluate the market and pick one.
        
         | AndrewStephens wrote:
         | > Out of laziness, most sites I run have their own 100-200 line
         | Python static site generator that takes Markdown (if I'm really
         | feeling it) or HTML files with Jinja templates and generates
         | pages around them. The core generator code hardly ever changes
         | year by year.
         | 
         | I maintain my site exactly the same way (minus the Jinja) and
         | it is a workflow that works for me. Simplicity is best, even
         | beating flexibility) when it comes to tools that help you
         | express yourself. Otherwise you spend all your time wrestling
         | with your tools rather than creating.
        
         | Bayart wrote:
         | Yesterday someone ran a thread where people posted tons of
         | headless CMS options[1]. There might be a few ideas there for
         | people interested in your comment.
         | 
         | [1] https://news.ycombinator.com/item?id=27674105
        
       | jrm4 wrote:
       | Alright, I'm skimming this whole idea of newfangled "static site
       | generators" that involve a lot of Javascript and I'm left with a
       | whole lot of "Isn't this just ____ with extra steps?"
       | 
       | I'm seeing "shortcodes" and I'm like -- as in variables and/or
       | configuration files?
       | 
       | Or, more broadly -- why Javascript for the backend? This looks
       | like a silly level of complexity. I'd start thinking about it in
       | Bash and then probably head over to e.g. Python once databases et
       | al start getting involved. What am I missing here?
        
         | nickreese wrote:
         | Shortcodes as they are implemented in Elder.js (what the
         | article is about) are borrowed from WordPress. Basically they
         | are a placeholder such as [[lastestTweet/]] that let you add
         | dynamic content into otherwise static content.
         | 
         | While Elder.js does allow for full server side rendering making
         | "Javascript the backend" the goal of the static site generator
         | is generate static HTML/CSS/JS that can be hosted from a CDN,
         | S3, or other static host.
         | 
         | That said, one of the biggest pitfalls with building a major
         | site on a static site generator is there is often no upgrade
         | path to server rendering. Elder.js does offer that out of the
         | box.
        
         | valenterry wrote:
         | > What am I missing here?
         | 
         | People usually start their projects with the language they feel
         | most comfortable in, not the one that is best suited for the
         | project.
        
         | frabert wrote:
         | Curious you mention bash! I did exactly that some time ago when
         | I thought about starting blogging... And then never started
         | blogging!
         | 
         | https://blog.frabert.me/posts/2018/11/11/blash.html
        
         | brundolf wrote:
         | > why Javascript for the backend?...[I'd] probably head over to
         | e.g. Python once databases et al start getting involved
         | 
         | I could ask the same thing: why Python for the back-end?
         | 
         | Both languages have similar feature-sets and are roughly
         | equally suitable to this task. Given that, a person should use
         | whichever they're most comfortable with. Lots of people are
         | comfortable with JavaScript these days, ergo there's lots of
         | activity in the JavaScript SSG space.
        
           | jrm4 wrote:
           | Fair -- but let me be more precise then. I do like the idea
           | of "use what you learned and are most comfortable in."
           | 
           | That being said, I think what I mean by "e.g. Python" is
           | specifically - "older slash more text oriented slash proven"
           | 
           | If you like Javascript, fine. But I think there's a case to
           | be made that there is a completely unnecessary "bloat" to
           | Javascript -- especially even as the author himself has
           | suggested that at the end of the day it's all string
           | concatenation.
           | 
           | If that's the case, (as it frequently is) it ends up boiling
           | down to "what handles _text_ well and in an established and
           | smooth way, " and Javascript does not score high there, I'd
           | suggest.
        
             | brundolf wrote:
             | It doesn't sound like you write much JavaScript :)
             | 
             | > I think there's a case to be made that there is a
             | completely unnecessary "bloat" to Javascript
             | 
             | "Bloat in JS" is something of a trope, and I'm not really
             | even sure what you mean by it here. Typically people mean
             | "too many JS dependencies are often drawn in for web
             | pages", but that's not really relevant. Maybe you mean
             | syntax bloat (method calls instead of list comprehensions)?
             | If so, you're not totally wrong, but it's also not really a
             | big deal in my experience. If you're talking about
             | runtime/performance, well... V8 tends to be faster than
             | CPython in raw compute (excluding native modules) because
             | Google has put so much work into optimizing it (not that
             | that matters much for a SSG either)
             | 
             | > it ends up boiling down to "what handles text well and in
             | an established and smooth way," and Javascript does not
             | score high there
             | 
             | Whenever I need to process some nontrivial text, the first
             | thing I do is open up a Chrome tab for the JS repl. I think
             | the following it pretty smooth:                 const csv =
             | `       Name,Email,Phone Number,Address       Bob
             | Smith,bob@example.com,123-456-7890,123 Fake Street
             | Mike Jones,mike@example.com,098-765-4321,321 Fake Avenue`
             | const lines = csv.trim().split('\n').map(line =>
             | line.split(','))            const [headings, ...data] =
             | lines            const objs = data.map(datum =>
             | Object.fromEntries(         headings.map((heading, index)
             | => [heading, datum[index]])))            console.log(objs)
             | // output:       [         {           "Name": "Bob Smith",
             | "Email": "bob@example.com",           "Phone Number":
             | "123-456-7890",           "Address": "123 Fake Street"
             | },         {           "Name": "Mike Jones",
             | "Email": "mike@example.com",           "Phone Number":
             | "098-765-4321",           "Address": "321 Fake Avenue"
             | }       ]
             | 
             | Not to mention template-strings, which I use extensively in
             | my own website:                 const header = `
             | <h1>Welcome to ${pageTitle}</h1>       `
             | 
             | Of course it's all subjective and Python does have format
             | strings and some slick dedicated syntaxes. But I don't
             | think it's fair to say "JavaScript does not score high"
             | when it comes to text-shuffling.
        
       | EricE wrote:
       | Having recently done a small Gatsby site I can identify with the
       | comments about complexity over time. And graphql does seem like
       | utter overkill too!
        
       | klelatti wrote:
       | A big thank you to the author for open sourcing this - I've been
       | playing with this to implement a largish static site (11,000
       | pages) and (as a definite non expert) have found it relatively
       | easy to understand and use - and it's lightning fast.
       | 
       | Just one comment: found implementing a Svelte Leaflet Map
       | component a bit of a struggle - an addition to the docs on this
       | would be very useful.
        
       | [deleted]
        
       | pier25 wrote:
       | This is a great post, but why doesn't it have a date?
       | 
       | It's infuriating to see blog posts or even news that don't
       | clearly display the publish date near the title.
        
         | corobo wrote:
         | Probably forgotten rather than whatever the other comments have
         | heard haha. I didn't even realise my own blog was missing dates
         | till just now
         | 
         | Infuriating is a bit much
        
           | pier25 wrote:
           | > Infuriating is a bit much
           | 
           | Ok maybe I was exaggerating :) but it really annoys me.
        
         | yarinr wrote:
         | The publish date is 2020-11-02, as appears on the articles list
         | at https://nicholasreese.com/
         | 
         | I agree they should probably make it visible on the article
         | page itself...
        
         | nickreese wrote:
         | Hey, author here. I banged this out quickly and didn't update
         | the template to include the date as an oversight. It was
         | written in Nov 2020 as shown on the homepage.
        
           | pier25 wrote:
           | > _It was written in Nov 2020 as shown on the homepage._
           | 
           | Nobody visits homepages anymore. :)
        
             | nickreese wrote:
             | Not arguing that. Updated the article template to keep
             | others from getting infuriated. ;)
        
               | pier25 wrote:
               | Awesome!
        
         | tomjen3 wrote:
         | If it is a great post, why does it matter when it was written?
         | 
         | How to win friends and influence people was written close to
         | 100 years ago, but it is still recommended in plenty of places.
         | 
         | The dragon book is older than me, but it is still one of the
         | recommended books to write a compiler in.
         | 
         | Why does a blog post have to have a date displayed on it? If it
         | is about a specific version of some software, I can understand,
         | and agree with you, why it should be mentioned in the post.
        
           | pier25 wrote:
           | > _How to win friends and influence people was written close
           | to 100 years ago, but it is still recommended in plenty of
           | places._
           | 
           | Because human nature hasn't changed. Front end stuff changes
           | every day.
        
         | mkr-hn wrote:
         | Some marketing blog long ago said to remove dates so people
         | can't tell when a post was written, and it's plagued blogging
         | ever since. The idea is that "evergeen content" shouldn't need
         | a date, but a date is context. No one is so good at writing
         | that their writing has nothing anchoring it to the context of a
         | point on a timeline.
        
           | kevincox wrote:
           | I agree. You can always update the date, or add a "refresh"
           | date if you do update the content (or just verify that it
           | still applies).
        
       | lawwantsin17 wrote:
       | Can't we all agree that adding addtl complied languages to the
       | frontend is not the answer. JS has template literals now, written
       | in C++. Who "falls in love" with a compiled frontend language and
       | then spends the next 7 months answering questions on discord?
       | What is the point of all of this?
        
       | nickreese wrote:
       | Hey all -- Author here. This was a reflection on building
       | Elder.js.[0]
       | 
       | Happy to answer any questions. I'll be going through and adding
       | context to the questions I see.
       | 
       | [0]: https://elderguide.com/tech/elderjs/
        
         | tsegratis wrote:
         | Really appreciate your frank assesment of the design and
         | current status. For instance:
         | 
         | > the ElderGuide.com team expects to maintain this project at
         | least until 2023-2024
         | 
         | So nice to see the absence of a hype train, and also so nice to
         | get an insight into your view of the design space
         | 
         | I would never be your market for elder.js. But I appreciate
         | learning from you, and appreciate building software alongside
         | you
        
       ___________________________________________________________________
       (page generated 2021-06-30 23:01 UTC)