[HN Gopher] Lessons from Building a Static Site Generator (2020)
___________________________________________________________________
Lessons from Building a Static Site Generator (2020)
Author : ramshorst
Score : 88 points
Date : 2021-06-30 12:56 UTC (10 hours ago)
(HTM) web link (nicholasreese.com)
(TXT) w3m dump (nicholasreese.com)
| whydoineedthis wrote:
| What is hydration? You talk a lot about it, but I've never had to
| add water to my websites before.
| nickreese wrote:
| Good question. Hydration is where Javascript is rendered
| statically or on the server and the client needs to take over
| that HTML.
|
| Traditional frameworks like Next.js, Gatsby, Nuxt.js all "fully
| hydrate" the client.
|
| This means that every bit of HTML that is sent to the is
| browser is taken over by JS on the client.
|
| This has it's costs but it is done to give interactivity.
|
| Partial hydration is where you are only adding interactivity to
| the parts of the site that need it... think of it like the good
| old days of jquery but with a modern front end framework...
| Svelte.
| whydoineedthis wrote:
| Thank you, very helpful explanation.
| Santosh83 wrote:
| Here you go:
| https://en.wikipedia.org/wiki/Hydration_(web_development)
| klodolph wrote:
| I wrote a static site generator for my own personal site. I've
| been using it for over 10 years, and it's gone through several
| major refactors / redesigns. A few comments:
|
| 1. Template system
|
| There are tons of different template systems out there for things
| like the "shortcodes" in the article. {{youtube
| id="123asdf4" /}}
|
| My conclusion is that the correct way to do this is with custom
| tags, <embed-youtube id="123asdf4"></embed-
| youtube>
|
| I apologize for the verbosity... but this is completely valid
| HTML5, and you do not need anything but an ordinary HTML5 parser
| to parse this. This maximizes your choices for the libraries you
| use in the static site generator and it maximizes the level of
| support in whatever editor you choose to author the site in. For
| example, you can just use the HTML mode in Vim or Emacs, or you
| can use VS Code, TextMate, Sublime Text, etc. and get a ton of
| features: syntax highlighting, indenting, etc.
|
| While on the surface it looks verbose because of the closing tag,
| in most editors, you only have to press a key or two to close the
| tag. HTML5, strictly speaking, does not support self-closing tag
| syntax for custom tags. That syntax is only supported for void
| elements. There are only 16 void elements in HTML5.
|
| I use "prefix-suffix" syntax to avoid ambiguity... any tag with a
| hyphen is obviously a custom tag.
|
| 2. Routing
|
| Something you can use to tackle the routing complexity is to
| place your source files in the same path as the canonical URL.
| You only need routes for generated content, like index pages and
| such.
|
| 3. Index data
|
| You'll naturally want to generate indexes and create previews for
| links. I suggest that you start by looking at the schema.org
| schema for web pages and work with a useful subset of that. This
| way, you can generate indexes on your web page using the same
| exact data, same exact schema, that you use for the JSON-LD data
| you provide for search engines like Google.
|
| This is a minor point, but it reduces duplicated effort between
| the code for generating content for your website and the code for
| generating JSON-LD metadata.
|
| Don't dive too deep into the schema.org schema, just take a
| couple bits and pieces that you need, and refer to the feature
| guides in Google's documentation:
|
| https://developers.google.com/search/docs/guides/intro-struc...
| EricE wrote:
| "2. Routing Something you can use to tackle the routing
| complexity is to place your source files in the same path as
| the canonical URL. You only need routes for generated content,
| like index pages and such." Thank you! The only saving grace
| for a Gatsby site I recently did was it leveraged a template
| that did that automatically. To say that it dramatically
| simplified things is a gross understatement. Template for those
| interested: https://github.com/18F/federalist-uswds-gatsby
| nickreese wrote:
| Author of the post. This is really an interesting take on
| shortcodes. I've been struggling with a format that the svelte
| compiler likes and can be used in markdown. This may be the
| answer.
|
| > <embed-youtube id="123asdf4"></embed-youtube>
|
| Thank you.
| na85 wrote:
| I went down this rabbit hole and found everything to be
| overcomplicated for my use case. I'm so sick of static site
| generators that have seven layers of templating engines and
| complicated build systems.
|
| Most of the static site builders I tried were either way too
| complex, or else just straight-up didn't work at all (looking at
| you, coleslaw).
|
| I tried to go full emacs and use org export (org being my
| favorite text format) but the default export is horrendous and
| the documentation for org to html export is so bad it might as
| well not exist.
|
| Software is simultaneously awesome and infuriating.
|
| So after three days I just gave up on org-export and now I have
| pandoc shit out an html snippet that I concatenate with a hand-
| rolled html preamble and postamble via a Makefile.
|
| Found a few rough edges, probably because org is underspecified.
| It's not elegant but it works for my use case.
| jstrieb wrote:
| For what it's worth, if you're using Pandoc, you can set the
| HTML output to be "standalone" based on a simple template.[0]
| You can also include a standard header and footer to be
| automatically inserted for each generated page.
| pandoc \ --standalone \ --css=/style.css \
| --highlight-style=code-highlight.theme \
| --variable=lang:en \ --include-before-
| body=navbar.html \ --include-after-body=footer.html \
| --template=template.html \ $MD -o $HTML
|
| I use a variation of this command in a bash script to generate
| my entire static site.[1] A friend improved upon my script with
| a Go implementation that does some more advanced stuff, but
| still compiles Markdown to HTML using this command under the
| hood.[2]
|
| 0: https://pandoc.org/MANUAL.html#option--standalone
|
| 1: https://github.com/jstrieb/personal-
| site/blob/master/compile...
|
| 2: https://github.com/lsnow99/dudu
| breck wrote:
| Try https://scroll.pub
|
| It uses Scrolldown instead of markdown, which is simpler,
| cleaner, and incredibly extensible.
|
| The command line app has just a few commands---and they all
| take zero params.
|
| A site is just a single folder, and because content is written
| in Scrolldown, it works great with git and is great for content
| sites or collaborative strongly typed databases.
|
| It fast, I get about 300 pages per second, and not a lot of
| code (sub 1k excluding dependencies), and the code is tested.
|
| It's in nodejs now but no reason scroll can't be language
| agnostic.
|
| I've been around SSGs for over a decade and designing this one
| to be simple, reliable, and to stay out of the creators' way. I
| think it could be the last SSG you'll ever need.
| wishinghand wrote:
| Is this hyperbole or am I just lucky to avoid whatever you
| tried? What SSGs require 7 layers of templating and complicated
| build systems. Whenever I tried out Sergey, Nuxt, Docpad, and a
| few others, there was just one templating engine each and a
| build command for the CLI.
| eatonphil wrote:
| This is a good review of a fairly complex piece of software. But
| don't let this convince you that all static site generators must
| be complex.
|
| Out of laziness, most sites I run have their own 100-200 line
| Python static site generator that takes Markdown (if I'm really
| feeling it) or HTML files with Jinja templates and generates
| pages around them. The core generator code hardly ever changes
| year by year. Here's an example [0].
|
| This isn't to say that everyone should always write their own. I
| am just a bit surprised by all the debate around each generator
| because they all produce the same thing and the only (or major)
| variables are the template language and what themes are built in
| (though of course you can always bring your own CSS).
|
| But _using_ a static site generator is a very good idea. If you
| have no other inclinations, I think the stack that makes sense
| for anyone with multiple contributors is to use WordPress for
| editing and then have a plugin that will generate static pages
| from it so not every request to your site hits the database.
|
| [0]
| https://github.com/eatonphil/notes.eatonphil.com/blob/master...
| jjjbokma wrote:
| Mine is just over 1KLOC :-) But it includes both a RSS and a
| JSON feed, support for Twitter card / Facebook sharing, a
| calendar view, and a tag cloud. Live demo: https://plurrrr.com/
|
| Code is available at github: https://github.com/john-
| bokma/tumblelog
| jazzyjackson wrote:
| The choice of using one markdown document to render all the
| pages is really interesting to me. Do you just edit the
| document in your terminal? Trying to imagine the usability of
| that, I guess if I was quick at jumping from one "page" to
| another it might be faster than opening and closing files.
| nickreese wrote:
| Author here. At it's core most static site generators are just
| fancy "string concatenation" tools.
|
| In my experience playing with several generators before
| building Elder.js it isn't so much about the output that
| matters it is about how the static site generator lets you do
| non-trivial customization. Things that would be hard without a
| larger framework.
|
| More importantly, when building a major project on a static
| site generator, it is important to have an upgrade path beyond
| a static site generator should you require it. Elder.js was
| built with that use case in mind though I didn't cover that in
| the article.
|
| Being able to move to SSR should the project require it is a
| huge plus in my book.
| eatonphil wrote:
| > At it's core most static site generators are just fancy
| "string concatenation" tools.
|
| I think that's a bit reductive (especially since template
| libraries themselves are already complicated, not in a bad
| way). To me a static site generator is: 1) a string template
| library, plus 2) a file system walker/trigger to generate
| from a template, plus 3) additional data to feed the
| templates, plus 4) the actual content.
|
| Then there's of course the additional features you may or may
| not need: a tagging system, a comment system, a subscription
| system, etc.
|
| Thankfully despite all these components you don't need to
| write much of the actual code since they all exist as builtin
| (file system walking) or major OSS (Jinja, Mustache.js, etc)
| libraries. The SSG is primarily glue.
| susam wrote:
| > But don't let this convince you that all static site
| generators must be complex.
|
| Indeed! For example, https://github.com/sunainapai/makesite is
| a simple and lightweight static site generator written in
| Python. It can be customized easily by modifying the Python
| source code and adapting it to one's needs. I like that when I
| need a new feature, I can add it quite easily by writing a few
| Python functions. It is meant to be programmer-friendly.
|
| Disclosure: My wife wrote this project. I am just a happy user
| of the project.
| mturmon wrote:
| I used ```makesite.py``` as a template for a small site of
| 100-200 pages that I maintain. It has worked quite well.
|
| The animating idea ("use this as a template but don't be
| afraid to customize or reinvent certain parts") liberated me
| from feature-by-feature evaluation of a bunch of complex
| config/templating systems.
|
| For a small site like mine, getting the content right is the
| main thing and the site generator should mostly get out of
| the way. I update the site sporadically and don't want to re-
| learn a complex templating and config-file system every time
| I go back to it.
| pjc50 wrote:
| I've become convinced that it's easier to write a SSG than
| understand someone else's, and it's _definitely_ quicker than
| trying to evaluate the market and pick one.
| AndrewStephens wrote:
| > Out of laziness, most sites I run have their own 100-200 line
| Python static site generator that takes Markdown (if I'm really
| feeling it) or HTML files with Jinja templates and generates
| pages around them. The core generator code hardly ever changes
| year by year.
|
| I maintain my site exactly the same way (minus the Jinja) and
| it is a workflow that works for me. Simplicity is best, even
| beating flexibility) when it comes to tools that help you
| express yourself. Otherwise you spend all your time wrestling
| with your tools rather than creating.
| Bayart wrote:
| Yesterday someone ran a thread where people posted tons of
| headless CMS options[1]. There might be a few ideas there for
| people interested in your comment.
|
| [1] https://news.ycombinator.com/item?id=27674105
| jrm4 wrote:
| Alright, I'm skimming this whole idea of newfangled "static site
| generators" that involve a lot of Javascript and I'm left with a
| whole lot of "Isn't this just ____ with extra steps?"
|
| I'm seeing "shortcodes" and I'm like -- as in variables and/or
| configuration files?
|
| Or, more broadly -- why Javascript for the backend? This looks
| like a silly level of complexity. I'd start thinking about it in
| Bash and then probably head over to e.g. Python once databases et
| al start getting involved. What am I missing here?
| nickreese wrote:
| Shortcodes as they are implemented in Elder.js (what the
| article is about) are borrowed from WordPress. Basically they
| are a placeholder such as [[lastestTweet/]] that let you add
| dynamic content into otherwise static content.
|
| While Elder.js does allow for full server side rendering making
| "Javascript the backend" the goal of the static site generator
| is generate static HTML/CSS/JS that can be hosted from a CDN,
| S3, or other static host.
|
| That said, one of the biggest pitfalls with building a major
| site on a static site generator is there is often no upgrade
| path to server rendering. Elder.js does offer that out of the
| box.
| valenterry wrote:
| > What am I missing here?
|
| People usually start their projects with the language they feel
| most comfortable in, not the one that is best suited for the
| project.
| frabert wrote:
| Curious you mention bash! I did exactly that some time ago when
| I thought about starting blogging... And then never started
| blogging!
|
| https://blog.frabert.me/posts/2018/11/11/blash.html
| brundolf wrote:
| > why Javascript for the backend?...[I'd] probably head over to
| e.g. Python once databases et al start getting involved
|
| I could ask the same thing: why Python for the back-end?
|
| Both languages have similar feature-sets and are roughly
| equally suitable to this task. Given that, a person should use
| whichever they're most comfortable with. Lots of people are
| comfortable with JavaScript these days, ergo there's lots of
| activity in the JavaScript SSG space.
| jrm4 wrote:
| Fair -- but let me be more precise then. I do like the idea
| of "use what you learned and are most comfortable in."
|
| That being said, I think what I mean by "e.g. Python" is
| specifically - "older slash more text oriented slash proven"
|
| If you like Javascript, fine. But I think there's a case to
| be made that there is a completely unnecessary "bloat" to
| Javascript -- especially even as the author himself has
| suggested that at the end of the day it's all string
| concatenation.
|
| If that's the case, (as it frequently is) it ends up boiling
| down to "what handles _text_ well and in an established and
| smooth way, " and Javascript does not score high there, I'd
| suggest.
| brundolf wrote:
| It doesn't sound like you write much JavaScript :)
|
| > I think there's a case to be made that there is a
| completely unnecessary "bloat" to Javascript
|
| "Bloat in JS" is something of a trope, and I'm not really
| even sure what you mean by it here. Typically people mean
| "too many JS dependencies are often drawn in for web
| pages", but that's not really relevant. Maybe you mean
| syntax bloat (method calls instead of list comprehensions)?
| If so, you're not totally wrong, but it's also not really a
| big deal in my experience. If you're talking about
| runtime/performance, well... V8 tends to be faster than
| CPython in raw compute (excluding native modules) because
| Google has put so much work into optimizing it (not that
| that matters much for a SSG either)
|
| > it ends up boiling down to "what handles text well and in
| an established and smooth way," and Javascript does not
| score high there
|
| Whenever I need to process some nontrivial text, the first
| thing I do is open up a Chrome tab for the JS repl. I think
| the following it pretty smooth: const csv =
| ` Name,Email,Phone Number,Address Bob
| Smith,bob@example.com,123-456-7890,123 Fake Street
| Mike Jones,mike@example.com,098-765-4321,321 Fake Avenue`
| const lines = csv.trim().split('\n').map(line =>
| line.split(',')) const [headings, ...data] =
| lines const objs = data.map(datum =>
| Object.fromEntries( headings.map((heading, index)
| => [heading, datum[index]]))) console.log(objs)
| // output: [ { "Name": "Bob Smith",
| "Email": "bob@example.com", "Phone Number":
| "123-456-7890", "Address": "123 Fake Street"
| }, { "Name": "Mike Jones",
| "Email": "mike@example.com", "Phone Number":
| "098-765-4321", "Address": "321 Fake Avenue"
| } ]
|
| Not to mention template-strings, which I use extensively in
| my own website: const header = `
| <h1>Welcome to ${pageTitle}</h1> `
|
| Of course it's all subjective and Python does have format
| strings and some slick dedicated syntaxes. But I don't
| think it's fair to say "JavaScript does not score high"
| when it comes to text-shuffling.
| EricE wrote:
| Having recently done a small Gatsby site I can identify with the
| comments about complexity over time. And graphql does seem like
| utter overkill too!
| klelatti wrote:
| A big thank you to the author for open sourcing this - I've been
| playing with this to implement a largish static site (11,000
| pages) and (as a definite non expert) have found it relatively
| easy to understand and use - and it's lightning fast.
|
| Just one comment: found implementing a Svelte Leaflet Map
| component a bit of a struggle - an addition to the docs on this
| would be very useful.
| [deleted]
| pier25 wrote:
| This is a great post, but why doesn't it have a date?
|
| It's infuriating to see blog posts or even news that don't
| clearly display the publish date near the title.
| corobo wrote:
| Probably forgotten rather than whatever the other comments have
| heard haha. I didn't even realise my own blog was missing dates
| till just now
|
| Infuriating is a bit much
| pier25 wrote:
| > Infuriating is a bit much
|
| Ok maybe I was exaggerating :) but it really annoys me.
| yarinr wrote:
| The publish date is 2020-11-02, as appears on the articles list
| at https://nicholasreese.com/
|
| I agree they should probably make it visible on the article
| page itself...
| nickreese wrote:
| Hey, author here. I banged this out quickly and didn't update
| the template to include the date as an oversight. It was
| written in Nov 2020 as shown on the homepage.
| pier25 wrote:
| > _It was written in Nov 2020 as shown on the homepage._
|
| Nobody visits homepages anymore. :)
| nickreese wrote:
| Not arguing that. Updated the article template to keep
| others from getting infuriated. ;)
| pier25 wrote:
| Awesome!
| tomjen3 wrote:
| If it is a great post, why does it matter when it was written?
|
| How to win friends and influence people was written close to
| 100 years ago, but it is still recommended in plenty of places.
|
| The dragon book is older than me, but it is still one of the
| recommended books to write a compiler in.
|
| Why does a blog post have to have a date displayed on it? If it
| is about a specific version of some software, I can understand,
| and agree with you, why it should be mentioned in the post.
| pier25 wrote:
| > _How to win friends and influence people was written close
| to 100 years ago, but it is still recommended in plenty of
| places._
|
| Because human nature hasn't changed. Front end stuff changes
| every day.
| mkr-hn wrote:
| Some marketing blog long ago said to remove dates so people
| can't tell when a post was written, and it's plagued blogging
| ever since. The idea is that "evergeen content" shouldn't need
| a date, but a date is context. No one is so good at writing
| that their writing has nothing anchoring it to the context of a
| point on a timeline.
| kevincox wrote:
| I agree. You can always update the date, or add a "refresh"
| date if you do update the content (or just verify that it
| still applies).
| lawwantsin17 wrote:
| Can't we all agree that adding addtl complied languages to the
| frontend is not the answer. JS has template literals now, written
| in C++. Who "falls in love" with a compiled frontend language and
| then spends the next 7 months answering questions on discord?
| What is the point of all of this?
| nickreese wrote:
| Hey all -- Author here. This was a reflection on building
| Elder.js.[0]
|
| Happy to answer any questions. I'll be going through and adding
| context to the questions I see.
|
| [0]: https://elderguide.com/tech/elderjs/
| tsegratis wrote:
| Really appreciate your frank assesment of the design and
| current status. For instance:
|
| > the ElderGuide.com team expects to maintain this project at
| least until 2023-2024
|
| So nice to see the absence of a hype train, and also so nice to
| get an insight into your view of the design space
|
| I would never be your market for elder.js. But I appreciate
| learning from you, and appreciate building software alongside
| you
___________________________________________________________________
(page generated 2021-06-30 23:01 UTC)