[HN Gopher] From Ugly to Beautiful
       ___________________________________________________________________
        
       From Ugly to Beautiful
        
       Author : mihaitodor
       Score  : 39 points
       Date   : 2023-02-25 14:41 UTC (8 hours ago)
        
 (HTM) web link (martinkysel.com)
 (TXT) w3m dump (martinkysel.com)
        
       | Swizec wrote:
       | Oh hey I'm quoted in this article as a cautionary tale :D
       | 
       | It is true: Migrating ~10 years worth of Wordpress mess to
       | markdown that can be slurped in by something like Gatsby was a
       | pain in the arse in my case. So many strange little things
       | regularly broke that I'm pretty sure some are still broken.
       | Naturally I no longer had source files, just what was live in
       | Wordpress.
       | 
       | Also a word of warning - _host images yourself_. So many lost
       | images (and links) on my site due to link rot. Makes me sad
       | 
       | But I now have all my content in text files on a github
       | repository. The next migration will be easier.
        
         | mkysel wrote:
         | you are famous on the google-s!
        
       | danwork wrote:
       | There are a multitude of stepping stones between a computer in
       | your friends basement and an app with a distributed database
       | hosted in AWS.
       | 
       | Literally hundreds of thousands WP hosting sites
       | 
       | I get that it's not always a cost exercise and more exploratory
       | learning to change your blog around, but that stuck out as an odd
       | complaint when justifying a shift in your stack.
       | 
       | So.. is it still hosted in your friends basement?
        
         | nicbou wrote:
         | Even if cost is not a concern, it's amazing not to maintain a
         | CMS and harden it against attackers. You just put the files
         | there and forget about them.
         | 
         | WordPress requires more babysitting.
        
       | Brajeshwar wrote:
       | I moved my blog from WordPress to Jekyll too. Not much about the
       | "beautiful" part but it definitely is simpler, much easier to
       | maintain, and the writing has been cleaner. I also made the
       | Jekyll theme available for anyone to try it out.
       | 
       | https://brajeshwar.com/2021/brajeshwar.com-2021/
        
         | extr0pian wrote:
         | I moved my blog from Wordpress to just a static html/css site
         | that I maintain myself, including the rss xml. It's certainly
         | simpler and I've learned a great deal but maintaining even a
         | personal website this way can be a pain (it's far too easy to
         | not catch a typo and breaking something like rss). Jekyll is
         | likely what I may end up switching to.
         | 
         | https://chuck.is/html/
        
           | Brajeshwar wrote:
           | The content I interact with is plain-text (Markdown). I use
           | Jekyll just as a tool. I have tried Hugo and way faster
           | locally. However, Github Pages has Jekyll built-in and so I
           | stayed with it. I rarely run Jekyll locally to write.
        
       | mihaitodor wrote:
       | A story about various options for migrating a personal blog from
       | a self-hosted Wordpress instance to a static site using Jekyll
        
       | nicbou wrote:
       | I just moved from Craft CMS to a DIY static site generator,
       | Ursus. I moved 850 entries with various kinds of metadata,
       | including various relationships between entries. This is the
       | website that pays my bills, so it's important to get it right.
       | 
       | It took 6-8 weeks, and I was on a roadtrip for 3 of those. It
       | includes writing a SSG from scratch and migrating both the
       | content and the templates.
       | 
       | I'm about to write my own recap, but to summarize it here: it
       | went okay, and I love working with markdown.
       | 
       | The biggest issue was the conversion of 5 years of HTML noise
       | caused by Redactor, a WYSIWYG editor with many quirks. The
       | converter didn't know what to do with stray line breaks at the
       | end of block elements, among other oddities. I had to fix a lot
       | of stuff manually.
       | 
       | The second biggest issue was implementing responsive images with
       | captions, which replace img tags with elaborate figure elements.
       | I converted those to markdown manually, then wrote my own
       | markdown extension. It wasn't that bad!
       | 
       | I've now been editing markdown for a month, and I am loving it.
       | It's text in files, not HTML in a database. You can transform it
       | with an arsenal of tools from the last 5 decades, not just a
       | crummy WYSIWYG editor. You can apply the same rigour to the
       | written word as you would to code. You can use scripts, regex,
       | linters, and fancy text editors with fine-tuned plugins and
       | themes. It's awesome to have so much control over your work
       | environment.
       | 
       | It's also great to review changes with a git merge tool, to
       | deploy content updates like code, to work offline on slow
       | hardware.
       | 
       | Oh and the server has so few moving parts now. No more WordPress
       | updates, no MySQL CPU spikes. The whole thing is so simple.
       | 
       | I waited a long time to move to an SSG, but it was absolutely
       | worth it.
        
         | swatcoder wrote:
         | Nice!
         | 
         | > The biggest issue was the conversion of 5 years of HTML noise
         | caused by Redactor, a WYSIWYG editor with many quirks. The
         | converter didn't know what to do with stray line breaks at the
         | end of block elements, among other oddities. I had to fix a lot
         | of stuff manually.
         | 
         | For anybody else trying this sort of rewrite, don't rule out
         | automating a (headless) browser in place of digesting noisy
         | source files. The browser is designed to normalize a lot of
         | that awful input content to a sane DOM, and you can then
         | extract your clean markdown/whatever representation from that
         | normalized _content_ tree rather than from a raw HTML model
         | (with all its loosey goosey allowances).
         | 
         | And if this is your own project/code, you might also introduce
         | a few changes into your CMS template to make the scraping more
         | reliable.
        
           | nicbou wrote:
           | The input was valid. It just wasn't clean. It had empty tags,
           | bold tags that wrapped a line break and ended at the
           | beginning of the second line, and other oddities that don't
           | quite translate to markdown. There were also site-specific
           | conventions that I replaced (basically, how I handle
           | footnotes).
           | 
           | The nice part is that I could fix those issues with regex. I
           | could fix hundreds of page in a few keystrokes, and only
           | commit changes if they looked right. I also fixed thousands
           | of little things that were impossible to fix before, because
           | editing text files is trivial.
           | 
           | To migrate the content, I converted the templates to output
           | plain markdown files instead of HTML. I also had one page
           | that was a plain text list of all entries to migrate. A
           | script read the list and saved each rendered markdown file.
           | 
           | I can't overstate how pleasant it is to work with plain text
           | files. Today, I'm reviewing 450 locations with the Google
           | Places API. I fixed a dozen businesses that moved or closed.
           | This would have taken days before. It was to tedious to even
           | try.
        
       | ChrisMarshallNY wrote:
       | _> Suddenly it's 6 months later and you're losing your mind_
       | 
       | The Programmer's Credo:
       | 
       |  _"We do what we do; not because it is easy, but because we
       | thought it would be easy."_
       | 
       | I have considered migrating my sites from WP, but they aren't
       | really central enough to my work, to justify the effort.
       | 
       | WP lets me keep a fairly "hands off" approach, while also giving
       | me a great deal of control over things like formatting.
       | 
       | But the new MD-based static site generators are pretty cool, and
       | I may get around to it.
        
         | CharlesW wrote:
         | You could also try a static site generator (e.g. Simply Static)
         | for WordPress as a stepping stone. You can even use the now-
         | official SQLite support (https://wordpress.org/plugins/sqlite-
         | database-integration/) to avoid the MySQL overhead.
        
       ___________________________________________________________________
       (page generated 2023-02-25 23:01 UTC)