[HN Gopher] EPUBCheck - The official conformance checker for ePu...
       ___________________________________________________________________
        
       EPUBCheck - The official conformance checker for ePub publications
        
       Author : auraham
       Score  : 121 points
       Date   : 2024-08-23 04:31 UTC (18 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | mdaniel wrote:
       | One can also $(brew install epubcheck) if so inclined
       | https://formulae.brew.sh/formula/epubcheck#default
        
         | dindresto wrote:
         | Or `nix run nixpkgs#epubcheck`
        
       | m101 wrote:
       | I'm not sure why but when I download some epubs and try to send
       | to my kindle it fails. Only after using an online converter to
       | convert them from epub to epub does it then work.
        
         | daveoc64 wrote:
         | You might want to try running the EPUB through https://kindle-
         | epub-fix.netlify.app/
         | 
         | This applies a few fixes for problems with EPUBs that the Send
         | to Kindle service doesn't like.
        
         | sillystuff wrote:
         | Maybe a little less hassle than using some web thingy. Ebook-
         | convert is a cli application that comes with Calibre, and is
         | probably what the online sites are using anyway.
         | ebook-convert infilename.epub outfilename.epub
         | 
         | If I get an ebook that works with fbreader, but has issues on
         | my nook, the above will fix it.
        
         | dopa42365 wrote:
         | I've used Calibre with the KFX Output plugin for years, never
         | had an issue with converting pirated epubs.
        
       | leoc wrote:
       | Cutting and pasting an old 2019 comment
       | https://news.ycombinator.com/item?id=19944627 :
       | 
       | > There are similar problems with uploading to publishers in ePub
       | format. The last time I was bashing my head against ebook
       | publishing, about a couple of years ago, many (most? all?) of the
       | sites were validating ePub uploads using an old version of the
       | ePub suite which rejected some ebooks which were valid per the
       | up-to-date validator. Which version they were using was ofc not
       | documented, and you were lucky to even get to see an error
       | message. And of course tech support was largely unhelpful.
       | (Especially kobo.com 's.) The people working on the ePub spec
       | seemed to be largely unbothered by the
       | fragmentation/noncompliance and hideous experience for those
       | authoring and uploading in the format, too.
       | 
       | > Which is a pity, because aside from this and some other bugs
       | and pitfalls EPUB 2.0 has some attractive features and is nice to
       | work with for anyone who doesn't mind bashing out a good old
       | directory tree of HTML docs by hand.
       | 
       | Maybe things are a lot better by now. Here's hoping!
        
       | tannhaeuser wrote:
       | Why is this linked now? There's no new release or milestone at
       | this time.
       | 
       | Citing my comment from when this was new about eight months ago:
       | 
       | > Even more unfortunate is that this change has already spilled
       | to derived standards such as EPUB3 which hence makes existing
       | EPUB3 content using compound headings going back to 2011 invalid,
       | and EPUB3 writers lacking a tool for actually verifying what
       | readers can support (epubcheck was blindly updated without
       | consideration for the installed base).
       | 
       | See also the blog [1] about W3C's most recent HTML spec. Lack of
       | HTML backward compat along with gross import of all of CSS
       | without profiles, or paged media requirements and deemphasis of
       | long-standing EPub mechanisms in favor of CSS and JS, and general
       | impression of a low-effort, merely editorial nature really makes
       | Epub's move to W3C questionable but nobody seems to care anyway,
       | sticking with EPub 2 and 3.1 (which is also what Calibre is
       | recommending as target format for conversion).
       | 
       | [1]: https://sgmljs.net/blog/blog2303.html
        
       | redman25 wrote:
       | I worked for a publisher for about 10 years as a typesetter and
       | ebook developer. There are a lot of things about the publishing
       | industry that are antiquated, especially for non-technical
       | publishing companies. Unfortunately it's a low margin business.
       | 
       | Most authors are only familiar with Microsoft Word, so on the
       | front end you often have to take a messily styled Word document
       | and manually caress it into a structured document that can be
       | used for ebooks and print.
       | 
       | For print, a majority of non-technical publishers use Adobe
       | InDesign and/or InCopy. Editors edit manuscripts in InCopy and
       | typesetters style documents for print. PDFs are generally
       | exported and sent to printers via FTP.
       | 
       | For ebooks, every publisher seems to have their own bespoke
       | system. You _can_ export books in epub format from InDesign but
       | the process for getting a clean ebook is difficult to say the
       | least since InDesign was primarily designed for print
       | publications. Generally, you end up structuring books for the
       | lowest common denominator of ebook platform (epub, kindle, etc.)
       | unless you are creating something like a children's book or a
       | poetry book where you might do something more custom.
       | 
       | Many publishers use ebook distribution platforms where you upload
       | epub, mobi, cover images via FTP. They use an XML standard called
       | ONIX for distributing metadata that's unique to say the least...
        
         | breck wrote:
         | This is extremely helpful information [0]. Thank you.
         | 
         | [0] I'm currently working on a new language for writing books.
        
         | Finnucane wrote:
         | I oversee ebook production for a university press publisher,
         | and indeed, our typesetters have to do some pre-processing of
         | our authors' Word files before typesetting (we do all editing
         | and copyediting in Word), and then post-processing of the
         | Indesign output to get acceptable ebook files. There are some
         | plugins that will help. Indesign, left to its own devices, will
         | give you garbage.
         | 
         | Our sales vendors all check the files with epubcheck. If it
         | doesn't pass, they'll bounce it.
        
         | grecy wrote:
         | FWIW I've published a few of my own books. I write them in
         | latex to get perfect print ready pdfs, then pan doc gives me
         | flawless epub files, all from the same source.
         | 
         | It works really well
        
           | bhaak wrote:
           | Any special LaTeX packages you are using?
        
             | philistine wrote:
             | Not OP but I basically do the same thing (except the epub
             | part) and the secret is to use XeTeX, since it allows you
             | to use modern fonts.
        
             | grecy wrote:
             | Here's my process http://theroadchoseme.com/how-i-self-
             | published-a-professiona...
        
         | lxgr wrote:
         | > You _can_ export books in epub format from InDesign but the
         | process for getting a clean ebook is difficult to say the least
         | since InDesign was primarily designed for print publications.
         | 
         | I wonder if that's why so many, and even relatively new, ePubs
         | feel a lot like poorly OCRed PDFs?
         | 
         | It generally seems like most publishers and I have opposite
         | goals when it comes to ePubs: They want them to look and feel
         | as much like the physical book as possible (by including custom
         | fonts, applying custom margins/padding etc.), while I want
         | absolutely none of that.
         | 
         | It's frustrating having to fight the publisher to get something
         | readable on a small display or non-Kindle ePub reader, and I
         | don't even want to get started on dark mode...
        
       | DiggyJohnson wrote:
       | Genuinely thankful for this tool and use it for both sides of my
       | non-fiction book project (research: verifying converted or
       | misbehaving epubs for use on Remarkable, iPad, Calibre and Kindle
       | (I know, I know...)) as well as typesetting and review.
        
       | everybodyknows wrote:
       | Last commit was in 2023:
       | https://github.com/w3c/epubcheck/commits/main/
       | 
       | 85 open bugs; 9 open PRs.
        
       | KingOfCoders wrote:
       | Tried to publish an epub to German platform Tolino, tried
       | everything, tried an external service agency, etc. doctored
       | around in Calibre, no success, they didn't accept the epub.
       | 
       | Printed PDF for decades at print shops, never had a problem.
       | 
       | Why is this such a problem? Because of the HTML?JS?
        
         | 0cf8612b2e1e wrote:
         | I believe epub is just a zip archive of html. Could you extract
         | the files and then use a html validator which would cleanup the
         | presumably broken markup?
         | 
         | My initial attempt would be to use pandoc to roundtrip the
         | files.
        
       ___________________________________________________________________
       (page generated 2024-08-23 23:01 UTC)