[HN Gopher] EPUBCheck - The official conformance checker for ePu...
___________________________________________________________________
EPUBCheck - The official conformance checker for ePub publications
Author : auraham
Score : 121 points
Date : 2024-08-23 04:31 UTC (18 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| mdaniel wrote:
| One can also $(brew install epubcheck) if so inclined
| https://formulae.brew.sh/formula/epubcheck#default
| dindresto wrote:
| Or `nix run nixpkgs#epubcheck`
| m101 wrote:
| I'm not sure why but when I download some epubs and try to send
| to my kindle it fails. Only after using an online converter to
| convert them from epub to epub does it then work.
| daveoc64 wrote:
| You might want to try running the EPUB through https://kindle-
| epub-fix.netlify.app/
|
| This applies a few fixes for problems with EPUBs that the Send
| to Kindle service doesn't like.
| sillystuff wrote:
| Maybe a little less hassle than using some web thingy. Ebook-
| convert is a cli application that comes with Calibre, and is
| probably what the online sites are using anyway.
| ebook-convert infilename.epub outfilename.epub
|
| If I get an ebook that works with fbreader, but has issues on
| my nook, the above will fix it.
| dopa42365 wrote:
| I've used Calibre with the KFX Output plugin for years, never
| had an issue with converting pirated epubs.
| leoc wrote:
| Cutting and pasting an old 2019 comment
| https://news.ycombinator.com/item?id=19944627 :
|
| > There are similar problems with uploading to publishers in ePub
| format. The last time I was bashing my head against ebook
| publishing, about a couple of years ago, many (most? all?) of the
| sites were validating ePub uploads using an old version of the
| ePub suite which rejected some ebooks which were valid per the
| up-to-date validator. Which version they were using was ofc not
| documented, and you were lucky to even get to see an error
| message. And of course tech support was largely unhelpful.
| (Especially kobo.com 's.) The people working on the ePub spec
| seemed to be largely unbothered by the
| fragmentation/noncompliance and hideous experience for those
| authoring and uploading in the format, too.
|
| > Which is a pity, because aside from this and some other bugs
| and pitfalls EPUB 2.0 has some attractive features and is nice to
| work with for anyone who doesn't mind bashing out a good old
| directory tree of HTML docs by hand.
|
| Maybe things are a lot better by now. Here's hoping!
| tannhaeuser wrote:
| Why is this linked now? There's no new release or milestone at
| this time.
|
| Citing my comment from when this was new about eight months ago:
|
| > Even more unfortunate is that this change has already spilled
| to derived standards such as EPUB3 which hence makes existing
| EPUB3 content using compound headings going back to 2011 invalid,
| and EPUB3 writers lacking a tool for actually verifying what
| readers can support (epubcheck was blindly updated without
| consideration for the installed base).
|
| See also the blog [1] about W3C's most recent HTML spec. Lack of
| HTML backward compat along with gross import of all of CSS
| without profiles, or paged media requirements and deemphasis of
| long-standing EPub mechanisms in favor of CSS and JS, and general
| impression of a low-effort, merely editorial nature really makes
| Epub's move to W3C questionable but nobody seems to care anyway,
| sticking with EPub 2 and 3.1 (which is also what Calibre is
| recommending as target format for conversion).
|
| [1]: https://sgmljs.net/blog/blog2303.html
| redman25 wrote:
| I worked for a publisher for about 10 years as a typesetter and
| ebook developer. There are a lot of things about the publishing
| industry that are antiquated, especially for non-technical
| publishing companies. Unfortunately it's a low margin business.
|
| Most authors are only familiar with Microsoft Word, so on the
| front end you often have to take a messily styled Word document
| and manually caress it into a structured document that can be
| used for ebooks and print.
|
| For print, a majority of non-technical publishers use Adobe
| InDesign and/or InCopy. Editors edit manuscripts in InCopy and
| typesetters style documents for print. PDFs are generally
| exported and sent to printers via FTP.
|
| For ebooks, every publisher seems to have their own bespoke
| system. You _can_ export books in epub format from InDesign but
| the process for getting a clean ebook is difficult to say the
| least since InDesign was primarily designed for print
| publications. Generally, you end up structuring books for the
| lowest common denominator of ebook platform (epub, kindle, etc.)
| unless you are creating something like a children's book or a
| poetry book where you might do something more custom.
|
| Many publishers use ebook distribution platforms where you upload
| epub, mobi, cover images via FTP. They use an XML standard called
| ONIX for distributing metadata that's unique to say the least...
| breck wrote:
| This is extremely helpful information [0]. Thank you.
|
| [0] I'm currently working on a new language for writing books.
| Finnucane wrote:
| I oversee ebook production for a university press publisher,
| and indeed, our typesetters have to do some pre-processing of
| our authors' Word files before typesetting (we do all editing
| and copyediting in Word), and then post-processing of the
| Indesign output to get acceptable ebook files. There are some
| plugins that will help. Indesign, left to its own devices, will
| give you garbage.
|
| Our sales vendors all check the files with epubcheck. If it
| doesn't pass, they'll bounce it.
| grecy wrote:
| FWIW I've published a few of my own books. I write them in
| latex to get perfect print ready pdfs, then pan doc gives me
| flawless epub files, all from the same source.
|
| It works really well
| bhaak wrote:
| Any special LaTeX packages you are using?
| philistine wrote:
| Not OP but I basically do the same thing (except the epub
| part) and the secret is to use XeTeX, since it allows you
| to use modern fonts.
| grecy wrote:
| Here's my process http://theroadchoseme.com/how-i-self-
| published-a-professiona...
| lxgr wrote:
| > You _can_ export books in epub format from InDesign but the
| process for getting a clean ebook is difficult to say the least
| since InDesign was primarily designed for print publications.
|
| I wonder if that's why so many, and even relatively new, ePubs
| feel a lot like poorly OCRed PDFs?
|
| It generally seems like most publishers and I have opposite
| goals when it comes to ePubs: They want them to look and feel
| as much like the physical book as possible (by including custom
| fonts, applying custom margins/padding etc.), while I want
| absolutely none of that.
|
| It's frustrating having to fight the publisher to get something
| readable on a small display or non-Kindle ePub reader, and I
| don't even want to get started on dark mode...
| DiggyJohnson wrote:
| Genuinely thankful for this tool and use it for both sides of my
| non-fiction book project (research: verifying converted or
| misbehaving epubs for use on Remarkable, iPad, Calibre and Kindle
| (I know, I know...)) as well as typesetting and review.
| everybodyknows wrote:
| Last commit was in 2023:
| https://github.com/w3c/epubcheck/commits/main/
|
| 85 open bugs; 9 open PRs.
| KingOfCoders wrote:
| Tried to publish an epub to German platform Tolino, tried
| everything, tried an external service agency, etc. doctored
| around in Calibre, no success, they didn't accept the epub.
|
| Printed PDF for decades at print shops, never had a problem.
|
| Why is this such a problem? Because of the HTML?JS?
| 0cf8612b2e1e wrote:
| I believe epub is just a zip archive of html. Could you extract
| the files and then use a html validator which would cleanup the
| presumably broken markup?
|
| My initial attempt would be to use pandoc to roundtrip the
| files.
___________________________________________________________________
(page generated 2024-08-23 23:01 UTC)