[HN Gopher] Standard Ebooks: liberated ebooks, carefully produce...
___________________________________________________________________
Standard Ebooks: liberated ebooks, carefully produced for the true
book lover
Author : tosh
Score : 900 points
Date : 2025-04-06 07:36 UTC (15 hours ago)
(HTM) web link (standardebooks.org)
(TXT) w3m dump (standardebooks.org)
| smallnix wrote:
| Awesome project. Gutenberg is mentioned, does this project feed
| back to Gutenberg?
| aegypti wrote:
| Absolutely, from a previous discussion:
|
| https://news.ycombinator.com/item?id=32217313
| miles wrote:
| As the linked comment says, it's up to the individual
| contributor to inform PG of any corrections; SE does not do
| so as a matter of course (at least, that was the case when I
| last contributed).
| opto wrote:
| Looks like a great project, and one sorely needed by people like
| me who find themselves trying to get hold of old books they can't
| get in their local library and that are too expensive to buy
| secondhand.
| carlosjobim wrote:
| The shadow libraries such as Anna's Archive are a treasure
| trove of old books, and you're not breaking any imaginary law
| by downloading old books which are out of copyright.
| zozbot234 wrote:
| If a book is out of copyright you can usually find the scan
| on Internet Archive. No need to look elsewhere at all.
| ZeroGravitas wrote:
| The internet archive's open library will also link to
| Standard Ebooks (and Gutenberg and a few others) if a
| version exists of a book you are looking at e.g.:
|
| https://openlibrary.org/books/OL37044523M/The_Woodlanders
| notpushkin wrote:
| If a book is still in copyright, chances are you'll find it
| there as well.
|
| Scans suck though, even a badly OCR'ed EPUB is way better.
| charcircuit wrote:
| The scans can have a different copyright date than the book
| itself.
| eesmith wrote:
| There is no copyright on scans.
|
| Scanning is not transformative and does not result in a
| derivative work which can is protected by copyright law.
|
| https://en.wikipedia.org/wiki/Wikipedia:Scanning_an_image_d
| o...
|
| https://law.stackexchange.com/questions/1214/who-owns-a-
| copy... points us to read the Compendium of US Copyright
| Office Practices at
| https://www.copyright.gov/comp3/docs/compendium.pdf
|
| > 313.4(A) Mere Copies
|
| > A work that is a mere copy of another work of authorship
| is not copyrightable. The Office cannot register a work
| that has been merely copied from another work of authorship
| without any additional original authorship. See L. Batlin &
| Son, 536 F.2d at 490 ("one who has slavishly or
| mechanically copied from others may not claim to be an
| author"); Bridgeman Art Library, Ltd. v. Corel Corp., 36 F.
| Supp. 2d 191, 195 (S.D.N.Y. 1999) ("exact photographic
| copies of public domain works of art would not be
| copyrightable under United States law because they are not
| original").
| charcircuit wrote:
| A pdf file can contain more than just the raw images of
| the pages.
| eesmith wrote:
| Certainly! If you add my latest Kirk/Spock slash fanfic
| to the end of the text, then that is transformative, so
| the resulting PDF is covered under copyright.
|
| But you wrote "scan". Adding an OCR'ed text layer, or
| doing manual proofreading and layout ("sweat of the
| brow") is not sufficiently transformative to have
| copyright protection.
|
| And we were specifically talking about scans of old books
| stored in shadow libraries.
| mariusor wrote:
| As far as I know Standard gets their raw ebooks from Project
| Gutenberg which has a vastly greater collection of public
| domain works. What they're doing is typesetting them for the
| average reader. But if all you're looking for is just the
| content, Gutenberg is the place to look for ethically clean
| copies.
| LordGronk wrote:
| I would love this if it were to produce viable unabridged ebooks
| of Francis Parkman's "France and England in North America" vol
| 2-7. All the existent digital editions were poorly scanned and
| don't separate footnotes from the main text.
| poidos wrote:
| If you have the cash, you can pay them to do so! Scroll down to
| "SPONSOR A NEW EBOOK":
|
| https://standardebooks.org/donate
|
| > Sponsoring a new ebook of your choice calls for a donation of
| $900 + $0.02 per word over the first 100,000
| squigz wrote:
| I love this project and don't want to disparage the work that
| goes into it, but 900 USD, and it has to be a book _that is
| already transcribed online_? That seems a bit much to me.
| eesmith wrote:
| That sounds quite reasonable to me. That's about what a
| freelance proofreader charges to edit a book, if
| https://thewritelife.com/how-much-to-pay-for-a-book-editor/
| is correct, and that's working with a (likely Word)
| document which isn't poorly scanned from paper.
| hombre_fatal wrote:
| You're paying a human to remaster the book word for word
| and hand transform it into epub html paragraph by
| paragraph.
|
| How much less would you do it for?
| carlosjobim wrote:
| If you pooled the funds with 10 other people who want the
| book, it would be $90 each. Or imagine pooling it with 100
| people.
| acabal wrote:
| You can also join our Patrons Circle to have this book added to
| our Wanted Ebooks list, which is a list of suggestions for our
| volunteers to work on:
| https://standardebooks.org/donate#patrons-circle
| tailspin2019 wrote:
| I love this. They pay attention to everything I normally despise
| about (many) ebooks (poor layout, lack of metadata, no chapter
| headings etc).
| jimnotgym wrote:
| Is there anything similar for Audiobooks (which I wish would go
| back to being called Talking Books)
| kybernetikos wrote:
| Librivox https://librivox.org/ is the closest I know.
| cdrini wrote:
| I would also recommend using Microsoft Edge's built-in
| ReadAloud (TTS) on standard ebooks. They have a mind boggling
| number of hyper realistic voices; more than any other browser
| I've tested.
| mentalgear wrote:
| Beautifully made! Which gutenberg.org would be updated with this
| design & approach!
| ssttoo wrote:
| I recently started on my first title contribution to the project,
| it's a rewarding experience https://github.com/stoyan/edith-
| wharton_the-custom-of-the-co... It's HTML all the way down
|
| The step-by-step:
| https://standardebooks.org/contribute/producing-an-ebook-ste...
|
| In a nutshell: start with a Project Gutenberg text, clean it up
| to a high standard, have it peer reviewed and published
| Touche wrote:
| Love this. So many in the archivist community are only
| interested in preservation and don't care at all about making
| the material accessible. Love to see a project like this
| prioritizing the latter.
| stog wrote:
| You're spot on with this. I recently converted a local
| history book from 1911 to Markdown, ePub and HTML and tracked
| the changes on GitHub. Only a handful of copies of this book
| exist in physical form and it has been photo copied (which is
| great).
|
| However, I was completely shot down by the local library when
| I was discussing it with them. They said they already had a
| photo copy and didn't need anymore digital editions, I tried
| to explain the benefits of having it in a machine readable
| format but they wouldn't entertain it. I completed the
| project for me, so I wasn't too bothered, but thought they
| might have been interested in archiving it but they weren't.
|
| My general feeling is that they didn't like an outsider
| contributing and touching on a format they didn't know so got
| slightly defensive.
| simpaticoder wrote:
| Interesting. I wonder if libraries suffer a supply-chain
| risk and so avoid taking contributions from (non-vetted)
| individuals? I imagine that over time a library gets lots
| of offers to take "important works of literature" from
| cranks, and perhaps they've developed this culture to
| protect them from that. Pure speculation, of course.
| badlibrarian wrote:
| Libraries typically don't even accept print books or
| CDs/DVDs. If there's a donation bin outside it probably
| isn't even theirs. And if stuff actually winds up with
| them, it just gets sold off so they can purchase material
| via vetted channels.
|
| https://www.betterworldbooks.com/go/donate
| pajop wrote:
| can you share the links to your project?
| badlibrarian wrote:
| Find an archive and make sure they're aware of the work
| you've done. Archivists always love meeting people who've
| done good work in the space they're in. Especially when
| they have some tech chops which is desperately lacking in
| the space.
|
| Beyond that, if the material is public domain, that library
| is called The Internet. Post it and promote it. The only
| reason to seek association with a library is if you're
| looking for cred for some reason, and that's not the
| business they're in.
|
| If it's not public domain, or if you haven't marked your
| derivative work public domain, then you put a library in an
| awkward position. Realize that these are the types of
| people who still post little notes by the copy machines
| saying what's permissible and enjoy policing it.
|
| Most just say no for the same reason that Hollywood returns
| ideas and scripts unopened. They're busy and the
| cost/benefit isn't there.
|
| Although the self-described online ones tend to play fast
| and loose, real librarians have a formal code of ethics
| which is worth reviewing.
|
| https://www.ala.org/tools/ethics
| raybb wrote:
| Thanks for doing this. We need more people to take
| initiative like this!
| frereubu wrote:
| Do you "claim" a book, to make sure that no-one else is trying
| to work on the same book? I presume that's part of step 4 in
| your link, given that it would be heartbreaking to get 90% of
| the way through and then be beaten to it by someone who'd
| started at roughly the same time!
| contact9879 wrote:
| Yes, you signal your intent on the mailing list subject to
| approval by the editor-in-chief
| ssttoo wrote:
| Exactly, you do get approval before you start, as step 4
| says: https://standardebooks.org/contribute/producing-an-
| ebook-ste...
|
| In my case I picked a title from the project's wishlist and
| almost started but searching the mailing list showed that
| someone has just started. I found another title by the same
| author: https://groups.google.com/g/standardebooks/c/IP0emh
| SQ6Bw/m/B...
| miles wrote:
| Some of the higher ranking previous discussions:
|
| 2017, 441 points, 97 comments
| https://news.ycombinator.com/item?id=14570035
|
| 2019, 820 points, 131 comments
| https://news.ycombinator.com/item?id=20594802
|
| 2022, 1578 points, 256 comments
| https://news.ycombinator.com/item?id=32215324
|
| 2024, 701 points, 154 comments
| https://news.ycombinator.com/item?id=38831219
| kpjas wrote:
| How about https://en.wikisource.org/wiki/Main_Page ?
| grues-dinner wrote:
| It's not very obvious, but Wikisource provides EPUBs via the
| Tools menu for every book.
| BoingBoomTschak wrote:
| What a great project! This should really be funded by states,
| states which often already have some money dedicated to the
| preservation of culture.
|
| Too bad most stuff I really like will never enter the public
| domain in my lifetime... well, paper and the high seas still
| exist!
| contact9879 wrote:
| its never too late to expand your "stuff I really like" further
| into the public domain!
|
| there are whole generations of wonderful and insightful works
| that essentially disappeared from present consciousness for no
| reason other than for being old
| thfuran wrote:
| It would be better to expand the public domain. Whole
| generations of works were stolen by extensions of copyright.
| contact9879 wrote:
| while I don't disagree, ?por que no los dos?
| Sverigevader wrote:
| It's thanks to this site that I learned that Kobo uses a really
| bad renderer for epubs unless converted to their own ebook format
| (Kepub). It make a huge difference in appearance and performance
| on a Kobo device.
|
| https://standardebooks.org/help/how-to-use-our-ebooks#kobo-f...
| RVuRnvbM2e wrote:
| Wow I never knew this!
| robin_reala wrote:
| Yeah, if you just load normal epubs it defaults to an old
| version of Adobe Digital Editions unfortunately.
| wyclif wrote:
| Yes, though I understand Kobo is working on correcting
| these issues with the epub format.
| crashingintoyou wrote:
| Are they? Where have you heard that?
|
| Recently Calibre was updated to convert things to kepub
| when loading to Kobo devices - see
| https://www.omgubuntu.co.uk/2025/03/calibre-update-
| convert-k... - but I haven't anything about Kobo itself
| doing anything to improve this.
| crtasm wrote:
| I assume KOReader has a better renderer for epub but will have
| to test how it compares to the stock software+kepub. So far
| I've only used KOReader on my device.
| contact9879 wrote:
| the only issues i've found with koreader is its default
| margin size and its display of standard ebooks' titlepages
| but (I believe) these can be fixed with a fairly simple user
| tweaks css
| _emacsomancer_ wrote:
| You can set default margins in the user interface of
| KOReader too.
| stog wrote:
| I discovered this too. However, I now use Plato Reader on my
| Kobo with standard ePub and it's lovely.
| Uvix wrote:
| You don't even have to convert it, just rename the extension to
| .kepub.epub. https://github.com/kobolabs/epub-spec?tab=readme-
| ov-file#sid...
| acabal wrote:
| This is not _entirely_ correct - Kobo also expects a bunch of
| special <span>s inserted for things like highlighting and
| page numbers to work.
|
| It kills me that Kobo is _so close_ to having plain epubs
| rendered with Webkit but for some reason they just won 't
| take the leap!
| lazyeye wrote:
| You can use kepubify to convert epubs to kepubs (and calibre
| will do this as well)
|
| https://pgaskin.net/kepubify/
| coopykins wrote:
| I found curious that if you order the books by reading difficulty
| (easier to harder) The sound and the fury is on the second place.
| acabal wrote:
| We use the Flesch-Kincaid algorithm to calculate reading ease.
| For most books it works pretty well, but for avant-garde prose
| like _The Sound and the Fury_ it fails pretty badly. It also
| considers _Ulysses_ to be "fairly easy"!
| acabal wrote:
| Editor-in-chief here, happy to answer any questions, as always.
| We also recently celebrated Public Domain Day with an especially
| notable crop of books, including _The Sound and the Fury_ , _All
| Quiet on the Western Front_ , John Steinbeck's first novel, some
| Hemingway, Gandhi, two Dashiell Hammett novels, and more:
| https://standardebooks.org/blog/public-domain-day-2025
| Erlangen wrote:
| Hi, Alex. Is there anyway to browser the ebooks filtered by
| languages? I tried to find some texts in French, but it doesn't
| seem to have any.
| LtWorf wrote:
| Same for me. I think it's english only.
| acabal wrote:
| Standard Ebooks only works on English-language books, as
| typography varies between languages and we're only experts in
| English.
| philistine wrote:
| I can tell you there is a lot of appetite for other
| languages. I looked at the project and the amount of stuff
| that would need to be rewritten to work with multiple
| languages was daunting. I would consider working on making
| your documentation and workflow functional with multiple
| languages.
| acabal wrote:
| Lots of people have tried similar projects in other
| languages but as far as I know none have persevered.
|
| Personally I think it's important to have one person in
| charge who is able to approve of the quality of all the
| project's output; for now, at SE, that person is me and
| I'm only an expert in English.
| colonwqbang wrote:
| Project Runeberg seems to be still going after 30-odd
| years.
| robin_reala wrote:
| Project Runeberg is trying to be a nordic Project
| Gutenberg, not a nordic Standard Ebooks.
| loloquwowndueo wrote:
| Which ebook reader works well with standard ebooks in 2025?
|
| (More concretely my reader is a 2nd-gen kindle which is
| basically useless these days and I'd love an idea of something
| that can display standard ebooks with all their advanced
| formatting)
|
| Thanks!
| wyclif wrote:
| A Kobo would be a great choice. I use a Kobo Libra 2 and love
| it a lot more than my old Kindle Paperwhite that got stolen:
| https://gl.kobobooks.com/products/kobo-libra-2 The Kobo Sage
| is also good because it has an 8" screen.
|
| Standard eBooks offers kepub format for Kobo devices and
| files, they use their advanced Webkit-based renderer:
| https://standardebooks.org/help/how-to-use-our-
| ebooks#kobo-f...
| loloquwowndueo wrote:
| What did you do with purchased books you had in your
| kindle? Rebuy them? Just "let them go"?
|
| Thanks for the recommendation!
| wyclif wrote:
| Fortunately, I had them backed up to a cloud folder. I
| remember almost deciding not to go to the trouble to back
| them up, but isn't that how it always works with backups?
| The Kobo also works with epub.
| acabal wrote:
| I read on an old Kobo, using Kepub files. Their Kepub
| renderer is quite good.
|
| I think Kindle's renderer hasn't changed significantly for
| many years, and it had always been pretty bad. I always say
| that Kindle seems to have been created by people who hate
| books.
|
| The best renderer around is iBooks on an iPad, which as far
| as I can tell uses an up-to-date Webkit.
| loloquwowndueo wrote:
| Thanks! I don't like reading on a backlit screen (hurts the
| eyes) so iPad is a no-go, but a kobo would probably work!
| CarterATX wrote:
| Kobo Libra 2 is a great e-reader. Works well one-handed
| (screen rotates for left/right hands), has buttons for
| page turns. Integrates with Overdrive (what Libby uses).
| Drawbacks are Kobo's bookstore is weaker than
| Amazon/Apple. Screen is also not flush which means some
| dust can collect in the recess.
| _emacsomancer_ wrote:
| I'd suggest KOReader, on various devices, as the best
| renderer and interface.
| turrican wrote:
| A note for Kobo users: a lot of us (myself included) use
| Calibre to manage and upload our ebooks. Something about
| Calibre messes up Kepub files and strips out a lot of the
| formatting (including the book's cover).
|
| If I want to appreciate a nice Kepub from Standard Ebooks, I
| upload it directly to the Kobo.
| kps wrote:
| Piggybacking: for _computers_ , what is a good epub viewer?
|
| What I'm personally looking for:
|
| - Linux and/or OS X
|
| - No 'import' requirement (a viewer, not a collection
| manager)
|
| - Single page _or_ continuous (no forced double spread)
|
| - No required animations
|
| - At least basic control over font size, spacing, margins.
|
| - Keyboard navigation (at least next/previous page)
| carlosjobim wrote:
| OS X: FB Reader
| boredhedgehog wrote:
| Alexandria.
| buu709 wrote:
| For Linux, Foliate is very nice.
| skydhash wrote:
| That's calibre viewer, but it may require some
| customization to get something nice. Foliate is ok, but
| it's a library. i'd say that's OK because epub is a zip
| file and you need to extract it to read it.
| tehnub wrote:
| Apple Books on macOS is pretty nice
| jzb wrote:
| Check out Foliate, it's a really nice reader and Standard
| Ebooks display quite nicely using Foliate IMO.
| carlosjobim wrote:
| My Kindle is 8 years old and works excellent with standard
| ebooks. I think you can select any device that you prefer and
| it will be good.
| rodolphoarruda wrote:
| For Android, Moon Reader Pro.
|
| Unmatched UI tweaking features which make reading a pleasure.
| Syncs bookmarks with cloud services, thus across different
| devices.
| bodantogat wrote:
| Is there an API or downloadable catalog of the titles? Happy to
| feature them on meetnewbooks.com so more readers can find them.
| acabal wrote:
| Yes, we have complete feeds available for our Patrons:
| https://standardebooks.org/feeds
| sbarre wrote:
| What's the point of including books that aren't public domain
| yet in your collections?
|
| It makes it hard to browse those collections to find actual
| books to read. The first 3 series I clicked on all said "not
| P.D." (which at first I didn't know what "P.D" meant - remember
| your audience does not have your level of familiarity with your
| context, perhaps a tooltip on that badge would help)..
|
| Then I see "this book will enter public domain in 2050"..
|
| I commend you for this project, it's really awesome work.. From
| a user's experience, it would be great to have a filter on your
| various lists that restricts only to books that are available,
| and excludes these books that are not yet in your collection.
| robin_reala wrote:
| Whenever we add a collection, the books that are in that
| collection but not yet in PD in the US get placeholders. But
| a filter might not be a bad idea.
| acabal wrote:
| In addition to what Robin mentioned below, some of these
| placeholders are for books on our Wanted list. I also think
| it's useful to show readers that particular books are looking
| for volunteers to produce, and also to show that some books
| they might want are locked away by copyright for possibly
| decades. In that sense it's partly a political message.
| salviati wrote:
| It sounds like implementing the filter gp suggested would
| still send the political message though.
| frereubu wrote:
| I love this. However, I couldn't find an alphabetical list of
| authors, which is the way I wanted to browse on my first visit.
| Instead my only option is to show 48 on a page and paginate
| through, which is tedious. I know there are author pages - e.g.
| https://standardebooks.org/ebooks/william-makepeace-thackera...
| - so I presume it's feasible. An author index would
| significantly increase my likelihood of understanding what's
| available and engaging with the content.
| acabal wrote:
| We don't have a list of authors yet, but that's a good idea
| to add!
| Kye wrote:
| You could reuse whatever process generates the sitemap:
| https://standardebooks.org/sitemap
|
| All the author pages come before any pages with books from
| those authors.
| homebrewer wrote:
| https://standardebooks.org/bulk-downloads/authors
|
| Links in the first column.
| frereubu wrote:
| Another question - in
| https://standardebooks.org/contribute/producing-an-ebook-ste...
| you talk about "modernising" spelling, e.g. changing "some one"
| to "someone". This may be against the implicit goal of making
| these accessible for a general reader, but I prefer to read
| what was originally written, and it feels like it crosses a
| line into editorialising rather than letting the original feel
| stand as-is. (Although of course these texts have already been
| "editorialised" by their original editors!) Totally your
| decision given the amount of effort that has clearly gone into
| this, but I'd be interested to read the rationale for that
| decision.
| acabal wrote:
| That's fine! Our editions didn't erase any of the other
| editions you can find online and in print. You're more than
| welcome to select any edition that fits your reading
| preferences.
| frereubu wrote:
| Apologies if that came across as at all critical. Genuinely
| interested in the rationale rather than it being a how-
| dare-you demand for you to explain yourself!
| acabal wrote:
| Spelling varies widely across the eras our ebooks were
| published in. Therefore we attempt to standardize
| spelling to what a modern reader might be familiar with.
| We only make sound-alike changes, like to-morrow ->
| tomorrow.
|
| This is a common practice that editors and publishers
| have quietly engaged in for centuries. For example, today
| you are not reading Shakespeare in the way it was spelled
| in its first printing.
| cenamus wrote:
| And you're for sure not speaking it like he would have
| frereubu wrote:
| Fair enough - thanks for the explanation.
| wpollock wrote:
| A wonderful project!
|
| After reading this comment I couldn't help but picture
| medieval monks, toiling away copying old manuscripts into
| "modern" English. Normally a thankless task, so thank
| you!
| Alive-in-2025 wrote:
| I appreciate this service you are doing, but it would be
| much much better to also have an original version with
| archaic spelling. Double bonus points for have optional
| (hidden by default) explanations of words. This would be
| tremendously helpful to some students.
| idoubtit wrote:
| I respect this choice of modernization, and I suppose some
| readers enjoy it, but it makes the publisher's whole work
| useless to me. When a text has been altered, I can't trust it
| respects the intent of the author, and any style
| inconsistency I find may be a by-product of the publisher's
| mangling.
|
| So, when I care about a book, I never read Standard Ebooks'
| edition.
|
| By the way, the modernization is more than joining a few
| words. Sometimes, Standard Ebooks replaces the word used at
| the time the book was written. For instance:
| This time, however, the mountain was going to
| [-Mahomet;-]{+Muhammad;+}
|
| The previous quote is from Galsworthy's "Forsyte Saga". The
| author used many French words and French spellings - like
| "Tchekov" for the Russian playwriter that was living in
| Paris. These subtleties are lost with the _modernization_.
|
| I also think some alterations are plain mistakes. For
| instance in the same book: if she wanted a
| good book she should read [-"Job"-]{+Job+}; his
| father was rather like Job while Job still had land.
| KennyBlanken wrote:
| Anyone who has read books for classes in high school and
| above knows that even classics are routinely fucked with by
| publishers. Even early in the work's history. I remember
| even in middle school someone would invariably end up with
| a different publisher's edition of a book for summer
| reading or whatnot and we'd find changes.
|
| Unless the book is specifically declared to be the original
| text - and it may have to specify _which_ original text -
| they 're going to be edited.
|
| However, in electronic form it should be possible to
| include both in one file, or two files with the original in
| a repo branch once all the document structure stuff has
| been added. That text will never change, so merging
| formatting-only changes should be pretty painless.
| greenie_beans wrote:
| ooo tempted to reprint faulkner as part of a small press,
| thanks for the idea
| fauria wrote:
| Roughly speaking, how long does it take you to produce a single
| ebook?
| contact9879 wrote:
| it varies widely depending on the length and type of book and
| how much free time the volunteer has to devote to it
|
| Anywhere between 1 week for the simplest (straight narrative,
| not too much verse or endnotes) and ~1 year (thousands of
| endnotes, pages of verse, drama, in-line references to book
| titles, use of technical terms, etc)
| acabal wrote:
| Once you're very familiar with the process, you could get a
| draft of a basic prose novel ready for proofreading in a few
| hours. Then it has to be proofread and completed.
|
| Beginners, and people working on more advanced books, can
| take much, much, much longer.
| htunnicliff wrote:
| I'd love to know more about the pattern of keeping each book in
| individual repos, rather than in a singular repo.
| remus wrote:
| Presumably to keep the repo size reasonable. Say I want to
| make an ad hoc contribution to a book, if step 1 is "download
| this multi-gigabyte repo" then that's a fairly big hurdle.
| acabal wrote:
| Each repo is a history of the ebook including editorial
| changes, typos fixes, and the like. Having a single repo
| containing thousands of ebooks and their histories would be
| pretty annoying to browse.
| jayanmn wrote:
| I am from India. Could you add local UPI based donation option
| at some point? Not everyone has card here.
| mourner wrote:
| Wonderful project! One thing I wish the website would have is
| being able to find the right book to read out of this enormous
| list -- e.g. showing / sorting by Goodreads ratings (which I
| realize you might not want to do), or at least having some kind
| of a "Featured" section with the most critically acclaimed /
| must read books of the project on one page.
| theyinwhy wrote:
| Great work! Gutenberg project books have always been a pain to
| read. Thank you for caring!
| agiacalone wrote:
| Been using Standard Ebooks for a while now, but wanted to drop
| by here and say how great this site is! It's replaced P.G. for
| me (for whatever is on this site, at least) and I like the much
| nicer formatting on the texts. It's great on both my physical
| Kindle and Apple Books on my iPhone.
| virtualritz wrote:
| That website is hopefully not an indication of how these ebooks
| will look on my mobile.
|
| A screenshot from the typography section:
|
| https://ibb.co/nqhyTR3M
| contact9879 wrote:
| if you're reading a style manual it might :)
|
| but no, the manual itself is not really mobile-friendly. you
| can check what an actual ebook would look like though:
|
| https://standardebooks.org/ebooks/louis-couperus/the-tour/al...
| virtualritz wrote:
| Much too tight leading for a book text.
|
| This is a leading you'd see on the ingredients list of an
| energy bar packaging.
|
| The other choices are fine.
|
| Caveat: I studied typography and worked in that field for a
| decade.
| contact9879 wrote:
| the online view is not the primary way readers are expected
| to read the ebooks. downloading the epub and reading on an
| ereader (edit: where line height and font size are
| customizable) is the expected and best supported method
|
| however, contributions are very welcome and everything is
| hosted on GitHub if you'd like to suggest improvements; or
| send your thoughts on the mailing list
| SamBam wrote:
| But if they have an online view, why not make it
| readable? The suggestion above about the line height is
| presumably a 1-line CSS change.
| contact9879 wrote:
| presumably, which is why i encouraged submitting a note
| to the mailing list or the standardebooks/web repo on
| github
| virtualritz wrote:
| I think the point of parent was that the issue, the too
| narrow leading, is not a change that needs debating. On a
| mailing list, issue tracker or whatever.
|
| Or if you think it actually was, this was not a project
| that I'd want to get involved in.
|
| As someone who reads mostly ePubs, many of which suffer
| from issues this project aims to fix, I mean that in a
| very caring way.
| contact9879 wrote:
| i also don't think it needs debating. my point was that
| the issue, the too narrow leading in the online view, is
| just not going to be fixed unless someone points it out
| to someone that can fix it. if that's you, great! you can
| submit a PR to the git repo. or, if don't have the time
| or want to have to go find where the line height is
| defined, submitting a comment to the mailing list or
| noting it on the issue tracker will let a volunteer fix
| it
|
| from my own experience, Alex is very amenable to
| improvements. the online view of the ebooks is just not
| used by probably anyone to actually read the books (just
| use an ereader app or device its a way better experience
| anyway) and because of that no one has cared to point it
| out until now
| acabal wrote:
| The manual has some known issues on mobile, I believe there's a
| GitHub issue open about it. It's low priority because the
| manual is rarely read on mobile. PRs welcomed!
| Animats wrote:
| Most of the big print-on-demand companies will now make
| hardcovers, for about $10. You can't feed raw Gutenberg files
| into those mills, but these "standard ebooks" have enough
| formatting info for that. So that would be a useful service.
| m-hodges wrote:
| What are some examples of companies that do this?
| pmarreck wrote:
| It surprises me that the eBook (clarification: epub) format is
| basically XHTML because 1) that means that every eReader needs to
| basically be a web browser 2) this sounds like it would make
| reformatting for different devices NOT easier
| contact9879 wrote:
| this also somewhat surprised me at first but I think it's
| obvious in hindsight, though they don't have to be a full-blown
| web browser (you can go read the epub specs at W3C to see
| what's supported)
|
| as for (2) I'm not sure why you think it would make it less
| easier? being html, text reflows automatically based on screen
| size, font size, line height, etc
| pmarreck wrote:
| I guess I assumed that, for example, multi device support on
| websites for various device widths entails a bunch of CSS,
| which means the epub renderer would have to also do that,
| which basically means a whole web browser.
|
| also that things like footnotes or anything that has a
| floating reference (table of contents links for example)
| might get very complex or require javascript
| contact9879 wrote:
| since ebooks are primarily (only?) text you don't have to
| worry about UI elements and such which simplifies a lot of
| the css
|
| footnotes aren't really a thing with ebooks (at least as
| far as displaying the note on the page with the text).
| Because it is just a html renderer, footnotes are presented
| as mutual <a> elements located in the endnotes at the end
| of the book
| badsectoracula wrote:
| Yeah (i guess you mean epub), though in practice readers
| support only a tiny subset and epubs avoid using anything
| fancier than basic XHTML. Epubs that try to use fancy stuff
| (like most CSS outside of setting fonts - that readers can
| ignore either because they do not support it, or because the
| user wants to use another font) tend to not display correctly.
| acabal wrote:
| It makes a lot of sense when you recall that HTML and its
| ancestors were designed to mark up and format _documents_ ,
| i.e. books. One of the most fundamental elements is <p>, which
| stands for... paragraph.
|
| Each renderer differs in capabilities, and most are stuck in a
| subset of early-2000s capabilities, so designing an ebook is
| very much like designing for the 90s era web. Lots of hacks are
| required to get the same file to look good on many different
| renderers, and achieving that is one of the goals of Standard
| Ebooks.
| hombre_fatal wrote:
| Including a web browser seems a lot easier and simpler than
| coming up with your own rendering system once you want to
| support a feature set past the trivial.
|
| Also, xhtml is just markup. It doesn't mean you have to support
| all the possible tags and styles of modern html and css. It
| would be a sensible choice even if you had basic needs. You
| just parse it into whatever representation you want.
| carlosjobim wrote:
| The greatest surprise is that no popular web browser opens
| ePubs natively! This in 2025, where they all display PDFs, high
| resolution video, 3D games, etc.
| robin_reala wrote:
| Edge used to, until MS rebuilt it on top of Chromium. Shame.
| mjmas wrote:
| Yes, and that was a great viewer too. Having the whole book
| laid out horizontally rather than vertically was a good
| idea.
| sandreas wrote:
| What I'm missing in modern ebooks (like epub format) is more
| metadata. Who's talking (character data)? What emotional aspects
| does the scene have (angry, happy, sad, in a hurry)? Where does
| the conversation take place (geodata)?
|
| I'd love to see at least: - character: ID, Name,
| Gender, Age - mood: ID, Name (Happy, Sad, Angry, ...)
| - place: ID, Name, Acoustic (Outside, Inside, Cave, ...)
|
| This could be prepared by the author, work as a glossary, enrich
| the whole ebook experience and also would be a great preparation
| to teach AI voices how to convert a book into an audiobook.
| acabal wrote:
| TEI is something like that, but the amount of effort required
| to mark a book up like that would be astronomical.
| xondono wrote:
| Starts to sound like the kind of task an AI could do
| reasonably well though
| kec wrote:
| If the goal of these tags are metadata for AI consumption,
| and the solution to generate them is "use an AI"... what is
| the point?
| roskelld wrote:
| Specialization I presume, so one produces the metadata
| that can be consumed by another.
|
| Also, the thing from the above post that stood out to me
| would be to act as a reminder for the reader. Not so much
| the location and emotion, but the character data. I've
| often found myself wondering who the character is that's
| appeared in a scene, forgetting that they previously
| appeared earlier.
| hombre_fatal wrote:
| If it can be derived from the book text, then LLMs or reader
| can already derive it.
|
| If it can't be derived from the book text, then it's extra
| content that probably shouldn't be there because it came from
| elsewhere.
| huhkerrf wrote:
| What's the point of reading a book, then? The joy of reading
| fiction is to try to understand the humanity in the scene. I
| don't need the author to force feed me all of these details. I
| want to wrestle with the answers, to try to grasp what it might
| mean.
| mjmas wrote:
| That sounds like you are asking for a play.
| llm_nerd wrote:
| A good initiative, but the "us vs them" framing -- where the
| "them" are other people trying to do a service for people --
| gives off bad juju. It positions the value proposition by
| seemingly denigrating other providers of free ebooks.
|
| It begins with "Other free ebooks don't put much effort into..."
| which sounds extremely catty.
|
| Maybe I'm reading too much into it, but it seems there's a way to
| stand on other people's shoulders and celebrate each other.
| kseistrup wrote:
| I love Standard Ebooks.
|
| See also Global Grey ebooks: https://www.globalgreyebooks.com/
| One woman has formatted hundreds of ebooks herself.
| konstantinua00 wrote:
| Forbidden You don't have permission to access this resource.
|
| thanks for being open ...I guess
| generationP wrote:
| You're probably in some country that has longer copyright
| duration than the US (life+70a, which is atrocious enough). Use
| Tor or a proxy.
| SamBam wrote:
| Are there any non-English books? When I go to the search page,
| language isn't even a pull-down option, so I'm guessing not.
|
| There is a huge world of out-of-copyright non-English texts, and
| Project Gutenberg has many thousands of them. I wonder if any
| interest could be generated to help bring them in by posting on
| foreign language subreddits or something.
| slevis wrote:
| Just looked through the entire website to answer this question.
| Seems like they only accept english books :( "Types of ebooks
| we don't accept: - Non-English-language books. Translations to
| English are, of course, OK."
| (https://standardebooks.org/contribute/collections-policy)
| SamBam wrote:
| Weird. Why the explicit rule against them?
|
| I understand if the existing editors can't personally
| proofread the submissions, but that's why peer-review exists.
| Or an open-source project in general where people can post
| corrections. Jimbo Wales didn't need to speak a hundred
| languages to launch Wikipedia.
| contact9879 wrote:
| To me, that niche is already covered by Wikisource.
| Standard Ebooks as a project is very strict about
| conforming to its editorial and quality standards. On
| boarding more languages would require volunteer editorial
| experts in those languages.
|
| Besides, projects in other languages can absolutely build
| upon Standard Ebooks work, but to expect Standard Ebooks
| itself to support other languages is just too outside the
| scope and expertise of the volunteers available.
| kelvinjps10 wrote:
| If you were to find the expert editors for the other
| languages would you let them publish the works in those
| other languages on standards books website?
| contact9879 wrote:
| well, that would be up to Alex. but as that would require
| a pretty substantial organizational and responsibility
| shift, I imagine, no, he would not.
|
| As it is now, Alex is editorially responsible for all
| output of Standard Ebooks. Changing that would require
| someone with the time and experience willing to take on
| all the responsibilities that Alex currently has for each
| of those other languages.
| kmoser wrote:
| Answered here:
| https://news.ycombinator.com/item?id=43601273
| npteljes wrote:
| A well-defined focus can help management of a project, for
| example, by not having the participants spread too thin.
|
| The website and toolchain are open source, so if someone
| would build an international version, and do it
| persistently, I'm sure they would link or maybe even merge
| the projects a bit.
| amelius wrote:
| Do they use AI tools in their conversion workflow?
| contact9879 wrote:
| No, LLMs are not used (nor would they be allowed). As for
| whether you would consider OCR to be AI, then... possibly?
| UncleEntity wrote:
| Does it use any automation?
|
| My bro-in-law supported his family as a freelance editor for
| years while my sister was doing the "maternity leave" thing
| so I know there's a non-trivial amount of work that goes into
| book editing. Cutting out some of that human labor seems like
| a good thing for a volunteer project.
| contact9879 wrote:
| there is quite a lot of automated changes using standard
| ebooks open source tools package
|
| the vast majority of textual tooling is regex-galore, but
| there is also automated epub tooling in there too
| kelvinjps10 wrote:
| Sorry for the question but how behind are the LLMs in terms
| of quality for something like this?
| contact9879 wrote:
| I can't really answer that because I haven't actually tried
| to use an LLM on any part of the process. The vast majority
| of the process is semantic markup using (x)html and
| proofreading. The markup process could, I guess, use an
| LLM, but most of it is already automated using regex and
| linting.
| tcoff91 wrote:
| For those who are into ebooks and audiobooks, I've been having a
| blast with the app Storyteller: https://storyteller-
| platform.gitlab.io/storyteller/
|
| You can self host the server, and it will create epub3s with the
| audiobook and ebook synced up.
|
| Then you use the mobile app to listen and read the books. It
| works way better than whispersync from kindle.
|
| Read on your boox e reader then switch to your phone and listen
| and everything syncs up seamlessly.
| tass wrote:
| Where do you find the books to host?
|
| Also your link has an erroneous .com
| tcoff91 wrote:
| You can get drm free audiobooks from libro and you can strip
| drm from kindle and audible books with calibre and libation.
| rr808 wrote:
| I like the idea. I read a bunch of classics from Gutenberg. In
| reality so many old books are very long and boring I ended up
| getting more modern books from the library instead.
|
| Maybe TikTok ruined me but maybe these things really do literally
| have a shelf life. Hopefully reformatting will help. Perhaps a
| better way to review and find the gems would be most helpful..
| TomasBM wrote:
| Perhaps it's not just about the 'shelf life' of a book, but
| also the language and style they use. The more archaic the
| language, and the more distant the style that the author's use,
| the harder it is for me to focus on the book.
|
| Perhaps it would be useful to have expertly abridged and
| modernized versions of (e)books, with interpreter's notes for
| each change.
| zozbot234 wrote:
| > Perhaps it would be useful to have expertly abridged and
| modernized versions of (e)books, with interpreter's notes for
| each change.
|
| A good AI can do this for you nowadays. So if anything it's
| nice to have the original version available.
| dr_dshiv wrote:
| Did you ever consider making them public domain but still
| offering to charge optional $10 donation for download?
|
| I'm interested in a similar approach for a rare book library, but
| funding for staff is a really challenge so we want to make some
| kind of revenue stream.
| contact9879 wrote:
| Standard Ebooks grew out of a pay-what-you-want experiment that
| Alex did ~10 years ago
| MilnerRoute wrote:
| Another great ebook/volunteer project is Librivox - free public-
| domain _audiobooks_ read by volunteers around the world...
|
| https://librivox.org/
| tcoff91 wrote:
| You can pair these together with the Storyteller app to create
| an epub3 with the audio embedded and aligned to have a
| whispersync-esque experience
| reassess_blind wrote:
| A sort by popularity filter would be appreciated.
| jomohke wrote:
| Some places resist this because it causes a "rich get richer"
| effect in popularity. But it's admittedly convenient.
| fuddle wrote:
| It would be great to be able to sort by popularity, to make it
| easier to find popular books. Or have a list of top 100
| downloads.
___________________________________________________________________
(page generated 2025-04-06 23:00 UTC)