[HN Gopher] Can you read this cursive handwriting? The National ...
___________________________________________________________________
Can you read this cursive handwriting? The National Archives wants
your help
Author : lemonberry
Score : 205 points
Date : 2025-01-18 02:42 UTC (20 hours ago)
(HTM) web link (www.smithsonianmag.com)
(TXT) w3m dump (www.smithsonianmag.com)
| Unearned5161 wrote:
| cheers! I was looking for something semi productive to sink a
| Friday night into
|
| on a more serious note, working through a transcription project
| for letters and journals that nobody has touched since they've
| been archived is such a wonderful feeling. Aside from being in
| front of the physical document itself, your degree of separation
| from the writer and point is time is vanishingly small!
|
| I always like to observe when they cross something out or make a
| mistake and think about what could have caused that. Did a friend
| pass by the door and scare them? Did they get distracted looking
| out the window? It's all so close and yet so far away :)
| Over2Chars wrote:
| It says "The following is the dedication of James Lambert a
| soldier of the Revolutionary wars with the Americas."
|
| blah blah blah
| sayrer wrote:
| Yes, that seems right. Not that difficult. This one suffers
| from some poor penmanship, though.
| Unearned5161 wrote:
| I'm not too sure about that reading, I got "The following is
| the declaration of James Lambert a soldier of the Revolutionary
| War in South America." rather different
| sayrer wrote:
| oh, it is "declaration", yes, but not South America. this guy
| is even on Amazon:
|
| https://www.amazon.com/James-
| Lambert-1758-1847-Elaboration-R...
| jahewson wrote:
| I got "North America"
| ripe wrote:
| Hmm, interesting: "North America" does make sense, and 4o
| also seems to transcribe it that way, but the handwriting
| looks like it says "South America" to me.
| Baeocystin wrote:
| Funnily enough, there have been a few times over the past couple
| of years I've been asked by younger co-workers to read something
| for them that was written in cursive. I hadn't really realized it
| had become such a (comparatively) rare skill. This fact is making
| me feel older than my actual 50th birthday did!
| MattGaiser wrote:
| I'm 28. I can only read the document in the article with a lot
| of effort and fiddling with the contrast.
| toolslive wrote:
| I'm a middle aged European and I have no issue reading the
| cursive handwriting shown there. I'm pretty sure there are
| plenty of (UK) senior citizens who would be thrilled to help
| out here. The retirement homes are filled with bored people
| eager to engage in anything.
| jncfhnb wrote:
| I don't think I believe that OCR can't do it but random humans
| can
|
| OCR is VERY good
| BugsJustFindMe wrote:
| > _I don't think I believe that OCR can't do it but random
| humans can_
|
| I do.
|
| > _OCR is VERY good_
|
| Uh, my experience is extremely different.
| CamperBob2 wrote:
| Your experience is obsolete.
| BugsJustFindMe wrote:
| Oh, ok then.
| CamperBob2 wrote:
| I mean, all you have to do is feed the image to ChatGPT,
| and it will read it basically as well as you can.
|
| Denying/downvoting reality is always an option, of
| course.
| bigstrat2003 wrote:
| Not being rude was also an option, one you chose not to
| take for some reason. Seriously, all it would've taken
| was for you to say something like "there have been a lot
| of advancements so it's probably different than you
| remember". This conversation would've gone much smoother
| for you if you had.
|
| And BugsJustFindMe _can 't_ downvote you, because it was
| a reply to him. So don't bite his head off over it. You
| got downvoted because you were a jerk, plain and simple.
| CamperBob2 wrote:
| _Not being rude was also an option_
|
| Refraining from reflexively pooh-poohing AI with
| uninformed and/or out-of-date opinions is also an option,
| but not one often exercised on HN.
|
| It gets old not being able to carry on a discussion
| without squinting at grayed-out text, simply because
| someone pointed out that humans aren't robots and should
| no longer have to emulate them.
| BugsJustFindMe wrote:
| Can you feed these to ChatGPT and tell me what it says
| they say?
|
| https://imgur.com/a/CDU6Lgs
|
| It gets them wrong for me, but maybe it will get them
| right for you. Maybe you're better at prompting or have
| access to a better model or something.
| CamperBob2 wrote:
| Eh, I was talking about OCR'ing modern English cursive
| handwriting, not translating medieval script written in a
| dead language. It seems reasonable to expect specialized
| models to be used for this type of work.
|
| Still, here's the first one, via Gemini 2.0 experimental:
| https://i.imgur.com/HtnwfHp.png
|
| How does the response look? Did it correctly identify the
| language as Old French, at least? Even if 100% made up,
| which I have a feeling it is, it's a more credible (not
| to mention creative) attempt than most non-specialists
| would come up with.
|
| o1-pro, on the other hand, completely shat the bed:
| https://i.imgur.com/mivdjkA.png I haven't seen it fail
| like that in a LONG time, so good job, I guess. :) I
| resubmitted it by uploading the .jpg directly, and it
| mumbled something about a "Problem generating the
| response."
|
| Second image:
|
| Gemini 2.0 seemed to have more trouble with this one:
| https://i.imgur.com/oEktMP6.png
|
| o1-pro gave another error message, but 4o did pretty well
| from what I can tell (agree/disagree?):
| https://i.imgur.com/7iR1y7U.png I thought it was
| interesting that it got the date wrong, as '1682' is
| pretty easy to make out compared to much of the text.
|
| In summary, I think you broke o1-pro.
| BugsJustFindMe wrote:
| > _Did it correctly identify the language as Old French,
| at least_
|
| Yes! But that's the easy part. :)
|
| > _I was talking about OCR 'ing modern English cursive
| handwriting_
|
| Yeah, see, I think that's a very narrow expectation.
| Archive paleography is substantially broader than that.
| I'm not saying that the tools are useless, but they're
| often still not better than humans directing focused care
| and attention.
|
| > _o1-pro, on the other hand, completely shat the bed_
|
| The result is absolutely hilarious though! So kudos to
| the model for making me laugh at least.
|
| > _4o did pretty well_
|
| It is indeed pretty good and very impressive as a
| technological feat. The big problems I guess are:
|
| 1) Pretty good isn't necessarily good enough.
|
| 2) If one machine gets it right and one machine gets it
| wrong, can a machine reconcile them? Or must we again
| recruit humans?
|
| 3) If a machine seems to get a lot right but also clearly
| makes important factual errors in ways where a human
| looks and says "how could you possibly get _this_ part
| wrong, of all things? " (like the year), how much do we
| trust and rely on it?
| CamperBob2 wrote:
| The technique of pitting one model against another is
| usually pretty effective in my experience. If Gemini 2.0
| Advanced and o1-pro agree on something, you can usually
| take it to the bank. If they don't, that's when human
| intervention is necessary, given the lack of additional
| first-rank models to query. (Edit: 1682 versus 1692 being
| a great example of something that a tiebreaker model
| could handle.)
|
| It seems likely that a mixture-of-models approach like
| this will be a good thing to formalize at some level.
| Using appropriately-trained models to begin with seems
| even more important, though, and I can't agree that this
| type of content is relevant when discussing
| straightforward OCR tasks on modern languages.
| BugsJustFindMe wrote:
| > _I can 't agree that this type of content is relevant
| when discussing straightforward OCR tasks on modern
| languages._
|
| 1682 is a number though, language independent, and you
| noted it as being extremely obvious to a human, even one
| who can't read any of the other language. So I do think
| the tools are useful, but people probably still need to
| be there for now until better models for this are made
| that stop getting especially obvious parts wrong.
| jncfhnb wrote:
| I would challenge you to find a picture of text that you
| think a human can read and OCR cannot. I'm happy to
| demonstrate. The text shown in this article is trivial.
| demosthanos wrote:
| The archivists themselves say that they run into such texts
| often enough that this program was needed:
|
| > The agency uses artificial intelligence and a technology
| known as optical character recognition to extract text from
| historical documents. But these methods don't always work,
| and they aren't always accurate.
|
| They are _absolutely_ aware of the advances in these tools,
| so if they say they 're not completely there yet I believe
| them. One likely reason is that the models probably have
| less 1800s-era cursive in their training set than they do
| modern cursive.
|
| It's likely that with more human-tagged data they could
| _improve_ on the state of the art for OCR, but it 's pretty
| arrogant to doubt the agency in charge of this sort of
| thing when they say the tech isn't there yet.
| tedunangst wrote:
| Can someone please post a sample of one of these images
| that can only be read by a human for us naive OCR
| believers to see?
| CamperBob2 wrote:
| To be fair there was a similar discussion a few days ago
| in which an SME remained unconvinced:
| https://news.ycombinator.com/item?id=42566391
|
| I don't necessarily agree with her conclusion because she
| wasn't participating directly in the thread and wasn't
| completely responsive to some of the points raised, but
| still, it appears that there _are_ a few instances of
| difficult-to-read handwriting where OCR is still coming
| in second to skilled human interpretation.
| jncfhnb wrote:
| That's comprehension of English not reading characters
| BugsJustFindMe wrote:
| I've posted these above, but I'll give you your own copy
| because the bits are free. Does your OCR work on these?
| Mine sadly doesn't. But if yours does, then I'll switch
| to it.
|
| https://imgur.com/a/CDU6Lgs
| jncfhnb wrote:
| The problem statement was text that random humans can
| read and OCR cannot.
|
| If you want to provide a good faith answer at least make
| it English. I assume this is French but it's obviously
| much harder to evaluate on both ends when you're mixing
| up the language.
| jncfhnb wrote:
| Then please provide a single example that we can't
| instantly solve. Happy to prove them wrong.
| AdieuToLogic wrote:
| > I would challenge you to find a picture of text that you
| think a human can read and OCR cannot.
|
| Are you aware of CAPTCHA[0] images?
|
| 0 - https://en.wikipedia.org/wiki/CAPTCHA
| jahewson wrote:
| Solvable with the right tools.
|
| https://github.com/noCaptchaAi/NoCaptcha-Ai-Browser-
| Extensio...
| AdieuToLogic wrote:
| > Solvable with the right tools.
|
| The original assertion was: I would
| challenge you to find a picture of text that you
| think a human can read and OCR cannot.
|
| Not if many CAPTCHA image challenges could be automated.
| Unless the tool referenced guarantees 100% correct
| solutions for all manipulated text images.
| CamperBob2 wrote:
| The AI models are now better at CAPTCHAs than I am, for
| both text- and image-based questions. But when confronted
| with a CAPTCHA, humans work for free, and the models
| don't. :(
|
| As long as that's the case, CAPTCHAs probably won't be
| considered truly obsolete.
| jncfhnb wrote:
| Text that is _intentionally constructed_ to fool
| computers but not humans is obviously out of scope. But
| they're generally easily solved with OCR these days
| anyway.
| BugsJustFindMe wrote:
| Yeah ok, but it might take me a few tries because I don't
| know what you're using. I hope that's agreeable?
|
| What does your OCR say that these say? The first one isn't
| too hard for a human (assuming appropriate language skill).
| The second one is a bit more difficult.
|
| https://imgur.com/a/CDU6Lgs
| AdieuToLogic wrote:
| > I don't think I believe that OCR can't do it but random
| humans can
|
| Considering the people involved are experts in their field, are
| certainly aware of OCR capabilities, and have publicized a need
| thusly: ... the National Archives is looking
| for volunteers who can help transcribe and organize its
| many handwritten records ...
|
| Perhaps "random humans" can perform tasks which could reshape
| your belief:
|
| > OCR is VERY good
| jncfhnb wrote:
| There are conceivable reasons why they may be telling a half
| truth here. Just engaging the public is a worthy goal here.
| AdieuToLogic wrote:
| > There are conceivable reasons why they may be telling a
| half truth here. Just engaging the public is a worthy goal
| here.
|
| Asserting an ulterior motive without supporting proof is to
| engage in conspiracy theories.
|
| Sometimes a cigar is just a cigar.[0]
|
| 0 - https://quoteinvestigator.com/2011/08/12/just-a-cigar/
| Dylan16807 wrote:
| It doesn't look like a cigar (very tricky documents)
| though. Hence the skepticism.
| jncfhnb wrote:
| The alternative is me saying that appealing to their
| "expertise" is an appeal to authority fallacy that flies
| in the face of general evidence that modern OCR is far
| better than humans at character recognition. Especially
| random non specialized humans.
| tptacek wrote:
| No. Sign up and look at the current missions. A lot of what
| they want transcribed is totally straightforward to OCR ---
| not even LLM, OCR. Whatever's going on, and I'm not second-
| guessing them, a pretty big chunk of their problem appears to
| be well within the state of the art. The appeal to authority
| isn't going to play here, because you can just click through
| to the archives and see what they're trying to figure out.
| AdieuToLogic wrote:
| > No. Sign up and look at the current missions. A lot of
| what they want transcribed is totally straightforward to
| OCR --- not even LLM, OCR. Whatever's going on, and I'm not
| second-guessing them, a pretty big chunk of their problem
| appears to be well within the state of the art.
|
| If it's that easy, then do it and be the hero they want.
|
| Or maybe, just maybe, "a pretty big chunk of their problem
| appears to be well within the state of the art" is a
| sweeping generalization lacking understanding of the
| difficulties involved.
| tptacek wrote:
| Go ahead and find something hard, and relate back the
| steps you took to find it.
| AdieuToLogic wrote:
| > Go ahead and find something hard, and relate back the
| steps you took to find it.
|
| This is a strawman[0] argument. You proclaimed:
| A lot of what they want transcribed is totally
| straightforward to OCR
|
| And I replied: If it's that easy, then do
| it and be the hero they want.
|
| So do it or do not. Nowhere does my finding "something
| hard" have any relevance to your proclamation.
|
| 0 - https://en.wikipedia.org/wiki/Straw_man
| tptacek wrote:
| I did in fact do it, and what I got was much, much easier
| than the samples in the article, which 4o did fine with.
| I'm sorry, but I declare the burden of proof here to be
| switched. Can you find a hard one?
|
| (I don't think you need to Wikipedia-cite "straw man" on
| HN).
| AdieuToLogic wrote:
| > I did in fact do it, and what I got was much, much
| easier than the samples in the article, which 4o did fine
| with.
|
| Awesome.
|
| Can you guarantee its results are completely accurate
| every time, with every document, and need no human
| review?
|
| > I'm sorry, but I declare the burden of proof here to be
| switched.
|
| If you are referencing my stating: If
| it's that easy, then do it and be the hero they want.
|
| Then I don't really know how to respond. Otherwise, if
| you are referencing my statement:
|
| > Perhaps "random humans" can perform tasks which could
| reshape your belief:
|
| >> OCR is VERY good
|
| To which I again ask, can you guarantee the correctness
| of OCR results will exceed what "random humans" can
| generally provide? What about "non-random motivated
| humans"?
|
| My point is that automated approaches to tasks such as
| what the National Archives have outlined here almost
| always require human review/approval, as accuracy is
| paramount.
|
| > (I don't think you need to Wikipedia-cite "straw man"
| on HN).
|
| I do so for two purposes. First, if I misuse a cited term
| someone here will quickly correct me. Second, there is
| always a probability of someone new here which is unaware
| of the cited term(s).
| Dylan16807 wrote:
| > If you are referencing my stating:
|
| > > If it's that easy, then do it and be the hero they
| want.
|
| > Then I don't really know how to respond.
|
| If someone says a thing is easy, and you respond by
| demanding they do it a million times to prove that it's
| easy, you are the one that has screwed up the burden of
| proof.
| tptacek wrote:
| Can I ask, did you sign up and look at what they're
| actually looking for? Show of good faith: can you give 3
| of the headers for the top-level "missions" they have for
| transcriptions?
| Dylan16807 wrote:
| There are two claims. The main one is that all of these
| documents are easy to _individually_ transcribe by
| machine. The other is that a whole lot can be OCR 'd,
| which is pretty simple to check.
|
| That's not a claim that processing the _entire archive_
| would be trivial. And even if it was, whether that would
| make someone the "hero they want" is part of what's
| being called into question.
|
| So your silly demand going unmet proves nothing.
|
| Also, "give me an example please" is not a strawman!
|
| If you actually want to prove something, you need to show
| at least one document in the set that a human can do but
| not a machine, or to _really_ make a good point you need
| to show that a _non-neglibile fraction_ fit that
| description.
| AdieuToLogic wrote:
| > So your silly demand going unmet proves nothing.
|
| I made demands of no one.
|
| > Also, "give me an example please" is not a strawman!
|
| My identification of the strawman was that it referenced
| "find something hard" when I had said "be the hero they
| want" and that what is needed in this specific problem
| domain may be more difficult than what a generalization
| addresses.
|
| > If you actually want to prove something, you need to
| show at least one document in the set that a human can do
| but not a machine, or to really make a good point you
| need to show that a non-neglibile fraction fit that
| description.
|
| Maybe this is the proof you demand.
|
| LLM's are statistical prediction algorithms. As such,
| they are nondeterministic and, therefore, provide no
| guarantees as to the correctness of their output.
|
| The National Archives have specific artifacts requiring
| precise textual data extraction.
|
| Use of nondeterministic tools known to produce provably
| incorrect results eliminate their applicability in this
| workflow due to _all_ of their output requiring human
| review. This is an unnecessary step and can be eliminated
| by the human reading the original text themself.
|
| Does that satisfy your demand?
| Dylan16807 wrote:
| > I made demands of no one.
|
| Whatever you want to call "If it's that easy, then do it"
|
| > LLM's [...] Does that satisfy your demand?
|
| That's a different argument from the one above where you
| were trying to contradict tptacek. And that argument is
| flawed itself. In particular, humans don't have
| guarantees either.
|
| > provably incorrect results
|
| This gets back to the actual request from earlier, which
| is showing an example where the machine performs below
| some human standard. Just pointing out that LLMs make
| mistakes is not enough proof of incorrectness in this
| specific use case.
| jncfhnb wrote:
| Also, you seem to have taken issue with the phrase "random
| humans" because you're confused at what's being done here. It
| is random humans. Non experts.
|
| Experts are asking for the help of non experts.
|
| > Anyone with an internet connection can volunteer to
| transcribe historical documents and help make the archives'
| digital catalog more accessible
| ozbonus wrote:
| I've been trying every state of the art OCR solution on my
| students' handwritten essays for fifteen years and have yet to
| find anything even close to acceptable.
| jncfhnb wrote:
| What methods have you tried?
| wriggler wrote:
| I'm the founder of handwritingocr.com - have you checked out
| our free trial? We have loads of educators using our service
| for exactly this, and they seem quite happy with it.
| jahewson wrote:
| Actually I think in 2025 you are correct, we just haven't got
| the best tech into the OCR software that's out there in the
| real world. I just pasted the letter from the article into
| ChatGPT (4o) and asked "what does this old letter say?" The
| response:
|
| ---
|
| The following is the declaration of James Lambert, a soldier of
| the Revolutionary War in North America.
|
| The said James Lambert on this day personally appeared in the
| Probate Court of the County of Dearborn in the State of Indiana
| and at the November Term of said Court (1841), it being a court
| of record established by the laws of Indiana and made oath
| that:
|
| On the 25th day of March 1842 he will be eighty-five years old;
| that he was born in the State of Maryland; that he is now a
| resident of said county and has been for the 27 years last
| past; that he has lived in Virginia, Maryland, Pennsylvania...
|
| ---
| musicale wrote:
| It might be nice for people to be able to actually read the
| documents in the National Archives rather than relying on a
| transcription or a mobile app.
|
| I wonder if they've considered making a simple tutorial on how to
| read cursive? It's not that hard if you can already read printed
| English. And of course you can practice on documents in the
| National Archives.
|
| It's exciting and fun to learn to read an unfamiliar script, like
| the runes on the cover of The Hobbit ... or the engraving-style
| cursive of the US Constitution.
| posterguy wrote:
| i dont think the problem is the lack of resources to learn how
| to read and write cursive
| jb1991 wrote:
| Except that it does say that in the article, that's it's a
| lack of education in reading cursive.
| posterguy wrote:
| no, it says the opposite, that there is growing interest in
| bringing it back into curriculums in various states. but
| that's aside from the point that the smithsonian making a
| tutorial on reading cursive would just represent an
| additional resource, of which we are not lacking, to learn.
| whether or not we teach it is different, but finding a
| resource to learn is not hard.
| musicale wrote:
| Maybe linking to the resource. "Learn how to read this
| document."
| Levitz wrote:
| Those two statements aren't at odds with each other.
|
| For example, there's a great abundance of resources to
| learn about music theory and such too, the average person
| doesn't know such things because they aren't interested.
| jhanschoo wrote:
| I find the article's conflation of two topics involving
| cursive writing ignorant or disingenuous to the point that
| I almost wanted to respond with my own comment on that
| itself. If you study cursive writing in class, you are
| likely to learn simple and standard letterforms like Palmer
| script.
|
| But the task requested by the National Archives is more
| akin to paleography where you can expect each author or
| work to have their own (region-based/family-based)
| handwriting that requires decipherment, even for experts.
| You may have encountered a coworker or schoolmate's
| indecipherable chicken scratch print writing; that is what
| you should expect, only cursive.
| ternaryoperator wrote:
| I think it likely that reading the great variety of cursive
| styles makes simple teaching rather complicated. Folks who
| spent years in school reading and writing in cursive can
| quickly adapt to the various styles, in a way that I'm not sure
| it could be done in a simple tutorial.
| AdieuToLogic wrote:
| > I wonder if they've considered making a simple tutorial on
| how to read cursive?
|
| In generations past, this was called "elementary school."
| iambateman wrote:
| This is all very cool so I'm not trying to be dismissive. In a
| lot of ways, giving a hobby out as a way to participate in the
| national archives is an end in itself.
|
| But...computers can definitely do this way better, right?
| jonahx wrote:
| I had the same thought but maybe on old hand writing they
| can't?
|
| EDIT:
|
| I tried giving the sample to 4o and it gave:
|
| The following is the declaration of James Lambert, a soldier of
| the Revolutionary War in North America.
|
| The said James Lambert this day personally appeared in the
| Probate Court of the County of Dearborn in the State of Indiana
| and at the November Term of said Court (1841), it being a court
| of record created by the laws of Indiana and made oath that:
|
| On the 25th day of March 1842, he will be eighty-five years
| old, that he was born in the State of Maryland, that he is now
| a resident of said county and has been for the 27 years last
| past; that he has lived in Virginia, Maryland, and
| Pennsylvania...
| AdieuToLogic wrote:
| > This is all very cool so I'm not trying to be dismissive. In
| a lot of ways, giving a hobby out as a way to participate in
| the national archives is an end in itself.
|
| > But...computers can definitely do this way better, right?
|
| No.
|
| Cursive writing is analog and fluid, lacking consistency across
| authors and often inconsistent by an individual author as well.
| When done well, it could be classified as its own art form.
| When done poorly, it can resemble the path walked by a chicken
| on meth.
| sulam wrote:
| Current LLMs can absolutely do this as well as you can,
| probably better.
| AdieuToLogic wrote:
| > Current LLMs can absolutely do this as well as you can,
| probably better.
|
| This is obviously disprovable, in that if they could, they
| would, and this call to action would not exist.
| Osyris wrote:
| That's quite a lot of faith you have in them.
| nozzlegear wrote:
| Them being the National Archives? What about the National
| Archives makes you think they're particularly inept at
| utilizing LLMs?
|
| I'm tired of this brand of dismissive cynicism.
| musicale wrote:
| iPad seems to do OK, but it has more to go by since it has
| the timing and pressure as well as the written text.
| zabzonk wrote:
| After using a keyboard for circa 50 years, I can't read my own
| handwriting. I can't even give a reproduceable signature.
| munchler wrote:
| Me too, and I used to be proud of my handwriting back in the
| 90's. Definitely a loss in self-expression.
| dpb001 wrote:
| Same here. Old enough to remember when your signature on a
| credit card receipt would be given a quick look to compare it
| to the scrawl on the back of the card. If this was still being
| done I'd probably fail 50% of the transactions I attempt.
| kmoser wrote:
| Nobody has checked the back of my credit card for the
| presence of a signature in decades, let alone whether the
| signature matches. (I also haven't bothered to sign my credit
| card for this reason, but also because why would I want
| somebody to have my actual signature if my card is stolen?)
| These days my "signature" on a credit card purchase is
| usually a smiley face. Nobody has ever complained.
| dpb001 wrote:
| Yup, it's been decades - I remember it happening with the
| carbon copy imprinting devices and it may have been more
| common in the US rural South where I was working at the
| time. The squiggles I fingerpaint on checkout screens now
| are my version of your smiley face.
| jb1991 wrote:
| > particularly for Americans who never learned cursive in school.
|
| American schools don't teach it anymore?!
| jghn wrote:
| Why would they? It's an anachronism optimizing for writing
| speed
| galangalalgol wrote:
| They started teaching it again because it correlated with
| better outcomes for things seemingly unrelated to writing.
| And it was important to learn it before typing supposedly.
| There is probably some better way to accomplish whatever it
| is actually doing, but they don't seem to know that.
| adrian_b wrote:
| I agree that cursive handwriting has become useless.
|
| As a child, even many years before having access to personal
| computers or any other kind of typewriting, I have switched
| my handwriting from cursive to using the kind of sans-serif
| typefaces used in technical drawing and since then I have
| never written again cursively, with the exception of my
| signature, where required on official documents.
|
| Nevertheless, I believe that some kind of calligraphy is
| necessary for developing fine motor skills in children,
| unless it is replaced with some other activity that requires
| a similar precision in the movements of the fingers and of
| the hand.
| _pktm_ wrote:
| Not that I can tell, unless you encounter a teacher who
| (personally) believes it's worthwhile.
|
| The real problem, IMO, is that they don't teach cursive but
| also don't teach typing. They've thrown laptops at the kids
| without giving them the basic skill necessary to be effective
| in that medium.
| galangalalgol wrote:
| They stopped teaching cursive for a number of years but all
| the schools in my area start it around age 6 or 7 now. They
| start typing the next year with some horribly boring typing
| program.
| jez wrote:
| The handwriting in some of these snippets, while sometimes
| difficult to read for one reason or another, is nonetheless
| beautiful: did everyone who wrote have such great handwriting
| back then?
|
| I'm looking at the piece in the Instagram post linked by the
| page, which begins, "honor of holding in their service". The
| lines are so straight, the letters are so uniform!
| cyberax wrote:
| Handwriting is a skill, you get better with practice!
|
| A lot of bad handwriting stems from using it to write down
| things quickly (see: https://imgur.com/doctors-strike-5ANma ).
|
| If you instead focus on doing slow calligraphy, your
| handwriting can improve rapidly.
| 999900000999 wrote:
| Widespread literacy is an extremely recent phenomenon.
|
| I highly doubt most people could write that well
| quickthrowman wrote:
| The US is an extreme outlier with regards to a high rate of
| literacy compared to almost everywhere else during the
| 1600-1800s. Today is a different story, Massachusetts had a
| higher rate of literacy when education was made compulsory in
| the 19th century than it does currently, which is kind of
| astounding.
|
| > Sheldon Richman quotes data showing that from 1650 to 1795,
| American male literacy climbed from 60 to 90 percent. Between
| 1800 and 1840 literacy in the North rose from 75 percent to
| between 91 and 97 percent. In the South the rate grew from
| about 55 percent to 81 percent. Richman also quotes evidence
| indicating that literacy in Massachusetts was 98 percent on
| the eve of legislated compulsion and is about 91 percent
| today.
|
| https://www.independent.org/publications/article.asp?id=307
| 999900000999 wrote:
| I'm happy to be proven wrong.
|
| Any reason for this being an American thing?
|
| I'd still assume fine penmanship was a mark of the upper
| class though
| hello_newman wrote:
| As someone with terrible handwriting but decent cursive, i
| think cursive provides a better structure for achieving cleaner
| penmanship compared to non-cursive writing. My theory is that
| cursive's consistency of soft, flowing loops rather than a mix
| of abrupt angles and disconnected lines helps create a more
| uniform result.
|
| I also remember teachers telling you when writing cursive to
| seldom lift your hand from the page. I think that act of
| keeping your pen on the page for most of the writing process
| encourages a smoother and more natural flow, reducing the
| chance of jerky, uneven strokes
| tptacek wrote:
| Isn't this like a bread-and-butter AI task?
|
| _"The following is the declaration of James Lambert, a soldier
| of the Revolutionary War in North America."_ _"The said James
| Lambert, on this day personally appeared in the Probate Court of
| the County of Dearborn in the State of Indiana, at the November
| Term of said Court [1841], it being a court of record created by
| the laws of Indiana, and made oath that on the 25th day of March
| 1842 he will be eighty-five years old; that he was born in the
| State of Maryland; that he is now a resident of [said] county and
| has been for the [27] years last past; that he has lived in
| Virginia, Maryland, [and Pennsylvania]; that..."_
|
| These kinds of problems, matching up cursive to actual text,
| would seem to play to the absolute best strengths of an LLM,
| given how much basic language structure the models encode.
| saagarjha wrote:
| > The agency uses artificial intelligence and a technology
| known as optical character recognition to extract text from
| historical documents. But these methods don't always work, and
| they aren't always accurate.
| edelbitter wrote:
| I've seen people do that, and the results are.. just sad. These
| modern models insert their twitter-era "what grabs attention
| must be true" view into the very little authentic past we still
| possess.
| tptacek wrote:
| What did 4o get wrong about the title image in the
| transcription I just gave you?
| saagarjha wrote:
| Seems like something that some of those big AI companies that are
| desperately starved of training material could chip in on, no?
| Actually do something for the public good, spend a few cents of
| that VC money, get some high-quality training data out of it?
| myth_drannon wrote:
| Why do they need volonteers to manually do it? Open AI models
| like Microsoft's TrOCR are very effective for handwritten English
| demosthanos wrote:
| Before commenting asking about why they don't just use LLMs,
| please note that the article specifically calls out that they do,
| but it's not always a viable solution:
|
| > The agency uses artificial intelligence and a technology known
| as optical character recognition to extract text from historical
| documents. But these methods don't always work, and they aren't
| always accurate.
|
| The document at the top is likely an especially _easy_ document
| to read precisely because it 's meant to be the hook to get
| people to sign up and get started. It isn't going to be
| representative of the full breadth of documents that the National
| Archives want people to go through.
| tptacek wrote:
| OK, fair enough, but can you find one in this article that's
| hard for an LLM? The gnarliest one I saw, 4o handled instantly,
| and I went back and looked carefully at the image and the text
| and I'm sold.
|
| Like if this is a crowdsourcing project, why not do a first
| pass with an LLM and present users with both the image and the
| best-effort LLM pass?
|
| _Later_
|
| I signed up, went to the current missions, and they all seem to
| post post-1900 and all typeset. They're blurry, but 4o cuts
| through them like a hot knife through butter.
| varenc wrote:
| My guess is because it's the Smithsonian, they're just not
| willing to trust an LLM's transcription enough to put their
| name on it. I imagine they're rather conservative. And maybe
| some AI-skeptic protectionist sentiments from the
| professional archivists. Seems like it could change with time
| though.
| ugh123 wrote:
| > My guess is because it's the Smithsonian, they're just
| not willing to trust an LLM's transcription enough to put
| their name on it. I imagine they're rather conservative
|
| I expect thats a common theme from companies like that, yet
| I don't think they understand the issue they think they
| have there.
|
| Why not have the LLMs do as much work as possible and have
| humans review and put their own name on it? Do you think
| they need to just trust and publish the output of the LLM
| wholeheartedly?
|
| I think too many people saw what a few idiot lawyers did
| last year and closed the book on LLM usage.
| patrick451 wrote:
| The incident with the lawyers just highlighted the
| fundamental problem with LLMs and AI in general. They
| can't be trusted for anything serious. Worse, they give
| the apppearence of being correct, which leads humans
| "checkers" into complacency. Total dumpster fire.
| dr_dshiv wrote:
| Instead of thinking about this as an all-or-nothing
| outcome, consider how this might work if they were made
| accessible with LLMs, and then you used randomized spot
| checks with experts to create a clear and public error
| rate. Then, when people see mistakes they can fix them.
|
| I'm trying to do this for old Latin books at the Embassy
| of the Free Mind in Amsterdam. So many of the books have
| never been digitized, let alone OCRd or translated. There
| is a huge amount of work to be done to make these works
| accessible.
|
| LLMs won't make it perfect. But isn't perfect the enemy
| of the good? If we make it an ongoing project where the
| source image material is easily accessible (unlike in a
| normal published translation, where you just have to
| trust the translator), then the knowledge and
| understanding can improve over time.
|
| This approach also has the benefit of training readers
| not to believe everything they read -- but to question it
| and try to get directly at the source. I think that's a
| beautiful outcome.
| patrick451 wrote:
| These kinds of ideas just sound to me like "Suppose you
| had to use broken technology X. How do you make work?"
| Dylan16807 wrote:
| I don't think you're wrong, but that's because there are
| no alternative technologies. The only alternative is
| leaving much more of the archive inaccessible for a much
| longer period, possibly forever.
| miltonlost wrote:
| > The only alternative is leaving much more of the
| archive inaccessible for a much longer period, possibly
| forever.
|
| No, the alternative is volunteers transcribing. Like this
| project.
|
| Not every problem needs a computer.
| Dylan16807 wrote:
| Volunteers transcribing leaves much more of the archive
| inaccessible for a much longer period.
| thaumasiotes wrote:
| > Why not have the LLMs do as much work as possible and
| have humans review and put their own name on it?
|
| That's not a good way to improve on the accuracy of the
| LLM. Humans reviewing work that is 95% accurate are
| mostly just going to rubber-stamp whatever you show them.
| This is equally a problem for humans reviewing the work
| of other humans.
|
| What you actually want, if you're worried about accuracy,
| is to do the same work multiple times independently and
| then compare results.
| dfc wrote:
| The article is from The Smithsonian. The actual project is
| with the National Archives.
| defaultcompany wrote:
| My parents have saved letters from their parents which are
| written in cursive but in two perpendicular layers. Meaning
| the writing goes horizontally in rows and then when they got
| to the end of the page it was turned 90 degrees and continued
| right on top of what was already there for the whole page.
| This was apparently to save paper and postage. It looks like
| an unintelligible jumble but my mother can actually decipher
| it. Maybe that's what the LLMs are having trouble with?
|
| Edit: apparently it's called cross writing [1]
|
| 1: https://highshrink.com/2018/01/02/criss-cross-letters/
| tptacek wrote:
| Are they having trouble? You can sign up right now and get
| tasks from the archive that seem trivial for 4o (by which I
| mean: feed a screenshot to 4o, get a transcription, and
| spot check it).
| doodlebugging wrote:
| I'm doing some genealogy work right now on my family's old
| papers covering the time period from recent years back to the
| late 17th century. Handwriting styles changed a lot over the
| centuries and individuals can definitely be identified by
| their personal cursive style of writing and you can see their
| handwriting change as they aged.
|
| Then you have the problem that some of these ancestors not
| only had terrible penmanship but also spelled multi-syllabic
| words phonetically since they likely were barely educated
| kids who spent more time when they were young working on the
| farm or ranch instead of attending school where they would've
| learned how to spell correctly.
|
| I don't know whether your LLM can handle English words
| spelled phonetically written in cursive by an individual who
| had no consistency in forming letters in the words. It is
| clear after reading a lot of correspondence from this person
| that they ignored things that didn't seem important in the
| moment like dotting i's or crossing t's or forming tails on
| g's, p's, j's, or even beginning letters consistently since
| they switched between cursive and block letters within a
| sentence, maybe while they paused to clarify their thoughts.
| I don't know but it is fascinating to take a walk through
| life with someone you'll never meet and to discover that many
| of the things that seemed awesome to you as a kid were also
| awesome to them and that their life had so many challenges
| that our generations will never need to endure.
|
| Some of my people have the most beautiful flowing cursive
| handwriting that looks like the cursive that I was taught in
| grade school. Others have the most beautiful flowing cursive
| with custom flourishes and adornments that make their
| handwriting instantly recognizable and easy to read once you
| understand their style.
|
| I think there are plenty of edge cases where LLMs will take a
| drunkard's walk through the scribble and spit out gibberish.
|
| I'm reminded of an old joke though.
|
| Ronald Reagan woke up one snowy Washington, DC morning and
| took a look out of the window to admire the new-fallen snow.
| He enjoys the beautiful scene laid out before him until he
| sees tracks in the snow below his window and a message
| obviously written in piss that said - "Reagan sucks".
|
| He dispatched the Secret Service to the site where samples
| were taken of the affected snow and photos of the tracks of
| two people were made.
|
| After an investigation he receives a call from the Secret
| Service agent in charge who tells him he has some good news
| and some bad news for him.
|
| The good news is that they know who pissed the message. It
| was George HW Bush, his Vice President. The bad news is that
| it was Nancy's handwriting.
| rtkwe wrote:
| One that require additional work beyond simply feeding the
| image into the model would be this example which is a mix of
| barely legible hand written cursive and easy to read typed
| form. [0] Initially 4o just transcribes (successfully) the
| bottom half of the text and has to be prompted to attempt the
| top half at which point it seems to at best summarize the
| text instead of giving a direct transcription. [1] In fact it
| seems to mix up some portions of the latter half of the typed
| text with the written text in the portion of it's
| "transcription" about "reduced and indigent circumstances".
|
| [0] https://catalog.archives.gov/id/54921817?objectPage=8&obj
| ect...
|
| [1] Reproducing here since I cannot share the chat since it
| has user uploaded images. " The text in the top half of the
| image is handwritten and partially difficult to read due to
| its cursive style and some smudging. Here's my best
| transcription attempt for the top section:
|
| ...resident within four? years, swears and says that the name
| of the John Hopper mentioned in the foregoing declaration is
| the same person, and he verily believes the facts as stated
| in the declaration are true.
|
| He further swears that the said John Hopper is in reduced and
| indigent circumstances and requires the aid of his country.
|
| The declarant further swears he has no evidence now in his
| power of service, except the statement of Capt. (illegible
| name), as to his reduced circumstances ...
|
| Sworn to before me, this day...
|
| Some parts remain unclear due to the handwriting, but let me
| know if you'd like me to attempt further clarification on
| specific sections!"
| thaumasiotes wrote:
| > this example which is a mix of barely legible hand
| written cursive and easy to read typed form.
|
| > In fact it seems to mix up some portions of the latter
| half of the typed text with the written text in the portion
| of it's "transcription" about "reduced and indigent
| circumstances".
|
| What typed form? What typed text? That image is a single
| handwritten page, and the writing is quite clean, not
| "barely legible".+ The file related to John Hopper appears
| to be 59 pages, and some of them are typed, but they're all
| separate images.
|
| Are you trying to process all 59 pages at once? Why?
|
| I should note that transcription is an excellent use of an
| LLM in the sense of a language model, as opposed to an
| "LLM" in the sense of several different pieces of software
| hooked together in cryptic ways. It would be a lot more
| useful, for this task, to have direct access to the
| language model backing 4o than to have access to a chatbot
| prompt that intermediates between you and the model.
|
| + My biggest problems in reading the page: Cursive _n_ and
| _u_ are often identical glyphs (both written i), leading me
| to read "Ind." as "Jud."; and I had trouble with the
| "roster" at the bottom of the page. What felt weirdest
| about that was that the crossbar of the "t" is positioned
| well above the top of the stem, but that can't actually be
| what tripped me up, because on further review it's a common
| feature of the author's handwriting that I didn't even
| notice until I got to the very end of the letter. It's even
| true in the earlier instance of "Roster" higher up on the
| page. So my best guess is that the "os" doesn't look right
| to me.
|
| I misread 1758 as 1958, too, but hopefully (a) that kind of
| thing wears off as you get used to reading documents about
| the Revolutionary War; and (b) it's a red flag when someone
| who died in 1838 was born in 1958 according to a letter
| written in 1935.
| ellen364 wrote:
| > Like if this is a crowdsourcing project, why not do a first
| pass with an LLM and present users with both the image and
| the best-effort LLM pass?
|
| Possibly for the reason that came up in your other post: you
| mentioned that you spot checked the result.
|
| Back when I was in historical research, and occasionally
| involved in transcription projects, the standard was 2-3
| independent transcriptions per document.
|
| Maybe the National Archive will pass documents to an LLM and
| use the output as 1 of their 2-3 transcriptions. It could
| reduce how many duplicate transcriptions are done by humans.
| But I'll be surprised if they jump to accepting spot checked
| LLM output anytime soon.
| tptacek wrote:
| You get that I'm not saying they should just commit LLM
| outputs as transcriptions, right?
| Avshalom wrote:
| Real quick, how long do you think chatgpto4 has existed? How
| long do you think the National Archive has been archiving?
| tptacek wrote:
| It's 4o. The crowdsourced transcription project dates back
| to 2012. My comment is mostly on this article.
| anaisbetts wrote:
| Did you actually check it? Sonnet 3.5 generates text that
| seems legitimate and generally correct, but misreads
| important details. LLMs are particularly deceptive because
| they will be internally consistent - they'll reuse the same
| incorrect name in both places and will hallucinate
| information that seems legit, but in fact is just made-up.
| dr_dshiv wrote:
| Just have version control, and allow randomized spot checks
| with experts to have a known error rate.
| myth_drannon wrote:
| You don't use LLM but other transformer based ocr models
| like trocr which has very low CER and WER rates
| vintermann wrote:
| I don't know about this project, but I can easily find
| thousands of images that gpt-4o can't read, but a human
| expert can. It can do typed text excellently, antika-style
| cursive if it's very neat, and kurrent-style cursive never.
| tptacek wrote:
| For straightforward reasons, I am commenting on this
| project, not the space of all possible projects. I did try,
| once, to get 4o to decode the Zodiac Killer's message. It
| didn't work.
| morning-coffee wrote:
| > Like if this is a crowdsourcing project...
|
| I'm confused by what you're asking. Are you asking me to like
| (upvote) your comment if this is a crowdsourcing project?
| Don't we already know it is a crowdsourcing project?
| enlightens wrote:
| The use of the word "like" here could be replaced with the
| word "so"
|
| "So if this is a crowdsourcing project..."
|
| Like is serving as an indication that someone else
| approximately said the phrase it introduced, in a way often
| associated with the "Valley Girl" social dialect but
| regularly seen outside of it.
|
| https://en.wikipedia.org/wiki/Like#As_a_colloquial_quotativ
| e
| thaumasiotes wrote:
| > The use of the word "like" here could be replaced with
| the word "so"
|
| Correct, but that's not a quotative use of the word. It's
| a discourse particle. You want to link one subsection
| down, _like_ as a discourse particle.
|
| https://en.wikipedia.org/wiki/Like#As_a_discourse_particl
| e,_...
| tedunangst wrote:
| Something about extraordinary claims and extraordinary
| evidence? The evidence presented, a seemingly easily
| transcribed image, is hardly persuasive.
| rtkwe wrote:
| Some are significantly harder to read. I took the page below
| and tried to get GPT 4o to transcribe it and it basically
| couldn't do it. I'm not going to sit and prompt hack for ages
| to see if it can but it seems unable to tackle the
| handwritten text at the top. When I first just fed it the
| image and asked for a transcription it only (but
| successfully) read the bottom portion, prompted for a
| transcription of the top it dropped into more of a summary of
| the whole document mainly pulling some phrases from the
| bottom text. (Sadly can't share it but I copied it's reply
| out in a comment upthread) [0]
|
| It was more successful at a few others I tried but it's still
| a task that requires manual processing like a lot of LLM
| output to check for accuracy and prompt modification to get
| it to output what you need for some documents.
|
| https://catalog.archives.gov/id/54921817?objectPage=8&object.
| ..
|
| [0] https://news.ycombinator.com/item?id=42746490
| interludead wrote:
| Still, the fact that they're combining AI and human effort
| makes sense
| mkoubaa wrote:
| High quality human transcriptions are the most valuable kind
| of training data
| prng2021 wrote:
| Determining whether the latest off the shelf LLMs are good
| enough should be straight forward because of this:
|
| "Some participants have dedicated years of their lives to the
| program--like Alex Smith, a retiree from Pennsylvania. Over
| nine years, he transcribed more than 100,000 documents"
|
| Have different LLMs transcribe those same documents and compare
| to see if the human or machine is or accurate and by how much.
| sandworm101 wrote:
| This is not an LLM problem. It was solved years ago via OCR.
| Worldwide, postal services long ago deployed OCR to read
| handwitten addresses. And there was an entire industry of
| OCR-based data entry services, much of it translating the
| chicken scratch of doctor's handwiting on medical forms, long
| before LLMs were a thing.
| lukeschlather wrote:
| LLMs improve significantly on state of the art OCR. LLMs
| can do contextual analysis. If I were transcribing these by
| hand, I would probably feed them through OCR + an LLM, then
| ask an LLM to compare my transcription to its transcription
| and comment on any discrepancies. I wouldn't be surprised
| if I offered minimal improvement over just having the LLM
| do it though.
| sandworm101 wrote:
| Why assume that OCR does not involve context? OCR systems
| regularly use context. It doesnt require an LLM for a
| machine reading medical forms to generate and use a list
| of the hundred most common drugs appearing in a paticular
| place on a specific form. And an OCR reading envelopes
| can be directed to prefer numbers or letters depending on
| what it expects.
|
| Even if LLMs can push a 99.9% accuracy to 99.99, at least
| an OCR-based system can be audited. Ask an OCR vendor why
| the machine confused "Vancouver WA" and "Vancouver CA"
| and one can get a solid answer based in repeated testing.
| Ask an LLM vendor why and, at best, you'll get a shrug
| and some line citing how much better they were in all the
| other situations.
| iterance wrote:
| Are you guessing, or are there results somewhere that
| demonstrate how LLMs improve OCR in practical
| applications?
| Modified3019 wrote:
| Someone linked this above
|
| https://trustdecision.com/resources/blog/revolutionizing-
| ocr...
|
| > Our internal tests reveal a leap in accuracy from
| 98.97% to 99.56%, while customer test sets have shown an
| increase from 95.61% to 98.02%. In some cases where the
| document photos are unclear or poorly formatted, the
| accuracy could be improved by over 20% to 30%.
|
| While a small percentage increase, when applied to
| massive amounts of text it's a big deal.
| dambi0 wrote:
| For the addresses it might be a bit easier because they are
| a lot more structured and in theory and the vocabulary is a
| lot more limited. I'm less sure about medical notes
| although I'd suspect that there are fairly common things
| they are likely to say.
|
| Looking at the (admittedly single) example from the
| National Archives seems a bit more open than perhaps the
| other two examples. It's not impossible thst LLMs could
| help with this
| WillAdams wrote:
| Yes, but there was usually a fall-back mechanism where an
| unrecognized address would be shown on a screen to an
| employee who would type it so that it could then be
| inkjetted with a barcode.
| prng2021 wrote:
| It was never "solved" unless you can point me to OCR
| software that is 100% accurate. You can take 5 seconds to
| google "ocr with llm" and find tons of articles explaining
| how LLMs can enhance OCR. Here's an example:
|
| https://trustdecision.com/resources/blog/revolutionizing-
| ocr...
| sandworm101 wrote:
| By that standard, no problem has ever been solved by
| anyone. I prefer to believe that a great many everyday
| tech issues were in fact tackled and solved in the past
| by people who had never even heard of LLMs. So too many
| things were done in finance long before blockchains
| solved everything for us.
| VWWHFSfQ wrote:
| OCR is not perfect. And therefore it is not "solved".
| Dylan16807 wrote:
| That definition, solved=perfect, is not what sandworm
| meant and it's an irrelevant definition to this
| conversation because it's an impossible standard.
|
| Insisting we switch to that definition is just being
| unproductive and unhelpful. And it's pure semantics
| because you know what they meant.
| philipwhiuk wrote:
| Not really, because this entire post is about that last
| fraction of a %.
| Dylan16807 wrote:
| It's not, because then they wouldn't want humans, because
| humans can't do 100% either.
| jadamson wrote:
| That's only true if the x% humans can't do is the same x%
| that OCR can't do.
| prng2021 wrote:
| From the article I linked:
|
| "Our internal tests reveal a leap in accuracy from 98.97%
| to 99.56%, while customer test sets have shown an
| increase from 95.61% to 98.02%. In some cases where the
| document photos are unclear or poorly formatted, the
| accuracy could be improved by over 20% to 30%."
| flir wrote:
| In my experience the chatbots have bumped transcription
| accuracy quite a bit. (Of course, it's possible I just
| don't have access to the best-in-class OCR software I
| should be comparing against).
|
| (I always go over the transcript by hand, but I'd have to
| do that with OCR anyway).
| asveikau wrote:
| OCR is very bad.
|
| As an example look at subtitle rips for DVD and Blu-ray.
| The discs store them as images of rendered computer text.
| A popular format for rippers is SRT, where it will be
| stored as utf-8 and rendered by the player. So when you
| rip subtitles, there's an OCR step.
|
| These are computer rendered text in a small handful of
| fonts. And decent OCR still chokes on it often.
| iandanforth wrote:
| Fun fact, convolutional neural networks developed by Yann
| LeCunn were instrumental in that roll out!
| pinoy420 wrote:
| Agree. Sounds like not wanting to let go of a legacy
| brenainn wrote:
| The Australian War Memorial has a volunteer program for
| transcribing old letters and diaries and such:
| https://transcribe.awm.gov.au/
|
| I gave it a go but it was too hard for me! I write in cursive but
| I found most of it illegible.
| Decabytes wrote:
| I'm interested to give this a go because I want to practice
| reading cursive. I do a lot of longhand writing including writing
| all my notes in cursive. It's exciting to watch my binding fill
| up with all sorts of different subjects!
|
| I like to write in cursive for a few reasons
|
| 1. I find it makes my hand cramp less 2. It offers some shallow
| privacy in public 3. I don't want to lose the skill 4. It's fun!
| gabeio wrote:
| All of the same reasons I love practicing a little calligraphy!
| I love how it looks as well. I don't use a special pen but just
| add my own style to my cursive to make it look even nicer. But
| I used to write my notes in school with calligraphy (mostly
| because it gave me an excuse to not care about the subject) but
| it made the teachers hate me because I would never finish
| copying their scribbles fast enough.
| c0brac0bra wrote:
| I have a family heirloom civil war journal and much of it is
| unfortunately near undecipherable cursive writing.
|
| It would be great if this would eventually develop into some kind
| of set of open models that would work on content like this.
| tkgally wrote:
| This reminded me of something the historian Megan Marshall wrote
| in the introduction to her book _The Peabody Sisters: Three Women
| Who Ignited American Romanticism_ (2005):
|
| "I became expert in deciphering the sisters' handwriting, and
| that of their ancestors, parents, and friends. Each era and each
| correspondent presented different challenges. Some hands were
| sprawling, some spindly, some cramped; _t_ 's went uncrossed at
| the ends of words, and _f_ 's and _s_ 's were interchanged;
| spelling, capitalization, and punctuation could be erratic or
| idiosyncratic. Often, to save paper and postage, the sisters
| turned a single sheet ninety degrees and wrote back across a page
| already covered with handwriting. I learned to be especially
| attentive to these cross-written lines, in which the sisters
| invariably confided their deepest feelings in the last hurried
| moments of closing a letter. Here I would find the urgent
| personal message that had been put off for the sake of dispensing
| news or settling business. In one such postscript, I discovered
| Elizabeth's account of a conversation with Horace Mann in which
| the two spoke frankly of their love for each other and finally
| settled on what it meant."
|
| A photograph of a letter with cross-writing is here:
|
| https://www.masshist.org/database/1774
|
| Marshall wrote more in an article for _Slate_ :
|
| https://slate.com/news-and-politics/2005/05/reading-the-peab...
| teddyh wrote:
| > _and f's and s's were interchanged_
|
| Could these be instances of the long s, "s", easily confused
| with an f?
| Unearned5161 wrote:
| Ok I did one letter, from a woman in 1814 writing to James Monroe
| (then Secretary of State) asking for a passport to go to Scotland
| to get her late brother's property. What a trip! So enjoyable to
| get into the flow once you've "synchronized" with the persons
| handwriting. Furthermore, due to the fact that you're reading and
| re-writing word for word of whatever you're transcribing, the
| stories you end up reading have tremendous memory-stick. This is
| not surprising, considering that you are dedicating an inordinate
| amount of time per page, but it's a welcome side effect when you
| try and recollect.
| jhanschoo wrote:
| > Furthermore, due to the fact that you're reading and re-
| writing word for word of whatever you're transcribing, the
| stories you end up reading have tremendous memory-stick. This
| is not surprising, considering that you are dedicating an
| inordinate amount of time per page, but it's a welcome side
| effect when you try and recollect.
|
| This was something I enjoyed when I decided to learn a language
| by translating short stories. (Edit: Of course, you have to
| choose an author whose diction you respect. Your unfamiliarity
| with the target language encourages you to mull over the
| author's use of diction and the nuances the author is trying to
| convey, and then find appropriate diction in English. This
| means you spend a long time immersed in the imagery.)
| Unearned5161 wrote:
| What a brilliant idea. I've had learning to read French on my
| list for a while now, I'm going to try transcription as
| another way at it.
| interludead wrote:
| I love the idea of "synchronizing" with someone's handwriting
| Daneel_ wrote:
| I wish this technique worked for me. I can transcribe something
| verbatim and then have absolutely no idea what I've written - I
| have to go back and read it to actually parse the text.
| dylan604 wrote:
| That's not uncommon. I was the same way back when I took an
| actual typing class. The part of my brain used for
| storage/recall just seems to go to sleep when doing the whole
| transcription stage. Maybe it was a mental thing realizing it
| was just a task and no actual interest in the content other
| than accomplishing a task vs doing it something I had a
| vested interest???
| geuis wrote:
| It's a really interesting project. But boy do they make it hard
| to participate.
|
| * Article doesn't provide a direct link to the topic mission
|
| * Signup is pretty easy. Well organized and even gently requires
| you to have two forms of 2FA.
|
| * Sign up complete. Go back to the primary page and try to find
| the mission. A little buried but not too deep.
|
| * Notice I'm not signed in. Ok, let's do that. Now I'm back on
| the main page and navigate back. Find the first document and open
| it. Really interesting to scan through the doc and to read.
| People back then generally had really nice handwriting.
|
| * Ok, what next, how do I transcribe? ... ? Oh it says I'm not
| logged in again. Fine, click the link and...
|
| * I'm logged in and directed back to the main page, again.
|
| Look, this is an interesting project and I'd love to spend my
| spare cycles to help out. But they really need to clean up this
| process.
|
| Volunteers shouldn't have to jump through kinda poorly designed
| interfaces to help out.
| rtkwe wrote:
| The social post embedded in the page links directly to this
| page with all the instructions. Once I created an account and
| signed in I just selected a state in the original tab and was
| right there and could start translating.
|
| Do you perhaps have uBlock Origin enabled or some other
| limitation on Javascript/cookies that might be messing with
| your login status?
|
| The direct link to the mission that was in the social post.
| https://www.archives.gov/citizen-archivist/missions/revoluti...
| jcoby wrote:
| I had the exact same experience when I tried to contribute last
| week. I had to jump between multiple sessions and browsers and
| eventually managed to log in after about 30 minutes of trying.
| There is no indication of what is going right or wrong. Once
| you're in the UI changes very little as well so it's quite easy
| to miss that you've managed to log in.
|
| Once I was logged in I spent another 45 minutes trying to find
| a document to transcribe. Every single one I found or was given
| from a challenge had either already been transcribed or was a
| typewritten document or manifest that the OCR had already done
| an OK job with. I reviewed a few documents for accuracy, closed
| the browser, and never went back.
|
| It's a shame it's so hard to use. I really was hoping for
| something I could pop open for 15-30 minutes a day as a break
| from work and contribute to instead of doing a crossword or
| watching a video.
| poulpy123 wrote:
| My brother in history, I can't even read mine
| seletskiy wrote:
| To tptacek and other guys who seem to have unwavering trust in
| OCRs/LLMs, as well as to opposite party who think that technology
| is not there yet -- you are all partially right, but somehow fail
| to hear each other while also spending time on baseless arguing
| instead of factual examples and attempts to find common truth.
|
| Can it be used to greatly simplify efforts by getting through
| boilerplate? -- Yes.
|
| Should the result be reviewed and proof-read by human? -- Also
| yes.
|
| ---
|
| Here subtle one:
| https://catalog.archives.gov/id/34384201?objectPage=40
|
| Here is (one of) transcripts made by `o1-pro`:
| (2) ...and I don't know whether it can be reset for a
| date in December or not. Cornell seemed anxious that it
| should not come up too close to Christmas, and of course
| new suspicion [would be aroused?] [about?] him. I will take
| this up with the Judge as soon as I can get rid of the brief.
| Meanwhile I would like to know whether there is anything else
| in which I can be useful to you, since it behooves me in
| ways of uncomfortable relations with the present management.
| Are you going East in December? Has any word come from
| Hagerman? Were there any noteworthy developments at the
| hearings on the [Teapot?] trial? I have no
| inclination yet whether Wheeler will be wanted in
| Washington, but the chances are that he will not. With
| regards to all the brethren and [flock?], I am very
| sincerely yours, George A. H. Fraser
|
| I'm not native english speaker, but even I can read where it is
| wrong. I'll leave it to be an excercise for the reader to find
| out mistakes, but it is certainly not a Teapot trial.
|
| Somehow GPT-4o performs better on this example and fails only on
| "New Mexican practise" part.
| teddyh wrote:
| A "Teapot trial" is not actually that farfetched: <https://en.w
| ikipedia.org/w/index.php?title=Teapot_Dome_scand...>
| wriggler wrote:
| From https://www.handwritingocr.com - seemed to be more
| accurate, mostly getting the New Mexican and possibly other
| parts:
|
| ---
|
| and I don't know whether it can be reset for a date in December
| or not. Cornell seemed anxious that it should not come off too
| close to Christmas, and of course New Mexican practice would
| support him. I will take this up with the Judge and with Hanna
| the moment I can get rid of the brief. Meanwhile I would like
| to know whether there is anything else in which I can be useful
| to you, since it behooves me to be diligent in view of
| uncomfortable relations with the present management.
|
| Are you going East in December?
|
| Has any word come from Hagerman?
|
| Were there any noteworthy developments at the hearings on the
| Tenorio tract?
|
| I have no intimation yet whether I will be wanted in
| Washington, but the chances are that I will not.
|
| With regards to all the brethren and flock, Dan
|
| Very sincerely yours, George H. H. Baser
| dahart wrote:
| Looks entirely accurate except for the end. It's interesting
| it didn't catch "I am" or George's name correctly, given how
| difficult some of the text is on this page.
|
| Edit: Oh I see from another thread this OCR site is your
| creation. Nice work!
| dahart wrote:
| Consider using the reply feature so that your comment appears
| in context.
|
| Also your link goes to the wrong page. Here's the right one:
| https://catalog.archives.gov/id/34384201?objectPage=190
| electricant wrote:
| Today I learned that in the us children are not taught cursive
| handwriting. This is rather absurd to me. How are they supposed
| to write?
| animal531 wrote:
| In print? In general its faster to write and a lot easier to
| read, also you save time by not having to learn two different
| systems.
| electricant wrote:
| Let me disagree. IMHO cursive is faster than print once you
| get the hang of it.
|
| However my point is valid for print too I guess.
|
| Regarding time saved and the fact that they are two different
| systems, I don't get it. Time saved for what? They are not so
| different, cursive is built on top of print, just optimized
| for not lifting the pen from the paper too often (hence it is
| supposedly faster to write).
| dahart wrote:
| > However my point is valid for print too I guess.
|
| What do you mean? You asked how kids can write without
| learning cursive, and print is the answer how. What is your
| point about print?
|
| Cursive might be faster for an experienced writer (though
| Google tells me that claim is debatable), but it takes a
| long time to get there. I learned cursive as a child, used
| it for years, and it was never faster than printing, it was
| much slower. When I say 'print', I use an in-between style
| of half-cursive fast print that isn't cursive but a lot of
| people use in practice, and it's much faster for me that
| trying to write legible cursive.
|
| However cursive is neither faster nor more legible to read,
| as evidenced by this article and the pages that need
| translating. If we're going to compare cursive and print,
| the metric should be overall speed and accuracy of
| communication, not how many milliseconds the pen-holder can
| save while writing something nobody can read.
|
| Today, it no longer matters. People type & text mostly, and
| typing is _way_ faster than either cursive or print. The
| number of situations that require handwriting continues to
| decline. We don't use handwriting enough anymore to develop
| cursive fluency and efficiency.
| astura wrote:
| >Cursive might be faster for an experienced writer
| (though Google tells me that claim is debatable), but it
| takes a long time to get there. I learned cursive as a
| child, used it for years, and it was never faster than
| printing, it was much slower.
|
| Cursive probably made sense at a time when everyone was
| writing with quill pens.
| IshKebab wrote:
| It's definitely not faster to write. That's kind of the whole
| point. Also it's barely a "different" system. You just join
| the letters together. In the UK it's called "joined-up
| writing" and everyone learns it in primary school where there
| is plenty of time for learning.
|
| It is definitely easier to read print though - for a lot of
| people's handwriting anyway. It's much easier to be lazy and
| just do an illegible scrawl with joined-up writing than
| print.
|
| It varies a lot though; I had a PhD supervisor whose
| handwriting was illegible to everyone - even himself! My
| wife's handwriting on the other hand is practically a font.
|
| Print is just so slow to write...
| Havoc wrote:
| It's pretty country specific & not just US.
|
| German school: You have to write cursive with fountain pen
|
| South African uni: You're not allowed to write cursive, we
| can't read it
|
| ...sigh...just decide ffs
| lupire wrote:
| You do realize that you are posting on a thread whose OP thesis
| is that cursive is unreadable for most people.
| kmoser wrote:
| They could be forgiven for writing in print, but I wonder how
| they will "sign" their signature, e.g. for legal documents.
| Sure, they could print that, too, but it would be a departure
| from the many generations before them that learned how to
| "properly" sign their name. Are they embarrassed that they
| don't know how to write their name as a traditional signature?
| Do they care at all?
|
| I realize many legal documents are "signed" via keyboard,
| meaning you just need to type your name, but some things are
| still done via pen and paper.
| coffeefirst wrote:
| I've heard in Europe the kids are taught script using fountain
| pens, which are actually _faster_ when you don 't pick up a
| pen.
|
| In the US, 25+ years ago when cursive was taught, we were
| largely using pencils and crappy bic pens. At which point, you
| don't really get the benefit of staying in contact with the
| paper for longer.
|
| This _might_ be part of the disconnect.
| euribates wrote:
| Americans... consider reading handwriting a superpower?
|
| This explains a lot
| jll29 wrote:
| Prompt: You are a paleologist specializing in
| analysis of cursive handwriting; tell me what the following text
| says: (pasting the picture).
|
| Output: The following is the declaration of James
| Lambert, a soldier of the Revolutionary War in North America.
| The said James Lambert this day personally appeared in the
| Probate Court of the County of Dearborn in the state of Indiana
| and at the November Term of said court (1841), it being a court
| for claims created by the laws of Indiana and makes oath that:
| "On the 25th day of March 1842, he will be eighty-five years old;
| that he was born in the state of Maryland; that he is now a
| resident of said county and has been for the 27 years last past;
| that he has lived in Virginia, Maryland, Pennsylvania ..."
| nejsjsjsbsb wrote:
| Might be in the training data:
|
| https://www.google.com/url?sa=t&source=web&rct=j&opi=8997844...
| ChrisMarshallNY wrote:
| They should ask a medical school for help ;)
|
| My family is Ivy-League, all the way, and has the worst goddamn
| cursive writing I've ever seen. It can take me an hour to read a
| Christmas card from my sister.
| wkjagt wrote:
| I've always wondered how pharmacists can read those
| prescriptions. There must be some kind of course in university
| that they followed.
| ivanjermakov wrote:
| I think with experience they know how each medicine is
| usually written? It's often easier to listen/read when you
| already know what it is about.
| valiant55 wrote:
| Not really a problem anymore, it's all been digitized at
| least for the most part.
| FireBeyond wrote:
| A lot of it is understanding the abbreviations.
|
| "2T BD IAF UF", 2 tablets, twice a day, immediately after
| food until, finished"
| kopirgan wrote:
| Is that true?! US kids don't learn cursive? How do they write?!
| thesagan wrote:
| Those around me just write a lot more slowly, writing in print
| (they don't connect the letters like in cursive, they can't
| easily read my very-clean cursive either, which gives a feeling
| that my cursive is a sort of superpower)
| astura wrote:
| I learned cursive in 2nd grade and was very strictly REQUIRED
| to use it up until high school, where they stopped requiring
| cursive.
|
| 1) My cursive was always slower than print. I was happy to go
| back to print so I could write fast. I went to school in the
| "analog" era, so 100% of all assignments were hand written
| and not typed.
|
| 2) I noticed that literally only 1 person in my school stayed
| with cursive when printing was an option. It was so unusual
| it stuck out.
|
| 3) I only know one person who writes cursive now in every day
| life even though 100% of us learned it in school.
|
| 4) That person is my dad and he writes in the style of these
| documents. If you gave me one of these documents and told me
| my dad wrote it, id believe you.
|
| Which makes me think we all somehow were taught cursive wrong
| or practiced it wrong. My cursive was never fast and never
| looked like these documents.
|
| Anyway, I found this, which summed up my feelings learning
| cursive perfectly
|
| https://nautil.us/cursive-handwriting-and-other-education-
| my...
|
| >Reading and literacy expert Randall Wallace, of Missouri
| State University, says "it seems odd and perhaps distracting
| that early readers, just getting used to decoding manuscript,
| would be asked to learn another writing style."
|
| I found it so frustrating that I just learned how to write
| one way and then they tell me that's not the "proper" way to
| write and we need to learn this other way to write.
| kopirgan wrote:
| Very interesting.. Frankly did not know most of what's said
| in replies.. That it's not compulsorily taught and more
| surprisingly it's slower to write!
|
| I thought having to lift pen repeatedly would be slower?
| Anyway I need to try to really know I guess! Versus the
| time taken to add those extra links.
|
| Like most others I've not written much in years perhaps
| decades, that has screwed up my handwriting as even minor
| notes are these days illegible even to me after a few days
|
| Thanks for the replies.. Cleared a few misconceptions...
| One of them being writing in blocks is somewhat 'childish'
| and cursive is more literate.
|
| Added later: read parts of the long article it's very
| interesting.. Need to read it fully.
| astura wrote:
| >I thought having to lift pen repeatedly would be slower?
|
| The extra strokes required for all those fucking loops
| more than make up for having to pick up the pen.
|
| Cursive probably made a lot of sense when people were
| writing with quill pens, but in modern times each
| individual has their own comfort level and preferences.
|
| >Cleared a few misconceptions... One of them being
| writing in blocks is somewhat 'childish' and cursive is
| more literate.
|
| I was taught exactly that when I was growing up, which is
| why cursive was required for all school assignments pre
| high school. I always thought it was bullshit though
| because books aren't written in cursive and I only knew a
| single adult that used cursive in their every day lives.
| It seemed like a weird academic script.
|
| I think a big reason I was so frustrated with being
| forced to use cursive in school was because after I
| learned to write in print and before I learned cursive I
| wrote a LOT. Like I'd write stories almost every day. I
| loved writing so much and then they gave me this new
| script that I needed to use for writing that slowed me
| down. It's like... Stop changing things on me.
|
| I'm really glad cursive is no longer required in a lot of
| places. My school years would have been so much better
| without being forced to use cursive.
| mikedelfino wrote:
| I guess that using block letters, also known as print writing.
| From Wikipedia: Elementary education in English-speaking
| countries typically introduces children to the literacy of
| handwriting using a method of block letters, which may later
| advance to cursive. The policy of teaching cursive in American
| elementary schools has varied over time, from strict
| endorsement, to removal, to being reinstated.
| celsoazevedo wrote:
| Print/block letters. Random picture from the web:
| https://i.imgur.com/4X1Mz11.jpeg
|
| I grew up in Portugal, so a different education system, and
| used cursive until I was 11 or 12. But I had terrible hand
| writing and one day during class I decided to write text like
| it was printed on books, computers, etc, and that's what I've
| been doing since then. Still looks bad, but at least it's
| readable :P
| Daneel_ wrote:
| I learnt it here in Australia in my early school years, and
| hated it because it was both slower to write and more difficult
| to read. I switched back to standard writing as soon as I was
| allowed.
| kopirgan wrote:
| Unless it's really badly written, like mine is these days, I
| can read cursive quite comfortably. Guess it's a matter of
| habit.
| mkoubaa wrote:
| I learned cursive in elementary school in the US. But I went to
| a private Islamic school
| awithrow wrote:
| I'm in the US and learned it in school. I just never really
| needed to use it consistently. Assignments and papers that were
| still handwritten could be done either way. Cursive never felt
| noticeably faster for me to write. I'm sure it would have had I
| been forced to do it. By the time I was in high school (1999),
| i remember typing most long form assignments. Now the only time
| I ever read cursive is on letters from my mom and her cursive
| is not particularly neat or clean.
| astura wrote:
| >Cursive never felt noticeably faster for me to write. I'm
| sure it would have had I been forced to do it.
|
| I was forced to use cursive and it was still slower than
| print.
| Aloisius wrote:
| I'm not sure why cursive would be faster given the
| letterforms require a lot more travel.
|
| Maybe it would be when writing with a quill where splatter
| and breakage were a concern, but surely not with a
| ballpoint pen.
| SCPlayz7000 wrote:
| This is cool.
| epgui wrote:
| An army of pharmacists ought to do the trick!
| Adachi91 wrote:
| A dying bread of them, perhaps before they retire.
|
| I haven't seen a prescription pad in a decade, it's all
| electronic now in my part of the southern US, my current
| pharmacist is so young I don't know if they would even be able
| to read some of my previous providers writing.
| gdubs wrote:
| FWIW since so many people here seem set on the idea that cursive
| is archaic / useless today, Montessori schools still teach
| cursive before print because the flowing letters are easier for
| kids and more similar to drawing, and all the exercises they do
| around letter tracing.
|
| The result is that kids in Montessori learn to read faster and
| earlier. (They're usually writing in cursive _first_ , which
| gives them a foundation of the letters and their phonetic sounds,
| before they begin reading exercises in earnest.)
| ternnoburn wrote:
| Kids with dysgraphia sometimes can successfully write in
| cursive and cannot write in block letters. I don't know where I
| fall on how hard it should be taught, generally, but it's
| clearly very helpful to some kids.
| nosioptar wrote:
| I'm the opposite. Dysgraphia rarely impacts my print writing,
| my cursive is an absolute mess of cludged up letters that are
| completely indecipherable.
| MarkusWandel wrote:
| Curious, how hard is the sample in the article meant to be? I
| grew up (in the 1970s) in a world in which cursive still ruled.
| But the variant that we were taught in school was already
| considerably evolved from the one used by my grandparents, and
| those were modern compared to the archaic German script (
| https://en.wikipedia.org/wiki/S%C3%BCtterlin ) so I've never
| thought of myself as good at reading cursive. And of course
| haven't written (or read) much of it in the decades since.
|
| It took about one minute to decipher the first sentence in the
| sample. Is that considered good these days?
| t1amat wrote:
| For me, the first sentence was almost immediately readable, I
| just had to slow down a bit to decipher the name
| ggddv wrote:
| I've found much of the "reading" of cursive of my teachers was
| just basically snobbery. If it's illegible but curly, well I
| just read it wrong! Illegible but straight, you makes it wrong!
| TheRealPomax wrote:
| They're not "meant to be hard", they're just normal texts. The
| question is literally "can you read this?" because if you can:
| "Cool! Want to help transcribe it because the constraining
| factor when it comes to digitizing cursive is literally how
| many humans we can get to help out".
| cvoss wrote:
| Someone with practice at reading old cursive would likely be
| able to read a sample such as this one at least at a pace
| suitable for reading aloud. An expert, of course, could do it
| as fast as if it were their "native" script.
|
| Here is an example of a non-expert compared to an expert
| reading aloud [0].
|
| I learned cursive in school in the early 2000s, but I could
| never read my grandmother's handwriting. Whenever she mailed me
| a card, I would have to have my mom read it to me.
|
| [0] https://www.youtube.com/watch?v=cRhDClIs8XE&t=165
| peter_retief wrote:
| How does one actually sign up?
| kmoser wrote:
| From https://www.archives.gov/citizen-archivist/register-and-
| get-...:
|
| > Citizen Archivists must register for a free user account in
| order to contribute to the National Archives Catalog. Begin the
| registration process by clicking on the Log in / Sign Up button
| found in the upper right hand corner of the Catalog.
|
| Catalog: https://catalog.archives.gov/
| jedberg wrote:
| They should hire a bunch of teachers to do this over the summer!
| Every teacher I know is an expert at reading terrible
| handwriting.
| anonymous_379 wrote:
| Why did people use to write like this?
| slater wrote:
| It's faster than writing out individual letters.
| madmask wrote:
| I still write like that
___________________________________________________________________
(page generated 2025-01-18 23:01 UTC)