[HN Gopher] First word discovered in unopened Herculaneum scroll...
___________________________________________________________________
First word discovered in unopened Herculaneum scroll by CS student
Author : razin
Score : 249 points
Date : 2023-10-12 14:11 UTC (6 hours ago)
(HTM) web link (scrollprize.org)
(TXT) w3m dump (scrollprize.org)
| sillysaurusx wrote:
| See also Nat's twitter announcement:
| https://twitter.com/natfriedman/status/1712470683207532906
|
| $700k is a life changing amount of money. I admit, it's tempting
| to drop everything and go devote myself like a monk to the
| pursuit of ancient enlightenment via modern ML. I wonder where
| we'd start...
|
| It's also funny that the scroll might just be a laundry list.
| chakintosh wrote:
| Or a customer complaint:
| https://www.thearchaeologist.org/blog/complaint-tablet-to-ea...
| Wojtkie wrote:
| What I love about the Ea-Nasir story is the tablet was found
| in a pile of other tablets, suggesting that Ea-Nasir saved
| them. Why? Who knows, maybe he found them funny.
| vimax wrote:
| I heard somewhere it was common practice to reuse tablets.
| It was easier to scrape the surface clean than to make a
| new tablet.
|
| You'd save any tablets you have, and might wait until you
| need it to scrape it clean.
|
| In Mesopotamia there was a period where it was fashionable
| to use a more rare softer red clay on top of the white
| clay. Your stylus would cut through the top layer leaving
| nice white letters on a red background. It made it easier
| to scrape clean and reuse, but much less durable over time.
| jdminhbg wrote:
| Yes, the clay tablets were used over and over. The ones
| that are preserved have what was written on them when
| they were fired, accidentally, by being in a building
| that was destroyed by fire.
| 0xf00ff00f wrote:
| A laundry list with something purple...
| seydor wrote:
| purple was the color of nobility and rather rare. It might be
| the description of a king or a room or roman fashion items.
| riffraff wrote:
| Or a complaint about bad writers
|
| https://en.m.wikipedia.org/wiki/Purple_prose
| empath-nirvana wrote:
| it might cost more than $700k in compute.
| latchkey wrote:
| It certainly did.
| https://news.ycombinator.com/item?id=36312385
| cosmojg wrote:
| Where do they say that the winners used that cluster?
| latchkey wrote:
| It is an assumption based on the fact that the codebase
| uses cuda and the main backer of the project owns the
| cluster.
| terhechte wrote:
| This is one of more than 600 scrolls that could be read
| afterwards if the method becomes scalable. What's more:
| "excavations were never completed, and many historians believe
| that thousands more scrolls remain underground." [0]
|
| [0]: https://scrollprize.org
| jdminhbg wrote:
| > It's also funny that the scroll might just be a laundry list.
|
| Most likely not, I believe they're starting with scrolls that
| were readable on the outside, which we know are minor works of
| Greek stoic philosophy. Also a laundry list would be written on
| a reusable wax tablet, rather than costly papyrus.
| michael_nielsen wrote:
| It's likely somehow a reference to the Emperor. Purple cloth
| was extremely rare and expensive, and it was the colour worn by
| the Emperors. Indeed, it eventually became a capital crime for
| people outside the Emperor's family to wear it. I don't know if
| that was yet true at the time of Vesuvius, although Wikipedia
| claims Caligula may have had someone killed for wearing purple.
| Arete314159 wrote:
| The other word visible is "oino", wine. Wine can be described
| as purple.
| OfSanguineFire wrote:
| While modern people make that connection, that is
| culturally dependent. The color terms available to speakers
| of a language, and what objects those terms can be
| associated with, change over time. In the case of the Greek
| word for "purple", it was connected to a dye and therefore
| used for clothing, but one shouldn't expect it to be used
| for wine.
| dataflow wrote:
| > $700k is a life changing amount of money
|
| Probably ~half of that will go to taxes?
| thrway63245 wrote:
| Not sure why this is downvoted. Yes, in California half will
| go to taxes and the rest is enough for a downpayment on a
| shack. Hardly life changing.
| jedberg wrote:
| > It's also funny that the scroll might just be a laundry list.
|
| Even if it were, a laundry list from 2000 years ago would be a
| fascinating read.
| davidw wrote:
| That's extremely cool. I wonder what we'll learn.
|
| As an aside, the "Professor Seales and team scanning at the
| particle accelerator" photo looks like it came from a TV show.
| "If we keep telling the computer 'enhance', we'll be able to read
| it".
| adamlgerber wrote:
| i love this project. i feel like this is going to be a great
| source of interest and value over the next few years (and
| potentially immesurable value over longer time frames).
| kelsey9876543 wrote:
| I recently saw a wonderful youtube video on this:
| https://www.youtube.com/watch?v=Z_L1oN8y7Bs
|
| Title: Herculaneum scrolls: A 20-year journey to read the
| unreadable
|
| it goes a little bit into the technology of how this was done,
| deep learning finally cracked the code. They had the scans for a
| decade but it took ML training to be able to identify which parts
| were paper and which parts were the ink on top. This had been
| done on a different set of scrolls with easier to read higher
| contrasting materials like the video says, 20 years ago. Deep
| learning is cracking the code for these datasets we had
| previously thought were impossible to algorithmically solve.
| nulbyte wrote:
| Thank you for sharing. It's a month old, but even so, I just
| saw a pinned comment ppsted an hour ago about an announcement
| coming later today.
| versteegen wrote:
| Can't speak for the video, but this is a bit misleading
| actually. What cracked this was actually visual inspection
| looking for patterns which could then be used as better
| training data, which so far apparently hasn't found very many
| letters that were too hard to see. Read the OP describing the
| iterative process of hand-annotation guided by output of a
| model, then retraining the model with the additional data, it's
| a fascinating technique! Simply using deep learning on the
| initially available ground truths without knowing what features
| the models should be looking for actually pretty much didn't
| work!
|
| Also, so far the process of virtually unrolling the scrolls is
| mostly manual and extremely labour intensive.
| kelsey9876543 wrote:
| Thank you for adding the deeper insight! The competition and
| the methods used are very fascinating indeed.
| tclancy wrote:
| Somewhat off-topic but if you clicked in here, you might be
| interested in this book: "The Riddle of the Labyrinth: The Quest
| to Crack an Ancient Code".
| jdminhbg wrote:
| This is the 21st-century equivalent of living through the opening
| of Tut's tomb. Incredible to think there's a very real chance
| that in the medium-term future you might be able to buy a copy of
| a newly-translated work on Amazon that hasn't been read for
| millennia.
| carapace wrote:
| Why the ad for Amazon?
| jdminhbg wrote:
| It's just a reference to making a boring, pervasive part of
| culture. Please feel free to buy those translations at any
| book company you feel like.
| carapace wrote:
| Sorry, I'm just cranky this morning.
| alanbernstein wrote:
| Surely they will be public domain by now??
| lexicality wrote:
| It is disgraceful that the ancient Greek authors won't
| see an obol that these so called "translators" and
| "historians" make from reselling their work.
|
| They should sue! /s
| jdminhbg wrote:
| The original Greek text is, but I got a C in Greek so
| I'll have to pay for a copyrighted English translation.
| versteegen wrote:
| The lettering was found by looking for 'crackle' texture on
| papyrus segments from the CT scans which obviously were in the
| shape of Greek letters, and annotating those as training data.
| Unfortunately such crackle texture isn't visible, at least by
| eye, on most of the papyrus. Probably it's only that visible
| where the ink was very thick. You can easily see the difference
| in texture in this electron microscope image [1] (far higher
| resolution than the CT scans) but especially on the very edge of
| the inked area (the narrow strip in the left image; I think the
| whole right image is inked) where the ink was pushed to. I'm
| surprised the crackle was discovered only after the Kaggle Ink
| Detection contest. Looking at the CT-scanned fragments with
| infrared ground truths, which were used in the Kaggle contest,
| Casey Handmer wrote [2]:
|
| > The ongoing apparent failure of deep-learning based ink
| detection based on the fragments indicated to me that direct
| inspection of the actual data would be more fruitful, as it has
| been here.
|
| > ...
|
| > I found similar "cracked mud" and "flake" textures
| corresponding to known character ink, but only for perhaps 10% of
| the known characters. It's been a long day, I can probably find
| more on closer inspection, but that does make one wonder about
| automated ink detection and what that is seeing.
|
| These new images are much better than I hoped for, but still only
| in one small area, so I'm still pessimistic about more than an
| odd sentence being readable.
|
| [1] https://scrollprize.org/img/tutorials/sem.png
|
| [2] https://caseyhandmer.wordpress.com/2023/08/05/reading-
| ancien...
| munificent wrote:
| I love uses of machine learning like this a thousand times more
| than generative LLMs spouting probable-sounding nonsense.
| esafak wrote:
| It is amazing what some college student can pull off with today's
| technology.
| Rallen89 wrote:
| >Shortly after that, another contestant, Youssef Nader,
| independently discovered the same word in the same area, with
| even clearer results -- winning the second place prize of
| $10,000.
|
| That's what u get for optimising your code
| hansoolo wrote:
| I thought the same. He had the better results, but too late.
| QuercusMax wrote:
| Or maybe the winner optimized his code, resulting in faster
| time to get results. Either one is equally plausible!
| zeteo wrote:
| Not really:
|
| >Youssef used a model from the Kaggle competition and was
| inspired by Luke's results to look in the same area.
| autokad wrote:
| imagine the person making this scroll 2,000 years ago wondering
| 'I wonder if some kid 2000 years in the future is going to win a
| boat load of money by reading this'
| nataliste wrote:
| I wrote this for a different community (filled with semiliterate
| sophists), but this is absolutely huge and could upend huge
| swathes of understanding about the last two thousand years.
|
| You can avoid the longform essay below if you want. The short of
| it is there are several potentially common works possibly in the
| library that could directly prove or disprove what is found in
| the New Testament and the predicates of Rabbinic Judaism as
| established at the Council of Jamnia.
|
| We could be seeing the beginning of conclusive proof that
| invalidates the narratives of Christianity, Judaism, and Islam by
| the end of the year.
|
| The Vesuvius Challenge isn't just an interesting contest in the
| machine learning realm; it's a groundbreaking endeavor that could
| redefine our understanding of the humanities if successful. The
| opportunity to digitally unroll and read the Herculaneum Papyri
| could offer unprecedented insights into ancient civilizations and
| the total feedstock of civilization today. This is not merely
| about filling in some historical gaps; it's about fundamentally
| altering how we understand antiquity and, by extension, our own
| intellectual heritage.
|
| The loss of the Library of Alexandria has long been considered a
| "dark age" event for intellectual progress. Now, consider the
| Herculaneum library--a collection of papyri from a villa once
| owned by Julius Caesar's father-in-law, carbonized but preserved
| by the Vesuvius eruption in 79 AD. Hundreds of these scrolls are
| unreadable because their carbon-based ink blends in with the
| carbonized papyrus, and thus are invisible to conventional
| imaging techniques. Yet, these scrolls are quite possibly on the
| cusp of revelation.
|
| Recent developments have introduced machine learning and high-
| resolution X-ray scans as methods for reading these "unreadable"
| scrolls. What texts do they contain? Treatises on science and
| philosophy? The lost books of Livy? The epic cycle? Governmental
| policies like the Twelve Tables? It's a tantalizing question
| because whatever is locked in those scrolls could be an
| unfiltered look at the Roman Empire--an empire that fundamentally
| influenced the trajectory of Western culture, religion,
| governance, and philosophy.
|
| Ponder a history of Rome that has not been retouched by myriadic
| emperors, by Constantine's Christianity, or the interpretive lens
| of the Roman Catholic Church. Unmediated accounts of Roman
| society, unaltered by the layers of religious and political power
| that came later, could rewrite our textbooks and shift the
| justification of history. It's not just about enriching our
| understanding of ancient civilizations; this could be a
| cornerstone on which to build a fresh philosophical understanding
| of human society.
|
| If the project succeeds, there will be repercussions in the
| academic realm. The humanities have long struggled to justify
| their existence in a world that increasingly prizes STEM and
| lacks any novel sources for the classical world. Suddenly, there
| could be a concrete, urgent task at hand: to decode, interpret,
| and integrate an influx of new knowledge. The Vesuvius Challenge
| could revitalize the field, offering an unforeseen but compelling
| reason for its study. In essence, it provides a utilitarian
| justification for the humanities, one that transcends 'cultural
| enrichment' and enters the realm of 'historical redefinition.'
|
| The Vesuvius Challenge could be the hinge upon which history
| swings, yielding intellectual treasure that could be as
| groundbreaking as the writings that were lost in Alexandria. For
| millennia, those scrolls have remained unread. Now, it's a
| software problem. That's not just a challenge; it's an
| imperative.
|
| The presence of specific works in the Herculaneum Papyri could
| dramatically impact our understanding of major historical events.
|
| In particular for me, I pray that the biography of Herod the
| Great by Nicholas of Damascus is discovered intact. While
| mainstream accounts generally portray the life of Herod within
| the context of Roman patronage and Judaean politics, uncovering a
| contemporary account by a close intimate (and used as a primary
| source by Josephus) would offer fresh, unmediated insights into
| his rule and its socio-political intricacies. Chronologies of the
| life of Jesus could be explicitly validated or disproved.
|
| The relevance here is far from academic. Consider the following
| naturalistic hypothesis: that the inception and rise of
| Christianity was entirely a dynastic struggle within the
| Hasmonean-Herodian line. What if the tale of Jesus is, in
| essence, a dramatized, mystified rendition of a 1st-century
| dynastic conflict, one that was subsequently co-opted and
| transformed into a religious narrative by an early form of
| conspiratorial thinking? Something like a 1st-century version of
| Q-anon, distorting real events to serve an alternative, concealed
| agenda in the aftermath of the First Jewish-Roman War.
|
| Unveiling a document like Nicholas of Damascus' biography could
| be groundbreaking in testing such a hypothesis. If Herod's life
| and rule were detailed without the religious overlays that later
| Christian interpretations bring into the picture, one could make
| more definitive assertions about the socio-political environment
| of the time. Furthermore, it could provide concrete evidence to
| either substantiate or refute theories about Christianity's
| emergence as a byproduct of a Herodian-Hasmonean power struggle.
|
| The fact that such a theory could be _tested_ is significant in
| its own right. Traditionally, discussions about early
| Christianity rely heavily on religious texts and subsequent
| historical accounts, many of which are fraught with dogma and
| ideological interpretations. A primary source devoid of such
| influences would be a game-changer, offering a baseline of raw
| data from which more accurate and reliable hypotheses could be
| drawn.
|
| And it's not limited solely to Christianity. Rabbinic Judaism
| could have equally monumental implications as a result. The owner
| of the villa, likely a wealthy Roman, would be unlikely to have
| had any primary Hebrew texts like the Pentateuch. However, that
| doesn't rule out the possibility of possessing Greek or Latin
| works discussing Jewish culture, beliefs, and politics. Given the
| villa's historical context, it's conceivable that there might be
| indirect ethnographic accounts from the period surrounding the
| destruction of Jerusalem in 70 AD but before the Council of
| Jamnia, traditionally dated around 90 AD, which helped canonize
| Hebrew scriptures.
|
| Why is this important? The Council of Jamnia is often cited as a
| crucial moment for the development of Rabbinic Judaism. It
| allegedly led to the fixing of the Hebrew Bible canon and
| crystallized what would become Talmudic tradition. If documents
| were to surface that provide a snapshot of Judaic thought and
| practice just before this council, it could upend millennia of
| precedent and identity.
|
| In a broader context, discovering pre-Jamnia ethnographic sources
| could significantly change our understanding of how Judaism
| adapted and evolved in the aftermath of the Second Temple's
| destruction. This could lead to far-reaching questions. How much
| of the Talmudic tradition was actually a post-hoc rationalization
| or systematization of beliefs and practices that were far more
| fluid before the Council of Jamnia? How much anti-Romanism was
| pared away to prevent suppression? Moreover, how would such a
| revelation interact with or even challenge the validity of
| current Rabbinic and Orthodox Jewish practices?
|
| The implications for the Judeo-Christian heritage as a whole are
| staggering. If both Christianity and Judaism could be traced back
| explicitly to politically or socially motivated machinations,
| rather than divinely inspired or time-honored traditions, the
| entire foundation of Judeo-Christian culture would come into
| question. In essence, the Vesuvius Challenge has the potential to
| destabilize two of the world's major religious traditions at
| their historical roots. It is difficult to overstate the
| potential impacts.
|
| The Vesuvius Challenge is not just an academic or technological
| endeavor. Its success could instigate an unparalleled
| epistemological crisis in religious studies and the humanities.
| It provides the opportunity to re-examine, with primary sources,
| the historical foundations of Western religious, cultural, and
| ultimately political traditions. We're not just potentially
| rewriting history here; we're reevaluating the very frameworks
| through which that history has been understood.
| narag wrote:
| So this is just the very begining? Will they be able to
| decypher whole docs? I guess you wouldn't have written all that
| otherwise!
|
| Anyway, if there's religion involved, I doubt any revelation
| will shake anything.
| 1vuio0pswjnm7 wrote:
| "He found a few dozen ink strokes - and some complete letters -
| that could be labeled and used as training data.
|
| Before long, the model was unveiling traces of crackle invisible
| to his own eye. Soon, these traces began to form letters and
| hints of actual words."
|
| This does not sound like a "Large Language Model (LLM)" or other
| large set of training data, like the sort hyped by so-called
| "tech" companies; this sounds relatively small. What am I
| missing. (Besides brain cells.)
___________________________________________________________________
(page generated 2023-10-12 21:00 UTC)