[HN Gopher] Autogenerating a Book Series from Three Years of iMe...
___________________________________________________________________
Autogenerating a Book Series from Three Years of iMessages
Author : wonger_
Score : 370 points
Date : 2024-03-06 13:27 UTC (1 days ago)
(HTM) web link (benkettle.xyz)
(TXT) w3m dump (benkettle.xyz)
| helboi4 wrote:
| Now to make this work for Whatsapp for the brits... Got excited
| at the idea of a project and then realised I will have to learn
| Rust if I was to fork this haha.
|
| Anyway, this is definitely a cool idea. Reading my chat history
| with friends is actually very nostalgic.
| hu3 wrote:
| This Python package exports WhatsApp backup chat to JSON or
| HTML:
|
| https://github.com/KnugiHK/WhatsApp-Chat-Exporter
|
| It also links to a Telegram exporter.
| westernpopular wrote:
| > It also links to a Telegram exporter.
|
| Telegram has this natively already
| hu3 wrote:
| Indeed! That was unexpected for me.
| netsharc wrote:
| WhatsApp lets you export chats as txt, but I guess that's
| lossy.. e.g. I'm not sure the emojis will be there. Surely no
| attachments or voice messages.
|
| As for extracting from the backup DB, they'll be encrypted..
| andyjohnson0 wrote:
| Whatsapp on Android lets you export a chat with or without
| media (images etc), but it limits the number of messages.
| With media you get the last 10k messgaes, and without you get
| the last 40k. Emojis are preserved though.
|
| See https://faq.whatsapp.com/1180414079177245/?cms_platform=a
| ndr...
|
| The limits are supposedly due to email size limits but, as
| they also apply when exporting to non-email endpoints like
| Google Drive, I suspect they're more to do with preventing
| people from moving their chats to other services.
|
| And yes, the db is encrypted.
| nolongerthere wrote:
| I wonder if those limits will be eased with the new EU
| regulations...
| Rygian wrote:
| GDPR is already 6 years old.
| adastral wrote:
| > they'll be encrypted
|
| Since some months (years?) ago, WhatsApp lets you set up your
| own encryption password for the DB backup. I set one up and
| used https://github.com/ElDavoo/wa-crypt-tools to get access
| to the decrypted SQLite and run some analytics over my
| messages :)
| dav43 wrote:
| For the rest of the world...
| helboi4 wrote:
| I was going to say that but I then remembered all the many
| many other apps that a lot of other countries used, and
| therefore I didn't want to act like I wasn't aware of those.
| For example, WeChat, Line, KakaoTalk, and more. Whatsapp is
| not at all universal, even if it might be the most common in
| many European countries.
| solstice wrote:
| There is a python library for Wechat, but I'm having
| problems using it. Also, 1) getting and then 2) decrypting
| the Wechat database isn't easy. Because my phone is not
| rooted, I had to use an android emulator on my PC, transfer
| all my chats over there, extract the DB and all media.
| After installing all the fickle dependencies of the
| bruteforce decrypter, it took three days straight on my
| laptop from 2014 to decrypt, and now I can finally open it
| in an SQLite viewer. But that still leaves the major step
| of getting formatted messages out of there like in the OP.
| The HTML-conversion script that I used produced half-decent
| results, but hasn't been maintained for a while and thus
| chokes on certain messages so that the conversion of large
| chats invariably breaks down before being finished. Anyway.
| Maybe it is time to learn Python...
| VikingIV wrote:
| This desperately needs to happen -- in a way that all messages
| and media can be exported in a sensibly reviewable format.
| Heck, I'd just like to be able to archive a backup that I know
| can be restored in the future on another device.
| mif wrote:
| That's awesome. I would like to do that with Telegram or
| WhatsApp.
| hu3 wrote:
| https://github.com/KnugiHK/WhatsApp-Chat-Exporter
|
| Just in case you missed my other comment.
|
| Not my repo.
| ivanjermakov wrote:
| Telegram gives you a JSON export of a full conversation, should
| not be complicated to adapt the input.
| ThePowerOfFuet wrote:
| >My first approach at this LaTeX generation is quite simple:
| align left if the message is from me and right otherwise
|
| Isn't that backwards?
| wantlotsofcurry wrote:
| Maybe they wanted it from/for the other persons perspective for
| some reason?
| g4zj wrote:
| Perhaps the book is a gift for the person on the other side of
| the messages in question.
| suddenclarity wrote:
| It is. I thought the author just mistyped in the post but
| looking at the repository example it's indeed backwards.
| throwuwu wrote:
| I like this a lot. We need more hard records of personal
| correspondence. It would be cool to do this as a service.
|
| Honestly when I read the title I thought it was going to be about
| using message history as a basis for generating a narrative
| account of the events using an LLM.
| trizoza wrote:
| The same, I expected a whole criminal AI generated novel based
| on a history of a chat.
| shawnc wrote:
| Ditto. Exactly what I pictured from the title and I was already
| thinking how interesting that would be. I'm curious to try
| something like it now.
| Cthulhu_ wrote:
| With the EU's DMA law and the preceding GDPR, some services
| have to offer an API so that your hypothetical service can pull
| this data. However, iMessage was notably excluded from this
| law, and then there's the encryption thing where you can't just
| pull data from e.g. whatsapp.
| darkwater wrote:
| Me too. But the real story is way better! Now I want to do the
| same with my Telegram chat history.
| thefourthchime wrote:
| I love it as art, but it's usefulness is questionable. If you
| want a hard copy, just copy it to a microsd. Or three if you
| are worried about losing it.
| refulgentis wrote:
| "Hard copy" means "a collection of paper sheets bound in some
| fashion" in the context of books. So they probably didn't
| mean it in a way where microSD is equivalent.
| tivert wrote:
| > I love it as art, but it's usefulness is questionable. If
| you want a hard copy, just copy it to a microsd. Or three if
| you are worried about losing it.
|
| Hard copy means paper. Also microsd is a terrible for long
| term storage.
| roland35 wrote:
| I love this idea! I think this would be a fun idea, except 1) not
| sure how it would handle pictures, and 2) there are probably some
| texts which should not be published!
|
| Also - noto emoji is great. It is also nice to use for 3d
| printing/laser cutting
| nkko wrote:
| Imagine having GPT generate a haiku from each unique
| correspondence.
| risenshinetech wrote:
| Why?
| velcrovan wrote:
| I'll bite.
|
| A book full of years of texts would be an interesting
| artifact, but how often would you pick it up? How interesting
| could it really be? Would you even want anyone else to see
| it?
|
| Now suppose each exchange comes with a haiku summary, a fresh
| high-level look at the conversation that condenses its vague
| essence into a little linguistic locket, portable and easily
| recalled. The interplay between the mundane raw material and
| the poetic take would render it much more interesting, and
| tend to reward repeated examination.
|
| A human poet would undoubtedly do a better job at this than
| an LLM, if any human poet could be persuaded to start and
| complete such a project. But having an LLM do it would be a
| very low-cost, low-effort way to at least try for interesting
| results.
| marban wrote:
| Python Script to export them on a Mac
| https://pypi.org/project/imessage-reader/
| gregorymichael wrote:
| Great project and a great writeup
| btbuildem wrote:
| Would the next level be to use an LLM to take the essence of each
| set of exchanges and present it in a format of a play or movie
| script? Perhaps novelize it?
| input_sh wrote:
| Or, hear me out: you keep the personal communication personal
| instead of feeding it through a randomness machine.
| risenshinetech wrote:
| Ya bro and then the next level would be to like feed that AI
| generated play or script and feed it into an AI movie maker. It
| would totally be a game changer! I'm super stoked and pumped
| about it
| balu_ wrote:
| Nice Idea, tryed to use the source and build a quick example
| myself... Now i'm reminded of me disliking latex (it just doesn't
| work)
| jjice wrote:
| I wasn't familiar with BN Press for personal use. I've done some
| research into KDP and Lulu, but I've decided that ebooks would be
| my main focus after getting a Kindle and loving it. For a
| limited/test run, BN Press seems fantastic. $30 for 1300 pages is
| fantastic.
| fragmede wrote:
| People will pay money for this! What a heartfelt, sentimental
| thing to be able to give to someone on an anniversary or birthday
| or something.
| russfink wrote:
| Or keep it to remind yourself of what a toxic relationship
| looks like, be that the case.
| frankfrank13 wrote:
| My Aunt has done a wonderful job at preserving the letters and
| diary entries between my grandfather + grandmother during WWII.
| My immediate thought is how our children and grandchildren will
| not have the same joy!
|
| [Here is the blog for those
| interested](http://www.honeylightsletters.com/)
| CSSer wrote:
| Ha, I'm not sure it's quite the same. If my understanding of
| most couples is correct, this would be more akin to preserving
| all of their sticky notes e.g. "Pick up milk on the way home",
| "You're getting the kids, right?", "see you in 20", etc.
|
| Yet I suppose there's a certain charm to that, so I hope I
| don't sound like too much of a wet blanket.
| yaky wrote:
| Perhaps an unpopular opinion, but this is slightly creepy.
|
| I never understood why people care to keep their private
| conversation history in the first place. IMO private messages (as
| opposed to public posts, blogs, etc) are supposed to be temporary
| ("ephemeral") - one does not record every face-to-face
| conversation or phone call after all.
| vanjajaja1 wrote:
| private messages are not face to face conversations nor phone
| calls. letters last a long time and would be a more accurate
| comparison
| the-grump wrote:
| Because they're cherished memories for one.
| famahar wrote:
| I agree. But it's more to do with the part of me that cringes
| at messages I sent and exist forever. The person I was 10 years
| ago is so different. It feels so jarring reading old messages.
| jakespencer wrote:
| I think this is interesting, and not necessarily unpopular. It
| seems different people just think about this issue differently.
| I do everything I can to preserve every single chat history
| that I can. And I would like to have every face-to-face
| conversation and phone call recorded and easily accessible for
| that matter. I have a sense that I am the sum of my experiences
| and I don't want to forget those experiences - it feels like I
| am somehow less than myself if I don't remember them.
|
| But I've seen that episode of Black Mirror, too. So I wrestle
| with the desire to perfectly remember everything that I've ever
| experienced vs the mental and emotional health benefits that
| clearly come from being able to forget things.
| Sigliotio wrote:
| I don't assume those are private or ephemeral.
|
| It tells a story and its the zeitgeist of our generation.
|
| People haven't thought about too much how to preserve something
| like this.
|
| I personally like the idea and i can imagine exporting this
| with all / few messages of my mother and having a memory of
| that time.
| digging wrote:
| > IMO private messages (as opposed to public posts, blogs, etc)
| are supposed to be temporary
|
| Why? Privacy and permanency are orthogonal axes. You've never
| kept a cherished letter or re-read a thoughtful text message?
| yaky wrote:
| If the message was something I found interesting, important,
| or funny, I would usually copy or screenshot it. Or remember
| it. Although I don't proactively delete old messages, I never
| intentionally backed up or transferred message history
| between devices either.
|
| As for privacy and permanency - if data stops existing, it is
| definitely private now :)
| tivert wrote:
| > I never understood why people care to keep their private
| conversation history in the first place.
|
| One reason that's understandable without relying
| sentimentality, is they're a record of what you were doing or
| thinking at a particular time, much like a private diary.
|
| There's been a few times where I've gone back though stuff like
| chat history to better understand something that I didn't
| realize the significance of at the time.
| nonameiguess wrote:
| I can see both sides. I did actually correspond with people
| using written letters up until maybe 2004 or so, and in many of
| those cases, especially old girlfriends and letters from my
| little sisters when I first went to college, reading them years
| later was intensely nostalgic.
|
| On the other hand, when I left the Army I moved into a much
| smaller place, put most of my stuff in storage, then three
| years later figured anything I hadn't touched or used for three
| years was something I didn't actually need, and let the
| facility have all of it. That seems to have included both all
| those old letters and all of my old photographs. I can't say I
| actually miss those things. People in here are saying they
| don't want to forget the past but the reality of forgetting is
| you don't know you forgot it so it has no perceivable effect
| once it happens.
|
| To be honest, I'm nostalgic enough as is and don't think I need
| even more things to hold onto. I already don't watch new
| television or listen to new music. I'm mentally stuck in 1999
| and not sure that's healthy.
| sciencesama wrote:
| We need bubbles aswell !!
| federalbob wrote:
| A French company does this: https://www.monlivresms.com/
| (warning: the website is annoying)
| j1elo wrote:
| I love the idea!
|
| The thing I like the least though is the table of contents, it's
| so dry with just the months and years. Despite the skepticism I
| have about latest AI use and abuse, generating a one-liner from
| the contents of each month seems like it would be a fitting usage
| for it.
| css wrote:
| Awesome to see someone using my library [0] in the wild! Very
| cool use case.
|
| [0]: https://github.com/ReagentX/imessage-exporter
| bkettle wrote:
| Author of the post here, thanks so much for making it
| available! It's an excellent library and I was thrilled to find
| it.
| jxramos wrote:
| > All of my friends, for putting up with me sending them
| random messages to test things
|
| Good sports taking one for the team! Thank you
| jborichevskiy wrote:
| Big fan of this library. Thanks for making it!
| alchemist1e9 wrote:
| Thank you for this! I recently was digging into the sqlite
| files with an idea to monitor them for changes indicating new
| messages and then extract them. My initial prototype seemed to
| mostly work, with a few hacks. Next time I look at that idea
| I'll switch to your library. Any suggestions or tips around
| near-time accessing?
| demondemidi wrote:
| I did this for my partner on valentines day back in 2015. 20,000+
| messages in one HTML page. I never thought of binding a book,
| though.
|
| I suspect this person's project will become very popular as a
| service. This is a great idea.,
| kirmerzlikin wrote:
| Am I the only one who finds the idea of sending a full history of
| your private messages to some publisher for printing a little bit
| unsettling?
| cooper_ganglia wrote:
| I think if I sent my full history of private messages to a
| publisher, the most unsettled one would be the publisher!
| russfink wrote:
| Maybe try the --rot13 switch.
|
| :-)
| janfoeh wrote:
| No, I do too. I've been planning on doing what the author did
| for quite some time now, and this is one of the unsolved
| stumbling blocks.
|
| Printing and binding at home is probably the only option. All
| that's left to figure out is how to make the end result durable
| haptically pleasant...
| dogline wrote:
| You can always print it out, then run to Kinko's and use
| their comb binder that they usually have out. Not as elegant
| as real binding, but enough to make it work on my shelves.
| janfoeh wrote:
| For the cost of the plane ticket, I could probably hire a
| retired book binder ;)
| red-iron-pine wrote:
| the publisher is the business of spitting ink on paper. you
| should be more unsettled by being MITM'd by data mining
| companies whose job it is to change behavior via ads or other
| consensus-building tools.
| ghaff wrote:
| I might not do this if I were a high public official or
| celebrity on the off-chance that someone in the printing and
| packaging chain might happen to notice. But, for an average
| person, it seems pretty harmless. (Personally, the last thing
| I need is more paper but I get the attraction.)
| achristmascarl wrote:
| This is really cool, and also seems like it could be a great gift
| to a loved one.
|
| I was playing around with Nomic Atlas (https://docs.nomic.ai/)
| recently and dumped a bunch of my chat history in there, and it
| was pretty interesting to visualize and browse my messages as
| clusters around topics.
|
| Which leads me to think that you could bring the searchability of
| digital to the physical format by generating embeddings for the
| messages and running topic modeling on them; then, you could
| create an index of topics at the end of the physical book with
| page number references to messages about that topic.
| jll29 wrote:
| In 2000 years, your books may be the only thing left to study how
| we lived in the 21st century, because all ephemeral information
| (tweets, chat, SMS, emails, digital photos on people's devices)
| may have vanished.
| mym1990 wrote:
| Very cool. A while ago I took a trip down memory lane with my
| partner to take a look at the first messages we sent each other,
| it was very neat and the memories definitely came back, even
| though it has been years since we met! A little bit like looking
| at a photograph and remembering the location and feeling in that
| moment.
| johncalvinyoung wrote:
| This is fantastic. I've done some semi-similar things, have a
| tool I wrote for generating nice documents from Facebook
| Messenger conversations, for archiving important personal
| conversations. But I didn't take it so far as to generate a
| _book_ yet! What a great idea!
| pimlottc wrote:
| Doesn't seem liken this includes images, which for some people
| could be a significant part of their conversations: photos,
| memes, reaction gifs, etc
| janfoeh wrote:
| If been thinking about doing just this for a bit now. I plan on
| showing thumbnails plus a QR code for animations and videos; I
| have yet to figure out how to make the files accessible in a
| private and durable manner though.
| bm-rf wrote:
| Maybe you could use something like GPT 4 vision To include a
| text description of the image in the transcript
| samatman wrote:
| Filtering full-color images down to a halftone suitable for
| book publishing is a mature technology, setting up an
| ImageMagick pipeline to do so would not be among the hard
| parts of preparing a book like this. Picking the right still
| frame out of gifs and video is a bit trickier, but not by
| much.
| gavmor wrote:
| I like to listen to blogs through Pocket's TTS mode, but this one
| made me laugh because I couldn't easily skip these sections:
|
| > /.../00008120-001854410CEB401E >>> cd 3d
| /.../00008120-001854410CEB401E/3d >>> ls
| 3d0292d3fe90e1e22c247403c0e9105ea0f9ff44
| 3d8830b71e98aae80b6eaf8bdd5500d79ce74946
| 3d02fe309afa7de839822d6f1b8433aa90090d17
| 3d88cdc16ff2b5231e5ea4b52271ee195a6f4b96
| 3d072c4fca5db4a5678fa10b137435f757e98492
| 3d8a425d70f4049417e855d273c44d8199de30c9
| 3d0739c90579fa907246d5c21bd8d8ebaa2d9d6b
| 3d8a43a1921f504bb4393250f75b24bfc2c5cedb
| 3d0798b3cc4d2f5ad347ffb8bc5a0f9d8c82cfb9
| 3d8a7c0460aadabf1b7fc9adea9e6a2a6e7bc73b
| 3d07a0adc5c5c22dc525ccd3a93fb05a50ef1ac5
| 3d8b6ad12c7617b3d783790a457b0aa19b193b68
| 3d0880f091c51ddc145e17c78d8e6f9a3e7e20c8
| 3d8b82abe05a9d697102d8b665c9d499e07492ea
| 3d093e92cf03abf3650411e09a647630a1e0c478
| 3d8ba897240ad32580bf8dfd00db8f181658cdfd
| 3d095e908ff898be3b3ffd64a75db959a58ac70a
| 3d8bc227d67ec4944df8e75291102367034d7214
| 3d09d5dcd5a9bdad67a80cd83201a9e1fb75aada
| 3d8c722f1d92f7cd6f90c936c14f60f51aad128b
| 3d0abb83123be82abf43ce20118e72fea06023c5
| 3d8ca6eeabeb1c01fae05bb20f08dedf734cfd04
| 3d0b246304c42d2ab1eb1892d629fcdfde689cb7
| 3d8d0c6b1bf7946c6bef91d60cccb32207b7bc01
| 3d0bb5f49e6f0e31348ef8feb9a38d4ce71f5ec7
| 3d8fd2fbcaf3079a683a8e486ecde8875f0a591d
| 3d0c1283936c45fec533a507b78558b5aa3159fa
| 3d8ff93bd94b3ea14edc77d1e677cf4ee4306e4e
| 3d0cb8e28462780bb9af1440e297ecd8224c70ff
| 3d90ea8bfbf62feda080cd0ccbd12fa5c8673993
| 3d0ce10de5f69606c52882215b99ebab259dc194
| 3d932638fe8ed669725b7a143c6a8b02b8959923
| 3d0d7e5fb2ce288813306e4d4636395e047a3d28
| 3d93c92679aa9d398331e27fdeed64b5094e68d1 ...
|
| All I could think was, "Oh no, the nam-shub of Enki!"
| egypturnash wrote:
| are you posting this while having a drink at the Black Sun
| firewolf34 wrote:
| I often listen to Pocket TTS on the train or when I can't
| access my device to skip or do much other than play/pause, and
| oh my god this gets me everytime haha. I am actually thinking
| of DIY'ing my own web-scraper thing to do a better job at it
| because especially for scientific articles, it's really rough
| when it gets to any LaTeX. And then I'm sitting there listening
| to some very automated sounding voice read off cryptic numbers
| and greek letters and code and math notation like some kind of
| Soviet number station (which is kinda cool at first, but gets
| annoying haha).
|
| I want some kind of local document host that I can run a
| summarization or filtering script over to extract the portions
| that are legible to TTS, pipe it into something nice like
| ElevenLabs (if I was rich) or whatever, and then host a OGG for
| me to listen to on the go...
| nico wrote:
| Really cool, looks great
|
| Also before going to the article, I thought it was about using an
| LLM to write a book with stories and characters inspired by the
| message history
| larodi wrote:
| I somehow initially thought that the iMessages went through some
| LLM which retold them in nice Brothers Grim style. But from
| another perspective it also makes sense to have the originals,
| although the author is perhaps much better than me in writing
| messages which may one day be worth reading...
___________________________________________________________________
(page generated 2024-03-07 23:00 UTC)