[HN Gopher] Ghostwriter - use the reMarkable2 as an interface to...
       ___________________________________________________________________
        
       Ghostwriter - use the reMarkable2 as an interface to vision-LLMs
        
       Author : wonger_
       Score  : 182 points
       Date   : 2025-02-08 03:02 UTC (19 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | rpicard wrote:
       | This is so cool. I'm going to try it this weekend.
       | 
       | I've been playing with the idea of auto creating tasks when I
       | write todos by emailing the PDF and sending it to an LLM.
       | 
       | This just opened up a whole realm of better ways to accomplish
       | that goal in realtime.
        
         | r2_pilot wrote:
         | This works pretty well when I did a proof of concept with
         | Claude and rMPP a couple of months ago. It even handles
         | scheduling fuzzy times ("I want to do this sometime but I don't
         | have any real time I want to do it, pick a time that doesn't
         | conflict with my actually scheduled tasks"). All with minimal
         | prompting. I just didn't have a decent workflow and did exactly
         | what you considered, emailed the pdf. I should probably revisit
         | this but I haven't had the inclination since I just ignored the
         | tasks anyway lol
        
           | rpicard wrote:
           | Ha, automating the doing of the task is the next step.
        
         | awwaiid wrote:
         | Let me know if you need any help, I think only one other person
         | has tried to get this working. I'm over on the reMarkable
         | discord server, https://discord.gg/u3P9sDW (linked from
         | https://github.com/reHackable/awesome-reMarkable)
         | 
         | Rust binary so should be easy to install. In theory :)
        
           | rpicard wrote:
           | Will do! My wife and I love Harry Potter so I'm motivated to
           | show her my investment in the tablet actually got me Tom
           | Riddle's diary.
           | 
           | I don't use discord much but I'll find you somewhere around
           | here!
        
             | awwaiid wrote:
             | I'm on at awwaiid@gmail.com and probably other places :)
             | 
             | "proof" to partner of tablet investment value based on
             | interactive fiction conversation == excellent strategy and
             | nothing could go wrong
        
       | owulveryck wrote:
       | Awesome.
       | 
       | I wanted to try to implement this for months. You did a really
       | good job.
        
         | owulveryck wrote:
         | At some point I wanted to turn goMarkableStream into a MCP
         | server (model context protocol). I could get the screen, but
         | without "hack" I couldn't write the response back.
        
           | awwaiid wrote:
           | The trick here is to inject events as if they came from the
           | user. The virtual-keyboard works really reliably, you can see
           | it over at https://github.com/awwaiid/ghostwriter/blob/main/s
           | rc/keyboar... . It is the equivalent of plugging in the
           | reMarkable type-folio.
           | 
           | Main limitation is that the reMarkable drawing app is very
           | very minimal, it doesn't let you place text in arbitrary
           | screen locations and is instead sort of a weird overlay text
           | area spanning the entire screen.
        
         | awwaiid wrote:
         | Thank you! Still a WIP, but a very fun learning / inspiration
         | project. Got a bit of Rust jammed in there, bit of device
         | constraint dancing, bit of multiple LLM api normalization, bit
         | of spacial vision LLM education, etc.
        
       | tony_francis wrote:
       | Harry potter half-blood prince vibes. Interesting just how much
       | the medium changes the feeling of interacting with a chat model
        
         | s2l wrote:
         | Now only if llm response font is some handwritten style.
        
           | satvikpendem wrote:
           | That's definitely pretty easy to achieve, just change the
           | font settings to use a particular handwritten style font [0].
           | 
           | [0] https://fonts.google.com/?categoryFilters=Calligraphy:%2F
           | Scr...
        
           | memorydial wrote:
           | That would be next-level immersion! You could probably
           | achieve this by rendering the LLM's response using a
           | handwritten font--maybe even train a model on your own
           | handwriting to make it feel truly personal.
        
             | dharma1 wrote:
             | Script fonts don't really look like handwriting - too
             | regular.
             | 
             | But one of the early deep learning papers from Alex Graves
             | does this really well with LSTMs -
             | https://arxiv.org/abs/1308.0850
             | 
             | Implementation - https://www.calligrapher.ai/
        
               | awwaiid wrote:
               | ooo -- thanks for the link!
        
           | memorydial wrote:
           | Actually if you figure that out please post it here!! I'd
           | love to see that!
        
           | wdb wrote:
           | Like Apple Notes's Smart Script?
        
           | awwaiid wrote:
           | This uses LLM Tools to pick between outputting an SVG or
           | plugging in a virtual keyboard to type. The keyboard is much
           | more reliable, and that's what you see in the screenshot.
           | 
           | If nothing else it could use an SVG font that has
           | handwriting; you'd need to bundle that for rendering via
           | reSVG or use some other technique.
           | 
           | But if I ever make a pen-backend to reSVG then it would be
           | even cooler, you would be able to see it trace out the
           | letters.
        
         | memorydial wrote:
         | Exactly! There's something about handwriting that makes it feel
         | more personal--like scribbling notes in the margins of a
         | spellbook. The shift from typing to pen input definitely
         | changes the vibe of interacting with AI.
        
         | GeoAtreides wrote:
         | erm, you mean harry potter tom riddle's horcrux diary, sure
         | 
         | you know, the diary that wrote back to you and possessed your
         | soul? that cursed diary?
        
           | guax wrote:
           | I wonder if its better than the current version where my soul
           | gets possessed by youtube shorts for 40 minutes.
        
         | hexomancer wrote:
         | That's beside the point but you are probably referring to harry
         | potter and the chamber of secrets not the half-blood prince.
        
       | cancelself wrote:
       | @apple.com add to iPadOS Notes?
        
       | newman314 wrote:
       | I wonder if this can be abstracted to accept interaction from a
       | Daylight too.
        
       | complex1314 wrote:
       | Really cool. Would this run on the remarkable paper pro too?
        
         | awwaiid wrote:
         | Buy me one and I'll find out! hahahaha
         | 
         | But also -- the main thing that might be different is the
         | screenshot algorithm. I'm over on the reMarkable discord; if
         | you want to take up a bit of Rust and give it a go then I'd be
         | happy to (slowly/async) help!
        
           | complex1314 wrote:
           | :) Thanks! Been looking into learning rust recently, so will
           | keep that in mind if I get it off the ground.
        
             | awwaiid wrote:
             | Initially most of the Rust was written by copilot or
             | Sourcegraph's Cody; then I learn more and more rust as I
             | disagree with the code-helper's taste and organization.
             | Though I have a solid foundation in other programming
             | languages which accelerates the process ... it's still a
             | weird way to learn a language that I'm getting used to and
             | kinda like.
             | 
             | That said, I based the memory capture on
             | https://github.com/cloudsftp/reSnap/tree/latest which is a
             | shell script that slurps out of process space device files.
             | If you can find something like that which works on the rPP
             | then I can blindly slap it in there and we can see what
             | happens!
        
       | 0xferruccio wrote:
       | This is so cool! I love to see people hacking together apps for
       | the reMarkable tablet
       | 
       | I made a little app for reMarkable too and I shared it here some
       | time back: https://digest.ferrucc.io/
        
         | memorydial wrote:
         | That's awesome! Love seeing the reMarkable get more
         | functionality through creative hacks. Just checked out your app
         | --what was the biggest challenge you faced while developing for
         | the reMarkable?
        
           | 0xferruccio wrote:
           | I think the thing I really didn't like was the lack of an
           | OAuth like flow with fine-grained permissions
           | 
           | Basically authentication with devices is "all-access" or "no-
           | access". I would've liked it if a "write-only" or "add-only"
           | api permission scope existed
        
           | pieterhg wrote:
           | Blocked for AI reply @dang
        
             | defrost wrote:
             | Good catch, the last few pages of comment history are
             | inhumanly insincere.
             | 
             | https://news.ycombinator.com/threads?id=memorydial
             | 
             | " @dang " isn't a thing, he doesn't watch for it - take
             | credit and email him direct.
        
               | kordlessagain wrote:
               | Do you have proof this is true?
        
               | awwaiid wrote:
               | I might be biased because memorydial was complimentary to
               | me ... but they SEEM like a human! Also I'm not all that
               | opposed to robot participation in the scheme of things.
               | Especially if they are nice to me or give good ideas :)
        
               | loxias wrote:
               | Most people don't correctly use an em-dash differently
               | than a hyphen. That jumps out to me. :)
        
               | defrost wrote:
               | He has commented on this.
               | 
               | Retrieval is tricky as Algolia doesn't index '@' symbols:
               | 
               | https://hn.algolia.com/?query=%40dang%20by%3Adang&sort=by
               | Dat...
        
         | Ensign35 wrote:
         | It's so great seeing these, always make me want to play with
         | developing apps for the Remarkable 2. Do you have any sources
         | you can recommend? Thank you!
         | 
         | edit: found the official developer website
         | https://developer.remarkable.com/documentation
        
           | 0xferruccio wrote:
           | IMO the easiest way to play around is to use the reverse
           | engineered APIs
           | 
           | https://github.com/erikbrinkman/rmapi-js
        
             | Ensign35 wrote:
             | Much appreciated :+1:
        
           | awwaiid wrote:
           | https://github.com/reHackable/awesome-reMarkable is a great
           | resource to get other resources, including getting onto the
           | discord if you want some interactive conversations.
        
       | t0bia_s wrote:
       | How about this on android driven Onyx Boox ereaders? Would it be
       | possible?
        
         | awwaiid wrote:
         | The limitations for the reMarkable made it so that I took a
         | screenshot and then injected input events to interact with the
         | proprietary drawing app. Cross-app screenshots with the right
         | permission are probably possible on Android, I'm not sure about
         | injecting the drawing events.
         | 
         | The other way to go would be to make a specific app. I just
         | picked up an Apple Pencil and am thinking of porting the
         | concepts to a web app which so far works surprisingly well ...
         | but for a real solution it'd be better for this Agent to
         | interact with existing apps.
        
       | memorydial wrote:
       | This is a brilliant use case--handwriting input combined with
       | LLMs makes for a much more natural workflow. I wonder how well it
       | handles messy handwriting and if fine-tuning on personal notes
       | would improve recognition over time.
        
         | r2_pilot wrote:
         | I did this a few months ago with the Remarkable Paper Pro and
         | Claude. It worked quite well, my handwriting is pretty
         | terrible, and I even had a clunky workflow where I could just
         | write down stuff I wanted to do, and roughly(or specifically)
         | when I wanted to do it, and it was able to generate an ical I
         | could load into my calendar.
        
         | awwaiid wrote:
         | Generally if I can read my handwriting then it can! It has no
         | issues with that. Really the problem is more in spacial
         | awareness -- it can't reliably draw an X in a box, let alone
         | play tic-tac-toe or dots-and-boxes.
        
       | chrismorgan wrote:
       | > _Things that worked at least once:_
       | 
       | I like it.
        
         | awwaiid wrote:
         | Top quality modern AI Eval!!!
        
       | 8bithero wrote:
       | Not to distract from the project but if anyone is interested in
       | eink tablets with LLMs, the ViWoods tablet might be of interest
       | to you.
        
         | Ensign35 wrote:
         | Is this a Remarkable rebrand? Even the UI looks the same!
         | 
         | edit: https://viwoods.com/ (based in Hong Kong)
         | 
         | edit 2:
         | 
         | It's a blatant copy of the Remarkable 2 for sure :/ LLM
         | integration is interesting --> Remarkable are you listening?
        
       | seethedeaduu wrote:
       | Kinda unrelated but should I go for kobo or the remarkable? I
       | mostly want to read papers and maybe take notes. How do tthey
       | compare in terms of hackability and freedom?
        
       | vessenes wrote:
       | Love this! There are some vector diffusion models out there; why
       | not use tool calling to outsource to one of those if the model
       | decides to draw something? Then it could specify coordinate range
       | and the prompt.
        
         | awwaiid wrote:
         | Two reasons. One, because I haven't gotten to it yet. Two... er
         | no just the one reason! Do you have a particular one, ideally
         | with a hosted API, that you recommend?
        
       | 3abiton wrote:
       | I own a boox tablet (full fledge Android tablet with eink
       | screen), and this sort of things would be perfect for it. I
       | wonder if in 5 years the mobile hw would support something like
       | that locally!
        
       | xtiansimon wrote:
       | For PDF paper readers, is the Remarkable's 11" size sufficient? I
       | have the Sony DPT 2nd version at 13", and it's perfect viewing
       | experience. But projects like this keep drawing me to the
       | Remarkable product.
        
         | pilotneko wrote:
         | I have used the Remarkable 2 for papers, but it is slightly too
         | small to read text comfortably. I'm also an active reader, so I
         | miss the color highlighting. Annotations are excellent. For
         | now, I'm sticking to reviewing papers in the Zotero application
         | on my iPad.
        
         | kordlessagain wrote:
         | It's barely usable for PDFs
        
           | freedomben wrote:
           | Depends mostly on the font size in the PDF. For dense PDFs I
           | agree, it's barely usable. For most PDFs though I'd call it
           | "acceptable." If you have control over the font size (such as
           | when you're converting some source material to PDF) you can
           | make it an excellent reading experience IMHO.
        
         | abawany wrote:
         | I got the reMarkable Pro tablet recently and as a result was
         | able to move on from my Sony DPT-S1 and reMarkable 2. The
         | latter was nice for its hackability but the screen size of the
         | Pro, its color functionality, and size have made it a great
         | replacement.
        
       | awwaiid wrote:
       | Project author here -- happy to elaborate on anything; a
       | continuous WIP project. The biggest insight has been limitations
       | of vision models in spacial awareness -- see
       | https://github.com/awwaiid/ghostwriter/blob/main/evaluation_...
       | for some sketchy examples of my rudimentary eval.
       | 
       | Next top things:
       | 
       | * Continue to build/extract into a yaml+shellscript agentic
       | framework/tool
       | 
       | * Continue exploring pre-segmenting or other methods of spacial
       | awareness
       | 
       | * Write a reSvg backend that sends actual pen-strokes instead of
       | lots of dots
        
         | rybosome wrote:
         | This is a really cool effect. How do you envision this being
         | used?
         | 
         | Thinking about it as a product, I'd want a way to easily slip
         | in and out of "LLM please respond" so it wasn't constantly
         | trying to write back the moment I stopped the stylus - maybe
         | I'd want awhile to sketch and think, then restart a
         | conversation. Or maybe for certain pages to be LLM-enabled, and
         | others not.
         | 
         | Does it require any sort of jailbreak to get SSH access to the
         | device?
        
           | awwaiid wrote:
           | The reMarkable comes with root-ssh out of the box, so
           | installation here is scp'ing a rust-built binary over, and
           | then ssh'ing and running it. I haven't wrapped it in a
           | startup-on-boot service yet.
           | 
           | It is triggered right now by finger-tapping in the upper-
           | right corner, so you can ask it to respond to the current
           | contents of the screen on-demand. I think it would be cool to
           | have another out-of-band communication, like voice, but this
           | device has no microphone.
           | 
           | Also right now it is one-shot, but on my long long TODO list
           | is a second trigger that would _continue_ a back and forth
           | multi-screenshot (like multi-page even) conversation.
        
             | rybosome wrote:
             | Ah great, I will definitely give this a try later then,
             | thanks!
             | 
             | I'm curious if this is becoming something that you are
             | using in your own day-to-day, or if your focus right now is
             | on building it?
             | 
             | The context for my question is just a general interest in
             | the transition to AI-enabled workflows. I know that I could
             | be much more productive if I figured out how to integrate
             | AI assistance into my workflows better.
        
               | awwaiid wrote:
               | Only building so far.
               | 
               | The one use-case that is _close_ to ready-for-useful: I
               | often take business meeting notes. In these notes I often
               | write a T in a circle to indicate a TODO item. I am going
               | to add a bit of config in there, basically "If you see a
               | circle-T, then go add that to my todo list if it isn't
               | already there. If you see a crossed-out circle-T then go
               | mark it as done on the todo list" .
               | 
               | I got slightly distracted implementing this, working
               | instead toward a pluggable "if you see X call X.sh"
               | interface. Almost there though :)
        
         | loxias wrote:
         | Wow! This is really cool! Really really cool! I imagine some
         | sort of use where it's even more collaborative and not just
         | "unadorned turn-by-turn".
         | 
         | For example, maybe I'm taking notes involving words, simple
         | math, and a diagram. Underline a key phrase and "the device"
         | expands on the phrase in the margin. Maybe the device is
         | diagramming, and I interrupt and correct it, crossing out some
         | parts, and it understands and alters.
         | 
         | Sorry, I know this is vague, I don't know precisely what I
         | mean, but I do think that the combination of text (via some
         | sort of handwriting recognition), stroke gestures, and a small
         | iconography language with things enabled by LLMs probably opens
         | up all sorts of new user interaction paradigms that I (and
         | others) might be too set in our ways to think of immediately.
         | 
         | I think there's a "mother of all demos" moment potentially
         | coming soon with stuff like this, but I am NOT a UX designer
         | and can't quite imagine it clearly enough. Maybe you can.
        
           | awwaiid wrote:
           | Yes! I have flashbacks to productive times standing in front
           | of a whiteboard, alone or with others, doodling out thoughts
           | and annotating them. When working with others I can usually
           | talk to them, so we are also discussing as we are drawing and
           | annotating. But also I've handed diagrams / equations to
           | someone and then later they hand me back an annotated version
           | -- that's interesting too.
        
       | vendiddy wrote:
       | I wish the remarkable tablets weren't so locked down.
       | 
       | It's one of my favorite pieces of hardware and wish there were
       | more apps for it.
        
         | thrtythreeforty wrote:
         | Locked down? You can get a shell by ssh'ing to it. Call me when
         | an iPad lets you do that...
        
           | freedomben wrote:
           | I agree I definitely wouldn't call them "locked down." I do
           | however think they could do a lot more to make it
           | usable/hackable. This slightly undermines their cloud service
           | ambitinos, but I think the hackability is what makes the
           | Remarkable so ... well .. remarkable. Certainly that's why I
           | bought one!
        
       ___________________________________________________________________
       (page generated 2025-02-08 23:01 UTC)