[HN Gopher] Llama.ttf: A font which is also an LLM
       ___________________________________________________________________
        
       Llama.ttf: A font which is also an LLM
        
       Author : fuglede_
       Score  : 330 points
       Date   : 2024-06-23 12:01 UTC (10 hours ago)
        
 (HTM) web link (fuglede.github.io)
 (TXT) w3m dump (fuglede.github.io)
        
       | fuglede_ wrote:
       | Very much inspired this earlier HackerNews post which put Tetris
       | into a font, today we put an LLM and an inference engine into a
       | font so you can chat with your font, or write stuff with your
       | font without having to write stuff with your font.
       | 
       | https://news.ycombinator.com/item?id=40737961
        
       | hsfzxjy wrote:
       | cool. is there a github repo to produce this thing?
        
         | skilled wrote:
         | https://github.com/fuglede/llama.ttf
        
           | simonw wrote:
           | I love that the "Why?" section is deliberately left blank.
        
       | LeonigMig wrote:
       | this is over my head
        
         | polshaw wrote:
         | The critical part is knowing that TTF fonts can include a
         | virtual machine.. then he pops an llm into that and replaces
         | instances of !!!!!! with whatever the llm outputs.
        
           | abecedarius wrote:
           | Thank you. I wasn't going to watch a video to find out how
           | the LLM actually affects any output.
        
       | xrd wrote:
       | After watching part of the video, I believe the world would
       | benefit from a weekly television program where you could tune in
       | each week to watch something weird, brilliant and funny. This
       | would be a great episode #1 for that television show.
        
         | pininja wrote:
         | This reminds me of Posy. His channel is so fun, weird, and
         | captivating. https://youtube.com/@posymusic
        
           | hawski wrote:
           | Watching his videos is often very calming for me at the same
           | time. Visually striking beauty in simple things, calm
           | narration and pleasing music. I also recommend his Lazy
           | channel.
        
             | pininja wrote:
             | Watching "Thing on my carpet" is like witnessing curiosity
             | in its purest form
        
         | btown wrote:
         | On the esoteric software engineering side, Tom7 is the channel
         | you're looking for! https://www.youtube.com/@tom7
        
           | nextaccountic wrote:
           | Also Paralogical https://youtube.com/@paralogical-dev
        
         | haunter wrote:
         | Adult Swim's Off the Air
         | 
         | https://www.adultswim.com/videos/off-the-air
         | 
         | or on Youtube
         | https://www.youtube.com/playlist?list=PLQl8zBB7bPvLWfGCVicg_...
        
         | amelius wrote:
         | Didn't Slashdot try this?
        
         | surfingdino wrote:
         | So, basically TechMoan /s
        
       | electric_mayhem wrote:
       | While cool, technically... From a security perspective today I
       | learned that TrueType fonts have arbitrary code execution as a
       | 'feature' which seems mostly horrific.
        
         | samwillis wrote:
         | Not really, no more so than a random webpage running js/WASM in
         | a sandbox.
         | 
         | The only output from the WASM is to draw to screen. There is no
         | chance of a RCE, or data exfiltration.
        
           | electric_mayhem wrote:
           | I'm open to your idea, but can you explain in technical terms
           | why a wasm sandbox is invulnerable to the possibility of
           | escape vulnerabilities when other flavors of sandboxes have
           | not been?
        
           | turnsout wrote:
           | The risk is that you could have the text content say one
           | thing while the visual display says another. There are social
           | engineering and phishing risks.
        
           | xg15 wrote:
           | It's still horrible, not in a (direct) security but in an
           | interop sense: Now you have to embed an entire WASM engine,
           | including proper sandboxing, just to render the font
           | correctly. That's a huge increase of complexity and attack
           | surface.
        
             | simonw wrote:
             | I'm hoping that in a few years time WASM sandboxes will be
             | an expected part of how most things in general purpose
             | computing devices work.
             | 
             | There's very little code in the world that I wouldn't want
             | to run in a robust sandbox. Low level OS components that
             | manage that sandbox is about it.
        
               | xg15 wrote:
               | Normalizing the complexity doesn't make it go away.
               | 
               | Ideally, I'd like not to execute any kind of arbitrary
               | code when doing something mundane as rendering a font. If
               | that's not possible, then the code could be restricted to
               | someting less than turing complete, e.g. formula
               | evaluation (i.e. lambda calculus) without arbitrary
               | recursion.
               | 
               | The problem is that even sandboxed code is unpredictable
               | in terms of memory and runtime cost and can only be
               | statically analyzed to a limited extent (halting problem
               | and all).
               | 
               | Additionally, once it's there, people will bring in
               | libraries, frameworks and sprawling dependency trees,
               | which will further increase the computing cost and
               | unpredictability of it.
        
               | simonw wrote:
               | That's why I care so much about WebAssembly (and other
               | sandbox) features that can set a strict limit on the
               | amount of memory and CPU that the executing code can
               | access.
        
               | rft wrote:
               | Your comment reminded me of this great talk [1] (humor
               | ofc). While it talks about asm.js, WASM is in may ways,
               | IMO, the continuation of asm.js
               | 
               | [1] https://www.destroyallsoftware.com/talks/the-birth-
               | and-death...
        
             | Bluestein wrote:
             | While neat in a "because we can" kind of sense, it really
             | is maddening: Have we gone "compute-mad" and will end up
             | needing a full-fledged VM to render ever-smaller subsets of
             | UI or content until ... what?
             | 
             | What is the end game here?
             | 
             | It is kind of like a "fractal" attack surface, with
             | increasing surface the "deeper" one looks into it. It is
             | nightmarish from that perspective ...
        
           | Hizonner wrote:
           | > Not really, no more so than a random webpage running
           | js/WASM in a sandbox.
           | 
           | ... except that it can happen in non-browser contexts.
           | 
           | Even for browsers, it took 20+ years to arrive at a
           | combination of ugly hacks and standard practices where
           | developers who make no mistakes in following a million arcane
           | rules can mostly avoid the massive day-one security problems
           | caused by JavaScript (and its interaction with other
           | misfeatures like cookies and various cross-site nonsense).
           | During all of which time the "Web platform" types were
           | beavering away giving it more access to more things.
           | 
           | The Worldwide Web technology stack is a pile of ill-thought-
           | out disasters (or, for early, core architectural decisions,
           | not-thought-out-at-all disasters), all vaguely contained with
           | horrendous hackery. This adds to the pile.
           | 
           | > The only output from the WASM is to draw to screen.
           | 
           | Which can be used to deceive the user in all kinds of well-
           | understood ways.
           | 
           | > There is no chance of a RCE, or data exfiltration.
           | 
           | Assuming there are no bugs in the giant mass of code that a
           | font can now exercise.
           | 
           | I used to write software security standards for a living.
           | Finding out that you could embed WASM in fonts would have
           | created maybe two weeks of work for me, figuring out the
           | implications and deciding what, if anything, could be done
           | about them. Based on, I don't know, a hundred similar cases,
           | I believe I probably would have found some practical issues.
           | I might or might not have been able to come up with any
           | protections that the people writing code downstream of me
           | could (a) understand and (b) feasibly implement.
           | 
           | Assuming I'd found any requirements-worthy response, it
           | probably would have meant much, much more work than that for
           | the people who at least theoretically had to implement it,
           | and for the people who had to check their compliance. At one
           | company.
           | 
           | So somebody can make their kerning pretty in some obscure
           | corner case.
        
           | kenferry wrote:
           | Why do you say that? Security exploits involving fonts are
           | extremely common.
        
         | px43 wrote:
         | If you think that's bad, until very recently, Windows used to
         | parse ttf directly in the kernel, meaning that a target could
         | look at a webpage, or read an email, and be executing arbitrary
         | code in ring0.
         | 
         | Last I checked there were about 4-10 TTF bugs discovered and
         | actively exploited per year. I think I heard those stats in
         | 2018 or so. This has been a well known and very commonly
         | exploited attack vector for at least 20 years.
        
           | anthk wrote:
           | The same with Wav files.
        
         | rft wrote:
         | (Sadly) this is nothing new. Years ago I wrangled a (modified)
         | bug in the font rendering of Firefox [1, 2016] into an exploit
         | (for a research paper). Short version: the Graphite2 font
         | rendering engine in FF had/has? a stack machine that can be
         | used to execute simple programs during font rendering. It
         | sounded insane to me back then, but I dug into it a bit. Turns
         | out while rendering Roman based scripts is relatively
         | straightforward [2], there are scripts that need heavy use of
         | ligatures etc. to reproduce correctly [3]. Using a basic
         | scripting (heh) engine for that does make some sense.
         | 
         | Whether this is good or bad, I have no opinion on. It is "just"
         | another layer of complexity and attack surface at this point.
         | We have programmable shaders, rowhammer, speculative execution
         | bugs, data timing side channels, kernel level BPF scripting,
         | prompt injection and much more. Throwing WASM based font
         | rendering into the mix is just balancing more on top of the
         | pile. After some years in the IT security area, I think there
         | are so many easier ways to compromise systems than these arcane
         | approaches. Grab the data you need from a public AWS bucket or
         | social engineer your access, far easier and cheaper.
         | 
         | For what it's worth, I think embedded WASM is a better idea
         | than rolling your own eco systems for scripting capabilities.
         | 
         | [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1248876
         | 
         | [2] I know, there are so many edge cases. I put this in the
         | same do not touch bucket as time and names.
         | 
         | [3]
         | https://scripts.sil.org/cms/scripts/page.php?id=cmplxrndexam...
        
       | closetkantian wrote:
       | This is really cool, but I'm left with a lot of questions. Why
       | does the font always generate the same string to replace the
       | exclamation points as he moves from gedit to gimp? Shouldn't the
       | LLM be creating a new "inference"?
       | 
       | As an aside, I originally thought this was going to generate a
       | new font "style" that matched the text. So for example, "once
       | upon a time" would look like a storybook style font or if you
       | wrote something computer science-related, it would look like a
       | tech manual font. I wonder if that's possible.
        
         | closetkantian wrote:
         | So, another poster cleared up my first question. It's probably
         | because the seed is the same. I think it would have been a
         | better demo if it hadn't been, though.
        
           | thomasfromcdnjs wrote:
           | But having the same "seed" doesn't guarantee the same
           | response from an LLM, hence the question above.
        
             | wavemode wrote:
             | I fail to understand how an LLM could produce two different
             | responses from the same seed. Same seed implies all random
             | numbers generated will be the same. So where is the source
             | of nondeterminism?
        
               | furyofantares wrote:
               | I believe people are confused because ChatGPT's API
               | exposes a seed parameter which is not guaranteed to be
               | deterministic.
               | 
               | But that's due to the possibility model configuration
               | changes on the service end and not relevant here.
        
             | dragonwriter wrote:
             | Barring subtle incompatibilities in underlying
             | implementations on different environments, it does,
             | assuming all other generation settings (temperature, etc.)
             | are held constant.
        
           | fuglede_ wrote:
           | You got it, same seed in practice, but also just temperature
           | = 0 for the demo actually. A few things I considered adding
           | for the fun of it were 1) a way to specify a seed in the
           | input text, 2) a way to using a symbol to say "I didn't like
           | that token, try to generate another one", so you could do,
           | say, "!" to generate tokens, "?" to replace the last
           | generated token. So you would end up typing things like
           | 
           | "Once upon a
           | time!!!!!!!!!!!!!!!!!!!!!!!!!!!!!SEED42!!!!!??!!!??!"
           | 
           | and 3) actually just allow you to override the suggestions by
           | typing what letters on your own, to be used in future
           | inferences. At that point it'd be a fairly generic auto-
           | complete kind of thing.
        
             | jameshart wrote:
             | Using the input characters to affect the token selection
             | would increase the 'magic' a little.
             | 
             | As it is, if you go back into a string of !!!!!!!!!! That
             | has been turned into 'upon a time', and try to delete the
             | 'a', you'll just be deleting an ! And the string will turn
             | into 'once upon a tim'.
             | 
             | If you could just keyboard mash to pass entropy to the
             | token sampler, deleting a specific character would alter
             | the generation from that point onwards.
        
       | wiradikusuma wrote:
       | So how do you copy the output?
        
         | simonw wrote:
         | Screenshot and OCR!
        
         | phaym wrote:
         | Since it only alters the presentation of the text, not the
         | text/data itself, maybe using a type of image-to-text tool like
         | this could work: https://www.imagetotext.info/
         | 
         | I guess that's the closest you get to copying.
        
       | jonathaneunice wrote:
       | I never imagined a future in which PDFs talked back. Now I can.
        
         | anthk wrote:
         | PostScript files are dynamic code. You can create polygons
         | dynamically with commands. And, of course, font FX's, styles,
         | elipses...
         | 
         | Also, there's a ZMachine interpreter (text adventure player)
         | written in PostScript which can play Zork and some libre games
         | such as Calypso with just GhostScript, the PostScript
         | interpreter most software use to render PostScript files.
        
       | polshaw wrote:
       | This is cool, as far as a practical issue though (aside from the
       | 280gb TTF file!) is that it makes it incompatible with all other
       | fonts; if you copy and paste your "improved" text then it will no
       | longer say what you thought it did. It just alters the
       | presentation, not the content. I guess you would have to ocr to
       | get the content as you see it.
       | 
       | I was wondering why this was never used for an simpler
       | autocorrect, but i guess that's why.
       | 
       | Also perhaps someone more educated on LLMs could tell me; this
       | wouldn't always be consistent right? Like "once upon a time
       | _____" wouldn't always output the same thing, yes? If so even
       | copying and pasting in your own system using the correct font
       | could change the content.
        
         | Retr0id wrote:
         | If there's any randomness involved in inference, it ought to be
         | deterministic as long as the same seed is used each time.
        
           | furyofantares wrote:
           | Is there even any possibility of using a different seed? I'd
           | doubt the WASM shaper has accesss to any source of non-
           | determinism.
        
         | magnat wrote:
         | > if you copy and paste your "improved" text then it will no
         | longer say what you thought it did
         | 
         | It's not a bug, it's a feature - a DRM. Your content can now be
         | consumed, but cannot be copied or modified - all without
         | external tools, as long as you embed that TTF somehow.
         | 
         | Which kind of reminds me of a PDF invoices I got from my
         | electricity provider. It looked and printed perfectly fine, but
         | used weird codepoint mapping which resulted in complete garbage
         | when trying to copy any text from it. Fun times, especially
         | when pasting account number to a banking app.
        
           | mbb70 wrote:
           | This is while pretty much all software that extracts
           | structured data from PDFs throws away the text and just OCRs
           | the page. Too many tricks with layouts and fonts.
        
       | xg15 wrote:
       | > _The font shaping engine Harfbuzz, used in applications such as
       | Firefox and Chrome, comes with a Wasm shaper allowing arbitrary
       | code to be used to "shape" text._
       | 
       | Has there already been a proposal to add scripting functionality
       | to Unicode itself? Seems to me we're not very far from that
       | anymore...
        
         | magicalhippo wrote:
         | Unicode OS when?
        
         | DemocracyFTW2 wrote:
         | Considering the actual complexity of rendering e.g. Urdu in
         | decent, native-looking way you presumably do want some Turing-
         | complete capabilities at least in some cases, cf "One
         | handwritten Urdu newspaper, The Musalman, is still published
         | daily in Chennai.[232] InPage, a widely used desktop publishing
         | tool for Urdu, has over 20,000 ligatures in its Nasta'liq
         | computer fonts."
         | (https://en.wikipedia.org/wiki/Urdu#Writing_system)
         | 
         |  _Edit_ --the OP uses this exact use case, Urdu typesetting, to
         | justify WASM in Harfbuzz (video around 6:00); seems like Urdu
         | has really become the posterchild for typographic complexity
         | these days
        
         | crazygringo wrote:
         | To Unicode? Good god please no. Unicode is just codepoints. I
         | shudder to think what adding scripting support to that would
         | even mean.
         | 
         | Maybe you meant adding it to OpenType?
        
           | xg15 wrote:
           | I was being sarcastic, but yes, I meant unicode...
        
             | crazygringo wrote:
             | Sometimes you just can't tell, you know... OK, my sanity is
             | restored, thanks. :)
        
         | winternewt wrote:
         | You mean encoding executable code in plain text files, that
         | execute when you open them? No, that seems unnecessary and very
         | insecure.
        
       | yourfriendpalsy wrote:
       | Interesting idea, but needs to be ported to the Typescript type
       | system.
        
       | simonw wrote:
       | > The font shaping engine Harfbuzz, used in applications such as
       | Firefox and Chrome, comes with a Wasm shaper allowing arbitrary
       | code to be used to "shape" text.
       | 
       | In that case could you ship a live demo of this that's a web page
       | with the font embedded in the page as a web font, such that
       | Chrome and Firefox users can try it out without installing
       | anything else?
        
         | chazeon wrote:
         | As shown in the video, the font is 280 GB, so opening such a
         | page will practically be a nightmare, especially if you are on
         | cellular.
        
         | binwiederhier wrote:
         | In the video he shows that the font file size is 290GB, so I
         | would assume that's a little prohibitive.
        
           | codezero wrote:
           | That's only for a 70B param LLM. The one he includes is 15M
           | params and weighs about 60MB. Not tiny, but doable.
        
           | azeirah wrote:
           | That's LLaMa-3-70B. The demo he gives at 6:09 is
           | tinystories-15m, which is 30.4MB, so you'd only have to add
           | the font to that (80~KB?)
           | 
           | https://huggingface.co/nickypro/tinyllama-15M/tree/main
        
         | erk__ wrote:
         | The wasm shaper is an experimental feature that is not enabled
         | in any browser at the moment.
        
       | Xlythe wrote:
       | It seems like it'd be possible to, instead of typing multiple
       | exclamation points, have one trigger-character (eg. ). And then
       | replace that character visually with an entire paragraph of text,
       | assuming there aren't limits to the width of a character in
       | fonts. I suppose the cursor and text wrapping would go wonky,
       | though.
       | 
       | You could also use this to make animated fonts. An excuse to hook
       | up a diffusion model next?
        
       | bitwize wrote:
       | > The font shaping engine Harfbuzz, used in applications such as
       | Firefox and Chrome, comes with a Wasm shaper allowing arbitrary
       | code to be used to "shape" text.
       | 
       | Oh, this can't be used for nefarious purposes. What could
       | POSSIBLY go wrong?!
        
       | lacoolj wrote:
       | Hello. I'm Dr. Sheldon Cooper. And welcome to Sheldon Cooper
       | Presents: Fun with Fonts
        
       | exe34 wrote:
       | your engineers were so busy finding out if they could, they never
       | stopped to ask if they should!
        
       | rhyjyrtjhtyn wrote:
       | The author categorizes this as "pointless" but some things I can
       | think of is being able to create automated workflows within an
       | app that didn't previously allow it or had limited scope and then
       | creating app interoperability with other app's using the same
       | method.
        
         | ComputerGuru wrote:
         | You mean via wasm hinting in general or embedded llm in
         | specific? Because I don't see why you need an llm for that.
        
       | pk-protect-ai wrote:
       | I will never allow my linux to update my fonts ever again ...
       | Arbitrary code execution in its finest form.
        
       | tcsenpai wrote:
       | I may be doing this wrong but...the font provided just install as
       | OpenSans and does not provide any functionality at least in
       | mousepad or LibreOffice Writer. I am talking about the 90mb one
        
         | fuglede_ wrote:
         | Yeah, sorry, that could have been clearer, I added a few more
         | instructions. Basically, chances are that even if you've got
         | Harfbuzz running, you're still running a version with no Wasm
         | runtime. If so, chances are you can get away with building it
         | with Wasm support, then add the built library to LD_PRELOAD
         | before running the editor.
        
           | tcsenpai wrote:
           | That was useful. I have indeed compiled and installed wasm-
           | micro and now meson build it successfully. Tho "meson compile
           | -C build" returns an error about not finding "hb-wasm-api-
           | list.hh". Do you have any experience of that?
           | 
           | EDIT: Nevermind. Using the exact commits you linked give
           | another error (undefined reference to
           | wasm_externref_ref2obj). I give up
        
             | fuglede_ wrote:
             | Another font connoisseur put together a script here that
             | might be helpful: https://github.com/hsfzxjy/Harfbuzz-WASM-
             | Fantasy/blob/master...
        
               | tcsenpai wrote:
               | Managed to build it using "-DWAMR_BUILD_REF_TYPES=1" And
               | it works! Now is time to dive deep :)
        
       | Dwedit wrote:
       | I thought the Bad Apple font was really neat, but this is just
       | too much.
        
       | NayamAmarshe wrote:
       | This is the coolest thing I've seen this week.
        
       | geor9e wrote:
       | >build Harfbuzz with -Dwasm=enabled and build wasm-micro-runtime,
       | then add the resulting shared libraries, libharfbuzz.so.0.60811.0
       | and libiwasm.so to the LD_PRELOAD environment variable before
       | running a Harfbuzz-based application such as gedit or GIMP
       | 
       | It'd be lovely if someone embedded the font in a website form to
       | save us all the trouble of demoing it
        
         | erk__ wrote:
         | It would not be of much use as no browser enables this
         | experimental feature. So unless you somehow build a wasm build
         | of Harfbuzz with the feature enabled and embed it on there
         | nothing will happen.
        
           | choppaface wrote:
           | And thank goodness it's disabled, or we could have another
           | JBIG2 https://googleprojectzero.blogspot.com/2021/12/a-deep-
           | dive-i...
        
             | pennomi wrote:
             | Yeah I know these posts are all funny use cases, but all I
             | can see are font-based security nightmares.
        
       | jraph wrote:
       | (Show HN)
        
       | ranger_danger wrote:
       | This is terrifying.
        
       | bastien2 wrote:
       | Well this definitely won't get exploited at all or lead to new
       | strict limits on what Harfbuzz/WASM can do
        
         | lxgr wrote:
         | WASM sandboxing is pretty good! Together with the presumably
         | very limited API with which this can communicate with the
         | outside world, I wouldn't be too concerned.
         | 
         | To me, it's a great reminder that the line between well-
         | sandboxed turing-complete execution environments and messy
         | implementations of decoders for "purely declarative" data
         | formats can be quite blurry.
         | 
         | Said differently, I'd probably trust Harfbuzz/WASM more than
         | the average obscure codec implementation in ffmpeg.
        
       | zharknado wrote:
       | My takeaway is that if you can efficiently simulate rendering
       | raster graphics with text ligatures, you could run Doom in a TTF.
       | 
       | Right?
        
       | petters wrote:
       | The page links to https://www.coderelay.io/fontemon.html which is
       | a game embedded into a font. Playable in the browser.
        
       | freitasm wrote:
       | Stopped watching when the demo showed the letter O with a slash.
       | That would confuse me a lot. I am an old timer and expect the
       | zero to have it.
        
       | anthk wrote:
       | A Z Machine in a TTF font, anyone?
        
       | tcsenpai wrote:
       | After your help and troubleshooting, I am happy to notify you
       | that your work has been archived
       | (https://archive.tunnelsenpai.win/archive/1719179042.512455/i...
       | and in the Internet Archive). Thanks!
        
       | bbor wrote:
       | Wow, this is incredible. OP you (I?) should train a few models
       | with different personalities/tasks and pair them with the 5
       | GitHub Monaspace fonts accordingly, allowing people in multifont
       | programs to easily get different kinds of help in different
       | situations. Lots of little ideas sparked by this... in general, I
       | think this a good reminder that we are vastly underestimating
       | fonts in discussions of UI (and, it appears, UX in full!)
        
       ___________________________________________________________________
       (page generated 2024-06-23 23:00 UTC)