[HN Gopher] Hey, computer, make me a font
       ___________________________________________________________________
        
       Hey, computer, make me a font
        
       Author : pavanyara
       Score  : 239 points
       Date   : 2023-10-03 12:17 UTC (10 hours ago)
        
 (HTM) web link (serce.me)
 (TXT) w3m dump (serce.me)
        
       | scarygliders wrote:
       | Okay I can't try it out anyway. "Blocksparse is not available:
       | the current GPU does not expose Tensor cores"
       | 
       | My "best" GPU is an RTX 2070 Super, Turing architecture.
       | 
       | I've seen similar messages when using stable-diffusion... either
       | with -webui or with automatic, can't exactly remember, but they
       | both run fine on that RTX 2070 Super, so I can only guess that
       | they revert to some other method than Blocksparse on seeing that
       | it doesn't support Turing. Or something. I haven't looked into
       | how they deal with it.
       | 
       | I've submitted an Issue [0] for it. I don't have enough knowledge
       | to know if there's some way of saying "don't use Blocksparse" for
       | fontogen.
       | 
       | [0] https://github.com/SerCeMan/fontogen/issues/2
        
       | philipwhiuk wrote:
       | > To train the model, I assembled a dataset of 71k distinct
       | fonts.
       | 
       | I give it a week before Monotype sues your face off.
        
         | yellow_postit wrote:
         | Font law is almost as complex and fascinating as Tree law.
         | Given how complex font licensing can be, a generative use case
         | that produces usable fonts would be a huge threat to the
         | foundaries and I expect they will be very litigious, just as
         | Getty and others are in the image space.
        
           | dwaltrip wrote:
           | Tree law? Please say more, sounds interesting
        
             | 123pie123 wrote:
             | possibly this? https://www.atlasobscura.com/articles/tree-
             | law-is-a-gnarly-t...
             | 
             | ...."It's never about the trees," Bonapart says. "The trees
             | often serve as lightning rods for other issues that are the
             | psychological underpinning of a dispute that people might
             | have with each other."
        
         | mock-possum wrote:
         | Not this agin /eyeroll
         | 
         | It's not illegal for a human to look through 71,000 fonts and
         | then creat their own. It can't be illegal for a human to use a
         | robot to look through the fonts for them.
        
           | ChristianGeek wrote:
           | It depends on exactly what is learned from looking through
           | them. If you end up copying shapes and segments then there
           | are possible grounds for a lawsuit. If you're able to
           | determine the rules to make a good font from your analysis,
           | however, then nothing is stopping you from applying them.
        
         | ballenf wrote:
         | Copyright around fonts may not support such a suit in the same
         | as way as works of art.
         | 
         | Wikipedia says: "In the United States, the shapes of typefaces
         | are not eligible for copyright but may be protected by design
         | patent (although it is rarely applied for, the first US design
         | patent that was ever awarded was for a typeface).[1]"
         | 
         | So just scanning the rendered font (as opposed to the code that
         | generates it), may be harder to stop than scanning of artwork.
         | 
         | https://en.wikipedia.org/wiki/Intellectual_property_protecti...
        
           | jddj wrote:
           | > may be harder to stop than scanning of artwork.
           | 
           | Which has not been particularly easy to stop either
        
           | virtue3 wrote:
           | The interesting thing about law is that even if the law
           | doesn't absolutely protect you; the person that you believe
           | infringing on your work for free had better be prepared to
           | pony up lawyer fees to defend their work.
           | 
           | I think this would be one of the few times where I think
           | that's useful. Typefaces take a lot of time and consideration
           | and work to create so just blanket ripping off that work
           | because we all take them for granted is kind of bullshit.
           | 
           | I have conflicting thoughts about this.
           | 
           | But at the end of the day if you only trained on open fonts
           | they just aren't generally as good and the output should be
           | not generally as good as opposed to training on nicer fonts
           | that you technically don't have the rights to (but no one
           | thoughts of this being an issue at the time of design patents
           | / etc ).
           | 
           | But we're now in the world where we will pay money to compute
           | an AI model to design fonts instead of just paying designers
           | to design fonts. The race to the bottom is accelerating at an
           | alarming rate.
        
         | BeFlatXIII wrote:
         | That's why it's so important for the weights to be released to
         | the public ASAP. Even when the original is sued, they can still
         | be passed around in torrents for hobbyists and third-world
         | businessmen to enjoy.
        
       | andybak wrote:
       | Everyone knows that AIs can't draw sans...
        
       | TheRealPomax wrote:
       | Neat! Does it have prompt capabilities for things like FVAR,
       | GSUB, and GPOS? E.g. "okay now include a many-to-one ligature
       | that turns the word 'chicken' into an emoji of a chicken in the
       | same style" or "now make a second, sans-serif, robotic style and
       | add an axis called interpol that varies the font from the style
       | we just made to this new style"?
        
         | simbolit wrote:
         | Not OP, but the answer is "no".
         | 
         | What exactly made you suspect such abilities?
        
           | TheRealPomax wrote:
           | Odd phrasing, but: the part where I've worked on OpenType
           | parsing for decades and love seeing people with a passion for
           | digital typefaces make new and creative tools in that space.
           | Typically folks don't stop working on a cool tool after they
           | write a blog post, they're still refining and extending, so
           | you never know how far someone is trying to take a tool
           | without asking them.
        
             | simbolit wrote:
             | So, if I understand you correctly, it was less a question
             | "does it do x?" and more an indirect form of "hey OP, would
             | be cool if it did X" ?
        
               | TheRealPomax wrote:
               | This is not the place for starting a discussion about
               | whether context and subtext should be implicit or
               | explicit in written English. That's what
               | https://philosophy.stackexchange.com is for.
        
               | simbolit wrote:
               | I don't want to start a discussion, I wanted to know if I
               | misread your original comment, and whether you meant
               | something different from what I thought you meant.
               | 
               | From your answer, tho very indirect, I now suspect that I
               | did misunderstand your initial comment, and answering
               | your question was missing the point.
               | 
               | That is all I wanted (after at first wanting to be
               | helpful).
        
               | TheRealPomax wrote:
               | Fair enough, I thought you were trolling, but you've made
               | it clear you weren't. I wrote my comment as a question
               | that would hopefully engage the author on the
               | capabilities (both concrete, as well as hypothetical) of
               | this approach to font generation.
        
       | PaulHoule wrote:
       | Kinda funny how it works well at this whereas diffusion models go
       | to die when it comes to drawing text but of course it works in a
       | completely different manner.
        
         | TheRealPomax wrote:
         | There's a huge difference between "pictures of letters" and
         | "writing text" though. Ask stable diffusion to write text and
         | it'll generate hilarious weird-looking results. But, ask it to
         | generate individual letters (e.g. "Show me an ornate uppercase
         | letter b") and it'll do that for you with (mostly) no problems.
        
         | ilaksh wrote:
         | SDXL can do text kind of. Also isn't DALLE-3 a diffusion model?
         | 
         | But yeah overall diffusion has not generally been able to do it
         | at all before.
        
           | gwern wrote:
           | > But yeah overall diffusion has not generally been able to
           | do it at all before.
           | 
           | Imagen/Parti were doing text just fine long before DALL-E 3
           | was announced. GANs were also learning some text in the
           | earlier runup (even ProGAN was doing striking 'moon runes' -
           | amusingly, they were complete gibberish because it did
           | mirroring data augmentation).
        
       | euroderf wrote:
       | Has anyone tried using an LLM to make a font based on their
       | handwriting ?
       | 
       | EDIT: There's a couple (IIRC) of online services that offer this.
        
         | OnlyMortal wrote:
         | If it was my handwriting, it wouldn't be popular.
         | 
         | Perhaps a cursive font might be good though I'm pretty sure one
         | exists.
         | 
         | An expert system might be able to join up the letters in
         | cursive and make intentional mistakes to give it the character
         | of natural handwriting?
        
       | logdahl wrote:
       | Cool! Now generate 'upper-uppercase' and see what happens :^)
        
         | matsemann wrote:
         | I think this is a reference to "Uppestcase and Lowestcase
         | Letters", a submission a while back about someone training a ML
         | model to generate lowercase/uppercase letters, and used it to
         | uppercase letters already in uppercase. Quite fun
         | https://news.ycombinator.com/item?id=26667852
        
       | Nevermark wrote:
       | In honor of all the times he pressed his hands into his eyes (and
       | myself doing the same thing):
       | 
       | I present: "Perplexed" by Nilsa. [0]
       | 
       | I have a print in my office, in lieu of a mirror.
       | 
       | [0] https://www.sargentsfineart.com/img/nisla/all/nisla-
       | perplexe...
        
       | scarygliders wrote:
       | Hmmm. The model is a ckpt instead of a safetensor.
       | 
       | Pondering on whether to keep proceeding trying this out or not...
       | 
       | EDIT: a scan with picklescan[0] found nothing.. exciting.
       | 
       | [0] https://github.com/mmaitre314/picklescan
        
         | eurekin wrote:
         | proxmox/virtualbox/qemu + throwaway vm
        
           | scarygliders wrote:
           | Quite, I was thinking about doing so.
           | 
           | I just scanned it with picklescan, which found nothing
           | malicious. I just updated my original reply.
        
         | 7moritz7 wrote:
         | Haven't seen a single maliscious ckpt file so far. Sure, there
         | is a possibility, but huggingface scans pickled weights
         | automatically so the likelihood of someone using that site to
         | spread malware in this form is super low
        
           | artursapek wrote:
           | "pickled weights"?
           | 
           | serious question, how on Earth should someone like me, who
           | has completely missed the last 12 months of AI development,
           | catch up with the state of the art?
        
             | omneity wrote:
             | Two separate terms here, pickling is a serialization method
             | for Python objects (unrelated to AI per se).
             | 
             | Read more here:
             | https://docs.python.org/3/library/pickle.html
             | 
             | Then "weights" is just referring to a model's weights, a
             | specific instance of a python object that can be pickled.
        
             | scarygliders wrote:
             | Just know that the .ckpt format has more or less been
             | replaced by .safetensors these days.
             | 
             | tl;dr .ckpt files can contain Python pickles containing
             | runnable Python code, which means a Bad Guy could create a
             | .ckpt model containing malicious python code. Basically.
        
             | simbolit wrote:
             | I suppose you being here means that you are already fluent
             | in some programming languages. If so, I would start here:
             | 
             | Conway & Miles - Machine Learning for Hackers: Case Studies
             | and Algorithms to Get You Started
             | 
             | Once you read and understood this, I'd do an online
             | course...
        
               | artursapek wrote:
               | thank you
        
           | scarygliders wrote:
           | I've never spotted one in the wild either, but, y'know, I
           | like to not be the one who first finds one out... the bad
           | way. ;)
        
       | mastersummoner wrote:
       | Poof! You're a font.
        
       | kleiba wrote:
       | Obligatory xkcd reference: https://xkcd.com/1015/
        
       | dexsst wrote:
       | I used to make some fonts for rare, non Latin alphabets like the
       | Orkhon script by hand using a Paint-like freewar, it was fun
        
       | RugnirViking wrote:
       | Ooh I have to try this out when I get home, looks like the
       | weights are under 1GB too
        
       | boffinAudio wrote:
       | I've long had a project in mind involving the various typefaces
       | of the signage around the city of Vienna, which I find very
       | inspiring in many cases.
       | 
       | The idea is to just take a picture of every different typeface I
       | can find, attached to the local buildings at street level.
       | 
       | There are some truly wonderful typefaces out there, on signage
       | dating back to last century, and I find the aesthetics often
       | quite appealing.
       | 
       | With this tool, could I take a collection of the various
       | typefaces I've captured, and get it to complete the font, such
       | that a sign that only has a few of the required characters could
       | be 'completed' in the same style?
       | 
       | Because if so, I'm going to start taking more pictures of
       | Vienna's wonderful types ..
        
         | chris_st wrote:
         | Even if you never get around to using the photos, I think it
         | would be a wonderful service to take the photos and put them up
         | somewhere for non-Vienna residents to enjoy.
        
           | boffinAudio wrote:
           | Oh, definitely .. but first I must amass an archive worthy of
           | it ..
        
         | simbolit wrote:
         | With this tool: no.
         | 
         | With a next-gen tool: if you do some pre-processing on the
         | images, quite possibly.
        
       | rogual wrote:
       | > THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
       | 
       | It's "a" quick brown fox, otherwise the sentence has no "a".
        
         | dave78 wrote:
         | Huh, I never gave that sentence much thought, and I guess I
         | never realized it conveniently covered the whole alphabet. It
         | makes so much more sense now!
        
         | lordfrito wrote:
         | Lazy?
        
         | jnosCo wrote:
         | There is an "a" in "lazy"?
        
           | rogual wrote:
           | Well, I'm stupid.
        
             | jameshart wrote:
             | The usual mistake people make in reciting this is to say
             | the fox _jumped_ over the lazy dog, causing them to omit an
             | 's' from the sentence.
             | 
             | Making sure it's 'the' lazy dog rather than 'a' lazy dog is
             | actually important if you care about completing the
             | lowercase alphabet, as without it there's only an uppercase
             | 'T'.
        
           | SamBam wrote:
           | Indeed.
           | 
           | But it's worth having "a lazy" anyway to avoid the repetetive
           | "the."
        
       | lawlessone wrote:
       | This is interesting but i think generating the next letter from
       | the letters before may not be the best way to do it. As you
       | mentioned they degrade with each letter.
       | 
       | Maybe creating one long image of a whole font would work better.
       | 
       | edit: in the above am misunderstanding what is happening here.
       | 
       | But i still think there must be another way to structure this so
       | the attention mechanism doesn't have to work so hard.
        
         | jrmg wrote:
         | Since the first three letters are good, and generated only with
         | the context of the preceding letters, shouldn't just using the
         | first three (instead of the preceding three) as context for
         | every other one be good enough?
        
       | nitrofurano wrote:
       | lots of kernings misfits ftw
        
         | tabtab wrote:
         | Stop your bigotry of Kernians.
        
       | lachlan_gray wrote:
       | I found a few months ago that the gpt-4 code interpreter is
       | capable of converting a black and white png of a glyph to an svg
       | 
       | https://twitter.com/lfegray/status/1678787763905126400
       | 
       | It would be cool to combine a script like the one gpt-4 gave me
       | with an image generation model to generate fonts. The approach
       | from this blog post is way more interesting though.
       | 
       | On a separate note it reminds me of this suckerpinch video :)
       | maybe we can finally get uppestcase and lowestcase fonts
       | 
       | https://www.youtube.com/watch?v=HLRdruqQfRk
        
         | logicallee wrote:
         | >I found a few months ago that the gpt-4 code interpreter is
         | capable of converting a black and white png of a glyph to an
         | svg
         | 
         | :) Easy there, let's not make all the naysayers who say it only
         | just predicts plausible words sweat.
         | 
         | Your phrasing almost makes it sound like you're sharing a clear
         | example of it analyzing and completing a complex task
         | correctly, while perfectly understanding what it's doing.
         | 
         | Perhaps we should say it only just predicted words that are
         | plausible responses to someone asking to do that, while also
         | predicting plausible words someone might say in response to an
         | error message along the way. It might not actually be doing any
         | converting, just predicting words and tokens without really
         | doing anything.
         | 
         | My favorite part of its predictive capabilities is how it is
         | able to predict the other half of a conversation that literally
         | goes "didn't work, try again", "didn't work, try again", "still
         | didn't work, try again", "all right you finally fixed it good
         | job" - without even telling it why it didn't work or quoting
         | the error message. Somehow it is still able to predict the
         | other half of the conversation so that it ends up with
         | "finally, good job!"
         | 
         | Who knew that to get results that look like it knows what it's
         | doing, it's enough to predict what could make someone say that!
         | 
         | We are truly living in the golden age of statistical prediction
         | that does not involve any degree of thinking, analysis, or
         | understanding.
         | 
         | Truly our age of applied statistics is going better than anyone
         | could have, er, "predicted". :)
        
           | wizzwizz4 wrote:
           | > _Your phrasing almost makes it sound like you 're sharing a
           | clear example of it analyzing and completing a complex task
           | correctly, while perfectly understanding what it's doing._
           | 
           | OpenAI has hardcoded (or heavily overfit) several special-
           | purpose functions into their ChatGPT systems. In the past few
           | months, they've integrated other special-purpose models, so
           | their tools can do more than just predictive text (e.g. image
           | recognition).
           | 
           | GPT can do limited verbal reasoning, whatever else can do
           | image recognition, but that does not mean the combined system
           | can do visual reasoning. There's no mechanism by which it
           | would (unless you specifically create one, but that's not
           | trivial and doesn't generalise).
           | 
           | > _Who knew that to get results that look like it knows what
           | it 's doing, it's enough to predict what could make someone
           | say that!_
           | 
           | Everyone. Some call it "specification gaming" or "reward
           | hacking", and we've known about it for a _long_ time. It 's a
           | really obvious concept if you have a good mental model of
           | reinforcement learning.
           | https://doi.org/10.1162%2Fartl_a_00319 is a fun example.
           | 
           | > _We are truly living in the golden age of statistical
           | prediction that does not involve any degree of thinking,
           | analysis, or understanding._
           | 
           | This is a straw argument. I can't speak for anyone else, but
           | my criticisms are mainly of people seeing some thinking-like,
           | analysis-like or understanding-like behaviour, and assuming
           | that it _is_ human-like thinking, analysis or understanding,
           | while ignoring other hypotheses (some of which make
           | successful advance predictions in a way the "it 's doing what
           | humans do!" models don't).
           | 
           | I will note: the people being the most loudly exuberant about
           | ChatGPT's vast intelligence seem to view it as a _tool_. If I
           | were faced with an opaque box, inside which was a being
           | capable of general-purpose problem solving, conversation, and
           | original thought, my first reaction would _not_ be "I can use
           | this for my own ends". I am glad that I have seen nothing to
           | convince me that ChatGPT is such a being, and I have
           | theoretical arguments that ChatGPT probably _won 't ever_ be
           | such a being, but if you genuinely think this technology has
           | the potential to produce such a being, you have an ethical
           | responsibility.
        
         | specproc wrote:
         | Thank you for sharing that suckerpinch, enjoyed watching that
         | immensely
        
         | bambax wrote:
         | The author says he achieved text-to-SVG generation but doesn't
         | point to a code repository for it... It would be super
         | interesting (or does gpt-4 do it natively?)
         | 
         | That said, I'm not sure that you need GPT-4 for outlining a BW
         | image and making a path out of it; Corel Draw did that well,
         | over 25 years ago?
         | 
         | So yes, another approach to what the author is doing, would be
         | to generate font bitmaps using any of the leading image
         | generators, and then vectorize the bitmaps. Less
         | straightforward and precise, but probably simpler.
        
           | SerCe wrote:
           | Hi, I am the author. For text-to-SVG, check out IconShop [1].
           | It was the paper that I tried to reproduce results from
           | initially. In the paper, there is a comparison of their
           | approach against using GPT-4 [2].
           | 
           | Using vectorisation tools like potrace, is indeed a much more
           | popular approach, and there are quite a few papers generating
           | fonts this way. The most recent I believe is DualVector [3].
           | But I tried to approach the problem from another angle.
           | 
           | [1]: https://icon-shop.github.io
           | 
           | [2]: https://arxiv.org/pdf/2304.14400.pdf
           | 
           | [3]: https://openaccess.thecvf.com/content/CVPR2023/html/Liu_
           | Dual...
        
           | simonbw wrote:
           | ChatGPT/GPT-4 does it natively. You can say "Please generate
           | me an SVG image of a unicorn" and it will spit out the SVG
           | code.
        
             | cjaybo wrote:
             | Here's my stupid question of the day:
             | 
             | Would you mind to explain what you mean by "native" in this
             | context?
        
               | tough wrote:
               | Not using a -plugin- probably
        
         | alana314 wrote:
         | That's amazing. One of my favorite things to do with copilot is
         | to comment something like "//white arrow pointing right" and
         | then start "<svg" and have it complete it. If it doesn't get it
         | right the first time I update my comment. Saves me time
         | searching for the right SVG and digging through free but really
         | paid image sites.
        
           | toddmorey wrote:
           | This is such a good idea. Not sure why svg code escaped my
           | mind as something copilot would be good at.
        
             | pphysch wrote:
             | In general, copilots are a massive boon to "boilerplatey",
             | simple syntax languages from XML/HTML to Go.
        
           | speps wrote:
           | And it saves you having to credit anyone, win-win!
        
             | fragmede wrote:
             | You must be new to "professional" software development...
        
             | jbc1 wrote:
             | Awful lot of sites have icons on them. I can't recall ever
             | seeing icon credit. Copilot is like a year old.
        
           | duskwuff wrote:
           | > Saves me time searching for the right SVG and digging
           | through free but really paid image sites.
           | 
           | FWIW, Google's Material Design Icons and The Noun Project are
           | decent sources of high quality, actually-free SVG icons:
           | 
           | * https://fonts.google.com/icons (Apache license)
           | 
           | * https://thenounproject.com/ (CC-BY)
        
       | itsyaboi wrote:
       | https://www.youtube.com/watch?v=a8K6QUPmv8Q
        
       | gigglesupstairs wrote:
       | "Fucking Hell" - first thing I yelled to myself when I saw that
       | headline
       | 
       | Kudos for the project, of course, but it just saddens me a bit
       | more. Nothing is sacred anymore.
        
         | waterheater wrote:
         | Let's assume the technology will eventually work.
         | 
         | What if you had a "personal font"? Sure, you have a user name,
         | but what if you had a custom-generated font which communicates
         | your personality to other people on the Internet? The font
         | could be on a spectrum between static (generated once and
         | reused indefinitely) and dynamic (continuous online learning of
         | personal information causes an adjustment of the font).
         | 
         | I'm just making up an example here, but say you're feeling sad,
         | and your smart technology figures out you're feeling sad. When
         | you send a text message to family, then your personal font
         | takes on "sad" characteristics.
        
         | nvy wrote:
         | >Nothing is sacred anymore.
         | 
         | If it can be specified, it can be automated.
        
           | gigglesupstairs wrote:
           | I know, I know. I am not disputing the technicalities.
        
         | layer8 wrote:
         | I mean, looking at the kerning of the second example in
         | particular, there's still a lot to be done. And something like
         | "extend this latin-1 font to all scripts of the BMP so that is
         | looks stylistically consistent and, within that constraint, the
         | glyphs and their combinations look natural and readable for
         | native readers of each script, assuming Japanese readers for
         | the Han characters" is probably still way off.
        
           | chefandy wrote:
           | Just like all visual generative AI, it gets the first 95% but
           | doesn't get the last 5% that takes 95% of the time. Kerning
           | pairs on typefaces take an incredible amount of human time.
           | Years of full-time work for a large type family. After all
           | these years, even Adobe can't perfectly automate kerning
           | because making letters look right next to each other isn't
           | (obviously) formulaic. Maybe generative AI will nip it in the
           | bud? Certainly hasn't so far, but maybe it will. Obviously in
           | monospaced fonts, like that last one, kerning isn't an issue.
           | 
           | More correctable in these models would be the balance between
           | the letterforms. Surely if there was some kind of prompt you
           | could tell it to not make those Ms in that bold serif font to
           | be obnoxiously wide?
           | 
           | Either way, as of now, what this gets us is exactly 90% less
           | useful and probably of lower quality than the stuff you can
           | get for free on dafont.com. I know it will progress, but I
           | imagine the best use case for generative AI and font creation
           | for commercially viable fonts would be to give roughs glyphs
           | to fill out a large character set as an aid for a
           | professional type designer.
           | 
           | And surely there will be a chorus of people insisting that it
           | doesn't matter. Well, you're wrong. If you blindly showed
           | people a headline, book, poster or whatever with properly
           | kerned type and then one without, they will see how much more
           | polished the properly kerned page is, even if they couldn't
           | tell you specifically why. In a lot of situations, that
           | really, really matters, even to people who haven't developed
           | the ability to point out the differences.
        
         | colesantiago wrote:
         | > Kudos for the project, of course, but it just saddens me a
         | bit more. Nothing is sacred anymore.
         | 
         | Why does this sadden you?
         | 
         | I'm quite happy everything is being done by AI, time will be
         | freed for other things that are more important.
         | 
         | Manual font making will not go away though and now anyone can
         | make their own fonts for free.
        
           | ori_b wrote:
           | Genuine question: What do you think is more important that
           | won't eventually be done with AI?
        
           | rogerclark wrote:
           | You don't know that the time will be freed for other things
           | that are more important. We don't know for sure how this is
           | all going to work out at all.
           | 
           | And people who make fonts, create art, and write prose
           | generally do these things because they like doing them, not
           | because they're forced to. These technologies aren't
           | automating drudgery, they're automating things that give
           | people's lives meaning. What's the endgame here exactly?
        
             | colesantiago wrote:
             | Time will be freed, ChatGPT, DALL-E, Midjourney and Stable
             | Diffusion has collectively saved countless people billions
             | of hours of time and this will do the same.
             | 
             | The big font makers no longer have a hold on extremely
             | pricey fonts that are inaccessible, the general endgame is
             | most software is going free and open source thanks to AI,
             | and that is a good thing.
        
               | morph12 wrote:
               | Time for what?
        
               | rogerclark wrote:
               | Creatives will have their hopes and dreams stripped away
               | so that artless and tasteless software engineers can type
               | words into a box and instantly get exactly what they
               | want, with no surprises, no feelings, and no economic
               | upsides for anyone else. A beautiful future indeed.
        
               | axus wrote:
               | Won't the creatives be able to type the software
               | specification into a box, and add functionality to their
               | endeavor without needing programmers? I'm not sure that
               | the process and the paycheck are more important than the
               | final artifact.
        
               | ori_b wrote:
               | Why would we need people to have endeavors? That sounds
               | like automatable drudgery.
               | 
               | Won't the AIs be able to infer what would best maximize
               | engagement for their owners and type the specifications
               | necessary to create whatever the entity running them
               | would want their users to consume?
        
               | morph12 wrote:
               | Why would we need people? Seems like a pointless
               | bottleneck in the pursuit of efficiency.
        
               | BeFlatXIII wrote:
               | Designing cereal boxes is not authentic human experssion.
        
               | rogerclark wrote:
               | You must not know any designers. Pretty much everyone I
               | know would consider that to be pretty fun - this is
               | exactly the kind of thing artistic kids say they want to
               | do when they grow up. And most of us would rather get
               | paid to design cereal boxes than to do many other things,
               | and almost everyone would rather do it than to not get
               | paid at all.
        
             | nvy wrote:
             | >What's the endgame here exactly?
             | 
             | All of us paying a set of subscriptions to the FAANGs, for
             | literally every aspect of our lives.
        
               | aatd86 wrote:
               | With what money once everyone is out of a job?
        
         | rogerclark wrote:
         | It's so depressing to think that this is what people want.
        
           | tpmoney wrote:
           | What is that? The ability to quickly and easily generate
           | creative or expressive pieces of computer wizardry without
           | first having to delve into the depths of esoteric knowledge?
           | Of course people want that. It turns out you can't specialize
           | in everything, but sometimes you just want to be able to make
           | something good enough without having to engage the services
           | of an expert in the field.
           | 
           | No this might not be the most beautiful font with the most
           | perfect kerning or optimized code. But if it's functional
           | enough for the person who requests it, that should be good
           | enough shouldn't it? Most things people are printing on their
           | 3d printers aren't high quality designed parts either. Plenty
           | of scientists and accountants have scripts and code that
           | would make most developers cringe, but if it's good enough
           | then why be bothered?
           | 
           | The ability of people to make things with tools that they
           | otherwise never would have been able to make before without
           | dedicating months or years of time they may not have is
           | awesome and we should be excited for it. I've watched 70 year
           | old grandmothers learn to make little home movies of their
           | grandkids in iMovie. No they weren't doing "real" film
           | editing and certainly weren't learning any skills that would
           | transfer to avid or Final Cut. And so what? That home movie
           | cut together with a minimum of skill and a whole lot of
           | technology hiding the esoterica was probably more meaningful
           | and joy inducing for that woman than most blockbuster
           | cinematics produced by the best minds.
        
         | mock-possum wrote:
         | 'Anymore' ha
         | 
         | You don't actually believe anything was ever sacred to being
         | with do you?
        
       | martincmartin wrote:
       | Douglas Hofstader, the author of Godel Escher Bach, thought the
       | task of creating fonts could only be solved with general AI.
       | 
       | https://www.m-u-l-t-i-p-l-i-c-i-t-y.org/media/pdf/Metafont-M...
       | 
       | The Letter Spirit project aims to model artistic creativity by
       | designing stylistically uniform "gridfonts" (typefaces limited to
       | a grid).
        
         | adastra22 wrote:
         | Well, GPT is a general AI.
        
         | gwern wrote:
         | I read that a while ago and thought that it was interesting:
         | Hofstadter was right that it would require much more general
         | approaches than Knuth's approach of 'think very hard and tweak
         | a hand-engineered knob', because that's how all the past
         | VAE/GAN/RNN work on typography-related stuff has worked.
         | 
         | As for the broader question of whether such approaches are
         | general AI, well, that's a bullet Hofstadter is increasingly
         | willing to bite, as upset as it makes him:
         | https://www.lesswrong.com/posts/kAmgdEjq2eYQkB5PP/douglas-ho...
        
           | svat wrote:
           | Hofstadter's article is very interesting and delightful (as
           | is typical of him). But as a response to Knuth's article it's
           | basically reacting to a straw-man or misunderstanding: by "a
           | metafont" in "The Concept of a Meta-Font"[1] Knuth simply
           | meant a common description of many related fonts in a family
           | (like the Computer Modern family where different font sizes,
           | bold, italics, sans-serif, typewriter style etc are all
           | generated from common code and tweakable knobs) -- this is a
           | consciously chosen and designed family. But when he joked
           | about
           | 
           | > _The idea of a meta-font should now be clear. But what good
           | is it? The ability to manipulate lots of parameters may be
           | interesting and fun, but does anybody really need a
           | 6[?]-point font that is one fourth of the way between
           | Baskerville and Helvetica?_
           | 
           | Hofstadter ran with it, imagining Knuth to mean a single
           | universal "metafont" from which every single font can be
           | achieved by suitable tweaking of knobs. This is of course
           | nonsense.
           | 
           | Knuth wrote a (little-known or referenced) short response in
           | the same journal's Vol. 17 No. 4 (1983): Volume 17.4 (p 412,
           | or in the PDF page 89 of 96 at
           | https://journals.uc.edu/index.php/vl/issue/view/364/183)
           | [from the tone I imagine him very annoyed :-)]:
           | 
           | > _I never meant to imply that all typefaces could usefully
           | be combined into one single meta-font, not even if
           | consideration is restricted to book faces. For example, [...]
           | Meanwhile, I 'm pleased to see that my article has stimulated
           | people to have other ideas, even if those ideas have little
           | or no connection with the main point I was trying to make.
           | Misunderstandings of meta-fonts may well prove to be more
           | important than my own simple observations in the long run._
           | 
           | Returning to the thread a bit, all these "write code to draw
           | an image" systems--like Metafont/MetaPost, Asymptote, TikZ
           | (and also I guess DOT/Graphviz, Mermaid, nomnoml, ...)--are
           | IMO interesting as a way for those who think in language /
           | symbols / concepts to do visual stuff (and vice-versa to some
           | extent), and also (along Knuth's lines) "truly understand"
           | shapes by translating them into precise descriptions.
           | Metafont was never going to become popular expecting font
           | designers to write code (and the fact that hand-writing SVG
           | is a negligible fraction of usage makes sense), but now that
           | LLMs can help translate back-and-forth, it's going to be
           | interesting to see if we ever get to "understanding" shapes.
           | 
           | [1]:
           | https://web.archive.org/web/20220629082019/https://s3-us-
           | wes... /
           | https://journals.uc.edu/index.php/vl/article/view/5329/4193
        
       | dleeftink wrote:
       | Although I would be sad to see the handcrafting that goes into
       | designing custom fonts go, some iterations down the line a model
       | like this would greatly aid tedious glyph alignment and
       | consistency tasks when designing CJK, hiragana, katakana and
       | kanji fonts. Inspiring stuff.
        
         | matsemann wrote:
         | It's already so that writing on computers is quite US-centric
         | (at least English-centric). While this might help on some of
         | the shortcomings, I'm also a bit afraid it will make it so that
         | even more focus is put only on the US part, and the rest of the
         | world get a "good enough" implementation made by AI that kinda
         | erases some heritage.
        
           | dleeftink wrote:
           | Maybe, but the Latin font market is quite saturated, whereas
           | the CJK space has ample opportunity for innovating and is
           | likely even in need of it, cf. [0][1][2]
           | 
           | [0]: https://qz.com/522079/the-long-incredibly-tortuous-and-
           | fasci...
           | 
           | [1]: https://fonts.google.com/knowledge/type_in_china_japan_a
           | nd_k...
           | 
           | [2]: https://stackoverflow.com/a/14573813
        
         | california-og wrote:
         | I think that would be ideal. The 'killer' feature would be:
         | Handcraft a set of control characters, like the letters in
         | "handglove" and then let AI generate the rest. Designing a
         | typeface is fun, until you need to add support for multiple
         | languages and need to make 800+ characters. Or, maybe there is
         | a nice (open source) font, that is unfortunately missing some
         | characters you really need: let AI generate them.
        
       | paulcnichols wrote:
       | Inevitable in a good way. Keep going! There's gold here.
        
       | Rantenki wrote:
       | OK, that's cool, but those fonts are all terrible. The serifs are
       | all different sizes and shapes, sometimes on the same letter. The
       | kerning looks like a random walk. The stroke widths are all over
       | the place, and/or the hinting is busted.
       | 
       | Now, that said, it's pretty amazing that this works at all, but
       | it'll take some pretty specific training on a model to get
       | something that can compete with a human made font that's curated
       | for good usability _and_ aesthetics.
       | 
       | Sadly, we'll also probably see adoption of these kinds of fonts
       | (along with graphic design, illustration, songwriting,
       | screenwriting, etc)... because "meh, good enough" combined with
       | some Dunning-Kruger.
       | 
       | TL;DR: Thanks, I hate it.
        
         | BoorishBears wrote:
         | > Sadly, we'll also probably see adoption of these kinds of
         | fonts (along with graphic design, illustration, songwriting,
         | screenwriting, etc)... because "meh, good enough" combined with
         | some Dunning-Kruger.
         | 
         | Ironic bringing up Dunning-Kruger as you treat generic RLHF as
         | a "pretty specific training" and make sweeping declarations
         | about how people will use AI as if the current SOTA of several
         | of the tasks you just mentioned didn't come from not settling
         | for "meh, good enough" and instead applying the "pretty
         | specific training" you alluded to (see Midjourney)
        
         | jeron wrote:
         | I don't think any self respecting graphic designer would use
         | these fonts in its current state but it's a cool proof of
         | concept and could be improved upon to a more usable state
        
       | Jack000 wrote:
       | I think this approach isn't ideal because you're representing
       | pixels as 150x150 unique bins. With only 71k fonts it's likely a
       | lot of these bins are never used, especially at the corners.
       | Since you're quantizing anyways, you might as well use a convnet
       | then trace the output, which would better take advantage of the
       | 2d nature of the pixel data.
       | 
       | This kind of reminds me of dalle-1 where the image is represented
       | as 256 image tokens then generated one token at a time. That
       | approach is the most direct way to adapt a causal-LM architecture
       | but it clearly didn't make a lot of sense because images don't
       | have a natural top-down-left-right order.
       | 
       | For vector graphics, the closest analogous concept to pixel-wise
       | convolution would be the Minkowski sum. I wonder if a Minkowski
       | sum-based diffusion model would work for svg images.
        
         | briandw wrote:
         | How would the Minkowski sum be used in the diffusion model? Is
         | the idea to look at the Minkowski sum of the prediction and
         | label?
        
       | yklcs wrote:
       | I've tried out some work on generating vector fonts too, in the
       | format of Bezier curves and a seq2seq model. The problem was that
       | fonts outputted by ML models were imprecise. Lines were not
       | perfectly parallel, corners were at 89deg, and curves were
       | kinked. It's not too difficult to get fonts that look good
       | enough, but the imperfections are glaring as fonts are normally
       | perfectly precise. These imperfections are evident in OP's output
       | too, and in my opinion make these types of models unusable for
       | actual typesetting.
       | 
       | A 1% error in a raster output would be pixel colors being
       | slightly off, but a 89deg corner in a vector image is immediately
       | noticeable, making this a hard problem to solve. I haven't looked
       | into this problem too much since, but I'm interested to hear
       | about possible solutions and reading material.
        
         | waterheater wrote:
         | Without changing the fundamental learning process, one could
         | conceivably introduce a "post-production" step, where you
         | tighten up the output according to a set of pre-defined rules
         | (e.g., if an angle is 89 degrees, adjust the angle to 90).
         | 
         | Of course, changing the learning process would be best. One
         | idea which comes to mind is finding a way to embed
         | relationships into the ML training system itself (e.g., output
         | no angles other than 90 degrees or some predefined set). Such
         | an approach is a type of contraint-based ML, where the ML agent
         | identifies a solution given certain constraints on the output.
         | In my experience, the right approach to accomplish this goal is
         | using factor graphs.
        
       ___________________________________________________________________
       (page generated 2023-10-03 23:00 UTC)