[HN Gopher] Hey, computer, make me a font
___________________________________________________________________
Hey, computer, make me a font
Author : pavanyara
Score : 239 points
Date : 2023-10-03 12:17 UTC (10 hours ago)
(HTM) web link (serce.me)
(TXT) w3m dump (serce.me)
| scarygliders wrote:
| Okay I can't try it out anyway. "Blocksparse is not available:
| the current GPU does not expose Tensor cores"
|
| My "best" GPU is an RTX 2070 Super, Turing architecture.
|
| I've seen similar messages when using stable-diffusion... either
| with -webui or with automatic, can't exactly remember, but they
| both run fine on that RTX 2070 Super, so I can only guess that
| they revert to some other method than Blocksparse on seeing that
| it doesn't support Turing. Or something. I haven't looked into
| how they deal with it.
|
| I've submitted an Issue [0] for it. I don't have enough knowledge
| to know if there's some way of saying "don't use Blocksparse" for
| fontogen.
|
| [0] https://github.com/SerCeMan/fontogen/issues/2
| philipwhiuk wrote:
| > To train the model, I assembled a dataset of 71k distinct
| fonts.
|
| I give it a week before Monotype sues your face off.
| yellow_postit wrote:
| Font law is almost as complex and fascinating as Tree law.
| Given how complex font licensing can be, a generative use case
| that produces usable fonts would be a huge threat to the
| foundaries and I expect they will be very litigious, just as
| Getty and others are in the image space.
| dwaltrip wrote:
| Tree law? Please say more, sounds interesting
| 123pie123 wrote:
| possibly this? https://www.atlasobscura.com/articles/tree-
| law-is-a-gnarly-t...
|
| ...."It's never about the trees," Bonapart says. "The trees
| often serve as lightning rods for other issues that are the
| psychological underpinning of a dispute that people might
| have with each other."
| mock-possum wrote:
| Not this agin /eyeroll
|
| It's not illegal for a human to look through 71,000 fonts and
| then creat their own. It can't be illegal for a human to use a
| robot to look through the fonts for them.
| ChristianGeek wrote:
| It depends on exactly what is learned from looking through
| them. If you end up copying shapes and segments then there
| are possible grounds for a lawsuit. If you're able to
| determine the rules to make a good font from your analysis,
| however, then nothing is stopping you from applying them.
| ballenf wrote:
| Copyright around fonts may not support such a suit in the same
| as way as works of art.
|
| Wikipedia says: "In the United States, the shapes of typefaces
| are not eligible for copyright but may be protected by design
| patent (although it is rarely applied for, the first US design
| patent that was ever awarded was for a typeface).[1]"
|
| So just scanning the rendered font (as opposed to the code that
| generates it), may be harder to stop than scanning of artwork.
|
| https://en.wikipedia.org/wiki/Intellectual_property_protecti...
| jddj wrote:
| > may be harder to stop than scanning of artwork.
|
| Which has not been particularly easy to stop either
| virtue3 wrote:
| The interesting thing about law is that even if the law
| doesn't absolutely protect you; the person that you believe
| infringing on your work for free had better be prepared to
| pony up lawyer fees to defend their work.
|
| I think this would be one of the few times where I think
| that's useful. Typefaces take a lot of time and consideration
| and work to create so just blanket ripping off that work
| because we all take them for granted is kind of bullshit.
|
| I have conflicting thoughts about this.
|
| But at the end of the day if you only trained on open fonts
| they just aren't generally as good and the output should be
| not generally as good as opposed to training on nicer fonts
| that you technically don't have the rights to (but no one
| thoughts of this being an issue at the time of design patents
| / etc ).
|
| But we're now in the world where we will pay money to compute
| an AI model to design fonts instead of just paying designers
| to design fonts. The race to the bottom is accelerating at an
| alarming rate.
| BeFlatXIII wrote:
| That's why it's so important for the weights to be released to
| the public ASAP. Even when the original is sued, they can still
| be passed around in torrents for hobbyists and third-world
| businessmen to enjoy.
| andybak wrote:
| Everyone knows that AIs can't draw sans...
| TheRealPomax wrote:
| Neat! Does it have prompt capabilities for things like FVAR,
| GSUB, and GPOS? E.g. "okay now include a many-to-one ligature
| that turns the word 'chicken' into an emoji of a chicken in the
| same style" or "now make a second, sans-serif, robotic style and
| add an axis called interpol that varies the font from the style
| we just made to this new style"?
| simbolit wrote:
| Not OP, but the answer is "no".
|
| What exactly made you suspect such abilities?
| TheRealPomax wrote:
| Odd phrasing, but: the part where I've worked on OpenType
| parsing for decades and love seeing people with a passion for
| digital typefaces make new and creative tools in that space.
| Typically folks don't stop working on a cool tool after they
| write a blog post, they're still refining and extending, so
| you never know how far someone is trying to take a tool
| without asking them.
| simbolit wrote:
| So, if I understand you correctly, it was less a question
| "does it do x?" and more an indirect form of "hey OP, would
| be cool if it did X" ?
| TheRealPomax wrote:
| This is not the place for starting a discussion about
| whether context and subtext should be implicit or
| explicit in written English. That's what
| https://philosophy.stackexchange.com is for.
| simbolit wrote:
| I don't want to start a discussion, I wanted to know if I
| misread your original comment, and whether you meant
| something different from what I thought you meant.
|
| From your answer, tho very indirect, I now suspect that I
| did misunderstand your initial comment, and answering
| your question was missing the point.
|
| That is all I wanted (after at first wanting to be
| helpful).
| TheRealPomax wrote:
| Fair enough, I thought you were trolling, but you've made
| it clear you weren't. I wrote my comment as a question
| that would hopefully engage the author on the
| capabilities (both concrete, as well as hypothetical) of
| this approach to font generation.
| PaulHoule wrote:
| Kinda funny how it works well at this whereas diffusion models go
| to die when it comes to drawing text but of course it works in a
| completely different manner.
| TheRealPomax wrote:
| There's a huge difference between "pictures of letters" and
| "writing text" though. Ask stable diffusion to write text and
| it'll generate hilarious weird-looking results. But, ask it to
| generate individual letters (e.g. "Show me an ornate uppercase
| letter b") and it'll do that for you with (mostly) no problems.
| ilaksh wrote:
| SDXL can do text kind of. Also isn't DALLE-3 a diffusion model?
|
| But yeah overall diffusion has not generally been able to do it
| at all before.
| gwern wrote:
| > But yeah overall diffusion has not generally been able to
| do it at all before.
|
| Imagen/Parti were doing text just fine long before DALL-E 3
| was announced. GANs were also learning some text in the
| earlier runup (even ProGAN was doing striking 'moon runes' -
| amusingly, they were complete gibberish because it did
| mirroring data augmentation).
| euroderf wrote:
| Has anyone tried using an LLM to make a font based on their
| handwriting ?
|
| EDIT: There's a couple (IIRC) of online services that offer this.
| OnlyMortal wrote:
| If it was my handwriting, it wouldn't be popular.
|
| Perhaps a cursive font might be good though I'm pretty sure one
| exists.
|
| An expert system might be able to join up the letters in
| cursive and make intentional mistakes to give it the character
| of natural handwriting?
| logdahl wrote:
| Cool! Now generate 'upper-uppercase' and see what happens :^)
| matsemann wrote:
| I think this is a reference to "Uppestcase and Lowestcase
| Letters", a submission a while back about someone training a ML
| model to generate lowercase/uppercase letters, and used it to
| uppercase letters already in uppercase. Quite fun
| https://news.ycombinator.com/item?id=26667852
| Nevermark wrote:
| In honor of all the times he pressed his hands into his eyes (and
| myself doing the same thing):
|
| I present: "Perplexed" by Nilsa. [0]
|
| I have a print in my office, in lieu of a mirror.
|
| [0] https://www.sargentsfineart.com/img/nisla/all/nisla-
| perplexe...
| scarygliders wrote:
| Hmmm. The model is a ckpt instead of a safetensor.
|
| Pondering on whether to keep proceeding trying this out or not...
|
| EDIT: a scan with picklescan[0] found nothing.. exciting.
|
| [0] https://github.com/mmaitre314/picklescan
| eurekin wrote:
| proxmox/virtualbox/qemu + throwaway vm
| scarygliders wrote:
| Quite, I was thinking about doing so.
|
| I just scanned it with picklescan, which found nothing
| malicious. I just updated my original reply.
| 7moritz7 wrote:
| Haven't seen a single maliscious ckpt file so far. Sure, there
| is a possibility, but huggingface scans pickled weights
| automatically so the likelihood of someone using that site to
| spread malware in this form is super low
| artursapek wrote:
| "pickled weights"?
|
| serious question, how on Earth should someone like me, who
| has completely missed the last 12 months of AI development,
| catch up with the state of the art?
| omneity wrote:
| Two separate terms here, pickling is a serialization method
| for Python objects (unrelated to AI per se).
|
| Read more here:
| https://docs.python.org/3/library/pickle.html
|
| Then "weights" is just referring to a model's weights, a
| specific instance of a python object that can be pickled.
| scarygliders wrote:
| Just know that the .ckpt format has more or less been
| replaced by .safetensors these days.
|
| tl;dr .ckpt files can contain Python pickles containing
| runnable Python code, which means a Bad Guy could create a
| .ckpt model containing malicious python code. Basically.
| simbolit wrote:
| I suppose you being here means that you are already fluent
| in some programming languages. If so, I would start here:
|
| Conway & Miles - Machine Learning for Hackers: Case Studies
| and Algorithms to Get You Started
|
| Once you read and understood this, I'd do an online
| course...
| artursapek wrote:
| thank you
| scarygliders wrote:
| I've never spotted one in the wild either, but, y'know, I
| like to not be the one who first finds one out... the bad
| way. ;)
| mastersummoner wrote:
| Poof! You're a font.
| kleiba wrote:
| Obligatory xkcd reference: https://xkcd.com/1015/
| dexsst wrote:
| I used to make some fonts for rare, non Latin alphabets like the
| Orkhon script by hand using a Paint-like freewar, it was fun
| RugnirViking wrote:
| Ooh I have to try this out when I get home, looks like the
| weights are under 1GB too
| boffinAudio wrote:
| I've long had a project in mind involving the various typefaces
| of the signage around the city of Vienna, which I find very
| inspiring in many cases.
|
| The idea is to just take a picture of every different typeface I
| can find, attached to the local buildings at street level.
|
| There are some truly wonderful typefaces out there, on signage
| dating back to last century, and I find the aesthetics often
| quite appealing.
|
| With this tool, could I take a collection of the various
| typefaces I've captured, and get it to complete the font, such
| that a sign that only has a few of the required characters could
| be 'completed' in the same style?
|
| Because if so, I'm going to start taking more pictures of
| Vienna's wonderful types ..
| chris_st wrote:
| Even if you never get around to using the photos, I think it
| would be a wonderful service to take the photos and put them up
| somewhere for non-Vienna residents to enjoy.
| boffinAudio wrote:
| Oh, definitely .. but first I must amass an archive worthy of
| it ..
| simbolit wrote:
| With this tool: no.
|
| With a next-gen tool: if you do some pre-processing on the
| images, quite possibly.
| rogual wrote:
| > THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
|
| It's "a" quick brown fox, otherwise the sentence has no "a".
| dave78 wrote:
| Huh, I never gave that sentence much thought, and I guess I
| never realized it conveniently covered the whole alphabet. It
| makes so much more sense now!
| lordfrito wrote:
| Lazy?
| jnosCo wrote:
| There is an "a" in "lazy"?
| rogual wrote:
| Well, I'm stupid.
| jameshart wrote:
| The usual mistake people make in reciting this is to say
| the fox _jumped_ over the lazy dog, causing them to omit an
| 's' from the sentence.
|
| Making sure it's 'the' lazy dog rather than 'a' lazy dog is
| actually important if you care about completing the
| lowercase alphabet, as without it there's only an uppercase
| 'T'.
| SamBam wrote:
| Indeed.
|
| But it's worth having "a lazy" anyway to avoid the repetetive
| "the."
| lawlessone wrote:
| This is interesting but i think generating the next letter from
| the letters before may not be the best way to do it. As you
| mentioned they degrade with each letter.
|
| Maybe creating one long image of a whole font would work better.
|
| edit: in the above am misunderstanding what is happening here.
|
| But i still think there must be another way to structure this so
| the attention mechanism doesn't have to work so hard.
| jrmg wrote:
| Since the first three letters are good, and generated only with
| the context of the preceding letters, shouldn't just using the
| first three (instead of the preceding three) as context for
| every other one be good enough?
| nitrofurano wrote:
| lots of kernings misfits ftw
| tabtab wrote:
| Stop your bigotry of Kernians.
| lachlan_gray wrote:
| I found a few months ago that the gpt-4 code interpreter is
| capable of converting a black and white png of a glyph to an svg
|
| https://twitter.com/lfegray/status/1678787763905126400
|
| It would be cool to combine a script like the one gpt-4 gave me
| with an image generation model to generate fonts. The approach
| from this blog post is way more interesting though.
|
| On a separate note it reminds me of this suckerpinch video :)
| maybe we can finally get uppestcase and lowestcase fonts
|
| https://www.youtube.com/watch?v=HLRdruqQfRk
| logicallee wrote:
| >I found a few months ago that the gpt-4 code interpreter is
| capable of converting a black and white png of a glyph to an
| svg
|
| :) Easy there, let's not make all the naysayers who say it only
| just predicts plausible words sweat.
|
| Your phrasing almost makes it sound like you're sharing a clear
| example of it analyzing and completing a complex task
| correctly, while perfectly understanding what it's doing.
|
| Perhaps we should say it only just predicted words that are
| plausible responses to someone asking to do that, while also
| predicting plausible words someone might say in response to an
| error message along the way. It might not actually be doing any
| converting, just predicting words and tokens without really
| doing anything.
|
| My favorite part of its predictive capabilities is how it is
| able to predict the other half of a conversation that literally
| goes "didn't work, try again", "didn't work, try again", "still
| didn't work, try again", "all right you finally fixed it good
| job" - without even telling it why it didn't work or quoting
| the error message. Somehow it is still able to predict the
| other half of the conversation so that it ends up with
| "finally, good job!"
|
| Who knew that to get results that look like it knows what it's
| doing, it's enough to predict what could make someone say that!
|
| We are truly living in the golden age of statistical prediction
| that does not involve any degree of thinking, analysis, or
| understanding.
|
| Truly our age of applied statistics is going better than anyone
| could have, er, "predicted". :)
| wizzwizz4 wrote:
| > _Your phrasing almost makes it sound like you 're sharing a
| clear example of it analyzing and completing a complex task
| correctly, while perfectly understanding what it's doing._
|
| OpenAI has hardcoded (or heavily overfit) several special-
| purpose functions into their ChatGPT systems. In the past few
| months, they've integrated other special-purpose models, so
| their tools can do more than just predictive text (e.g. image
| recognition).
|
| GPT can do limited verbal reasoning, whatever else can do
| image recognition, but that does not mean the combined system
| can do visual reasoning. There's no mechanism by which it
| would (unless you specifically create one, but that's not
| trivial and doesn't generalise).
|
| > _Who knew that to get results that look like it knows what
| it 's doing, it's enough to predict what could make someone
| say that!_
|
| Everyone. Some call it "specification gaming" or "reward
| hacking", and we've known about it for a _long_ time. It 's a
| really obvious concept if you have a good mental model of
| reinforcement learning.
| https://doi.org/10.1162%2Fartl_a_00319 is a fun example.
|
| > _We are truly living in the golden age of statistical
| prediction that does not involve any degree of thinking,
| analysis, or understanding._
|
| This is a straw argument. I can't speak for anyone else, but
| my criticisms are mainly of people seeing some thinking-like,
| analysis-like or understanding-like behaviour, and assuming
| that it _is_ human-like thinking, analysis or understanding,
| while ignoring other hypotheses (some of which make
| successful advance predictions in a way the "it 's doing what
| humans do!" models don't).
|
| I will note: the people being the most loudly exuberant about
| ChatGPT's vast intelligence seem to view it as a _tool_. If I
| were faced with an opaque box, inside which was a being
| capable of general-purpose problem solving, conversation, and
| original thought, my first reaction would _not_ be "I can use
| this for my own ends". I am glad that I have seen nothing to
| convince me that ChatGPT is such a being, and I have
| theoretical arguments that ChatGPT probably _won 't ever_ be
| such a being, but if you genuinely think this technology has
| the potential to produce such a being, you have an ethical
| responsibility.
| specproc wrote:
| Thank you for sharing that suckerpinch, enjoyed watching that
| immensely
| bambax wrote:
| The author says he achieved text-to-SVG generation but doesn't
| point to a code repository for it... It would be super
| interesting (or does gpt-4 do it natively?)
|
| That said, I'm not sure that you need GPT-4 for outlining a BW
| image and making a path out of it; Corel Draw did that well,
| over 25 years ago?
|
| So yes, another approach to what the author is doing, would be
| to generate font bitmaps using any of the leading image
| generators, and then vectorize the bitmaps. Less
| straightforward and precise, but probably simpler.
| SerCe wrote:
| Hi, I am the author. For text-to-SVG, check out IconShop [1].
| It was the paper that I tried to reproduce results from
| initially. In the paper, there is a comparison of their
| approach against using GPT-4 [2].
|
| Using vectorisation tools like potrace, is indeed a much more
| popular approach, and there are quite a few papers generating
| fonts this way. The most recent I believe is DualVector [3].
| But I tried to approach the problem from another angle.
|
| [1]: https://icon-shop.github.io
|
| [2]: https://arxiv.org/pdf/2304.14400.pdf
|
| [3]: https://openaccess.thecvf.com/content/CVPR2023/html/Liu_
| Dual...
| simonbw wrote:
| ChatGPT/GPT-4 does it natively. You can say "Please generate
| me an SVG image of a unicorn" and it will spit out the SVG
| code.
| cjaybo wrote:
| Here's my stupid question of the day:
|
| Would you mind to explain what you mean by "native" in this
| context?
| tough wrote:
| Not using a -plugin- probably
| alana314 wrote:
| That's amazing. One of my favorite things to do with copilot is
| to comment something like "//white arrow pointing right" and
| then start "<svg" and have it complete it. If it doesn't get it
| right the first time I update my comment. Saves me time
| searching for the right SVG and digging through free but really
| paid image sites.
| toddmorey wrote:
| This is such a good idea. Not sure why svg code escaped my
| mind as something copilot would be good at.
| pphysch wrote:
| In general, copilots are a massive boon to "boilerplatey",
| simple syntax languages from XML/HTML to Go.
| speps wrote:
| And it saves you having to credit anyone, win-win!
| fragmede wrote:
| You must be new to "professional" software development...
| jbc1 wrote:
| Awful lot of sites have icons on them. I can't recall ever
| seeing icon credit. Copilot is like a year old.
| duskwuff wrote:
| > Saves me time searching for the right SVG and digging
| through free but really paid image sites.
|
| FWIW, Google's Material Design Icons and The Noun Project are
| decent sources of high quality, actually-free SVG icons:
|
| * https://fonts.google.com/icons (Apache license)
|
| * https://thenounproject.com/ (CC-BY)
| itsyaboi wrote:
| https://www.youtube.com/watch?v=a8K6QUPmv8Q
| gigglesupstairs wrote:
| "Fucking Hell" - first thing I yelled to myself when I saw that
| headline
|
| Kudos for the project, of course, but it just saddens me a bit
| more. Nothing is sacred anymore.
| waterheater wrote:
| Let's assume the technology will eventually work.
|
| What if you had a "personal font"? Sure, you have a user name,
| but what if you had a custom-generated font which communicates
| your personality to other people on the Internet? The font
| could be on a spectrum between static (generated once and
| reused indefinitely) and dynamic (continuous online learning of
| personal information causes an adjustment of the font).
|
| I'm just making up an example here, but say you're feeling sad,
| and your smart technology figures out you're feeling sad. When
| you send a text message to family, then your personal font
| takes on "sad" characteristics.
| nvy wrote:
| >Nothing is sacred anymore.
|
| If it can be specified, it can be automated.
| gigglesupstairs wrote:
| I know, I know. I am not disputing the technicalities.
| layer8 wrote:
| I mean, looking at the kerning of the second example in
| particular, there's still a lot to be done. And something like
| "extend this latin-1 font to all scripts of the BMP so that is
| looks stylistically consistent and, within that constraint, the
| glyphs and their combinations look natural and readable for
| native readers of each script, assuming Japanese readers for
| the Han characters" is probably still way off.
| chefandy wrote:
| Just like all visual generative AI, it gets the first 95% but
| doesn't get the last 5% that takes 95% of the time. Kerning
| pairs on typefaces take an incredible amount of human time.
| Years of full-time work for a large type family. After all
| these years, even Adobe can't perfectly automate kerning
| because making letters look right next to each other isn't
| (obviously) formulaic. Maybe generative AI will nip it in the
| bud? Certainly hasn't so far, but maybe it will. Obviously in
| monospaced fonts, like that last one, kerning isn't an issue.
|
| More correctable in these models would be the balance between
| the letterforms. Surely if there was some kind of prompt you
| could tell it to not make those Ms in that bold serif font to
| be obnoxiously wide?
|
| Either way, as of now, what this gets us is exactly 90% less
| useful and probably of lower quality than the stuff you can
| get for free on dafont.com. I know it will progress, but I
| imagine the best use case for generative AI and font creation
| for commercially viable fonts would be to give roughs glyphs
| to fill out a large character set as an aid for a
| professional type designer.
|
| And surely there will be a chorus of people insisting that it
| doesn't matter. Well, you're wrong. If you blindly showed
| people a headline, book, poster or whatever with properly
| kerned type and then one without, they will see how much more
| polished the properly kerned page is, even if they couldn't
| tell you specifically why. In a lot of situations, that
| really, really matters, even to people who haven't developed
| the ability to point out the differences.
| colesantiago wrote:
| > Kudos for the project, of course, but it just saddens me a
| bit more. Nothing is sacred anymore.
|
| Why does this sadden you?
|
| I'm quite happy everything is being done by AI, time will be
| freed for other things that are more important.
|
| Manual font making will not go away though and now anyone can
| make their own fonts for free.
| ori_b wrote:
| Genuine question: What do you think is more important that
| won't eventually be done with AI?
| rogerclark wrote:
| You don't know that the time will be freed for other things
| that are more important. We don't know for sure how this is
| all going to work out at all.
|
| And people who make fonts, create art, and write prose
| generally do these things because they like doing them, not
| because they're forced to. These technologies aren't
| automating drudgery, they're automating things that give
| people's lives meaning. What's the endgame here exactly?
| colesantiago wrote:
| Time will be freed, ChatGPT, DALL-E, Midjourney and Stable
| Diffusion has collectively saved countless people billions
| of hours of time and this will do the same.
|
| The big font makers no longer have a hold on extremely
| pricey fonts that are inaccessible, the general endgame is
| most software is going free and open source thanks to AI,
| and that is a good thing.
| morph12 wrote:
| Time for what?
| rogerclark wrote:
| Creatives will have their hopes and dreams stripped away
| so that artless and tasteless software engineers can type
| words into a box and instantly get exactly what they
| want, with no surprises, no feelings, and no economic
| upsides for anyone else. A beautiful future indeed.
| axus wrote:
| Won't the creatives be able to type the software
| specification into a box, and add functionality to their
| endeavor without needing programmers? I'm not sure that
| the process and the paycheck are more important than the
| final artifact.
| ori_b wrote:
| Why would we need people to have endeavors? That sounds
| like automatable drudgery.
|
| Won't the AIs be able to infer what would best maximize
| engagement for their owners and type the specifications
| necessary to create whatever the entity running them
| would want their users to consume?
| morph12 wrote:
| Why would we need people? Seems like a pointless
| bottleneck in the pursuit of efficiency.
| BeFlatXIII wrote:
| Designing cereal boxes is not authentic human experssion.
| rogerclark wrote:
| You must not know any designers. Pretty much everyone I
| know would consider that to be pretty fun - this is
| exactly the kind of thing artistic kids say they want to
| do when they grow up. And most of us would rather get
| paid to design cereal boxes than to do many other things,
| and almost everyone would rather do it than to not get
| paid at all.
| nvy wrote:
| >What's the endgame here exactly?
|
| All of us paying a set of subscriptions to the FAANGs, for
| literally every aspect of our lives.
| aatd86 wrote:
| With what money once everyone is out of a job?
| rogerclark wrote:
| It's so depressing to think that this is what people want.
| tpmoney wrote:
| What is that? The ability to quickly and easily generate
| creative or expressive pieces of computer wizardry without
| first having to delve into the depths of esoteric knowledge?
| Of course people want that. It turns out you can't specialize
| in everything, but sometimes you just want to be able to make
| something good enough without having to engage the services
| of an expert in the field.
|
| No this might not be the most beautiful font with the most
| perfect kerning or optimized code. But if it's functional
| enough for the person who requests it, that should be good
| enough shouldn't it? Most things people are printing on their
| 3d printers aren't high quality designed parts either. Plenty
| of scientists and accountants have scripts and code that
| would make most developers cringe, but if it's good enough
| then why be bothered?
|
| The ability of people to make things with tools that they
| otherwise never would have been able to make before without
| dedicating months or years of time they may not have is
| awesome and we should be excited for it. I've watched 70 year
| old grandmothers learn to make little home movies of their
| grandkids in iMovie. No they weren't doing "real" film
| editing and certainly weren't learning any skills that would
| transfer to avid or Final Cut. And so what? That home movie
| cut together with a minimum of skill and a whole lot of
| technology hiding the esoterica was probably more meaningful
| and joy inducing for that woman than most blockbuster
| cinematics produced by the best minds.
| mock-possum wrote:
| 'Anymore' ha
|
| You don't actually believe anything was ever sacred to being
| with do you?
| martincmartin wrote:
| Douglas Hofstader, the author of Godel Escher Bach, thought the
| task of creating fonts could only be solved with general AI.
|
| https://www.m-u-l-t-i-p-l-i-c-i-t-y.org/media/pdf/Metafont-M...
|
| The Letter Spirit project aims to model artistic creativity by
| designing stylistically uniform "gridfonts" (typefaces limited to
| a grid).
| adastra22 wrote:
| Well, GPT is a general AI.
| gwern wrote:
| I read that a while ago and thought that it was interesting:
| Hofstadter was right that it would require much more general
| approaches than Knuth's approach of 'think very hard and tweak
| a hand-engineered knob', because that's how all the past
| VAE/GAN/RNN work on typography-related stuff has worked.
|
| As for the broader question of whether such approaches are
| general AI, well, that's a bullet Hofstadter is increasingly
| willing to bite, as upset as it makes him:
| https://www.lesswrong.com/posts/kAmgdEjq2eYQkB5PP/douglas-ho...
| svat wrote:
| Hofstadter's article is very interesting and delightful (as
| is typical of him). But as a response to Knuth's article it's
| basically reacting to a straw-man or misunderstanding: by "a
| metafont" in "The Concept of a Meta-Font"[1] Knuth simply
| meant a common description of many related fonts in a family
| (like the Computer Modern family where different font sizes,
| bold, italics, sans-serif, typewriter style etc are all
| generated from common code and tweakable knobs) -- this is a
| consciously chosen and designed family. But when he joked
| about
|
| > _The idea of a meta-font should now be clear. But what good
| is it? The ability to manipulate lots of parameters may be
| interesting and fun, but does anybody really need a
| 6[?]-point font that is one fourth of the way between
| Baskerville and Helvetica?_
|
| Hofstadter ran with it, imagining Knuth to mean a single
| universal "metafont" from which every single font can be
| achieved by suitable tweaking of knobs. This is of course
| nonsense.
|
| Knuth wrote a (little-known or referenced) short response in
| the same journal's Vol. 17 No. 4 (1983): Volume 17.4 (p 412,
| or in the PDF page 89 of 96 at
| https://journals.uc.edu/index.php/vl/issue/view/364/183)
| [from the tone I imagine him very annoyed :-)]:
|
| > _I never meant to imply that all typefaces could usefully
| be combined into one single meta-font, not even if
| consideration is restricted to book faces. For example, [...]
| Meanwhile, I 'm pleased to see that my article has stimulated
| people to have other ideas, even if those ideas have little
| or no connection with the main point I was trying to make.
| Misunderstandings of meta-fonts may well prove to be more
| important than my own simple observations in the long run._
|
| Returning to the thread a bit, all these "write code to draw
| an image" systems--like Metafont/MetaPost, Asymptote, TikZ
| (and also I guess DOT/Graphviz, Mermaid, nomnoml, ...)--are
| IMO interesting as a way for those who think in language /
| symbols / concepts to do visual stuff (and vice-versa to some
| extent), and also (along Knuth's lines) "truly understand"
| shapes by translating them into precise descriptions.
| Metafont was never going to become popular expecting font
| designers to write code (and the fact that hand-writing SVG
| is a negligible fraction of usage makes sense), but now that
| LLMs can help translate back-and-forth, it's going to be
| interesting to see if we ever get to "understanding" shapes.
|
| [1]:
| https://web.archive.org/web/20220629082019/https://s3-us-
| wes... /
| https://journals.uc.edu/index.php/vl/article/view/5329/4193
| dleeftink wrote:
| Although I would be sad to see the handcrafting that goes into
| designing custom fonts go, some iterations down the line a model
| like this would greatly aid tedious glyph alignment and
| consistency tasks when designing CJK, hiragana, katakana and
| kanji fonts. Inspiring stuff.
| matsemann wrote:
| It's already so that writing on computers is quite US-centric
| (at least English-centric). While this might help on some of
| the shortcomings, I'm also a bit afraid it will make it so that
| even more focus is put only on the US part, and the rest of the
| world get a "good enough" implementation made by AI that kinda
| erases some heritage.
| dleeftink wrote:
| Maybe, but the Latin font market is quite saturated, whereas
| the CJK space has ample opportunity for innovating and is
| likely even in need of it, cf. [0][1][2]
|
| [0]: https://qz.com/522079/the-long-incredibly-tortuous-and-
| fasci...
|
| [1]: https://fonts.google.com/knowledge/type_in_china_japan_a
| nd_k...
|
| [2]: https://stackoverflow.com/a/14573813
| california-og wrote:
| I think that would be ideal. The 'killer' feature would be:
| Handcraft a set of control characters, like the letters in
| "handglove" and then let AI generate the rest. Designing a
| typeface is fun, until you need to add support for multiple
| languages and need to make 800+ characters. Or, maybe there is
| a nice (open source) font, that is unfortunately missing some
| characters you really need: let AI generate them.
| paulcnichols wrote:
| Inevitable in a good way. Keep going! There's gold here.
| Rantenki wrote:
| OK, that's cool, but those fonts are all terrible. The serifs are
| all different sizes and shapes, sometimes on the same letter. The
| kerning looks like a random walk. The stroke widths are all over
| the place, and/or the hinting is busted.
|
| Now, that said, it's pretty amazing that this works at all, but
| it'll take some pretty specific training on a model to get
| something that can compete with a human made font that's curated
| for good usability _and_ aesthetics.
|
| Sadly, we'll also probably see adoption of these kinds of fonts
| (along with graphic design, illustration, songwriting,
| screenwriting, etc)... because "meh, good enough" combined with
| some Dunning-Kruger.
|
| TL;DR: Thanks, I hate it.
| BoorishBears wrote:
| > Sadly, we'll also probably see adoption of these kinds of
| fonts (along with graphic design, illustration, songwriting,
| screenwriting, etc)... because "meh, good enough" combined with
| some Dunning-Kruger.
|
| Ironic bringing up Dunning-Kruger as you treat generic RLHF as
| a "pretty specific training" and make sweeping declarations
| about how people will use AI as if the current SOTA of several
| of the tasks you just mentioned didn't come from not settling
| for "meh, good enough" and instead applying the "pretty
| specific training" you alluded to (see Midjourney)
| jeron wrote:
| I don't think any self respecting graphic designer would use
| these fonts in its current state but it's a cool proof of
| concept and could be improved upon to a more usable state
| Jack000 wrote:
| I think this approach isn't ideal because you're representing
| pixels as 150x150 unique bins. With only 71k fonts it's likely a
| lot of these bins are never used, especially at the corners.
| Since you're quantizing anyways, you might as well use a convnet
| then trace the output, which would better take advantage of the
| 2d nature of the pixel data.
|
| This kind of reminds me of dalle-1 where the image is represented
| as 256 image tokens then generated one token at a time. That
| approach is the most direct way to adapt a causal-LM architecture
| but it clearly didn't make a lot of sense because images don't
| have a natural top-down-left-right order.
|
| For vector graphics, the closest analogous concept to pixel-wise
| convolution would be the Minkowski sum. I wonder if a Minkowski
| sum-based diffusion model would work for svg images.
| briandw wrote:
| How would the Minkowski sum be used in the diffusion model? Is
| the idea to look at the Minkowski sum of the prediction and
| label?
| yklcs wrote:
| I've tried out some work on generating vector fonts too, in the
| format of Bezier curves and a seq2seq model. The problem was that
| fonts outputted by ML models were imprecise. Lines were not
| perfectly parallel, corners were at 89deg, and curves were
| kinked. It's not too difficult to get fonts that look good
| enough, but the imperfections are glaring as fonts are normally
| perfectly precise. These imperfections are evident in OP's output
| too, and in my opinion make these types of models unusable for
| actual typesetting.
|
| A 1% error in a raster output would be pixel colors being
| slightly off, but a 89deg corner in a vector image is immediately
| noticeable, making this a hard problem to solve. I haven't looked
| into this problem too much since, but I'm interested to hear
| about possible solutions and reading material.
| waterheater wrote:
| Without changing the fundamental learning process, one could
| conceivably introduce a "post-production" step, where you
| tighten up the output according to a set of pre-defined rules
| (e.g., if an angle is 89 degrees, adjust the angle to 90).
|
| Of course, changing the learning process would be best. One
| idea which comes to mind is finding a way to embed
| relationships into the ML training system itself (e.g., output
| no angles other than 90 degrees or some predefined set). Such
| an approach is a type of contraint-based ML, where the ML agent
| identifies a solution given certain constraints on the output.
| In my experience, the right approach to accomplish this goal is
| using factor graphs.
___________________________________________________________________
(page generated 2023-10-03 23:00 UTC)