[HN Gopher] NeuralSVG: An Implicit Representation for Text-to-Ve...
       ___________________________________________________________________
        
       NeuralSVG: An Implicit Representation for Text-to-Vector Generation
        
       Author : lnyan
       Score  : 262 points
       Date   : 2025-01-08 18:16 UTC (4 hours ago)
        
 (HTM) web link (sagipolaczek.github.io)
 (TXT) w3m dump (sagipolaczek.github.io)
        
       | pizza wrote:
       | Prompting Claude to make SVGs then dropping them into Inkscape
       | and getting the last ~20% of it to match the picture in my head
       | has been a phenomenal user experience for me. This, too, piques
       | my curiosity..!
        
         | xvfLJfx9 wrote:
         | Claide doesn't work at all for me when generating SVGs
        
           | iambateman wrote:
           | It depends for me on if there is an existing SVG that exists
           | in its training set.
           | 
           | "Make an SVG of a clock icon" is likely to work. "Make an SVG
           | of a playground swingset with the sun setting" is not.
        
             | varunneal wrote:
             | Your prompt verbatim turned out quite well, single-shot.
             | 
             | https://claude.site/artifacts/0f696bf8-399d-42c3-93c0-29649
             | 3...
        
         | jgalt212 wrote:
         | I poked around with NeoSVG a few months back. I was not happy
         | with the results, the computation time, or the cost. That being
         | said, I do hope they've made big progress lately because SVGs
         | work real nice when you have an LLM and human working in tandem
         | (as per the comment above).
         | 
         | https://neosvg.com/generations
        
       | kelseyfrog wrote:
       | Why does the fourth example show a hamburger but is labeled as a
       | dragon?
        
         | jsheard wrote:
         | American cultural bias in the training data led it to infer
         | that dragons would be turned into burgers if they were real.
        
         | airstrike wrote:
         | Most likely just a clerical error, since the dragon is two
         | examples to the left with the same caption.
        
         | dekhn wrote:
         | because hamburgers aren't made from chopped ham.
        
       | murtio wrote:
       | This is really cool! I have been using Claude to animate SVG, and
       | it has been great.
        
         | NiloCK wrote:
         | I'd be interested to see examples and hear about process here,
         | if you're willing to share.
        
       | zellyn wrote:
       | The sketch generation is wild... and apparently comes for free.
        
       | airstrike wrote:
       | This opens up lots of opportunities for document authoring tools.
       | Really cool stuff, can't wait to try out the code once it's
       | available.
        
       | lbj wrote:
       | "Code coming soon" - I hope someone reposts this when there's
       | more to dig into
        
       | janalsncm wrote:
       | I am a huge fan of this type of incremental generative approach.
       | Language isn't precise enough to describe a final product, so
       | generating intermediate steps is very powerful.
       | 
       | I'd also like to see this in music generation. Tools like Suno
       | are cool but I would much rather have something that generates
       | MIDIs and instrument configurations instead.
       | 
       | Maybe this is a good lesson for generative tools. It's possible
       | to generate _something_ that's a good starting point. But what
       | people actually want is long tail, so including the capability of
       | precision modification is the difference between a canned demo
       | and a powerful tool.
       | 
       | > Code coming soon
       | 
       | The examples are quite nice but I have no idea how reproducible
       | they are.
        
         | kadushka wrote:
         | _I'd also like to see this in music generation. Tools like Suno
         | are cool but I would much rather have something that generates
         | MIDIs and instrument configurations instead._
         | 
         | Sounds like you're looking for something like
         | https://www.aiva.ai
        
           | janalsncm wrote:
           | Honestly that site feels like they have a database of midis
           | tagged by genre and pick them out randomly. It's totally
           | different from their demo song.
           | 
           | I guess I'm hoping for something better. It's also closed
           | source, the web ui doesn't have editing functionality, and
           | the output is pretty disjointed. Maybe if I messed around
           | with it enough the result would be decent.
        
         | bufferoverflow wrote:
         | MIDI isn't enough. I want MIDI + filters, plus separate voice
         | and custom sounds tracks.
        
         | gexaha wrote:
         | microtonal midi would be super awesome
        
       | jonathaneunice wrote:
       | Nice! Looking forward to similar textual generation of diagrams.
       | (The Pic/Pikchr for the LLM age.)
        
         | da_rob wrote:
         | It's not PIC and not really suitable for complex diagrams, yet,
         | but you can use Vizzlo's Chart Vizzard to create a subset of
         | the supported chart types (let's say a Gantt) and then continue
         | editing it using the chart editor: https://vizzlo.com/ai
        
       | fosterbuster wrote:
       | Its a wasted opportunity not using SVG to show the examples.
        
       | TeMPOraL wrote:
       | Available in ComfyUI when? :).
       | 
       | Seriously though, this is amazing, I'm glad to see this tackled
       | directly.
       | 
       | Also, I just learned from this thread that Claude is apparently
       | usable for generating SVGs (unlike e.g. GPT-4 when I tested for
       | it some months ago), so I'll play with that while waiting for
       | NeuralSVG to become available.
        
       | vipshek wrote:
       | This is excellent!
       | 
       | I think the utility of generating vectors is far, far greater
       | than all the raster generation that's been a big focus thus far
       | (DALL-E, Midjourney, etc). Those efforts have been incredibly
       | impressive, of course, but raster outputs are _so_ much more
       | difficult to work with. You 're forced to "upscale" or "inpaint"
       | the rasters using subsequent generative AI calls to actually
       | iterate towards something useful.
       | 
       | By contrast, generated vectors are inherently scalable and easy
       | to edit. These outputs in particular seem to be low-complexity,
       | with each shape composed of as few points as possible. This is a
       | boon for "human-in-the-loop" editing experiences.
       | 
       | When it comes to generative visuals, creating simplified
       | representations is much harder (and, IMO, more valuable) than
       | creating highly intricate, messy representations.
        
         | tasuki wrote:
         | Ah, we should be friends!
         | 
         | I'm not sure what else to add, except that these are exactly
         | the thoughts I think, and it used to feel lonely ;)
        
         | Lerc wrote:
         | There is also the possibility for using these images as
         | guidance for rasterization models. Generate easily
         | manipulatable and composible images as a first stage then add
         | detail once the image composition is satisfactory.
        
         | SillyUsername wrote:
         | My little project for the highly intricate, messy
         | representation ;) https://github.com/KodeMunkie/shapesnap (it
         | stands on the backs of giants, original was not mine). It's
         | also available on npm.
        
         | gwern wrote:
         | Have you looked at https://www.recraft.ai/ recently? The image
         | quality of their vector outputs seems to have gotten quite
         | good, although you obviously still wouldn't want to try to
         | generate densely textured or photographic-like images like
         | Midjourney excels at. (For https://gwern.net/dropcap last year
         | or before, we had to settle for Midjourney and create a
         | somewhat convoluted workflow through Recraft; but if I were
         | making dropcaps now, I think the latest Recraft model would
         | probably suffice.)
        
       | 1970-01-01 wrote:
       | Aside: I've been having a very hard time prompting ChatGPT to
       | spit out ASCII art. It really seems to not be able to do it.
       | Here is an ASCII art representation of a hopping rabbit:
       | ```           (\(\             ( -.-)             o_(")(")
       | ```               This is a simple representation of a rabbit
       | with its ears up and in a hopping stance. Let me know if you'd
       | like me to adjust it!
        
         | Kiro wrote:
         | Pretty good if you ask me. What would a proper hopping rabbit
         | ASCII art look like?
        
           | jansan wrote:
           | Not sure, but that is a sitting rabbit.
        
         | jsheard wrote:
         | It seems to have just pulled an ASCII rabbit from the training
         | data verbatim
         | 
         | https://old.reddit.com/r/identifythisfont/comments/ytd25m/wh...
        
       | goeiedaggoeie wrote:
       | This is very nice.
       | 
       | I has to convert a bitmask to svg and was wishing to skip the
       | intermediatary step so looked around for papers about
       | segmentation models outputting svg and found this one
       | https://arxiv.org/abs/2311.05276
        
       | scosman wrote:
       | I've been impressed with even applying sonnet to SVGs for
       | animations. This looks like it could be a lot more powerful.
       | 
       | Fun example:
       | https://gist.github.com/scosman/701275e737331aaab6a2acf74a52...
        
       | toisanji wrote:
       | This is a group applying vector generation to animations:
       | https://www.youtube.com/@studyturtlehq The graphic fidelity has
       | been slowly improving over time.
        
         | gcr wrote:
         | can you say more? all of these videos have less than 5 views
         | and i can't find any explanation of their process
        
       ___________________________________________________________________
       (page generated 2025-01-08 23:00 UTC)