[HN Gopher] NeuralSVG: An Implicit Representation for Text-to-Ve...
___________________________________________________________________
NeuralSVG: An Implicit Representation for Text-to-Vector Generation
Author : lnyan
Score : 707 points
Date : 2025-01-08 18:16 UTC (1 days ago)
(HTM) web link (sagipolaczek.github.io)
(TXT) w3m dump (sagipolaczek.github.io)
| pizza wrote:
| Prompting Claude to make SVGs then dropping them into Inkscape
| and getting the last ~20% of it to match the picture in my head
| has been a phenomenal user experience for me. This, too, piques
| my curiosity..!
| xvfLJfx9 wrote:
| Claide doesn't work at all for me when generating SVGs
| iambateman wrote:
| It depends for me on if there is an existing SVG that exists
| in its training set.
|
| "Make an SVG of a clock icon" is likely to work. "Make an SVG
| of a playground swingset with the sun setting" is not.
| varunneal wrote:
| Your prompt verbatim turned out quite well, single-shot.
|
| https://claude.site/artifacts/0f696bf8-399d-42c3-93c0-29649
| 3...
| jgalt212 wrote:
| I poked around with NeoSVG a few months back. I was not happy
| with the results, the computation time, or the cost. That being
| said, I do hope they've made big progress lately because SVGs
| work real nice when you have an LLM and human working in tandem
| (as per the comment above).
|
| https://neosvg.com/generations
| kelseyfrog wrote:
| Why does the fourth example show a hamburger but is labeled as a
| dragon?
| jsheard wrote:
| American cultural bias in the training data led it to infer
| that dragons would be turned into burgers if they were real.
| airstrike wrote:
| Most likely just a clerical error, since the dragon is two
| examples to the left with the same caption.
| dekhn wrote:
| because hamburgers aren't made from chopped ham.
| murtio wrote:
| This is really cool! I have been using Claude to animate SVG, and
| it has been great.
| NiloCK wrote:
| I'd be interested to see examples and hear about process here,
| if you're willing to share.
| zellyn wrote:
| The sketch generation is wild... and apparently comes for free.
| airstrike wrote:
| This opens up lots of opportunities for document authoring tools.
| Really cool stuff, can't wait to try out the code once it's
| available.
| lewisjoe wrote:
| Curious how this can augment document authoring! Can you toss
| some ideas?
| airstrike wrote:
| I just think about how often professionals need placeholder
| images or doodles in their documents, but cliparts are
| generally terrible and actually making a nice looking drawing
| for those purposes is out of scope for business users and
| immensely time consuming... so this fills a nice gap.
|
| I'm obviously biased as a former "business user" writing a
| document authoring software!
| lbj wrote:
| "Code coming soon" - I hope someone reposts this when there's
| more to dig into
| janalsncm wrote:
| I am a huge fan of this type of incremental generative approach.
| Language isn't precise enough to describe a final product, so
| generating intermediate steps is very powerful.
|
| I'd also like to see this in music generation. Tools like Suno
| are cool but I would much rather have something that generates
| MIDIs and instrument configurations instead.
|
| Maybe this is a good lesson for generative tools. It's possible
| to generate _something_ that's a good starting point. But what
| people actually want is long tail, so including the capability of
| precision modification is the difference between a canned demo
| and a powerful tool.
|
| > Code coming soon
|
| The examples are quite nice but I have no idea how reproducible
| they are.
| kadushka wrote:
| _I'd also like to see this in music generation. Tools like Suno
| are cool but I would much rather have something that generates
| MIDIs and instrument configurations instead._
|
| Sounds like you're looking for something like
| https://www.aiva.ai
| janalsncm wrote:
| Honestly that site feels like they have a database of midis
| tagged by genre and pick them out randomly. It's totally
| different from their demo song.
|
| I guess I'm hoping for something better. It's also closed
| source, the web ui doesn't have editing functionality, and
| the output is pretty disjointed. Maybe if I messed around
| with it enough the result would be decent.
| kadushka wrote:
| Fair enough. Still, for what you've described, Aiva is the
| best tool available.
| bufferoverflow wrote:
| MIDI isn't enough. I want MIDI + filters, plus separate voice
| and custom sounds tracks.
| gexaha wrote:
| microtonal midi would be super awesome
| jonathaneunice wrote:
| Nice! Looking forward to similar textual generation of diagrams.
| (The Pic/Pikchr for the LLM age.)
| da_rob wrote:
| It's not PIC and not really suitable for complex diagrams, yet,
| but you can use Vizzlo's Chart Vizzard to create a subset of
| the supported chart types (let's say a Gantt) and then continue
| editing it using the chart editor: https://vizzlo.com/ai
| _1 wrote:
| I've had some success with converting SQL to Mermaid Markdown
| diagrams.
| fosterbuster wrote:
| Its a wasted opportunity not using SVG to show the examples.
| TeMPOraL wrote:
| Available in ComfyUI when? :).
|
| Seriously though, this is amazing, I'm glad to see this tackled
| directly.
|
| Also, I just learned from this thread that Claude is apparently
| usable for generating SVGs (unlike e.g. GPT-4 when I tested for
| it some months ago), so I'll play with that while waiting for
| NeuralSVG to become available.
| vipshek wrote:
| This is excellent!
|
| I think the utility of generating vectors is far, far greater
| than all the raster generation that's been a big focus thus far
| (DALL-E, Midjourney, etc). Those efforts have been incredibly
| impressive, of course, but raster outputs are _so_ much more
| difficult to work with. You 're forced to "upscale" or "inpaint"
| the rasters using subsequent generative AI calls to actually
| iterate towards something useful.
|
| By contrast, generated vectors are inherently scalable and easy
| to edit. These outputs in particular seem to be low-complexity,
| with each shape composed of as few points as possible. This is a
| boon for "human-in-the-loop" editing experiences.
|
| When it comes to generative visuals, creating simplified
| representations is much harder (and, IMO, more valuable) than
| creating highly intricate, messy representations.
| tasuki wrote:
| Ah, we should be friends!
|
| I'm not sure what else to add, except that these are exactly
| the thoughts I think, and it used to feel lonely ;)
| Lerc wrote:
| There is also the possibility for using these images as
| guidance for rasterization models. Generate easily
| manipulatable and composible images as a first stage then add
| detail once the image composition is satisfactory.
| datadrivenangel wrote:
| Trivially possible with controlnets!
| SillyUsername wrote:
| My little project for the highly intricate, messy
| representation ;) https://github.com/KodeMunkie/shapesnap (it
| stands on the backs of giants, original was not mine). It's
| also available on npm.
| gwern wrote:
| Have you looked at https://www.recraft.ai/ recently? The image
| quality of their vector outputs seems to have gotten quite
| good, although you obviously still wouldn't want to try to
| generate densely textured or photographic-like images like
| Midjourney excels at. (For https://gwern.net/dropcap last year
| or before, we had to settle for Midjourney and create a
| somewhat convoluted workflow through Recraft; but if I were
| making dropcaps now, I think the latest Recraft model would
| probably suffice.)
| esperent wrote:
| Link to their vector page, since the main page makes them
| look like yet another AI image generator:
|
| https://www.recraft.ai/ai-image-vectorizer
|
| The quality does look quite amazing at first glance. How are
| the vectors to work with? Can you just open them in
| illustrator and start editing?
| gwern wrote:
| No, I actually was referring to their native vector AI
| image generator, not their vector _izer_ - although the
| vectorizer was better than any other we found, and that 's
| why we were using it to convert the Midjourney PNG dropcaps
| into SVGs
|
| (The editing quality of the vectorized ones were not great,
| but it is hard to see how they could be good given their
| raster-style appearance. I can't speak to the editing
| quality of the native-generated ones, either in the old
| obsolete Recraft models or the newer ones, because the old
| ones were too ugly to want to use, and I haven't done much
| with the new one yet.)
| brown_martin wrote:
| I was under the impression that their AI Vector generator
| generates a PNG and vectorizes under the hood.
| zidad wrote:
| I always imagine how useful Sora.ai could be if it would
| generate 3D models to render their animations from instead
| spyder wrote:
| I agree, that's the future of these video models. For
| professional use you want more control and the obvious next
| step towards that is to generate the full 3D scene (in the
| form of animated gaussian splats since that's more AI
| friendly than the mesh based 3D). That also helps the model
| to be more consistent but also adds the ability for the user
| to have more control over the camera or the scene.
| cochlear wrote:
| I couldn't agree more. I feel that the block-coding and
| rasterized approaches that are ubiquitous in audio codecs (even
| the modern "neural" ones) are a dead-end for the fine-grained
| control that musicians will want. They're just fine for text-
| to-music interfaces of course.
|
| I'm working on a sparse audio codec that's mostly focused on
| "natural" sounds at the moment, and uses some (very roughly)
| physics-based assumptions to promote a sparse representation.
|
| https://blog.cochlea.xyz/sparse-interpretable-audio-codec-pa...
| 1970-01-01 wrote:
| Aside: I've been having a very hard time prompting ChatGPT to
| spit out ASCII art. It really seems to not be able to do it.
| Here is an ASCII art representation of a hopping rabbit:
| ``` (\(\ ( -.-) o_(")(")
| ``` This is a simple representation of a rabbit
| with its ears up and in a hopping stance. Let me know if you'd
| like me to adjust it!
| Kiro wrote:
| Pretty good if you ask me. What would a proper hopping rabbit
| ASCII art look like?
| jansan wrote:
| Not sure, but that is a sitting rabbit.
| jsheard wrote:
| It seems to have just pulled an ASCII rabbit from the training
| data verbatim
|
| https://old.reddit.com/r/identifythisfont/comments/ytd25m/wh...
| goeiedaggoeie wrote:
| This is very nice.
|
| I has to convert a bitmask to svg and was wishing to skip the
| intermediatary step so looked around for papers about
| segmentation models outputting svg and found this one
| https://arxiv.org/abs/2311.05276
| scosman wrote:
| I've been impressed with even applying sonnet to SVGs for
| animations. This looks like it could be a lot more powerful.
|
| Fun example:
| https://gist.github.com/scosman/701275e737331aaab6a2acf74a52...
| astrodude wrote:
| oh, wow. this actually works. I didn't know :) thanks!
| toisanji wrote:
| This is a group applying vector generation to animations:
| https://www.youtube.com/@studyturtlehq The graphic fidelity has
| been slowly improving over time.
| gcr wrote:
| can you say more? all of these videos have less than 5 views
| and i can't find any explanation of their process
| andy_ppp wrote:
| I'm looking forward to seeing what this makes of Simon Willison's
| LLM SVG generation test prompt: "Generate an SVG of a pelican
| riding a bicycle".
|
| It's quite amazing the progress we are seeing in AI and it will
| keep getting better which is somewhat terrifying.
| nojvek wrote:
| I asked both Claude and ChatGPT o3 to "generate svg of mainland
| USA with black outline".
|
| Tried various models and they got it hopelessly wrong. Claude
| does an okay job at "Generate an SVG of a pelican riding a
| bicycle"
| IanCal wrote:
| How's this? https://imgur.com/a/aWQ0J49
|
| I might be missing something but at a first pass it looks
| good. Not from the US though so something may be more
| obviously wrong to you.
| piombisallow wrote:
| This is much more useful for actual design jobs.
| CyberDildonics wrote:
| If you can generate an image you can flatten it and if you can
| flatten it you can cluster it, and if you can cluster the flat
| sections you can draw vectors around them.
| strangecasts wrote:
| This posterization-vectorization approach is what the Flash
| "Trace Bitmap" tool implemented (I'm not sure if Animate still
| has it?), but if your image isn't originally clipart/vector
| art, it gives the resulting vector art a very early 2000s
| look...
| intalentive wrote:
| I've always thought that generation of intermediate
| representations was the way to go. Instead of generating concrete
| syntax, generate AST. Instead of generating PNG, generate SVG.
| Instead of generating a succession of images for animation,
| generate wire frame or rigging plus script.
|
| Once you have your IR, modify and render. Once you have your
| render, apply a final coat of AI pixie dust.
|
| Maybe generative models will get so powerful that fine-grained
| control can be achieved through natural language. But until then,
| this method would have the advantages of controllability,
| interoperability with existing tools (like Intellisense, image
| editors), and probably smaller, cheaper models that don't have to
| accommodate high dimensional pixel space.
| IncreasePosts wrote:
| Shouldn't the girl with the pearl earring have an earring?
| mcraiha wrote:
| No, because it is not a pearl earring.
| https://www.theartnewspaper.com/2023/02/08/the-girl-with-a-g...
| IncreasePosts wrote:
| Okay, but shouldn't she at least have a glass teardrop-shaped
| bauble?
| niemandhier wrote:
| It looks as if this is not autoregressive.
|
| It would be interesting to see a similar approach that
| incrementally works from simpler ( fewer curves ) to more complex
| representations.
|
| That way one could probably apply RLHF along the trajectory too.
| nbzso wrote:
| So designers, artist, musicians we are done, right? Who's next, I
| wonder?
| thomasfl wrote:
| Finally something that can benefit artists as a sketching tool.
| nikolayasdf123 wrote:
| very nice. had this idea for awhile, but never had time to
| implement it.
|
| glad someone actually did it! great work!
| Jean-Papoulos wrote:
| This is the kind of image generation I've been waiting for. No
| more messing around in Inkscape (or at least, less of it) when I
| need a specific icon.
| cyp0633 wrote:
| Claude has been doing a good job generating SVGs compared to its
| rivals, happy to see new models bringing image generation even
| further
| chestervonwinch wrote:
| I wonder if you can use an existing svg as a starting point. I
| would love to use the sketch approach and generate frame-by-frame
| animations to plot with my pen plotter.
| shahzaibmushtaq wrote:
| I am really impressed with how it generates rough sketches
| because everything in the design world begins that way.
___________________________________________________________________
(page generated 2025-01-09 23:00 UTC)